Appearance
Configuration
.nubos-pilot/config.json is the single source of truth for runtime, model selection, and workflow toggles. This page lists every key, its default, who reads it, and how to change it.
Where the file lives
| Path | Created by | Purpose |
|---|---|---|
.nubos-pilot/config.json | npx nubos-pilot install interview | Project-scoped configuration; committed to git |
The installer writes the initial file. After install, edit it directly or via npx nubos-pilot update (which preserves answers and refreshes the payload).
Default shape
json
{
"runtime": "claude",
"runtimes": ["claude"],
"scope": "local",
"model_profile": "frontier",
"response_language": "en",
"workflow": {
"commit_docs": true,
"commit_artifacts": true,
"worktree_isolation": false,
"tier_routing": false,
"research_tools": {
"WebFetch": true,
"Context7": true
}
},
"agents": {
"parallelization": true,
"research": true,
"plan_checker": true,
"verifier": true,
"architect": true,
"test_writer": true,
"economy": "ultra"
},
"loop": { "maxRounds": 3, "verify_runs": 1 },
"swarm": {
"research": { "k": 3, "threshold": 0.9, "minOccurrence": 3 },
"critic": { "style_tier": "haiku", "tests_tier": "sonnet", "acceptance_tier": "sonnet", "economy_tier": "haiku" },
"knowledge_adapter": "local"
},
"spawn": {
"headless": {
"enabled": false,
"agents": ["np-critic", "np-researcher"],
"timeout_ms": 600000,
"fallback_on_error": true
}
},
"compression": {
"enabled": false,
"min_block_bytes": 2048,
"verify_max_bytes": 2000,
"elision": { "enabled": true, "ttl_ms": 1800000 },
"proxy": { "enabled": false },
"output_steering": {
"enabled": false,
"verbosity_profile": "balanced",
"effort_routing": { "enabled": false, "base_effort": null, "mechanical_effort": "low" }
},
"cache_align": { "enabled": false }
},
"security": {
"enabled": true,
"scan_on_write": true,
"review_on_stop": true,
"review_on_commit": true,
"custom_rules_path": null,
"guidance_path": null,
"review_timeout_ms": 180000,
"max_stop_reviews_in_a_row": 3,
"max_commit_reviews_per_hour": 20,
"max_files_per_review": 30
},
"conformance": { "inject_criteria": true },
"learnings": {
"auto_capture": true,
"max_captures_per_hour": 10,
"max_in_a_row": 3,
"timeout_ms": 120000,
"max_files": 30
},
"auto_log_learning": true
}Authoritative defaults live in lib/config-defaults.cjs. Keys absent from the file fall back to the values shown above, so workflows never crash on a missing key.
Top-level keys
| Key | Type | Default | Who reads it | Effect |
|---|---|---|---|---|
runtime | string | from install detect | lib/runtime/index.cjs::detect | Active host CLI adapter. One of the 14 ids in KNOWN_RUNTIMES. |
runtimes | string[] | [runtime] | bin/install.js | List of runtimes to install for in multi-runtime mode (--agents flag). |
runtime_source | string | config | env | default | lib/runtime/index.cjs | Diagnostic — which detection path won. Read-only after install. |
scope | local | global | local | bin/install.js | local writes payload into project; global writes to runtime's home directory. |
model_profile | enum | frontier | np-tools.cjs resolve-model | Tier → model resolution. See Tier × Profile matrix. |
model_providers | object | (unset → claude-native) | lib/model-providers.cjs, np-tools.cjs resolve-model | Optional provider definitions for model-agnostic dispatch. See § model_providers.* and agent_routing.* and ADR-0021. |
agent_routing | object | (unset) | lib/model-providers.cjs, np-tools.cjs resolve-model | Optional per-agent provider/model routing. See § model_providers.* and agent_routing.*. |
response_language | ISO-639 string | en | lib/language.cjs, every workflow body | Language of workflow prose, prompts, dashboard labels, stats markdown. |
workflow.*
| Key | Type | Default | Who reads it | Effect |
|---|---|---|---|---|
workflow.commit_docs | boolean | true | install / re-install path | Whether the installer commits managed-block updates to CLAUDE.md / AGENTS.md / GEMINI.md. |
workflow.commit_artifacts | boolean | true | lib/commit-policy.cjs | Whether non-task artifacts (CONTEXT, PLAN, ROADMAP, etc.) are committed by their producing workflow. Set to false to keep planning artifacts out of git. See ADR-0004. |
workflow.worktree_isolation | boolean | false | lib/worktree.cjs::worktreeIsolationEnabled | Opt-in per-slice git worktree isolation during /np:execute-phase. See ADR-0008. |
workflow.tier_routing | boolean | false | /np:execute-phase spawn step | Opt-in cost-aware model routing. See § workflow.tier_routing below. |
workflow.text_mode | boolean | (unset) | lib/text-mode.cjs | Forces plain-text prompt rendering. When unset, falls back to process.env.CLAUDECODE. Set to true for non-TTY shells, false to override an inherited CLAUDECODE. |
workflow.research_tools.WebFetch | boolean | true | np-researcher startup probe | Whether the researcher tries WebFetch. Disabling forces local-only research. Overridden by env NP_TOOLS_WEBFETCH. |
workflow.research_tools.Context7 | boolean | true | np-researcher startup probe | Whether the researcher tries Context7 MCP. Overridden by env NP_TOOLS_CONTEXT7. |
workflow.tier_routing
Cost-aware executor model routing. Off by default — every task's Round-1 executor runs at the frontier profile (the strongest model), regardless of the task's planned tier. This preserves the historical behaviour: maximum quality on every task.
When set to true, the Round-1 executor instead resolves its model from the task's planner-assigned tier under the project's configured model_profile (default balanced):
Task tier | Model under balanced |
|---|---|
haiku (trivial — single-file docs/rename/format) | haiku |
sonnet (standard — ordinary single-concern work) | sonnet |
opus (large — many files, or security/data-sensitive) | opus |
The planner sets each task's tier; it makes that call evidence-based via np-tools.cjs derive-tier (file count + security/data-sensitivity signals — never implementation detail, per ADR-0013). Round-2+ retries (the np-build-fixer) always stay at frontier: fixing a failing task wants the strongest model regardless of routing. Because a task only drops below frontier when routing is explicitly enabled, a mis-classified tier is never a correctness risk — it only trades cost against the chosen model_profile.
bash
node np-tools.cjs config-get workflow.tier_routing
node np-tools.cjs derive-tier --files "app/Auth.php" --name "add login throttling" # → opusmodel_providers.* and agent_routing.*
Both keys are optional. Absent, every agent resolves to the implicit claude-native default and resolution is byte-for-byte unchanged. Together they make the workspace model-agnostic: each agent can run on a different provider — Claude, a hosted OpenAI-compatible API (OpenAI, xAI/Grok), or a local model (Ollama, vLLM, LM Studio). See ADR-0021.
Off-host dispatch (ADR-0021, Accepted)
An agent runs off-host automatically when its agent_routing resolves to an openai-compat provider (Ollama, OpenAI, Grok, vLLM). You can also run any off-host agent ad-hoc with np-tools spawn-offhost --agent <name> --task <…> [--task-id M<NNN>-S<NNN>-T<NNNN>].
What an agent writes decides whether it needs a worktree:
- Live-code editors (
np-executor,np-build-fixerin/np:execute-phase) edit the working tree, so they requireworkflow.worktree_isolation=true. Model-driven edits stay confined to the per-wave slice worktree and are ff-merged back, and--allow-bashis honoured only there. - Artefact writers (
np-planner,np-plan-checkerin/np:plan-phase) only write planning artefacts under.nubos-pilot/, inside the repo cwd, never live code. They run with the default cwd (Read/Grep/Glob over the whole repo, Write confined to cwd), no worktree and no Bash. There is no emit-and-persist contract; they write their files exactly as the native agent does. - Read-only emitters (
np-critic,np-verifier) run--read-onlyand return their result object as the final message, which the orchestrator persists.
Guards: Rule-9-audited agents (np-executor, np-researcher, np-build-fixer) need a canonical --task-id so the search-evidence ledger applies (a native knowledge-search tool is injected to satisfy the search bar); without one they are refused (offhost-audited-agent-unsupported). --allow-bash works only inside a slice worktree, otherwise offhost-bash-requires-sandbox. The native claude spawn path can't run a non-Claude model, so resolve-model redirects to spawn-offhost (off-host-not-on-native-path); use resolve-model <agent> --kind to detect routing without the refusal (it prints native or openai-compat). Off-host metrics rows leave tokens_in/out null, like every non-Claude runtime ([D-09]).
An openai-compat provider has to clear two bars, or the run fails loudly instead of producing garbage:
- A
GET {base_url}/modelsendpoint. Preflight calls it before every off-host spawn and refuses (preflight-failed) if the endpoint is unreachable or doesn't list the requested model, with an actionable hint such asrun: ollama pull <model>. A gateway that exposes/chat/completionsbut not/modelscan't be used off-host. - OpenAI function/tool-calling support. The agent loop drives Read/Edit/Bash as tool-calls. A model that won't emit tool-calls just returns plain text, which the loop treats as a final answer, so an off-host editor would silently make no edits. That isn't fatal and can't be preflighted cheaply, so it's caught afterwards: the dispatch envelope carries a
capabilityobject ({toolsAdvertised, toolCalls, mutating, ok}), andspawn-offhostprints a loud stderr hint when tools were advertised but none were called (loudest formutatingagents). Keep weak, non-tool-calling models off the editor agents.
Every workflow agent-spawn has an off-host branch: execute-phase (executor, build-fixer, researcher, critic, task-architect, test-writer), plan-phase (planner, plan-checker), discuss-phase (sc-extractor), research-phase (researcher, reconciler), architect-phase (architect, researcher, critic), validate-phase (nyquist-auditor), verify-work (verifier), scan-codebase (codebase-documenter). scripts/check-offhost-coverage.cjs (a test in tests/) enforces this: a new spawn site without an off-host branch fails the suite.
np-security-reviewer and np-learnings-extractor spawn through spawn-headless (a claude -p subprocess in lib code) rather than a workflow orchestrator spawn, so their off-host routing lives inside spawn-headless itself. When the agent routes to an openai-compat provider it calls dispatchOffHost instead of claude -p and writes the same {result} envelope, so review.cjs and extract.cjs parse it unchanged. (spawn-headless.run is now async; the CLI dispatcher and both in-process callers await it.) The default stays native, and keeping the security reviewer on Claude is still the recommendation, since a weak local model lowers review quality.
One capability bound is worth calling out: the off-host toolset has no WebFetch or context7, so np-researcher can only do offline (knowledge-search) research off-host. Route it native for online mode. The audited researcher uses the synthetic canonical task-id ${MILESTONE_ID}-S000-T0000 (milestone-level, no slice or task) for its Rule-9 ledger.
tier stays the agent's intrinsic difficulty property; agent_routing is the orthogonal where-it-runs layer.
jsonc
{
"model_providers": {
"default": "claude", // provider used when no routing entry matches
"claude": { "kind": "native" }, // delegate to the host runtime (Claude Code, etc.)
"openai": {
"kind": "openai-compat",
"base_url": "https://api.openai.com/v1",
"api_key_env": "OPENAI_API_KEY",
"models": { "haiku": "gpt-4o-mini", "sonnet": "gpt-4o", "opus": "gpt-4.1" }
},
"ollama": {
"kind": "openai-compat",
"base_url": "http://localhost:11434/v1",
"models": { "haiku": "qwen2.5-coder:7b", "sonnet": "qwen2.5-coder:32b", "opus": "qwen2.5-coder:32b" }
}
},
"agent_routing": {
"np-planner": { "provider": "claude", "model": "claude-opus-4-7" },
"np-critic*": { "provider": "openai", "model": "gpt-4o" },
"np-executor": { "provider": "ollama", "model": "qwen2.5-coder:32b" }
}
}model_providers.<name>
| Field | Type | Required | Effect |
|---|---|---|---|
default | string | no | Provider name used when no agent_routing entry matches. Absent ⇒ claude. |
<name>.kind | native | openai-compat | yes | native delegates to the host runtime (the host picks the model). openai-compat engages nubos-pilot's own dispatch layer over a single fetch-based client. |
<name>.base_url | string | for openai-compat | The /v1 endpoint. Ollama's is http://localhost:11434/v1. The only locator the dispatch layer needs. |
<name>.api_key_env | string | no | Name of the env var holding the API key. Omit for keyless local servers (Ollama). |
<name>.models | object | no | Tier → model id table, used as the fallback when an agent_routing entry pins no explicit model. |
agent_routing.<agent-or-glob>
Keys are exact agent names (np-executor) or a single trailing-* prefix glob (np-critic*). Exact beats glob; among globs the longest prefix wins. Bare tiers never route — they always resolve to the default provider.
| Field | Type | Required | Effect |
|---|---|---|---|
provider | string | yes | Must reference a defined model_providers.<name> — an undefined reference is a hard error (provider-undefined), never a silent fallback to Claude. |
model | string | no | Pins an exact model. Omitted ⇒ the provider's models[tier] is used (the agent's tier still drives size). For a native provider, a pin forces that exact host model id. |
Resolution precedence: agent_routing[<exact>] → agent_routing[<glob>] → model_providers.default (tier-mapped). An openai-compat provider with neither a pinned model nor a models[tier] entry for the agent's tier is a hard error (provider-model-unresolved).
bash
node np-tools.cjs resolve-model np-executor # prints the resolved model for the executorAgent skills
config.json may carry an agent_skills map that lists additional skill files an agent should load at spawn time:
json
{
"agent_skills": {
"np-researcher": ["nubos-research-skill"],
"np-executor": ["nubos-rails-skill", "nubos-typescript-skill"]
}
}The map is read by lib/agents.cjs::getAgentSkills(name) and surfaced through np-tools.cjs agent-skills <name>. Workflows that spawn the agent embed the resolved skill list into the spawn prompt; the agent loads each skill before performing the task.
Skills are runtime-managed: each entry is a string the host CLI can resolve to a skill file (Claude Code's skill-loader, Codex equivalent, etc.). nubos-pilot does not validate the strings; unknown skill ids are passed through and rejected by the runtime if invalid.
Default is no extra skills. Add the map only when you have project-specific skills you want every spawn of a given agent to load.
agents.*
Optional kill-switches for parts of the workflow pipeline. Default is "everything on".
| Key | Type | Default | Effect when false |
|---|---|---|---|
agents.parallelization | boolean | true | /np:execute-phase runs tasks within a slice serially instead of parallel. |
agents.research | boolean | true | /np:plan-phase skips the research gate prompt. |
agents.plan_checker | boolean | true | /np:plan-phase ships planner output without the adversarial review loop. |
agents.verifier | boolean | true | /np:verify-work becomes a no-op stub. |
agents.architect | boolean | true | The per-task architect step (np-task-architect) is skipped in the Nubosloop — the executor gets no up-front structural spec. |
agents.test_writer | boolean | true | The per-task TDD step (np-test-writer) is skipped — no tests are written before the executor runs. |
These are escape hatches for constrained environments. Leaving them on is the recommended default.
agents.architect & agents.test_writer (per-task Nubosloop steps)
Two round-1 steps that run inside /np:execute-phase, between the researcher swarm and the executor, in this order: architect → test-writer → executor.
agents.architectspawnsnp-task-architect(read-only). It reads the task plan,RULES.md(Conventions), and anyM<NNN>-ARCHITECTURE.md, then emits an ephemeral per-task structural spec (responsibilities, boundaries, paradigms, required test surfaces) that is injected into the test-writer and executor prompts. It writes no files and is never committed — so it cannot trip plan-lint. This is the per-task counterpart to the milestone-level/np:architect-phase; the milestone architect is reachable from planning via/np:plan-phase <N> --architect.agents.test_writerspawnsnp-test-writerafter the architect. It writes real, valid test files for the required surfaces before production code exists (TDD). The tests may start red; the executor makes them green and is told not to delete, skip, or weaken them. Thenp-critic-testsaxis re-audits afterwards for any skipped or vacuous assertions.
Both default to true and are backfilled to true on install and update when the key is absent (an explicit false is a deliberate choice and never overwritten — same conservative rule as agents.economy). Each step has a Layer-C SKIP-GUARD (loop-post-architect-missing-spawn-audit / loop-post-test-writer-missing-spawn-audit) so the orchestrator cannot silently skip a spawn it claims to have run. See ADR-0023.
agents.economy (graduated Economy axis)
The Economy axis (Ponytail-style anti-over-engineering) is graduated, not a binary toggle. One enum dials two mechanisms: the prevention ladder (the climb-the-ladder discipline injected into np-executor before it writes) and the in-loop critic (the np-critic-economy.md audit axis that reviews the diff after).
agents.economy | Prevention ladder | In-loop critic | Notes |
|---|---|---|---|
off | off | off | No economy pressure at all. |
lite | on | off | Prevention-first: the executor climbs the ladder, but no critic round is spent. |
full | on | on | Adds the Economy audit axis to the single np-critic spawn — flags over-engineering, stdlib-reinvention, native-duplication, and shrinkable logic (routed to the executor for simplification) on top of the always-on style/tests/acceptance axes. |
ultra (install default) | on | on | As full, but the critic lowers its shrinkable bar, hunts reuse repo-wide, and flags single-use abstractions harder — more rounds for a leaner result. |
Two defaults, on purpose. npx nubos-pilot (install and update) writes agents.economy: "ultra" into the config — into a fresh config on install, and backfilled into an existing config on update only when the key is absent (an explicit economy, or a legacy economy_critic, is treated as a deliberate choice and never overwritten). If the key is missing entirely at runtime, the resolved fallback is lite — a conservative safety net, not the shipped default. So a normal install/update runs ultra; only a hand-stripped config falls back to lite.
The critic is a deliberate counter-force to the Completeness doctrine: completeness always wins — in every mode the economy axis never flags a test, validation, error path, or security control as removable. Dial it down (full/lite/off) if the extra ultra rounds cost more than they save on your project. The same rubric is available on demand, without the loop, via /np:simplify-review. The resolved mode is exposed by the economy-mode CLI command (economy-mode --json prints the {mode, prevention, critic, ultra} gate flags the workflow reads).
Legacy agents.economy_critic (deprecated). The old boolean still validates and is honoured when agents.economy is absent: true → full, false → lite. Prefer the agents.economy enum; the bool will be removed in a future version.
loop.*
Per-task Nubosloop cap — see ADR-0010.
| Key | Type | Default | Effect |
|---|---|---|---|
loop.maxRounds | integer | 3 | Maximum rounds before the loop transitions a task to stuck. Range [1, 100] (clamped). The operator's "Weitermachen +5 Runden" stuck-dialog choice persists nubosloop.max_rounds_override on the checkpoint and overrides this value for the surviving task; /np:resume-work honours the override. |
loop.verify_runs | integer | 1 | pass@k reliability. How many times /np:execute-phase runs a task's <verify> command per round. With 1 (default) the verify runs once, as before. With k > 1 the orchestrator runs it k times and folds the exit codes via np-tools.cjs verify-reliability: a task goes green only when all k runs pass (pass^k). A flaky task (some pass, some fail) aggregates to red and flows through the normal np-build-fixer path, with a FLAKY: … line appended to the verify log so the fixer sees why. Raise this for suites with nondeterminism (timing, ordering, network) you want surfaced rather than tolerated. |
swarm.*
Researcher-Schwarm and Critic configuration.
| Key | Type | Default | Effect |
|---|---|---|---|
swarm.research.k | integer | 3 | Number of parallel np-researcher spawns per Researcher-Schwarm round. Layer-C requires k audit entries per round (k-of-k gate, loop-post-researcher-missing-spawn-audit). |
swarm.research.threshold | float [0,1] | 0.9 | Jaccard-similarity threshold for the pre-flight cache lookup. Hits at or above this similarity bypass the swarm. |
swarm.research.minOccurrence | integer | 3 | Minimum prior-occurrence count required for a learning to count as a cache hit. |
swarm.research.min_agreement_score | float [0,1] | 0.5 | Disagreement-gate floor (ADR-0018 Step 5.7) for the reconciler-reported agreement_score. Below this, /np:research-phase triggers an askuser (re-spawn / continue / manual). Workflow-level override: researcher-reconcile gate <N> --min-agreement-score N. |
swarm.research.max_contested | integer | 2 | Disagreement-gate ceiling (ADR-0018) for the reconciler-reported contested_count. Workflow-level override: researcher-reconcile gate <N> --max-contested N. |
swarm.critic.style_tier | enum | "haiku" | Tier the style axis runs at. Style work is mechanical, so haiku is the cost-optimal choice. |
swarm.critic.tests_tier | enum | "sonnet" | Tier the tests axis runs at. |
swarm.critic.acceptance_tier | enum | "sonnet" | Tier the acceptance axis runs at. Promote to "opus" for milestones with subtle acceptance criteria. |
swarm.critic.economy_tier | enum | "haiku" | Tier the Economy axis runs at when agents.economy is full or ultra. The economy ladder is mechanical pattern-matching, so haiku is the cost-optimal choice. Inert at off/lite. |
swarm.knowledge_adapter | enum | "local" | "local" reads/writes .nubos-pilot/knowledge/learnings.json through lib/knowledge-adapter.cjs. The only adapter shipped. |
The default for swarm.critic is { style_tier: "haiku", tests_tier: "sonnet", acceptance_tier: "sonnet", economy_tier: "haiku" } (lib/config-defaults.cjs). Under the Single-Critic Revision (ADR-0010 §2026-05-05) a single np-critic spawn audits the always-on style/tests/acceptance axes plus the opt-in economy axis (agents.economy ∈ {full, ultra}), and each axis tier still resolves through resolve-model.
spawn.headless.* (Cost Layer L6)
Opt-in headless-subprocess mode for critic and researcher spawns; see ADR-0010 §L6. The orchestrator routes the named agents through bin/np-tools/spawn-headless.cjs, which shells out to claude -p --output-format json. The subprocess conversation lives entirely outside the parent session; only the final-message JSON is captured to disk.
| Key | Type | Default | Effect |
|---|---|---|---|
spawn.headless.enabled | boolean | false | Master switch. false = native Agent tool path (no behaviour change for existing installs). true = headless path for the agents listed below. |
spawn.headless.agents | string[] | ["np-critic", "np-researcher"] | Which agents are eligible for headless dispatch. np-executor and np-build-fixer are intentionally excluded — they mutate the working tree, and headless file mutations would not surface through the parent runtime's diff/edit telemetry, breaking the Layer-A commit-task gate. Extending this list to executor-class agents is a Layer-A risk. |
spawn.headless.timeout_ms | integer | 600000 (10 min) | Per-spawn subprocess timeout. Below 1000 = spawn-headless-invalid-timeout. |
spawn.headless.fallback_on_error | boolean | true | When true, a failing claude -p spawn (binary missing, auth-failure, timeout, non-zero exit) falls back to the native Agent tool. The fallback is stamped on the checkpoint (nubosloop.spawn_headless_fallbacks[]) so dashboards can count fallback rate. |
The claude CLI must be on $PATH and authenticated independently from the parent session. The NUBOS_PILOT_CLAUDE_BIN env var overrides the binary path for split-install scenarios.
Trust-Layer compatibility. loop-audit-tool-use --agent <name> stamps are identical in both paths. Going headless does not bypass the Layer-C audit: the orchestrator MUST call loop-audit-tool-use after spawn-headless returns, exactly as it does after a native Agent-tool spawn.
compression.* (Context Compression + Elision Store)
Opt-in, deterministic context compression for the content nubos-pilot injects before it reaches a model. Large fenced blocks in a headless spawn prompt (bin/np-tools/spawn-headless.cjs) and oversized verify logs (bin/np-tools/loop-run-round.cjs) are crushed by rule-based reducers in lib/compress.cjs (JSON arrays, build/test logs, grep output, unified diffs, and source code — signatures/structure kept, statement bodies elided), each preserving error and outlier lines. Dropped originals are cached under .nubos-pilot/elision/<hash>.json and referenced inline by a ⟦elided:<hash>⟧ marker, so an agent can recover the full text on demand with np-tools.cjs elision-get <hash>. The elision store makes every lossy drop reversible: the worst case is an uncompressed answer, never a wrong one.
Where the compression bites depends on how the agent runs:
- Off-host agents (openai-compat providers via
lib/runtime/agent-loop.cjs) run their agentic loop inside nubos-pilot. Each tool result is crushed before it enters the message history that is re-sent every turn, so the savings compound across turns. These agents are also handed acontext-expandtool (injected automatically while compression is on) so the model can pull back a⟦elided:<hash>⟧original itself mid-loop. - Native
claude -pspawns run their loop inside the Claude CLI, invisible to nubos-pilot. To reach that traffic,compression.proxy.enabledstarts a localhost proxy (lib/elision-proxy.cjs, forked into its own process), points the child'sANTHROPIC_BASE_URLat it, and crushes oversizedtool_resultblocks in each/v1/messagesrequest before forwarding upstream. Onlytool_resultblocks are touched —system, tool definitions and ordinary text are left byte-identical, and because the rewrite is deterministic the compressed prefix stays cache-stable, so Anthropic prompt-caching is preserved.
| Key | Type | Default | Effect |
|---|---|---|---|
compression.enabled | boolean | false | Master switch. false = no behaviour change (verify logs fall back to dumb tail-verify_max_bytes truncation; prompts and tool results pass through untouched). true = crush injected/tool content and emit elision markers. |
compression.min_block_bytes | integer | 2048 | A block smaller than this is left byte-identical. Below the threshold the savings don't justify the marker overhead. |
compression.verify_max_bytes | integer | 2000 | Byte budget for the critic-facing verify-output excerpt. Error/stack lines are kept first, then the tail fills the remaining budget. |
compression.elision.enabled | boolean | true | When true, dropped originals are cached and retrievable via elision-get. When false, blocks are still crushed but no marker/cache is written (irreversible — not recommended). |
compression.elision.ttl_ms | integer | 1800000 (30 min) | Time-to-live for a cached original. After it elapses, elision-get reports expired; lib/elision.cjs::prune removes lapsed entries lazily. |
compression.proxy.enabled | boolean | false | When true (and compression.enabled is on), native claude -p spawns are routed through the localhost elision proxy so their in-loop tool_result traffic is compressed. Requires no upstream change — it forwards to the original ANTHROPIC_BASE_URL (or api.anthropic.com). Default off. |
When compression.enabled and compression.elision.enabled are both on, the Researcher-Schwarm also dedups its identical k-spawn input: it is cached once and each spawn spec carries an input_ref hash instead of re-inlining the payload k times.
Input-block coverage
The rule-based reducers in lib/compress.cjs also handle two brace-light block types beyond those listed above, both fully reversible via the elision store: prose (long natural-language tool output — head/tail sentences plus any directive/warning sentence kept, the middle sampled) and brace-light source such as Python (crushed by indent depth — module-level lines, block headers and raise/assert kept, deep statement bodies elided). Brace counting for C-family code masks string literals and //…/* */ comments first, so braces inside them never distort the structure detection. These are governed by the same compression.enabled master switch — no separate toggle.
compression.output_steering.* (Output-token reduction)
Deterministic output-token reduction — no model, no dependency. Two independent levers, both safe with the prompt cache:
- Verbosity steering appends a fixed terseness directive to the end of the system prompt (wrapped in a
<nubos_output_shaping>block, idempotent — re-enrichment never stacks). Appending after the cached prefix leaves prompt-caching intact. On the nativeclaude -ppath the directive is appended to the agent instructions instead; on off-host agents it enriches the system message each loop. - Effort routing classifies each off-host turn structurally — a fresh user ask or a turn recovering from a tool error keeps full effort (
base_effort), while a turn that only continues from cleantool_results is "mechanical" and has its effort downgraded tomechanical_effort. The effort reaches the provider as the OpenAI-compatreasoning_effortfield, which is only ever sent when you pin a concretebase_effort— that pin is your affirmation the provider supports it. Without it the lever is inert and the field is never sent, so providers without effort support are unaffected. Single-shot native spawns have no multi-turn loop, so routing does not apply there.
| Key | Type | Default | Effect |
|---|---|---|---|
compression.output_steering.enabled | boolean | false | Master switch for both levers. Requires compression.enabled. |
compression.output_steering.verbosity_profile | string | "balanced" | balanced = no steering (no-op). concise / terse / minimal = cumulatively terser directives. An unknown value is treated as balanced. |
compression.output_steering.effort_routing.enabled | boolean | false | When true and base_effort is pinned, downgrade effort on mechanical continuation turns (off-host loop only). |
compression.output_steering.effort_routing.base_effort | string | null | null | Effort tier for new-ask / error-recovery turns, sent as reasoning_effort. null = routing inert, field never sent. Set a concrete tier (low/medium/high/xhigh/max) only if your openai-compat provider supports reasoning effort. |
compression.output_steering.effort_routing.mechanical_effort | string | "low" | Target effort floor for mechanical turns. Only applied when it is lower than base_effort; never an upgrade. |
compression.cache_align.* (Prompt-cache alignment, proxy path)
Deterministic prompt-cache alignment for the native claude -p path only (it acts on the Anthropic /v1/messages body the elision proxy already sees; off-host openai-compat agents have no Anthropic tools/system prefix and are unaffected). Two levers in lib/cache-align.cjs, both idempotent:
- Detect (never mutates) — scans the system prompt for cache-volatile tokens (UUID, ISO-8601 datetime, JWT, long hex hash) and logs one warning per kind, so when a prefix keeps missing the cache the cause is visible rather than mysterious.
- Normalize (mutates the request, opt-in) — gives the tool-definition prefix a stable byte layout (tools sorted by name, schema object-keys sorted recursively; array element order is preserved) and adds exactly one
cache_control: {type: "ephemeral"}breakpoint on the last tool. Customer-placement-wins: if the caller already set anycache_controlonsystemortools, nothing is reordered or added — a deliberate cache layout is never disturbed.
Unlike the rest of compression.* (which keeps the prefix byte-identical), this lever deliberately rewrites the tool prefix to make it canonical and cacheable, which is why it is a separate opt-in. It only engages when the proxy path (compression.proxy.enabled) is in use.
| Key | Type | Default | Effect |
|---|---|---|---|
compression.cache_align.enabled | boolean | false | When true (and compression.enabled), normalize the tool prefix and add a cache breakpoint on proxied /v1/messages requests, and log volatile-token warnings. Default off. |
security.* (In-Session Security Review, ADR-0020)
Always-on, non-blocking in-session security layer; see In-Session Security Review and ADR-0020. Built-in checks cannot be disabled individually — only whole layers or the feature. Custom rules and guidance are additive. Toggling takes effect at runtime; no reinstall needed.
| Key | Type | Default | Effect |
|---|---|---|---|
security.enabled | boolean | true | Master switch for the whole feature. false = every hook no-ops silently. |
security.scan_on_write | boolean | true | Layer 1 — deterministic per-edit pattern scan (PostToolUse on Edit/Write/MultiEdit/NotebookEdit). No model call. |
security.review_on_stop | boolean | true | Layer 2 — background semantic review of the turn-diff at Stop, surfaced as a non-blocking follow-up. |
security.review_on_commit | boolean | true | Layer 3 — deeper background review on the agent's own git commit/git push (PostToolUse on Bash). |
security.custom_rules_path | string | null | null | JSON file of additional per-edit patterns (rule_name, reminder, regex/substrings, paths/exclude_paths). Additive; ≤50 rules; catastrophic-backtracking regexes skipped. |
security.guidance_path | string | null | null | Markdown "what to watch for" file injected into the model-backed reviews. Additive. |
security.review_timeout_ms | integer | 180000 | Per-review subprocess timeout for the independent reviewer spawn. |
security.max_stop_reviews_in_a_row | integer | 3 | Consecutive Stop surfacings before yielding back to the user. |
security.max_commit_reviews_per_hour | integer | 20 | Rolling-hour cap on commit/push reviews. |
security.max_files_per_review | integer | 30 | Max changed files reviewed per turn. |
conformance.* (Requirements awareness)
| Key | Type | Default | Effect |
|---|---|---|---|
conformance.inject_criteria | boolean | true | When true, /np:execute-phase injects the milestone success_criteria (the acceptance target, intent-level per ADR-0019) into the np-executor spawn prompt, so the executor writes against the requirements from round 1 instead of only against the verify command. Set to false to keep the executor prompt criteria-free (the post-task np-critic-acceptance review is unaffected either way). |
auto_log_learning
Top-level boolean. Default true. When true, lib/nubosloop.cjs::autoLogLearning persists the merged Researcher-Schwarm consensus + the Executor's final diff to the learnings store on every commit. _runCommit skips the auto-log when (a) cache_hit: true (the cached pattern is already in the store), (b) --learning-pattern is a sentinel placeholder (<...>), or (c) --learning-pattern is empty.
learnings.* (Stop-hook continuous-learning capture, ADR-0010)
Distinct from auto_log_learning (which captures inside the /np:execute-phase loop): this is session-level auto-capture for any work — including ad-hoc edits outside a milestone. On Claude Code, the np-learnings-hook fires on session Stop; np-tools.cjs learnings capture then (rate-limited) spawns the read-only np-learnings-extractor headlessly over the turn's diff, and folds the atomic {pattern, outcome} learnings it returns into the same store learning-match reads at execute-phase Round 1. A companion hook on UserPromptSubmit resets the consecutive-stop streak. The extractor is tier: haiku — a cheap observer. On by default, rate-limited so the LLM cost stays bounded.
| Key | Type | Default | Effect |
|---|---|---|---|
learnings.auto_capture | boolean | true | Master switch for Stop-hook capture. false disables it entirely (the hook returns immediately, no spawn, no cost). |
learnings.max_captures_per_hour | integer | 10 | Sliding per-hour cap per session. Once hit, further Stops skip capture until the window clears — the primary cost guard. |
learnings.max_in_a_row | integer | 3 | Consecutive-Stop cap. Prevents a tight edit/stop loop from firing an extraction every Stop; reset by the next UserPromptSubmit. |
learnings.timeout_ms | integer | 120000 | Headless extractor timeout (ms). |
learnings.max_files | integer | 30 | Cap on changed files fed to the extractor; the diff itself is byte-capped. |
The hooks install with npx nubos-pilot (the installer runs install-hooks --which all). To remove just these, re-run hook install without the learnings group, or set auto_capture: false to neutralise them in place.
Reading values
Programmatically, from inside a workflow:
bash
node np-tools.cjs config-get model_profile
node np-tools.cjs config-get workflow.commit_artifacts
node np-tools.cjs config-get workflow.research_tools.WebFetchDotted key paths are resolved by bin/np-tools/config.cjs. Forbidden segments (anything not matching /^[a-zA-Z0-9_-]+$/ or with a __proto__ lookalike) raise config-forbidden-key.
Writing values
There is no dedicated config-set command. To change a value:
- Edit
.nubos-pilot/config.jsondirectly with your editor of choice. - Or run
npx nubos-pilot updateand re-answer the install questions.
The file is small enough that hand-editing is the supported path. Workflows do not mutate it during normal operation; keeping it human-owned is part of the three-trees invariant.
Environment variable overrides
A handful of values can be overridden by env vars at workflow-spawn time, useful for CI:
| Env var | Overrides | Notes |
|---|---|---|
CLAUDECODE | workflow.text_mode | Set automatically by Claude Code; affects prompt rendering. |
NP_TOOLS_WEBFETCH | workflow.research_tools.WebFetch | 0/1/true/false. Forces researcher probe on/off. |
NP_TOOLS_CONTEXT7 | workflow.research_tools.Context7 | Same shape. |
NUBOS_PILOT_CLAUDE_BIN | binary path used by spawn-headless | When spawn.headless.enabled=true, overrides the claude binary the subprocess invokes. Useful for split-install or alternate CLI versions. |
Env vars win over config.json; config.json wins over compiled defaults.
Validation
bin/np-tools/config.cjs reads the file with JSON.parse on every call; there is no schema validator. Malformed JSON throws config-parse-error; an unknown dotted-key segment returns null (not an error). New keys can be added freely, and old keys can be left in place without runtime impact.
