Configuration

.nubos-pilot/config.json is the single source of truth for runtime, model selection, and workflow toggles. This page lists every key, its default, who reads it, and how to change it.

Where the file lives

Path	Created by	Purpose
`.nubos-pilot/config.json`	`npx nubos-pilot` install interview	Project-scoped configuration; committed to git

The installer writes the initial file. After install, edit it directly or via npx nubos-pilot update (which preserves answers and refreshes the payload).

Default shape

json

{
  "runtime": "claude",
  "runtimes": ["claude"],
  "scope": "local",
  "model_profile": "frontier",
  "response_language": "en",
  "workflow": {
    "commit_docs": true,
    "commit_artifacts": true,
    "worktree_isolation": false,
    "tier_routing": false,
    "research_tools": {
      "WebFetch": true,
      "Context7": true
    }
  },
  "agents": {
    "parallelization": true,
    "research": true,
    "plan_checker": true,
    "verifier": true,
    "architect": true,
    "test_writer": true,
    "economy": "ultra"
  },
  "loop": { "maxRounds": 3, "verify_runs": 1 },
  "swarm": {
    "research": { "k": 3, "threshold": 0.9, "minOccurrence": 3 },
    "critic":   { "style_tier": "haiku", "tests_tier": "sonnet", "acceptance_tier": "sonnet", "economy_tier": "haiku" },
    "knowledge_adapter": "local"
  },
  "spawn": {
    "headless": {
      "enabled": false,
      "agents": ["np-critic", "np-researcher"],
      "timeout_ms": 600000,
      "fallback_on_error": true
    }
  },
  "compression": {
    "enabled": false,
    "min_block_bytes": 2048,
    "verify_max_bytes": 2000,
    "elision": { "enabled": true, "ttl_ms": 1800000 },
    "proxy": { "enabled": false },
    "output_steering": {
      "enabled": false,
      "verbosity_profile": "balanced",
      "effort_routing": { "enabled": false, "base_effort": null, "mechanical_effort": "low" }
    },
    "cache_align": { "enabled": false }
  },
  "security": {
    "enabled": true,
    "scan_on_write": true,
    "review_on_stop": true,
    "review_on_commit": true,
    "custom_rules_path": null,
    "guidance_path": null,
    "review_timeout_ms": 180000,
    "max_stop_reviews_in_a_row": 3,
    "max_commit_reviews_per_hour": 20,
    "max_files_per_review": 30
  },
  "conformance": { "inject_criteria": true },
  "learnings": {
    "auto_capture": true,
    "max_captures_per_hour": 10,
    "max_in_a_row": 3,
    "timeout_ms": 120000,
    "max_files": 30
  },
  "auto_log_learning": true
}

Authoritative defaults live in lib/config-defaults.cjs. Keys absent from the file fall back to the values shown above, so workflows never crash on a missing key.

Top-level keys

Key	Type	Default	Who reads it	Effect
`runtime`	string	from install detect	`lib/runtime/index.cjs::detect`	Active host CLI adapter. One of the 14 ids in `KNOWN_RUNTIMES`.
`runtimes`	string[]	`[runtime]`	`bin/install.js`	List of runtimes to install for in multi-runtime mode (`--agents` flag).
`runtime_source`	string	`config` \| `env` \| `default`	`lib/runtime/index.cjs`	Diagnostic — which detection path won. Read-only after install.
`scope`	`local` \| `global`	`local`	`bin/install.js`	`local` writes payload into project; `global` writes to runtime's home directory.
`model_profile`	enum	`frontier`	`np-tools.cjs resolve-model`	Tier → model resolution. See Tier × Profile matrix.
`model_providers`	object	(unset → claude-native)	`lib/model-providers.cjs`, `np-tools.cjs resolve-model`	Optional provider definitions for model-agnostic dispatch. See § `model_providers.` and `agent_routing.` and ADR-0021.
`agent_routing`	object	(unset)	`lib/model-providers.cjs`, `np-tools.cjs resolve-model`	Optional per-agent provider/model routing. See § `model_providers.` and `agent_routing.`.
`response_language`	ISO-639 string	`en`	`lib/language.cjs`, every workflow body	Language of workflow prose, prompts, dashboard labels, stats markdown.

`workflow.*`

Key	Type	Default	Who reads it	Effect
`workflow.commit_docs`	boolean	`true`	install / re-install path	Whether the installer commits managed-block updates to `CLAUDE.md` / `AGENTS.md` / `GEMINI.md`.
`workflow.commit_artifacts`	boolean	`true`	`lib/commit-policy.cjs`	Whether non-task artifacts (CONTEXT, PLAN, ROADMAP, etc.) are committed by their producing workflow. Set to `false` to keep planning artifacts out of git. See ADR-0004.
`workflow.worktree_isolation`	boolean	`false`	`lib/worktree.cjs::worktreeIsolationEnabled`	Opt-in per-slice git worktree isolation during `/np:execute-phase`. See ADR-0008.
`workflow.tier_routing`	boolean	`false`	`/np:execute-phase` spawn step	Opt-in cost-aware model routing. See § `workflow.tier_routing` below.
`workflow.text_mode`	boolean	(unset)	`lib/text-mode.cjs`	Forces plain-text prompt rendering. When unset, falls back to `process.env.CLAUDECODE`. Set to `true` for non-TTY shells, `false` to override an inherited `CLAUDECODE`.
`workflow.research_tools.WebFetch`	boolean	`true`	`np-researcher` startup probe	Whether the researcher tries `WebFetch`. Disabling forces local-only research. Overridden by env `NP_TOOLS_WEBFETCH`.
`workflow.research_tools.Context7`	boolean	`true`	`np-researcher` startup probe	Whether the researcher tries Context7 MCP. Overridden by env `NP_TOOLS_CONTEXT7`.

`workflow.tier_routing`

Cost-aware executor model routing. Off by default — every task's Round-1 executor runs at the frontier profile (the strongest model), regardless of the task's planned tier. This preserves the historical behaviour: maximum quality on every task.

When set to true, the Round-1 executor instead resolves its model from the task's planner-assigned tier under the project's configured model_profile (default balanced):

Task `tier`	Model under `balanced`
`haiku` (trivial — single-file docs/rename/format)	`haiku`
`sonnet` (standard — ordinary single-concern work)	`sonnet`
`opus` (large — many files, or security/data-sensitive)	`opus`

The planner sets each task's tier; it makes that call evidence-based via np-tools.cjs derive-tier (file count + security/data-sensitivity signals — never implementation detail, per ADR-0013). Round-2+ retries (the np-build-fixer) always stay at frontier: fixing a failing task wants the strongest model regardless of routing. Because a task only drops below frontier when routing is explicitly enabled, a mis-classified tier is never a correctness risk — it only trades cost against the chosen model_profile.

bash

node np-tools.cjs config-get workflow.tier_routing
node np-tools.cjs derive-tier --files "app/Auth.php" --name "add login throttling"   # → opus

`model_providers.` and `agent_routing.`

Both keys are optional. Absent, every agent resolves to the implicit claude-native default and resolution is byte-for-byte unchanged. Together they make the workspace model-agnostic: each agent can run on a different provider — Claude, a hosted OpenAI-compatible API (OpenAI, xAI/Grok), or a local model (Ollama, vLLM, LM Studio). See ADR-0021.

Off-host dispatch (ADR-0021, Accepted)

An agent runs off-host automatically when its agent_routing resolves to an openai-compat provider (Ollama, OpenAI, Grok, vLLM). You can also run any off-host agent ad-hoc with np-tools spawn-offhost --agent <name> --task <…> [--task-id M<NNN>-S<NNN>-T<NNNN>].

What an agent writes decides whether it needs a worktree:

Live-code editors (np-executor, np-build-fixer in /np:execute-phase) edit the working tree, so they require workflow.worktree_isolation=true. Model-driven edits stay confined to the per-wave slice worktree and are ff-merged back, and --allow-bash is honoured only there.
Artefact writers (np-planner, np-plan-checker in /np:plan-phase) only write planning artefacts under .nubos-pilot/, inside the repo cwd, never live code. They run with the default cwd (Read/Grep/Glob over the whole repo, Write confined to cwd), no worktree and no Bash. There is no emit-and-persist contract; they write their files exactly as the native agent does.
Read-only emitters (np-critic, np-verifier) run --read-only and return their result object as the final message, which the orchestrator persists.

Guards: Rule-9-audited agents (np-executor, np-researcher, np-build-fixer) need a canonical --task-id so the search-evidence ledger applies (a native knowledge-search tool is injected to satisfy the search bar); without one they are refused (offhost-audited-agent-unsupported). --allow-bash works only inside a slice worktree, otherwise offhost-bash-requires-sandbox. The native claude spawn path can't run a non-Claude model, so resolve-model redirects to spawn-offhost (off-host-not-on-native-path); use resolve-model <agent> --kind to detect routing without the refusal (it prints native or openai-compat). Off-host metrics rows leave tokens_in/out null, like every non-Claude runtime ([D-09]).

An openai-compat provider has to clear two bars, or the run fails loudly instead of producing garbage:

A GET {base_url}/models endpoint. Preflight calls it before every off-host spawn and refuses (preflight-failed) if the endpoint is unreachable or doesn't list the requested model, with an actionable hint such as run: ollama pull <model>. A gateway that exposes /chat/completions but not /models can't be used off-host.
OpenAI function/tool-calling support. The agent loop drives Read/Edit/Bash as tool-calls. A model that won't emit tool-calls just returns plain text, which the loop treats as a final answer, so an off-host editor would silently make no edits. That isn't fatal and can't be preflighted cheaply, so it's caught afterwards: the dispatch envelope carries a capability object ({toolsAdvertised, toolCalls, mutating, ok}), and spawn-offhost prints a loud stderr hint when tools were advertised but none were called (loudest for mutating agents). Keep weak, non-tool-calling models off the editor agents.

Every workflow agent-spawn has an off-host branch: execute-phase (executor, build-fixer, researcher, critic, task-architect, test-writer), plan-phase (planner, plan-checker), discuss-phase (sc-extractor), research-phase (researcher, reconciler), architect-phase (architect, researcher, critic), validate-phase (nyquist-auditor), verify-work (verifier), scan-codebase (codebase-documenter). scripts/check-offhost-coverage.cjs (a test in tests/) enforces this: a new spawn site without an off-host branch fails the suite.

np-security-reviewer and np-learnings-extractor spawn through spawn-headless (a claude -p subprocess in lib code) rather than a workflow orchestrator spawn, so their off-host routing lives inside spawn-headless itself. When the agent routes to an openai-compat provider it calls dispatchOffHost instead of claude -p and writes the same {result} envelope, so review.cjs and extract.cjs parse it unchanged. (spawn-headless.run is now async; the CLI dispatcher and both in-process callers await it.) The default stays native, and keeping the security reviewer on Claude is still the recommendation, since a weak local model lowers review quality.

One capability bound is worth calling out: the off-host toolset has no WebFetch or context7, so np-researcher can only do offline (knowledge-search) research off-host. Route it native for online mode. The audited researcher uses the synthetic canonical task-id ${MILESTONE_ID}-S000-T0000 (milestone-level, no slice or task) for its Rule-9 ledger.

tier stays the agent's intrinsic difficulty property; agent_routing is the orthogonal where-it-runs layer.

jsonc

{
  "model_providers": {
    "default": "claude",                       // provider used when no routing entry matches
    "claude": { "kind": "native" },            // delegate to the host runtime (Claude Code, etc.)
    "openai": {
      "kind": "openai-compat",
      "base_url": "https://api.openai.com/v1",
      "api_key_env": "OPENAI_API_KEY",
      "models": { "haiku": "gpt-4o-mini", "sonnet": "gpt-4o", "opus": "gpt-4.1" }
    },
    "ollama": {
      "kind": "openai-compat",
      "base_url": "http://localhost:11434/v1",
      "models": { "haiku": "qwen2.5-coder:7b", "sonnet": "qwen2.5-coder:32b", "opus": "qwen2.5-coder:32b" }
    }
  },
  "agent_routing": {
    "np-planner":  { "provider": "claude", "model": "claude-opus-4-7" },
    "np-critic*":  { "provider": "openai", "model": "gpt-4o" },
    "np-executor": { "provider": "ollama", "model": "qwen2.5-coder:32b" }
  }
}

`model_providers.<name>`

Field	Type	Required	Effect
`default`	string	no	Provider name used when no `agent_routing` entry matches. Absent ⇒ `claude`.
`<name>.kind`	`native` \| `openai-compat`	yes	`native` delegates to the host runtime (the host picks the model). `openai-compat` engages nubos-pilot's own dispatch layer over a single `fetch`-based client.
`<name>.base_url`	string	for `openai-compat`	The `/v1` endpoint. Ollama's is `http://localhost:11434/v1`. The only locator the dispatch layer needs.
`<name>.api_key_env`	string	no	Name of the env var holding the API key. Omit for keyless local servers (Ollama).
`<name>.models`	object	no	Tier → model id table, used as the fallback when an `agent_routing` entry pins no explicit `model`.

`agent_routing.<agent-or-glob>`

Keys are exact agent names (np-executor) or a single trailing-* prefix glob (np-critic*). Exact beats glob; among globs the longest prefix wins. Bare tiers never route — they always resolve to the default provider.

Field	Type	Required	Effect
`provider`	string	yes	Must reference a defined `model_providers.<name>` — an undefined reference is a hard error (`provider-undefined`), never a silent fallback to Claude.
`model`	string	no	Pins an exact model. Omitted ⇒ the provider's `models[tier]` is used (the agent's `tier` still drives size). For a `native` provider, a pin forces that exact host model id.

Resolution precedence: agent_routing[<exact>] → agent_routing[<glob>] → model_providers.default (tier-mapped). An openai-compat provider with neither a pinned model nor a models[tier] entry for the agent's tier is a hard error (provider-model-unresolved).

bash

node np-tools.cjs resolve-model np-executor          # prints the resolved model for the executor

Agent skills

config.json may carry an agent_skills map that lists additional skill files an agent should load at spawn time:

json

{
  "agent_skills": {
    "np-researcher": ["nubos-research-skill"],
    "np-executor":   ["nubos-rails-skill", "nubos-typescript-skill"]
  }
}

The map is read by lib/agents.cjs::getAgentSkills(name) and surfaced through np-tools.cjs agent-skills <name>. Workflows that spawn the agent embed the resolved skill list into the spawn prompt; the agent loads each skill before performing the task.

Skills are runtime-managed: each entry is a string the host CLI can resolve to a skill file (Claude Code's skill-loader, Codex equivalent, etc.). nubos-pilot does not validate the strings; unknown skill ids are passed through and rejected by the runtime if invalid.

Default is no extra skills. Add the map only when you have project-specific skills you want every spawn of a given agent to load.

`agents.*`

Optional kill-switches for parts of the workflow pipeline. Default is "everything on".

Key	Type	Default	Effect when `false`
`agents.parallelization`	boolean	`true`	`/np:execute-phase` runs tasks within a slice serially instead of parallel.
`agents.research`	boolean	`true`	`/np:plan-phase` skips the research gate prompt.
`agents.plan_checker`	boolean	`true`	`/np:plan-phase` ships planner output without the adversarial review loop.
`agents.verifier`	boolean	`true`	`/np:verify-work` becomes a no-op stub.
`agents.architect`	boolean	`true`	The per-task architect step (`np-task-architect`) is skipped in the Nubosloop — the executor gets no up-front structural spec.
`agents.test_writer`	boolean	`true`	The per-task TDD step (`np-test-writer`) is skipped — no tests are written before the executor runs.

These are escape hatches for constrained environments. Leaving them on is the recommended default.

`agents.architect` & `agents.test_writer` (per-task Nubosloop steps)

Two round-1 steps that run inside /np:execute-phase, between the researcher swarm and the executor, in this order: architect → test-writer → executor.

agents.architect spawns np-task-architect (read-only). It reads the task plan, RULES.md (Conventions), and any M<NNN>-ARCHITECTURE.md, then emits an ephemeral per-task structural spec (responsibilities, boundaries, paradigms, required test surfaces) that is injected into the test-writer and executor prompts. It writes no files and is never committed — so it cannot trip plan-lint. This is the per-task counterpart to the milestone-level /np:architect-phase; the milestone architect is reachable from planning via /np:plan-phase <N> --architect.
agents.test_writer spawns np-test-writer after the architect. It writes real, valid test files for the required surfaces before production code exists (TDD). The tests may start red; the executor makes them green and is told not to delete, skip, or weaken them. The np-critic-tests axis re-audits afterwards for any skipped or vacuous assertions.

Both default to true and are backfilled to true on install and update when the key is absent (an explicit false is a deliberate choice and never overwritten — same conservative rule as agents.economy). Each step has a Layer-C SKIP-GUARD (loop-post-architect-missing-spawn-audit / loop-post-test-writer-missing-spawn-audit) so the orchestrator cannot silently skip a spawn it claims to have run. See ADR-0023.

`agents.economy` (graduated Economy axis)

The Economy axis (Ponytail-style anti-over-engineering) is graduated, not a binary toggle. One enum dials two mechanisms: the prevention ladder (the climb-the-ladder discipline injected into np-executor before it writes) and the in-loop critic (the np-critic-economy.md audit axis that reviews the diff after).

`agents.economy`	Prevention ladder	In-loop critic	Notes
`off`	off	off	No economy pressure at all.
`lite`	on	off	Prevention-first: the executor climbs the ladder, but no critic round is spent.
`full`	on	on	Adds the Economy audit axis to the single `np-critic` spawn — flags over-engineering, stdlib-reinvention, native-duplication, and shrinkable logic (routed to the executor for simplification) on top of the always-on style/tests/acceptance axes.
`ultra` (install default)	on	on	As `full`, but the critic lowers its `shrinkable` bar, hunts reuse repo-wide, and flags single-use abstractions harder — more rounds for a leaner result.

Two defaults, on purpose. npx nubos-pilot (install and update) writes agents.economy: "ultra" into the config — into a fresh config on install, and backfilled into an existing config on update only when the key is absent (an explicit economy, or a legacy economy_critic, is treated as a deliberate choice and never overwritten). If the key is missing entirely at runtime, the resolved fallback is lite — a conservative safety net, not the shipped default. So a normal install/update runs ultra; only a hand-stripped config falls back to lite.

The critic is a deliberate counter-force to the Completeness doctrine: completeness always wins — in every mode the economy axis never flags a test, validation, error path, or security control as removable. Dial it down (full/lite/off) if the extra ultra rounds cost more than they save on your project. The same rubric is available on demand, without the loop, via /np:simplify-review. The resolved mode is exposed by the economy-mode CLI command (economy-mode --json prints the {mode, prevention, critic, ultra} gate flags the workflow reads).

Legacy agents.economy_critic (deprecated). The old boolean still validates and is honoured when agents.economy is absent: true → full, false → lite. Prefer the agents.economy enum; the bool will be removed in a future version.

`loop.*`

Per-task Nubosloop cap — see ADR-0010.

Key	Type	Default	Effect
`loop.maxRounds`	integer	`3`	Maximum rounds before the loop transitions a task to `stuck`. Range `[1, 100]` (clamped). The operator's "Weitermachen +5 Runden" stuck-dialog choice persists `nubosloop.max_rounds_override` on the checkpoint and overrides this value for the surviving task; `/np:resume-work` honours the override.
`loop.verify_runs`	integer	`1`	pass@k reliability. How many times `/np:execute-phase` runs a task's `<verify>` command per round. With `1` (default) the verify runs once, as before. With `k > 1` the orchestrator runs it `k` times and folds the exit codes via `np-tools.cjs verify-reliability`: a task goes green only when all `k` runs pass (pass^k). A flaky task (some pass, some fail) aggregates to red and flows through the normal `np-build-fixer` path, with a `FLAKY: …` line appended to the verify log so the fixer sees why. Raise this for suites with nondeterminism (timing, ordering, network) you want surfaced rather than tolerated.

`swarm.*`

Researcher-Schwarm and Critic configuration.

Key	Type	Default	Effect
`swarm.research.k`	integer	`3`	Number of parallel `np-researcher` spawns per Researcher-Schwarm round. Layer-C requires k audit entries per round (k-of-k gate, `loop-post-researcher-missing-spawn-audit`).
`swarm.research.threshold`	float `[0,1]`	`0.9`	Jaccard-similarity threshold for the pre-flight cache lookup. Hits at or above this similarity bypass the swarm.
`swarm.research.minOccurrence`	integer	`3`	Minimum prior-occurrence count required for a learning to count as a cache hit.
`swarm.research.min_agreement_score`	float `[0,1]`	`0.5`	Disagreement-gate floor (ADR-0018 Step 5.7) for the reconciler-reported `agreement_score`. Below this, `/np:research-phase` triggers an askuser (re-spawn / continue / manual). Workflow-level override: `researcher-reconcile gate <N> --min-agreement-score N`.
`swarm.research.max_contested`	integer	`2`	Disagreement-gate ceiling (ADR-0018) for the reconciler-reported `contested_count`. Workflow-level override: `researcher-reconcile gate <N> --max-contested N`.
`swarm.critic.style_tier`	enum	`"haiku"`	Tier the style axis runs at. Style work is mechanical, so haiku is the cost-optimal choice.
`swarm.critic.tests_tier`	enum	`"sonnet"`	Tier the tests axis runs at.
`swarm.critic.acceptance_tier`	enum	`"sonnet"`	Tier the acceptance axis runs at. Promote to `"opus"` for milestones with subtle acceptance criteria.
`swarm.critic.economy_tier`	enum	`"haiku"`	Tier the Economy axis runs at when `agents.economy` is `full` or `ultra`. The economy ladder is mechanical pattern-matching, so haiku is the cost-optimal choice. Inert at `off`/`lite`.
`swarm.knowledge_adapter`	enum	`"local"`	`"local"` reads/writes `.nubos-pilot/knowledge/learnings.json` through `lib/knowledge-adapter.cjs`. The only adapter shipped.

The default for swarm.critic is { style_tier: "haiku", tests_tier: "sonnet", acceptance_tier: "sonnet", economy_tier: "haiku" } (lib/config-defaults.cjs). Under the Single-Critic Revision (ADR-0010 §2026-05-05) a single np-critic spawn audits the always-on style/tests/acceptance axes plus the opt-in economy axis (agents.economy ∈ {full, ultra}), and each axis tier still resolves through resolve-model.

`spawn.headless.*` (Cost Layer L6)

Opt-in headless-subprocess mode for critic and researcher spawns; see ADR-0010 §L6. The orchestrator routes the named agents through bin/np-tools/spawn-headless.cjs, which shells out to claude -p --output-format json. The subprocess conversation lives entirely outside the parent session; only the final-message JSON is captured to disk.

Key	Type	Default	Effect
`spawn.headless.enabled`	boolean	`false`	Master switch. `false` = native Agent tool path (no behaviour change for existing installs). `true` = headless path for the agents listed below.
`spawn.headless.agents`	string[]	`["np-critic", "np-researcher"]`	Which agents are eligible for headless dispatch. `np-executor` and `np-build-fixer` are intentionally excluded — they mutate the working tree, and headless file mutations would not surface through the parent runtime's diff/edit telemetry, breaking the Layer-A commit-task gate. Extending this list to executor-class agents is a Layer-A risk.
`spawn.headless.timeout_ms`	integer	`600000` (10 min)	Per-spawn subprocess timeout. Below 1000 = `spawn-headless-invalid-timeout`.
`spawn.headless.fallback_on_error`	boolean	`true`	When `true`, a failing `claude -p` spawn (binary missing, auth-failure, timeout, non-zero exit) falls back to the native Agent tool. The fallback is stamped on the checkpoint (`nubosloop.spawn_headless_fallbacks[]`) so dashboards can count fallback rate.

The claude CLI must be on $PATH and authenticated independently from the parent session. The NUBOS_PILOT_CLAUDE_BIN env var overrides the binary path for split-install scenarios.

Trust-Layer compatibility. loop-audit-tool-use --agent <name> stamps are identical in both paths. Going headless does not bypass the Layer-C audit: the orchestrator MUST call loop-audit-tool-use after spawn-headless returns, exactly as it does after a native Agent-tool spawn.

`compression.*` (Context Compression + Elision Store)

Opt-in, deterministic context compression for the content nubos-pilot injects before it reaches a model. Large fenced blocks in a headless spawn prompt (bin/np-tools/spawn-headless.cjs) and oversized verify logs (bin/np-tools/loop-run-round.cjs) are crushed by rule-based reducers in lib/compress.cjs (JSON arrays, build/test logs, grep output, unified diffs, and source code — signatures/structure kept, statement bodies elided), each preserving error and outlier lines. Dropped originals are cached under .nubos-pilot/elision/<hash>.json and referenced inline by a ⟦elided:<hash>⟧ marker, so an agent can recover the full text on demand with np-tools.cjs elision-get <hash>. The elision store makes every lossy drop reversible: the worst case is an uncompressed answer, never a wrong one.

Where the compression bites depends on how the agent runs:

Off-host agents (openai-compat providers via lib/runtime/agent-loop.cjs) run their agentic loop inside nubos-pilot. Each tool result is crushed before it enters the message history that is re-sent every turn, so the savings compound across turns. These agents are also handed a context-expand tool (injected automatically while compression is on) so the model can pull back a ⟦elided:<hash>⟧ original itself mid-loop.
Native claude -p spawns run their loop inside the Claude CLI, invisible to nubos-pilot. To reach that traffic, compression.proxy.enabled starts a localhost proxy (lib/elision-proxy.cjs, forked into its own process), points the child's ANTHROPIC_BASE_URL at it, and crushes oversized tool_result blocks in each /v1/messages request before forwarding upstream. Only tool_result blocks are touched — system, tool definitions and ordinary text are left byte-identical, and because the rewrite is deterministic the compressed prefix stays cache-stable, so Anthropic prompt-caching is preserved.

Key	Type	Default	Effect
`compression.enabled`	boolean	`false`	Master switch. `false` = no behaviour change (verify logs fall back to dumb tail-`verify_max_bytes` truncation; prompts and tool results pass through untouched). `true` = crush injected/tool content and emit elision markers.
`compression.min_block_bytes`	integer	`2048`	A block smaller than this is left byte-identical. Below the threshold the savings don't justify the marker overhead.
`compression.verify_max_bytes`	integer	`2000`	Byte budget for the critic-facing verify-output excerpt. Error/stack lines are kept first, then the tail fills the remaining budget.
`compression.elision.enabled`	boolean	`true`	When `true`, dropped originals are cached and retrievable via `elision-get`. When `false`, blocks are still crushed but no marker/cache is written (irreversible — not recommended).
`compression.elision.ttl_ms`	integer	`1800000` (30 min)	Time-to-live for a cached original. After it elapses, `elision-get` reports `expired`; `lib/elision.cjs::prune` removes lapsed entries lazily.
`compression.proxy.enabled`	boolean	`false`	When `true` (and `compression.enabled` is on), native `claude -p` spawns are routed through the localhost elision proxy so their in-loop `tool_result` traffic is compressed. Requires no upstream change — it forwards to the original `ANTHROPIC_BASE_URL` (or `api.anthropic.com`). Default off.

When compression.enabled and compression.elision.enabled are both on, the Researcher-Schwarm also dedups its identical k-spawn input: it is cached once and each spawn spec carries an input_ref hash instead of re-inlining the payload k times.

Input-block coverage

The rule-based reducers in lib/compress.cjs also handle two brace-light block types beyond those listed above, both fully reversible via the elision store: prose (long natural-language tool output — head/tail sentences plus any directive/warning sentence kept, the middle sampled) and brace-light source such as Python (crushed by indent depth — module-level lines, block headers and raise/assert kept, deep statement bodies elided). Brace counting for C-family code masks string literals and //…/* */ comments first, so braces inside them never distort the structure detection. These are governed by the same compression.enabled master switch — no separate toggle.

`compression.output_steering.*` (Output-token reduction)

Deterministic output-token reduction — no model, no dependency. Two independent levers, both safe with the prompt cache:

Verbosity steering appends a fixed terseness directive to the end of the system prompt (wrapped in a <nubos_output_shaping> block, idempotent — re-enrichment never stacks). Appending after the cached prefix leaves prompt-caching intact. On the native claude -p path the directive is appended to the agent instructions instead; on off-host agents it enriches the system message each loop.
Effort routing classifies each off-host turn structurally — a fresh user ask or a turn recovering from a tool error keeps full effort (base_effort), while a turn that only continues from clean tool_results is "mechanical" and has its effort downgraded to mechanical_effort. The effort reaches the provider as the OpenAI-compat reasoning_effort field, which is only ever sent when you pin a concrete base_effort — that pin is your affirmation the provider supports it. Without it the lever is inert and the field is never sent, so providers without effort support are unaffected. Single-shot native spawns have no multi-turn loop, so routing does not apply there.

Key	Type	Default	Effect
`compression.output_steering.enabled`	boolean	`false`	Master switch for both levers. Requires `compression.enabled`.
`compression.output_steering.verbosity_profile`	string	`"balanced"`	`balanced` = no steering (no-op). `concise` / `terse` / `minimal` = cumulatively terser directives. An unknown value is treated as `balanced`.
`compression.output_steering.effort_routing.enabled`	boolean	`false`	When `true` and `base_effort` is pinned, downgrade effort on mechanical continuation turns (off-host loop only).
`compression.output_steering.effort_routing.base_effort`	string \| null	`null`	Effort tier for new-ask / error-recovery turns, sent as `reasoning_effort`. `null` = routing inert, field never sent. Set a concrete tier (`low`/`medium`/`high`/`xhigh`/`max`) only if your openai-compat provider supports reasoning effort.
`compression.output_steering.effort_routing.mechanical_effort`	string	`"low"`	Target effort floor for mechanical turns. Only applied when it is lower than `base_effort`; never an upgrade.

`compression.cache_align.*` (Prompt-cache alignment, proxy path)

Deterministic prompt-cache alignment for the native claude -p path only (it acts on the Anthropic /v1/messages body the elision proxy already sees; off-host openai-compat agents have no Anthropic tools/system prefix and are unaffected). Two levers in lib/cache-align.cjs, both idempotent:

Detect (never mutates) — scans the system prompt for cache-volatile tokens (UUID, ISO-8601 datetime, JWT, long hex hash) and logs one warning per kind, so when a prefix keeps missing the cache the cause is visible rather than mysterious.
Normalize (mutates the request, opt-in) — gives the tool-definition prefix a stable byte layout (tools sorted by name, schema object-keys sorted recursively; array element order is preserved) and adds exactly one cache_control: {type: "ephemeral"} breakpoint on the last tool. Customer-placement-wins: if the caller already set any cache_control on system or tools, nothing is reordered or added — a deliberate cache layout is never disturbed.

Unlike the rest of compression.* (which keeps the prefix byte-identical), this lever deliberately rewrites the tool prefix to make it canonical and cacheable, which is why it is a separate opt-in. It only engages when the proxy path (compression.proxy.enabled) is in use.

Key	Type	Default	Effect
`compression.cache_align.enabled`	boolean	`false`	When `true` (and `compression.enabled`), normalize the tool prefix and add a cache breakpoint on proxied `/v1/messages` requests, and log volatile-token warnings. Default off.

`security.*` (In-Session Security Review, ADR-0020)

Always-on, non-blocking in-session security layer; see In-Session Security Review and ADR-0020. Built-in checks cannot be disabled individually — only whole layers or the feature. Custom rules and guidance are additive. Toggling takes effect at runtime; no reinstall needed.

Key	Type	Default	Effect
`security.enabled`	boolean	`true`	Master switch for the whole feature. `false` = every hook no-ops silently.
`security.scan_on_write`	boolean	`true`	Layer 1 — deterministic per-edit pattern scan (`PostToolUse` on `Edit`/`Write`/`MultiEdit`/`NotebookEdit`). No model call.
`security.review_on_stop`	boolean	`true`	Layer 2 — background semantic review of the turn-diff at `Stop`, surfaced as a non-blocking follow-up.
`security.review_on_commit`	boolean	`true`	Layer 3 — deeper background review on the agent's own `git commit`/`git push` (`PostToolUse` on `Bash`).
`security.custom_rules_path`	string \| `null`	`null`	JSON file of additional per-edit patterns (`rule_name`, `reminder`, `regex`/`substrings`, `paths`/`exclude_paths`). Additive; ≤50 rules; catastrophic-backtracking regexes skipped.
`security.guidance_path`	string \| `null`	`null`	Markdown "what to watch for" file injected into the model-backed reviews. Additive.
`security.review_timeout_ms`	integer	`180000`	Per-review subprocess timeout for the independent reviewer spawn.
`security.max_stop_reviews_in_a_row`	integer	`3`	Consecutive `Stop` surfacings before yielding back to the user.
`security.max_commit_reviews_per_hour`	integer	`20`	Rolling-hour cap on commit/push reviews.
`security.max_files_per_review`	integer	`30`	Max changed files reviewed per turn.

`conformance.*` (Requirements awareness)

Key	Type	Default	Effect
`conformance.inject_criteria`	boolean	`true`	When `true`, `/np:execute-phase` injects the milestone `success_criteria` (the acceptance target, intent-level per ADR-0019) into the `np-executor` spawn prompt, so the executor writes against the requirements from round 1 instead of only against the `verify` command. Set to `false` to keep the executor prompt criteria-free (the post-task `np-critic-acceptance` review is unaffected either way).

`auto_log_learning`

Top-level boolean. Default true. When true, lib/nubosloop.cjs::autoLogLearning persists the merged Researcher-Schwarm consensus + the Executor's final diff to the learnings store on every commit. _runCommit skips the auto-log when (a) cache_hit: true (the cached pattern is already in the store), (b) --learning-pattern is a sentinel placeholder (<...>), or (c) --learning-pattern is empty.

`learnings.*` (Stop-hook continuous-learning capture, ADR-0010)

Distinct from auto_log_learning (which captures inside the /np:execute-phase loop): this is session-level auto-capture for any work — including ad-hoc edits outside a milestone. On Claude Code, the np-learnings-hook fires on session Stop; np-tools.cjs learnings capture then (rate-limited) spawns the read-only np-learnings-extractor headlessly over the turn's diff, and folds the atomic {pattern, outcome} learnings it returns into the same store learning-match reads at execute-phase Round 1. A companion hook on UserPromptSubmit resets the consecutive-stop streak. The extractor is tier: haiku — a cheap observer. On by default, rate-limited so the LLM cost stays bounded.

Key	Type	Default	Effect
`learnings.auto_capture`	boolean	`true`	Master switch for Stop-hook capture. `false` disables it entirely (the hook returns immediately, no spawn, no cost).
`learnings.max_captures_per_hour`	integer	`10`	Sliding per-hour cap per session. Once hit, further Stops skip capture until the window clears — the primary cost guard.
`learnings.max_in_a_row`	integer	`3`	Consecutive-Stop cap. Prevents a tight edit/stop loop from firing an extraction every Stop; reset by the next `UserPromptSubmit`.
`learnings.timeout_ms`	integer	`120000`	Headless extractor timeout (ms).
`learnings.max_files`	integer	`30`	Cap on changed files fed to the extractor; the diff itself is byte-capped.

The hooks install with npx nubos-pilot (the installer runs install-hooks --which all). To remove just these, re-run hook install without the learnings group, or set auto_capture: false to neutralise them in place.

Reading values

Programmatically, from inside a workflow:

bash

node np-tools.cjs config-get model_profile
node np-tools.cjs config-get workflow.commit_artifacts
node np-tools.cjs config-get workflow.research_tools.WebFetch

Dotted key paths are resolved by bin/np-tools/config.cjs. Forbidden segments (anything not matching /^[a-zA-Z0-9_-]+$/ or with a __proto__ lookalike) raise config-forbidden-key.

Writing values

There is no dedicated config-set command. To change a value:

Edit .nubos-pilot/config.json directly with your editor of choice.
Or run npx nubos-pilot update and re-answer the install questions.

The file is small enough that hand-editing is the supported path. Workflows do not mutate it during normal operation; keeping it human-owned is part of the three-trees invariant.

Environment variable overrides

A handful of values can be overridden by env vars at workflow-spawn time, useful for CI:

Env var	Overrides	Notes
`CLAUDECODE`	`workflow.text_mode`	Set automatically by Claude Code; affects prompt rendering.
`NP_TOOLS_WEBFETCH`	`workflow.research_tools.WebFetch`	`0`/`1`/`true`/`false`. Forces researcher probe on/off.
`NP_TOOLS_CONTEXT7`	`workflow.research_tools.Context7`	Same shape.
`NUBOS_PILOT_CLAUDE_BIN`	binary path used by `spawn-headless`	When `spawn.headless.enabled=true`, overrides the `claude` binary the subprocess invokes. Useful for split-install or alternate CLI versions.

Env vars win over config.json; config.json wins over compiled defaults.

Validation

bin/np-tools/config.cjs reads the file with JSON.parse on every call; there is no schema validator. Malformed JSON throws config-parse-error; an unknown dotted-key segment returns null (not an error). New keys can be added freely, and old keys can be left in place without runtime impact.

Configuration ​

Where the file lives ​

Default shape ​

Top-level keys ​

workflow.* ​

workflow.tier_routing ​

model_providers.* and agent_routing.* ​

model_providers.<name> ​

agent_routing.<agent-or-glob> ​

Agent skills ​

agents.* ​

agents.architect & agents.test_writer (per-task Nubosloop steps) ​

agents.economy (graduated Economy axis) ​

loop.* ​

swarm.* ​

spawn.headless.* (Cost Layer L6) ​

compression.* (Context Compression + Elision Store) ​

Input-block coverage ​

compression.output_steering.* (Output-token reduction) ​

compression.cache_align.* (Prompt-cache alignment, proxy path) ​

security.* (In-Session Security Review, ADR-0020) ​

conformance.* (Requirements awareness) ​

auto_log_learning ​

learnings.* (Stop-hook continuous-learning capture, ADR-0010) ​

Reading values ​

Writing values ​

Environment variable overrides ​

Validation ​

Configuration

Where the file lives

Default shape

Top-level keys

`workflow.*`

`workflow.tier_routing`

`model_providers.` and `agent_routing.`

`model_providers.<name>`

`agent_routing.<agent-or-glob>`

Agent skills

`agents.*`

`agents.architect` & `agents.test_writer` (per-task Nubosloop steps)

`agents.economy` (graduated Economy axis)

`loop.*`

`swarm.*`

`spawn.headless.*` (Cost Layer L6)

`compression.*` (Context Compression + Elision Store)

Input-block coverage

`compression.output_steering.*` (Output-token reduction)

`compression.cache_align.*` (Prompt-cache alignment, proxy path)

`security.*` (In-Session Security Review, ADR-0020)

`conformance.*` (Requirements awareness)

`auto_log_learning`

`learnings.*` (Stop-hook continuous-learning capture, ADR-0010)

Reading values

Writing values

Environment variable overrides

Validation