np-critic

Attribute	Value
Tier	`sonnet`
Tools	`Read, Write, Bash, Grep, Glob`
Spawned by	`/np:execute-phase` — Critic step of the Nubosloop
Spawned with	alone (Single-Critic Revision, ADR-0010 §2026-05-05)
Reads	task plan, slice UAT, milestone CONTEXT, executor diff, verify output, the three core audit-surface modules (plus the economy module when `agents.economy` is `full`/`ultra`)
Writes	the full findings JSON to the orchestrator-supplied `<report_path>` only — never source

Role

np-critic is the per-task adversarial review for the Nubosloop. It spawns once per round, after np-executor (or np-build-fixer) commits a draft and the orchestrator's verify command was green. The critic audits the executor's diff against three orthogonal axes (code style, test coverage, and acceptance criteria), plus a fourth economy axis when the project opts into it, and emits a single structured findings JSON.

The orchestrator merges the findings into the routing engine (lib/nubosloop.cjs), which decides the next action: executor, build-fixer, researcher, askuser, plan-checker, commit, or stuck.

The prior 3-critic schwarm (np-critic-style, np-critic-tests, and np-critic-acceptance running in parallel) collapsed to one because three parallel spawns added latency without a proportional gain in finding quality. Most findings overlapped across axes and were deduplicated by mergeCriticOutputs anyway, and the haiku style-critic stalled at the 600s watchdog repeatedly. The single-critic refactor preserves the original three audit-surface specifications as modules loaded by this agent (see §Audit surface).

Audit surface

The critic loads its audit-surface modules via Read before it produces findings. They enumerate every category, severity rubric, and stop condition the routing engine expects. The first three load on every spawn. The fourth, np-critic-economy, loads only when the resolved Economy mode is full or ultra; at off or lite the orchestrator omits it from the <files_to_read> block and the critic produces no economy findings.

Module	Injected	What it covers
`np-critic-style`	always	Markers, dead code, dangling threads, lint-equivalents, comment & import hygiene
`np-critic-tests`	always	Missing tests, edge-case gaps, weak assertions, silenced failures, naming, non-determinism, verify-mismatch
`np-critic-acceptance`	always	Per-`success_criterion` verdict, locked-decision conformance, scope-creep, stuck-detection, infrastructure-mismatch
`np-critic-economy`	`full`/`ultra` only	Over-engineering, stdlib-reinvention, native-duplication, shrinkable logic — bounded so it never flags a test, validation, error path, or security control

These files carry module: true in their frontmatter and are NOT spawnable agents — lib/agents.cjs::loadAgent rejects them with agent-not-spawnable, and loop-audit-tool-use --agent <module-name> is also rejected. They are only ever loaded as <files_to_read> by np-critic.

Output schema — Verdict-Only Contract (ADR-0010 §L5)

The critic emits its audit in two artefacts: a full findings JSON written to disk, and a small envelope returned as the spawn's final message. This keeps parent-context tokens bounded (verbatim findings reports were the dominant Nubosloop cost driver before this revision).

Step 1 — write the full report to disk

The orchestrator passes a <report_path> value in the spawn prompt (typically ${TMPDIR}/nubos-pilot/critic-reports/critic-<task-id>-r<round>.json). The critic uses Write to emit:

json

{
  "critic": "critic",
  "task_id": "M001-S001-T0001",
  "round": 1,
  "criteria": [
    {
      "id": "SC-1",
      "claim": "Endpoint returns 401 with WWW-Authenticate: Bearer header",
      "verdict": "Satisfied | Unsatisfied | Information-Missing",
      "evidence": "tests/Feature/AuthTest.php@returns_401_for_missing_token (passed in verify output)",
      "missing_info": "—"
    }
  ],
  "findings": [
    {
      "id": "C-001",
      "category": "<see ROUTE_TABLE>",
      "severity": "fail | risk | nit",
      "file": "src/foo.ts",
      "line": 42,
      "remediation": "<concrete fix instruction>",
      "criterion_id": "SC-3",
      "question_to_user": null
    }
  ],
  "verdict": "passed | issues_found"
}

category MUST be one of the entries in the findings routing table. Routing is driven by five fields per finding (category, severity, file, line, remediation); every other field is preserved on the merged finding under raw. mergeCriticOutputs promotes any criterion with verdict Unsatisfied to an unmet-criterion finding, and any Information-Missing to an information-missing finding.

Step 2 — emit the verdict envelope

The spawn's final response, the message that lands in the orchestrator's context, is one small JSON object:

json

{
  "critic": "critic",
  "task_id": "M001-S001-T0001",
  "round": 1,
  "verdict": "passed | issues_found",
  "blockers_count": 0,
  "report_path": ".nubos-pilot/.tmp/<run-id>/critic-M001-S001-T0001-r1.json",
  "run_id": "<run-id>"
}

verdict is passed only when every criterion is Satisfied AND findings.length === 0. blockers_count is the count of findings with severity == "fail" plus criteria with verdict Unsatisfied. report_path is the literal path the critic wrote, verbatim from the orchestrator's <report_path> input.

The orchestrator passes this path to loop-run-round --phase post-critics --critic-outputs-path <file> to drive routing. The envelope itself is for at-a-glance triage on np:dashboard.

Spawn-Evidence Audit (Trust Layer, ADR-0010)

The spawn must be stamped via loop-audit-tool-use --agent np-critic --tool-use-log <json> after the envelope is emitted. The post-critics gate refuses without this stamp; missing it blocks the round (loop-post-critics-missing-critic-audit). Synthesizing a fake findings JSON without spawning a real critic is a Layer-C violation.

Completeness rules bound

Rule	Why this critic enforces it
Rule 2 — Do it right	Rejects `// TODO`, `// FIXME`, `// XXX`, commented-out code paths, partial migrations
Rule 3 — Do it with tests	Production code without a corresponding test is the most important finding to surface
Rule 5 — Aim to genuinely impress	"Mostly satisfied" / "looks fine" are not verdicts. Findings cite file path, line number, offending pattern, concrete remediation
Rule 6 — Never offer to "table this for later"	A criterion the diff doesn't meet is a finding now, not a follow-up
Rule 7 — Never leave a dangling thread	Dangling imports, unused exports, dead functions, half-renamed identifiers — all findings
Rule 10 — Test before shipping	Vacuous assertions (`assert(true)`, `expect(x).toBeDefined()` without state-shape checks) are findings
Rule 11 — Ship the complete thing	Each criterion gets a verdict — no silent skips
Rule 12 — Boil the ocean	"Information missing" routes to Researcher, not "passes with reservations"

Stop conditions

Hard-stop (write the full findings JSON if possible, then emit envelope; do NOT attempt recovery):

Condition	Behaviour
Task plan has no `<success_criteria>` block	Single `unmet-criterion` finding pointing at the gap; route to plan-checker. Envelope `blockers_count: 1`
Critic budget (timeout) exhausted	Emit collected criteria + findings + verdict `issues_found`, envelope reflects the partial report
Diff unparseable / files missing	`category: critic-error`, route to stuck. Envelope `blockers_count: 1`
`<report_path>` missing in prompt OR `Write` to it fails	Envelope `report_path: null`, `verdict: "issues_found"`, `blockers_count: 1`, plus inline `error` describing cause. Orchestrator routes this as `critic-error → stuck`
A module listed in `<files_to_read>` cannot be read	`category: critic-error`, route to stuck — the orchestrator must inject every listed module path. The economy module is conditional: when it is absent (mode `off`/`lite`) its absence is never an error

Source

agents/np-critic.md in the source tree.

np-critic ​

Role ​

Audit surface ​

Output schema — Verdict-Only Contract (ADR-0010 §L5) ​

Step 1 — write the full report to disk ​

Step 2 — emit the verdict envelope ​

Spawn-Evidence Audit (Trust Layer, ADR-0010) ​

Completeness rules bound ​

Stop conditions ​

Source ​