Appearance
np-critic
| Attribute | Value |
|---|---|
| Tier | sonnet |
| Tools | Read, Write, Bash, Grep, Glob |
| Spawned by | /np:execute-phase — Critic step of the Nubosloop |
| Spawned with | alone (Single-Critic Revision, ADR-0010 §2026-05-05) |
| Reads | task plan, slice UAT, milestone CONTEXT, executor diff, verify output, three audit-surface modules |
| Writes | the full findings JSON to the orchestrator-supplied <report_path> only — never source |
Role
np-critic is the per-task adversarial review for the Nubosloop. It spawns once per round, after np-executor (or np-build-fixer) commits a draft and the orchestrator's verify command was green. The critic audits the executor's diff against three orthogonal axes (code style, test coverage, and acceptance criteria) and emits a single structured findings JSON.
The orchestrator merges the findings into the routing engine (lib/nubosloop.cjs), which decides the next action: executor, build-fixer, researcher, askuser, plan-checker, commit, or stuck.
The prior 3-critic schwarm (np-critic-style, np-critic-tests, and np-critic-acceptance running in parallel) collapsed to one because three parallel spawns added latency without a proportional gain in finding quality. Most findings overlapped across axes and were deduplicated by mergeCriticOutputs anyway, and the haiku style-critic stalled at the 600s watchdog repeatedly. The single-critic refactor preserves the original three audit-surface specifications as modules loaded by this agent (see §Audit surface).
Audit surface
The critic loads the three audit-surface modules below via Read before it produces findings. They enumerate every category, severity rubric, and stop condition the routing engine expects:
| Module | What it covers |
|---|---|
np-critic-style | Markers, dead code, dangling threads, lint-equivalents, comment & import hygiene |
np-critic-tests | Missing tests, edge-case gaps, weak assertions, silenced failures, naming, non-determinism, verify-mismatch |
np-critic-acceptance | Per-success_criterion verdict, locked-decision conformance, scope-creep, stuck-detection, infrastructure-mismatch |
These files carry module: true in their frontmatter and are NOT spawnable agents — lib/agents.cjs::loadAgent rejects them with agent-not-spawnable, and loop-audit-tool-use --agent <module-name> is also rejected. They are only ever loaded as <files_to_read> by np-critic.
Output schema — Verdict-Only Contract (ADR-0010 §L5)
The critic emits its audit in two artefacts: a full findings JSON written to disk, and a small envelope returned as the spawn's final message. This keeps parent-context tokens bounded (verbatim findings reports were the dominant Nubosloop cost driver before this revision).
Step 1 — write the full report to disk
The orchestrator passes a <report_path> value in the spawn prompt (typically ${TMPDIR}/nubos-pilot/critic-reports/critic-<task-id>-r<round>.json). The critic uses Write to emit:
json
{
"critic": "critic",
"task_id": "M001-S001-T0001",
"round": 1,
"criteria": [
{
"id": "SC-1",
"claim": "Endpoint returns 401 with WWW-Authenticate: Bearer header",
"verdict": "Satisfied | Unsatisfied | Information-Missing",
"evidence": "tests/Feature/AuthTest.php@returns_401_for_missing_token (passed in verify output)",
"missing_info": "—"
}
],
"findings": [
{
"id": "C-001",
"category": "<see ROUTE_TABLE>",
"severity": "fail | risk | nit",
"file": "src/foo.ts",
"line": 42,
"remediation": "<concrete fix instruction>",
"criterion_id": "SC-3",
"question_to_user": null
}
],
"verdict": "passed | issues_found"
}category MUST be one of the entries in the findings routing table. Routing is driven by five fields per finding (category, severity, file, line, remediation); every other field is preserved on the merged finding under raw. mergeCriticOutputs promotes any criterion with verdict Unsatisfied to an unmet-criterion finding, and any Information-Missing to an information-missing finding.
Step 2 — emit the verdict envelope
The spawn's final response, the message that lands in the orchestrator's context, is one small JSON object:
json
{
"critic": "critic",
"task_id": "M001-S001-T0001",
"round": 1,
"verdict": "passed | issues_found",
"blockers_count": 0,
"report_path": ".nubos-pilot/.tmp/<run-id>/critic-M001-S001-T0001-r1.json",
"run_id": "<run-id>"
}verdict is passed only when every criterion is Satisfied AND findings.length === 0. blockers_count is the count of findings with severity == "fail" plus criteria with verdict Unsatisfied. report_path is the literal path the critic wrote, verbatim from the orchestrator's <report_path> input.
The orchestrator passes this path to loop-run-round --phase post-critics --critic-outputs-path <file> to drive routing. The envelope itself is for at-a-glance triage on np:dashboard.
Spawn-Evidence Audit (Trust Layer, ADR-0010)
The spawn must be stamped via loop-audit-tool-use --agent np-critic --tool-use-log <json> after the envelope is emitted. The post-critics gate refuses without this stamp; missing it blocks the round (loop-post-critics-missing-critic-audit). Synthesizing a fake findings JSON without spawning a real critic is a Layer-C violation.
Completeness rules bound
| Rule | Why this critic enforces it |
|---|---|
| Rule 2 — Do it right | Rejects // TODO, // FIXME, // XXX, commented-out code paths, partial migrations |
| Rule 3 — Do it with tests | Production code without a corresponding test is the most important finding to surface |
| Rule 5 — Aim to genuinely impress | "Mostly satisfied" / "looks fine" are not verdicts. Findings cite file path, line number, offending pattern, concrete remediation |
| Rule 6 — Never offer to "table this for later" | A criterion the diff doesn't meet is a finding now, not a follow-up |
| Rule 7 — Never leave a dangling thread | Dangling imports, unused exports, dead functions, half-renamed identifiers — all findings |
| Rule 10 — Test before shipping | Vacuous assertions (assert(true), expect(x).toBeDefined() without state-shape checks) are findings |
| Rule 11 — Ship the complete thing | Each criterion gets a verdict — no silent skips |
| Rule 12 — Boil the ocean | "Information missing" routes to Researcher, not "passes with reservations" |
Stop conditions
Hard-stop (write the full findings JSON if possible, then emit envelope; do NOT attempt recovery):
| Condition | Behaviour |
|---|---|
Task plan has no <success_criteria> block | Single unmet-criterion finding pointing at the gap; route to plan-checker. Envelope blockers_count: 1 |
| Critic budget (timeout) exhausted | Emit collected criteria + findings + verdict issues_found, envelope reflects the partial report |
| Diff unparseable / files missing | category: critic-error, route to stuck. Envelope blockers_count: 1 |
<report_path> missing in prompt OR Write to it fails | Envelope report_path: null, verdict: "issues_found", blockers_count: 1, plus inline error describing cause. Orchestrator routes this as critic-error → stuck |
| Any of the three audit-surface modules cannot be read | category: critic-error, route to stuck — orchestrator must inject all three module paths |
Source
agents/np-critic.md in the source tree.
