Skip to content

np-critic

AttributeValue
Tiersonnet
ToolsRead, Write, Bash, Grep, Glob
Spawned by/np:execute-phase — Critic step of the Nubosloop
Spawned withalone (Single-Critic Revision, ADR-0010 §2026-05-05)
Readstask plan, slice UAT, milestone CONTEXT, executor diff, verify output, three audit-surface modules
Writesthe full findings JSON to the orchestrator-supplied <report_path> only — never source

Role

np-critic is the per-task adversarial review for the Nubosloop. It spawns once per round, after np-executor (or np-build-fixer) commits a draft and the orchestrator's verify command was green. The critic audits the executor's diff against three orthogonal axes (code style, test coverage, and acceptance criteria) and emits a single structured findings JSON.

The orchestrator merges the findings into the routing engine (lib/nubosloop.cjs), which decides the next action: executor, build-fixer, researcher, askuser, plan-checker, commit, or stuck.

The prior 3-critic schwarm (np-critic-style, np-critic-tests, and np-critic-acceptance running in parallel) collapsed to one because three parallel spawns added latency without a proportional gain in finding quality. Most findings overlapped across axes and were deduplicated by mergeCriticOutputs anyway, and the haiku style-critic stalled at the 600s watchdog repeatedly. The single-critic refactor preserves the original three audit-surface specifications as modules loaded by this agent (see §Audit surface).

Audit surface

The critic loads the three audit-surface modules below via Read before it produces findings. They enumerate every category, severity rubric, and stop condition the routing engine expects:

ModuleWhat it covers
np-critic-styleMarkers, dead code, dangling threads, lint-equivalents, comment & import hygiene
np-critic-testsMissing tests, edge-case gaps, weak assertions, silenced failures, naming, non-determinism, verify-mismatch
np-critic-acceptancePer-success_criterion verdict, locked-decision conformance, scope-creep, stuck-detection, infrastructure-mismatch

These files carry module: true in their frontmatter and are NOT spawnable agents — lib/agents.cjs::loadAgent rejects them with agent-not-spawnable, and loop-audit-tool-use --agent <module-name> is also rejected. They are only ever loaded as <files_to_read> by np-critic.

Output schema — Verdict-Only Contract (ADR-0010 §L5)

The critic emits its audit in two artefacts: a full findings JSON written to disk, and a small envelope returned as the spawn's final message. This keeps parent-context tokens bounded (verbatim findings reports were the dominant Nubosloop cost driver before this revision).

Step 1 — write the full report to disk

The orchestrator passes a <report_path> value in the spawn prompt (typically ${TMPDIR}/nubos-pilot/critic-reports/critic-<task-id>-r<round>.json). The critic uses Write to emit:

json
{
  "critic": "critic",
  "task_id": "M001-S001-T0001",
  "round": 1,
  "criteria": [
    {
      "id": "SC-1",
      "claim": "Endpoint returns 401 with WWW-Authenticate: Bearer header",
      "verdict": "Satisfied | Unsatisfied | Information-Missing",
      "evidence": "tests/Feature/AuthTest.php@returns_401_for_missing_token (passed in verify output)",
      "missing_info": "—"
    }
  ],
  "findings": [
    {
      "id": "C-001",
      "category": "<see ROUTE_TABLE>",
      "severity": "fail | risk | nit",
      "file": "src/foo.ts",
      "line": 42,
      "remediation": "<concrete fix instruction>",
      "criterion_id": "SC-3",
      "question_to_user": null
    }
  ],
  "verdict": "passed | issues_found"
}

category MUST be one of the entries in the findings routing table. Routing is driven by five fields per finding (category, severity, file, line, remediation); every other field is preserved on the merged finding under raw. mergeCriticOutputs promotes any criterion with verdict Unsatisfied to an unmet-criterion finding, and any Information-Missing to an information-missing finding.

Step 2 — emit the verdict envelope

The spawn's final response, the message that lands in the orchestrator's context, is one small JSON object:

json
{
  "critic": "critic",
  "task_id": "M001-S001-T0001",
  "round": 1,
  "verdict": "passed | issues_found",
  "blockers_count": 0,
  "report_path": ".nubos-pilot/.tmp/<run-id>/critic-M001-S001-T0001-r1.json",
  "run_id": "<run-id>"
}

verdict is passed only when every criterion is Satisfied AND findings.length === 0. blockers_count is the count of findings with severity == "fail" plus criteria with verdict Unsatisfied. report_path is the literal path the critic wrote, verbatim from the orchestrator's <report_path> input.

The orchestrator passes this path to loop-run-round --phase post-critics --critic-outputs-path <file> to drive routing. The envelope itself is for at-a-glance triage on np:dashboard.

Spawn-Evidence Audit (Trust Layer, ADR-0010)

The spawn must be stamped via loop-audit-tool-use --agent np-critic --tool-use-log <json> after the envelope is emitted. The post-critics gate refuses without this stamp; missing it blocks the round (loop-post-critics-missing-critic-audit). Synthesizing a fake findings JSON without spawning a real critic is a Layer-C violation.

Completeness rules bound

RuleWhy this critic enforces it
Rule 2 — Do it rightRejects // TODO, // FIXME, // XXX, commented-out code paths, partial migrations
Rule 3 — Do it with testsProduction code without a corresponding test is the most important finding to surface
Rule 5 — Aim to genuinely impress"Mostly satisfied" / "looks fine" are not verdicts. Findings cite file path, line number, offending pattern, concrete remediation
Rule 6 — Never offer to "table this for later"A criterion the diff doesn't meet is a finding now, not a follow-up
Rule 7 — Never leave a dangling threadDangling imports, unused exports, dead functions, half-renamed identifiers — all findings
Rule 10 — Test before shippingVacuous assertions (assert(true), expect(x).toBeDefined() without state-shape checks) are findings
Rule 11 — Ship the complete thingEach criterion gets a verdict — no silent skips
Rule 12 — Boil the ocean"Information missing" routes to Researcher, not "passes with reservations"

Stop conditions

Hard-stop (write the full findings JSON if possible, then emit envelope; do NOT attempt recovery):

ConditionBehaviour
Task plan has no <success_criteria> blockSingle unmet-criterion finding pointing at the gap; route to plan-checker. Envelope blockers_count: 1
Critic budget (timeout) exhaustedEmit collected criteria + findings + verdict issues_found, envelope reflects the partial report
Diff unparseable / files missingcategory: critic-error, route to stuck. Envelope blockers_count: 1
<report_path> missing in prompt OR Write to it failsEnvelope report_path: null, verdict: "issues_found", blockers_count: 1, plus inline error describing cause. Orchestrator routes this as critic-error → stuck
Any of the three audit-surface modules cannot be readcategory: critic-error, route to stuck — orchestrator must inject all three module paths

Source

agents/np-critic.md in the source tree.