Skip to content

ADR-0017: Strict Output-Schema Enforcement at Write-Time

Context and Problem Statement

Nubos-pilot agents write structured markdown artefacts (M<NNN>-VERIFICATION.md, M<NNN>-VALIDATION.md, S<NNN>-PLAN.md, and others) consumed downstream by other workflows: the close-project aggregator (ADR-0016), the dashboard, the planner. Each artefact has a contract: required frontmatter keys, header patterns, status enums, cross-field invariants. The contract has historically lived in prose, in agent body files (agents/np-*.md), workflow markdown, and template skeletons.

In production this surfaced as silent drift:

  1. np-verifier emitted ### SC-1: [object Object] because a success criterion entered the YAML as {id, text} instead of a string. Prose said "string"; runtime accepted both; rendering coerced it via String(sc).
  2. np-verifier rendered some files with ## SC-N — … (H2, em-dash) while others used ### SC-N: … (H3, colon). Prose specified H3-colon; nothing prevented H2-em-dash.
  3. np-nyquist-auditor produced VALIDATION.md whose frontmatter omitted the covered / under_sampled / uncovered counts, forcing downstream aggregators to grep the body for UNCOVERED, where they hit narrative prose as false positives.

In all three cases the drift first became visible at /np:close-project, the project-level aggregator, by which point N milestones had landed with subtly broken artefacts. The agents producing the bad files exited successfully; downstream consumers either accepted the drift silently or amplified it into phantom blockers.

The Completeness Doctrine (ADR-0012) demands mechanical checks for every rule it cites. Output-shape compliance was unenforced. Rule 3 ("do it with tests") and Rule 11 ("ship the complete thing") had no machine-checkable proxy for artefact correctness.

The Rule

Every workflow that writes a structured markdown artefact MUST (a) inject the artefact's output-schema into the spawn prompt before invocation, and (b) run node .nubos-pilot/bin/np-tools.cjs output-lint check --enforce against the just-written file before declaring success. Drift is a hard exit at write-time, not at consumer-time.

Schemas live in lib/schemas/<name>.cjs as plain frozen JS objects (zero-dep, ADR-0002). The validator (lib/output-lint.cjs) parses frontmatter via lib/frontmatter.cjs, checks required keys / types / enums / cross-field invariants, and walks body blocks with anchored regex, not free-text grep.

Decision Drivers

  • Fail fast, fail at the cause. Drift discovered at aggregation time is N times more expensive to fix than drift discovered at the producing spawn. Every consumer between cause and discovery is now suspect.
  • Schema-in-prompt is the strictest possible specification. Agents can't follow prose they didn't read; injecting the schema as a hard-contract section in the spawn prompt makes the contract impossible to miss.
  • Belt and suspenders. Pre-spawn injection sets the expectation; post-spawn lint catches deviation. Both layers are cheap; both catch different failure modes.
  • Mechanical, not advisory. The lint check exits non-zero on violation. No "best effort" downgrade. Aligns with ADR-0012 rules 5, 10, 11.
  • Zero deps. Pure Node fs plus the existing lib/frontmatter.cjs. No JSON-Schema validator dependency.
  • Drift visibility for legacy files. np:doctor walks every existing milestone and lints its artefacts against the current schemas, reporting drift that predates the rule.

Considered Options

  • A: Status quo, prose contracts only. Reject: documented production failures.
  • B: Lint at consumer-time (close-project, dashboard). Reject: the bug is observed N milestones too late, by which time partial trust in the artefact has propagated to commits and dependent workflows.
  • C: JSON Schema with an ajv dependency. Reject: violates ADR-0002; brings ESM/CJS interop pain; the checks are simple enough for a ~330-line validator.
  • D: Schemas as JS objects + custom validator + hard-gate per workflow. Chosen.

Decision Outcome

Chosen: Option D, write-time enforcement via per-artefact schemas, because it fixes the documented failures at the only correct layer (the moment of write), composes cleanly with the existing zero-dep stack, and gives np:doctor a free retroactive drift detector.

Layout

lib/
  output-lint.cjs              # validator engine (lintContent, lintFile, enforceFile, schemaPrompt)
  output-lint.test.cjs         # unit tests
  schemas/
    index.cjs                  # getSchema(name), inferSchemaForFile(path), listSchemas(), REGISTRY
    verification.cjs           # M<NNN>-VERIFICATION.md contract
    validation.cjs             # M<NNN>-VALIDATION.md contract
    researcher-output.cjs      # per-spawn researcher contract  (ADR-0018)
    research-final.cjs         # reconciler output contract     (ADR-0018)
bin/np-tools/
  output-lint.cjs              # CLI: check | prompt | list
workflows/
  verify-work.md               # injects the verification schema, enforces post-spawn
  validate-phase.md            # injects the validation schema, enforces post-spawn

lib/schemas/index.cjs::inferSchemaForFile maps a basename to a schema by suffix: -VERIFICATION.md → verification, -VALIDATION.md → validation, -RESEARCH.md → research-final, spawn-<i>.md → researcher-output. getSchema throws output-schema-not-found (NubosPilotError) on an unknown name and reports the available set.

Schema shape

A schema is a plain object with a name, an artifact path-pattern, a description, a frontmatter block (required[], typed properties, cross-field invariants[]), and a body block (blocks with heading_pattern / min_count / required_fields[] / heading_forbidden_substring, and anchored patterns[]). For example, the verification schema requires schema_version, milestone, milestone_status, sc_total, passed, failed, deferred, pending; constrains milestone_status to the enum verified | failed | deferred; and carries the invariant sc_total === passed + failed + deferred + pending. Its body requires at least one ### SC-N: … block, each with a Status field in Pass | Fail | Defer | Pending, forbids the [object Object] substring in headings, and requires a **Milestone Status:** header line.

Validator engine

lib/output-lint.cjs exposes four entry points:

  • lintContent(rawContent, schema): pure; returns { ok, violations[], frontmatter, schema_name }. Each violation carries a path, a code (missing-required, type, enum, min, invariant, body-pattern-min, body-pattern-max, forbidden-pattern, block-min, block-field-missing, block-field-enum, block-heading-forbidden, and so on), and a message.
  • lintFile(filePath, schema): reads the file and delegates; returns a file-missing violation on ENOENT.
  • enforceFile(filePath, schema): lints, and on any violation throws output-schema-violation (NubosPilotError) carrying { schema, file, violations }.
  • schemaPrompt(schema): renders the schema as a markdown contract block for injection into a spawn prompt; the rendered block ends with the hard-fail contract sentence.

The invariant evaluator supports the operators =/==/===, !=/!==, <, <=, >, >=, and additive (a + b + …) right-hand expressions resolved against the parsed frontmatter, enough for the count-sum invariants the schemas need without an expression-language dependency.

Workflow integration pattern

Every consumer of this rule does two things:

bash
# 1. Pre-spawn: inject schema into agent prompt
SCHEMA=$(node .nubos-pilot/bin/np-tools.cjs output-lint prompt --schema verification)
# … pass $SCHEMA as a literal section in the agent's spawn input …

# 2. Post-write: hard-gate
node .nubos-pilot/bin/np-tools.cjs output-lint check \
  --file "$ARTIFACT_PATH" \
  --schema verification \
  --enforce \
  --text
if [[ "$?" -ne 0 ]]; then
  echo "Schema violation — re-spawn with violation list as feedback. Do NOT hand-edit." >&2
  exit 1
fi

Three layers of enforcement

LayerWhenCostCatches
Pre-spawn injectionbefore agent invocationone prompt block; trivialagents that can't follow prose; reduces drift probability
Post-write hard-gateimmediately after the artefact's Write returnsone CLI call (~30 ms)drift that slipped past the prompt
np:doctor drift scanon-demand / pre-close-projectwalks milestones/M*/, lints each VERIFICATION/VALIDATIONlegacy artefacts written before the rule, hand-edits

The doctor check returns output-schema-violation issues per file with the first violations attached. Operators see the exact frontmatter key or block pattern that broke.

Consequences

  • Good, because drift fails at the producing workflow, not three milestones later. Re-spawning the agent with the violation list as feedback is the canonical fix, not editing the artefact by hand.
  • Good, because the aggregator's job becomes trivial again: read frontmatter, trust the counts. The body word-grep fallback in lib/archive.cjs becomes a legacy-compatibility tail, not a primary code path.
  • Good, because /np:doctor becomes a project-state health check for output artefacts, not just install integrity.
  • Good, because schemas are discoverable via CLI: output-lint list lists them, output-lint prompt prints them as markdown; agents and operators see the same contract.
  • Good, because new artefact types onboard cheap: add lib/schemas/<name>.cjs, register it in index.cjs::REGISTRY, wire one bash block into the producing workflow.
  • Bad, because it adds one more required workflow step per artefact-writing workflow. Mitigated: a single line plus an exit guard.
  • Bad, because schemas need maintenance. When lib/verify.cjs::renderVerificationMd changes shape the schema must change in lockstep. Caught by the existing test suite (fixtures + render round-trip).
  • Neutral, because schemas are intentionally permissive about extra frontmatter keys. Agents may add harmless metadata; only required keys and the type/enum of declared properties are checked.
  • Neutral, because body patterns are anchored, not exhaustive. The schema enforces only what consumers need (heading shape, status enum, required block fields); prose between blocks is unconstrained.

Pattern Conformance

  • S-2 NubosPilotError envelopeoutput-schema-violation, output-schema-not-found.
  • S-5 sandboxed testslib/output-lint.test.cjs builds artefact strings in-memory; no shared fixtures mutated.
  • S-6 CJS module footerlib/output-lint.cjs and every lib/schemas/*.cjs end with module.exports.

Migration plan

  1. Land lib/output-lint.cjs + lib/schemas/{index,verification,validation}.cjs.
  2. Wire /np:verify-work and /np:validate-phase with pre-spawn injection + post-write hard-gate.
  3. Extend np:doctor with the output-schema drift scan.
  4. Existing projects with pre-rule artefacts. Run np:doctor to surface drift, then re-run the producing workflow for each flagged file. There is no automatic rewriter; re-running guarantees the agent re-classifies with current Skills and Knowledge.
  5. Future artefact types (S<NNN>-PLAN.md, T<NNNN>-SUMMARY.md, PROJECT-SUMMARY.md, …). Add a schema as the consumer demand surfaces. Plans are covered separately by plan-lint (ADR-0019); unifying the two mechanisms is a pending follow-up.

More Information

  • Library: lib/output-lint.cjs, lib/schemas/index.cjs, lib/schemas/{verification,validation,researcher-output,research-final}.cjs.
  • CLI verb: bin/np-tools/output-lint.cjscheck | prompt | list.
  • Workflows: workflows/verify-work.md, workflows/validate-phase.md.
  • Related ADRs:
    • ADR-0002: preserved; pure Node built-ins, no JSON-Schema dep.
    • ADR-0012: the lint gate is the mechanical proxy for Rules 3, 5, 10, 11.
    • ADR-0016: the close-project aggregator is the consumer whose parse-fragility motivated this ADR.
    • ADR-0018: reuses this engine for the researcher-output and research-final schemas.
    • ADR-0019: plans are mechanically linted via a separate, parallel mechanism (plan-lint).

Origin: user feedback 2026-05-12 on a real /np:close-project run that surfaced 14 blockers, 11 of them phantoms caused by aggregator-side parse fragility. Root-cause analysis identified the writing workflows as the correct enforcement layer.