Skip to content

ADR-0023: Per-Task Architect + Test-Writer (TDD) in the Nubosloop

  • Status: Accepted. Both steps are wired into /np:execute-phase as round-1 ACTION CONTRACTs with Layer-C SKIP-GUARDs, config-gated and backfilled on install/update.
  • Date: 2026-06-30
  • Supersedes: None
  • Related: ADR-0010 (Execute-side Trust Layer), ADR-0012 (Completeness Mandate), ADR-0019 (Plan-side Trust Layer), ADR-0021 (Off-host dispatch)

Context and Problem Statement

The per-task Nubosloop (round 1) ran: researcher swarm → executor → critic. The executor decided code structure and wrote tests and wrote production code in one pass. Two gaps followed:

  1. No up-front structural decision. Each executor invented the task's class/module layout ad hoc. A milestone-level architect exists (np-architect, /np:architect-phase) but it is intent-level and milestone-scoped — it does not shape an individual task's code.
  2. Tests came after (or alongside) the code. Tests written by the same agent that just wrote the implementation tend to assert what the code does, not what the task requires — and "trivial enough to skip" creeps in.

We also needed a place for users to declare how code must be built (class construction, naming, paradigms) that the loop honours.

Decision

Insert two config-gated, round-1-only steps into the loop, in this order, between the researcher swarm and the executor:

researcher swarm → [architect] → [test-writer] → executor → critic
  • np-task-architect (agents.architect, default on). Read-only (Read/Grep/Glob). Reads the task plan, RULES.md Conventions, and any M<NNN>-ARCHITECTURE.md, then emits an ephemeral per-task structural spec — responsibilities, boundaries, paradigms, and the test surfaces TDD must cover — as its final message. The orchestrator injects it as <architecture_constraints> into the test-writer and executor prompts. It writes no files and is never committed.
  • np-test-writer (agents.test_writer, default on). Writes real, valid test files for every required surface before production code exists. Tests may be red; the executor makes them green and is told not to delete, skip, or weaken them. A built-in anti-skip self-check plus the downstream np-critic-tests axis guard against vacuous/skipped assertions.

Conventions are expressed in .nubos-pilot/RULES.md under a structured ## Conventions block (Class/Module Structure, Naming, Code Style, Patterns & Paradigms), read by the architect, test-writer, executor, and np-critic-style.

Trust-layer integration

Each step has a Layer-C SKIP-GUARD phase in loop-run-round.cjspost-architect / post-test-writer — that refuses to advance unless a loop-audit-tool-use spawn-evidence stamp exists for the round (loop-post-architect-missing-spawn-audit / loop-post-test-writer-missing-spawn-audit). Neither agent is Rule-9 audited (not in AUDITED_AGENTS): the architect is advisory/read-only, and the test-writer's quality is enforced by the tests critic. The prep steps never bump the round counter — TDD writes tests once; build-fixer rounds iterate.

Granularity & plan-lint (ADR-0019)

The per-task architect decides structure (responsibilities, boundaries, required test surfaces), not line-level implementation — no schema DDL, no exact framework filenames, no code-style edicts already in RULES.md. Its output is ephemeral and never reaches PLAN.md, so plan-lint is untouched. The milestone architect stays reachable from planning via /np:plan-phase <N> --architect (exit-43 dispatch → /np:architect-phase → re-enter), preserving the research → architect → plan flow.

Consequences

  • Round 1 spawns up to two more agents when both toggles are on. Both default on and are backfilled to true on install/update when the key is absent (an explicit false is never overwritten — same rule as agents.economy).
  • The executor receives a decided shape and a failing test suite, which it must satisfy — TDD inside the loop.
  • The mechanical verify gate runs only after the executor, so red-until-executor is the expected state, not a failure.
  • Each new agent/critic-style step touched the standard parity sites: agent file + agents.test.cjs, config-defaults + config-schema, install backfill, the workflow ACTION CONTRACT + off-host branch, check-offhost-coverage EXPECTED, the agent catalog, and configuration.md.

Shipped alongside: commit-task now strips the duplicated M…-S…-T… prefix from the subject and attaches a structured body (the task's <action>, <acceptance_criteria>, id, and files) via a second git commit -m, so history is self-explanatory months later. The task(<id>): subject prefix is preserved — revert and audit tooling parse it.