Appearance
ADR-0011: Researcher-Schwarm with Deterministic Consensus Merge
Context and Problem Statement
np-researcher is a single-agent spawn that produces RESEARCH.md for a milestone. A single LLM research pass has two well-documented failure modes:
- Halluzination with confidence: the agent commits to a library version, an architectural pattern, or a "best practice" that is wrong, and presents it with high confidence because no other voice contradicts it.
- Group-think on pre-existing knowledge: the agent retrieves what it already "knows" and skips searching, especially when the topic feels familiar.
Both fail silently: the planner consumes the bad RESEARCH.md and writes a milestone plan around the wrong claim. The plan-checker can spot inconsistencies but cannot validate factual claims; it has the same training data as the researcher.
The Nubos AI agent-harness brief specifies the cure: spawn k=3 independent researchers in parallel, then merge their outputs deterministically. Mehrheit for decisions; Union for risks; Schnittmenge for patterns. Independence is enforced by withholding the swarm-membership fact from each spawn, so each researcher believes itself the sole spawn.
The Rule
np:research-phase and np:plan-phase --research invoke lib/researcher-swarm.cjs spawnSwarm({ k }). The default is k=3. The orchestrator:
- Pre-flight — call
lib/learnings.cjs::matchExistingLearning(vialib/knowledge-adapter.cjs) against.nubos-pilot/knowledge/learnings.json. If a hit at similarity ≥swarm.research.threshold(default0.9) andoccurrence ≥ swarm.research.minOccurrence(default3) exists, the cached pattern is rendered asRESEARCH.mdwith provenance[CACHED]and the swarm is bypassed. - Parallel Spawn — spawn
knp-researcherinstances in parallel; each receives the same Ticket / CONTEXT / Stack input plus aseed_deltafield that nudges its prompt without disclosing the swarm. Each spawn produces a structured output matching the existingRESEARCH.mdschema plus a<consensus_meta>block carryingseed_deltaandagreement_score. - Merge —
mergeConsensus(outputs)runs three deterministic merge rules over the parsed outputs:- Decisions:
Mehrheit, at least⌈k/2⌉outputs agree on the same prescriptive decision (library, version, pattern). Disagreement →FLAGGEDwith all candidates listed for plan-checker review. - Risks:
Union, every risk surfaced by any spawn enters the merged list, deduplicated by semantic-fingerprint normalisation. - Patterns:
Schnittmenge, only patterns mentioned by ≥ 2 spawns enter the merged output. Solo-spawn patterns are demoted to[ASSUMED]in the merged output. - Open Questions / Sources: Union with credibility = max(seen).
- Decisions:
- Render — the merged output is written to
<milestone_dir>/<milestone>-RESEARCH.mdwith a<consensus_meta>block listingk,agreement_score,flagged_decisions[], and per-spawnseed_delta[]for audit.
agreement_score is the mean of per-decision agreement ratios across all merged decisions: for each decision bucket, compute min(1, agreement / k), then average over all buckets. 1.0 means every decision was unanimous; 1/k means every decision was solo-held. The plan-checker reads this as a single quality scalar; low agreement triggers extra scrutiny on flagged_decisions[].
The k-of-1 case (swarm.research.k = 1) is supported and degrades gracefully to the legacy single-spawn behaviour. k > 5 is rejected by the library: beyond 5 the cost outweighs the marginal information gain.
Decision Drivers
- Halluzination Resistance: three independent spawns are unlikely to halluzinate the same wrong fact. Disagreement is the signal that triggers Plan-Checker review.
- Determinism: same input + same
k+ same merge rules produce the same merged output. Not stochastic. - Bounded Cost:
k=3triples the research-token cost over single-spawn but is bounded; the cache-bypass at Pre-flight reclaims most of the cost as the project ages. - Independence: each spawn does not know about the others. No shared scratchpad. No iterative refinement. That is the property that prevents group-think.
Considered Options
- Single-spawn Researcher: current behaviour. Rejected. Hallucinates with confidence; no cross-check.
- Sequential refinement (Researcher A → B → C, each refining the prior): Rejected. Group-think; later spawns anchor on the first.
- k=3 with consensus voting on every finding: Rejected. Risks are not opinions; voting drops valid solo findings. Use Union with prefix-tagging instead.
- k=3 with Mehrheit / Union / Schnittmenge merge: chosen.
Decision Outcome
Chosen: k=3 parallel researchers with the Mehrheit / Union / Schnittmenge merge rules, because it produces a deterministic, halluzination-resistant RESEARCH.md whose <consensus_meta> block is a first-class plan-checker input.
Cache Adapter
The cache lookup goes through lib/knowledge-adapter.cjs. The local adapter (BM25 index over .nubos-pilot/knowledge/) is the only one shipped. The swarm.knowledge_adapter seam keeps the door open for additional adapters without touching the swarm logic.
Consequences
- Good, because every prescriptive decision in
RESEARCH.mdcarries explicitMehrheit-vs-FLAGGEDprovenance. - Good, because the cache short-circuits the swarm for repeat patterns. Token cost decays asymptotically.
- Good, because
<consensus_meta>is a deterministic plan-checker input, not a guess. - Bad, because
k=3triples first-time research cost. Accepted; cache reclaim + Rule 9 (Search before building) make the trade-off favourable. - Bad, because spawn coordination requires the orchestrator to hold three concurrent agent contexts. Accepted; this is exactly the kind of orchestration ADR-0001 frames as in-session.
More Information
- Concept: Researcher-Schwarm.
- Library:
lib/researcher-swarm.cjs,lib/knowledge-adapter.cjs,lib/learnings.cjs. - Related: ADR-0010, ADR-0012.
