Appearance
Vector-Memory
Vector-Memory is the opt-in semantic-recall layer that augments the BM25 pre-flight cache with embedding-based similarity. It lives at .nubos-pilot/memory/ and is consulted by the Nubosloop's Pre-flight (hybrid score), the Researcher-Schwarm (pre-recall before web search), and the Planner (prior-decision context).
ADR-0014 ratifies the design. The orchestration lives in lib/memory.cjs; the local provider in lib/memory-provider-local.cjs (lazy-loaded @huggingface/transformers); the index in lib/memory-index-usearch.cjs (lazy-loaded usearch).
Why a vector layer
The BM25 / Jaccard pre-flight in lib/learnings.cjs::matchExistingLearning is blind to semantically close patterns expressed with different vocabulary. Three failure modes recur:
- Vocabulary drift between phases. A learning logged in M002 about "Filament Resource policy registration" does not match the M005 ticket "Resource autorisierung in admin panel", so the swarm re-derives the same conclusion at triple Researcher cost.
- Handoff-note bloat.
lib/handoff.cjsreturns the entire prior-phase note set as plain text regardless of relevance. Late-project phases load increasingly large irrelevant context. - Critic-finding rediscovery. A Critic in M005 cannot recall that the same finding category appeared in M001 with a known remediation, so the routing engine re-explores the same dead ends.
The hybrid α·BM25 + (1−α)·vector score (default α = 0.6) catches both lexical and semantic repeats. A missing vector signal is treated as absent, so the lexical score stands; it is never treated as a zero that would drag a strong lexical hit down. A purely semantic match is also surfaced, but only when it clears swarm.research.threshold on its own.
Layout
.nubos-pilot/memory/ # strict sub-tree of Project-State (ADR-0005)
index.usearch # binary HNSW index (cosine, dim from provider)
index.usearch.keymap.json # BigInt-key ↔ string-uuid mapping
records.jsonl # 1:1 vector-id ↔ record, append-only, source-of-truth
manifest.json # embedding model, dim, version, alpharecords.jsonl schema (one record per line):
json
{ "id": "uuid",
"type": "learning|handoff|critic|research",
"phase": "M005-S007-T0002",
"title": "...",
"body": "...",
"tags": ["feature-flags", "filament"],
"provenance": "VERIFIED|CITED|ASSUMED|CACHED",
"created_at": "2026-05-08T..." }type and phase are exact-match filters at query time; tags is a set-overlap filter.
Activation
Disabled by default. Enable via .nubos-pilot/config.json:
json
{
"memory": {
"enabled": true,
"provider": "local",
"model": "Xenova/bge-small-en-v1.5",
"alpha": 0.6
}
}When memory.enabled = false, no embedding model is loaded, no index is built, and Pre-flight falls back to BM25-only. The optional dependencies (@huggingface/transformers, usearch) are not resolved at install time. Run npm install --include=optional to pull them when activating the layer.
Where it plugs in
1. Pre-flight hybrid score (Step 1 of the Nubosloop)
lib/knowledge-adapter.cjs::_localAdapter.match is async: it runs BM25/Jaccard first, then — when memory.enabled = true — queries memory.query(text, { type: 'learning', k: limit }) and merges via _hybridMerge(bm25Hits, vectorHits, alpha, byFp), keyed by learning fingerprint. Every hit carries a retrieval tag — bm25, vector, or hybrid. Threshold gating (swarm.research.threshold, default 0.9): lexical and hybrid hits were already gated by matchExistingLearning and pass through; a vector-only hit must clear the threshold on its own before it becomes a cache hit. If memory.enabled = true but the vector layer cannot be built or queried, match returns a lexical-only result with a non-null degraded marker rather than silently pretending the hybrid path ran. Provenance of cache hits remains [CACHED] per ADR-0011.
2. Researcher pre-recall
The np-researcher agent prompt instructs each spawn to query memory before issuing external research:
bash
node .nubos-pilot/bin/np-tools.cjs memory-query --text "<ticket-summary>" --k 5 --type research
node .nubos-pilot/bin/np-tools.cjs memory-query --text "<ticket-summary>" --k 3 --type learningHits with [VERIFIED] / [CITED] provenance enter the merged RESEARCH.md as [CACHED:VERIFIED] / [CACHED:CITED], with no duplicate web round-trip. memory-disabled is silently swallowed; the section is opt-in and additive.
3. Planner context-injection
np-planner queries memory before the reality-check pass and surfaces matching [VERIFIED] decisions in the slice plan's <context> block as prior-art. Locked-decisions in M<NNN>-CONTEXT.md remain canonical — memory hits are advisory only.
4. Derived-cache indexing
The vector index is a derived cache of the learnings store, not an independently-written store. lib/knowledge-adapter.cjs::_ensureLearningsIndexed runs at Pre-flight (inside _localAdapter.match): it embeds and indexes every learning whose fingerprint is not yet present, keyed by id = fingerprint. Because the key is the fingerprint, a re-logged pattern maps to the same record, so there is no duplication and no re-embedding.
lib/learnings.cjs (learnings.json) is the single source of truth. The commit phase does not write to the vector index; if the index is lost or stale, it is rebuilt deterministically, lazily at the next Pre-flight or explicitly via np:memory-rebuild.
Provider — local default
@huggingface/transformers (the successor to @xenova/transformers, maintained by the original author at Hugging Face) running Xenova/bge-small-en-v1.5 (or Xenova/bge-multilingual-base for non-English projects):
| Field | Value |
|---|---|
| Model size | ~70 MB downloaded on first run; cached under ~/.cache/nubos-pilot/models/ |
| Vector dim | 384 (bge-small) or 768 (bge-multilingual) |
| Provider interface | provider.embed(texts: string[]) → Promise<Float32Array[]> |
| First-run UX | one-time progress indicator while the model downloads; subsequent runs load the cached model |
| Runtime | dual CJS+ESM in v4; lib/memory-provider-local.cjs keeps require() |
A future Pro-tier remote provider (Jina Embeddings v3, multilingual + code-aware) will be specified in a successor ADR; it is out of scope here.
Index engine — usearch
usearch provides HNSW with cosine similarity:
- Why prebuilt binaries, not WASM: Both
@huggingface/transformers(viaonnxruntime-node) andusearchship platform-specific prebuilt binaries vianode-gyp-build/@img/sharp-*-style platform packages. Nonode-gypinvocation, no Python, and no build chain on the consumer machine. The install UX matches WASM, the runtime is faster. The deprecatedprebuild-installpackage is not in the dependency tree of these pinned versions. - Capacity: O(100K) records without strain. Per-project memory is expected to hold O(1K–10K) records over the project lifetime.
- Persistence:
index.usearchis written viaatomicWriteFileSync; corruption recovery runs vianp:memory-rebuildfromrecords.jsonl.
Memory is never committed
.nubos-pilot/memory/ is a runtime-state cache, not source-of-truth. The directory belongs in the consumer-project's .gitignore. Rebuild is deterministic from the source-of-truth files: lib/learnings.cjs learning-store, lib/handoff.cjs handoff-notes, milestone RESEARCH.md, and the Critic-report archive (per ADR-0010 §L5 <report_path>).
CLI
bash
# Bulk-index from a JSON-array or JSONL file (initial seeding)
node .nubos-pilot/bin/np-tools.cjs memory-index \
--records-file .nubos-pilot/memory/seed.jsonl
# Add a single record
node .nubos-pilot/bin/np-tools.cjs memory-add \
--type learning --title "use jose for jwt" --body "..." \
--tags filament,auth --provenance VERIFIED
# Query with optional filters
node .nubos-pilot/bin/np-tools.cjs memory-query \
--text "filament resource policy" \
--k 5 --type learning --tags filament
# Force full re-embed (after embedding-model change)
node .nubos-pilot/bin/np-tools.cjs memory-rebuild
# Print stats: count, dim, model, schema_version, created_at, rebuilt_at
node .nubos-pilot/bin/np-tools.cjs memory-statsEach verb returns its primary result as JSON on stdout. memory.enabled = false makes every verb refuse with a memory-disabled envelope (S-2).
Configuration reference
json
{
"memory": {
"enabled": false,
"provider": "local",
"model": "Xenova/bge-small-en-v1.5",
"alpha": 0.6
}
}| Key | Default | Purpose |
|---|---|---|
memory.enabled | false | gates the entire layer; refuses every verb with memory-disabled when off |
memory.provider | "local" | provider implementation; only "local" ships today |
memory.model | "Xenova/bge-small-en-v1.5" | embedding model; mismatch with manifest triggers memory-rebuild-required |
memory.alpha | 0.6 | hybrid-score weight: α·BM25 + (1−α)·vector |
Related
- Nubosloop — Pre-flight (Step 1) hybrid score.
- Researcher-Schwarm — pre-recall before external search.
- ADR-0014 — full architectural decision record.
- ADR-0006 — the precedent for
optionalDependenciesamendments to ADR-0002.
