Skip to content

ADR-0013: Learnings-Store Schema Evolution Policy

Context and Problem Statement

The learnings store at .nubos-pilot/knowledge/learnings.json is the persistent surface ADR-0010's pre-flight cache lookup hits. It accumulates value over project lifetime: every successful task commit appends or merges a learning, and a v1-trained store is the user's only protection against re-paying the Researcher-Schwarm cost.

A STORE_VERSION constant exists (lib/learnings.cjs), but the policy around bumping it was undefined. Without a policy:

  1. Silent wipe risk: a future v2 reader on a v1 store could (in an earlier draft) silently treat the version mismatch as "empty store" and overwrite the user's accumulated patterns.
  2. No migration contract: neither the developer bumping the version nor the user upgrading nubos-pilot has a written guarantee about what happens to existing data.
  3. No per-version recovery procedure: the user has no documented path to recover from a corrupt or out-of-version store.

The Rule

The learnings store is forward-incompatible by default. Reading a store whose version is not the running release's STORE_VERSION:

  1. Throws learnings-store-version-mismatch (NubosPilotError) with details: { expected, got, hint }. Never silently wipe.
  2. Before the throw, the loader runs the MIGRATORS registry from version got upward to expected. If a complete migration chain exists, the migrated store is returned. If any link is missing, the throw fires.
  3. The user can recover by either upgrading nubos-pilot to a release that ships the missing migrator, or by backing up learnings.json and removing it (clean slate).

Bumping STORE_VERSION

Each version bump ships:

  1. A new entry in MIGRATORS keyed by the outgoing version (e.g. MIGRATORS[1] = function(v1Store) { return v2Store; }).
  2. A test case in lib/learnings.test.cjs that round-trips a frozen v1 fixture through the migrator and asserts every legacy field maps to the v2 shape.
  3. A note in this ADR appended under "## Version History" (no rewrite; append-only).
  4. A release note in the changelog explaining what changed and what users should do (typically nothing; migration is automatic).

Adding a field forward-compatibly

When the change is additive (e.g. the tokens cache field added in the second review), do not bump STORE_VERSION:

  • Add the field with sensible defaults at write time (logLearning lazy-fills it).
  • Make the reader tolerate its absence (Array.isArray(l.tokens) ? l.tokens : _tokenize(l.pattern)).
  • Document the additive field in this ADR's "## Field Index" table.

This is the cheapest evolution path and should be preferred whenever it preserves the read invariant.

Decision Drivers

  • Preserve user data: accumulated learnings are the project's compound interest. A bug that wipes them is a permanent loss.
  • Loud failure: unknown versions throw with hints, not silent recovery. The user finds out immediately.
  • Migration as code: every breaking change ships its migrator with the release. Deferring migration to "manual cleanup" is a regression.
  • Forward-compat additive changes free: no version bump for additive fields keeps small improvements cheap.

Considered Options

  • Silent wipe on mismatch: original behaviour. Rejected (FAIL-4 second review).
  • Refuse to start until manual remediation: Rejected. Too coarse for additive changes.
  • Migrator-registry + hard-throw on missing migrator + additive-changes-no-bump policy: chosen.

Decision Outcome

Chosen as documented above. Implementation in lib/learnings.cjs (_readStore, _migrate, MIGRATORS).

Fresh-repo behaviour

When learnings.json does not exist, _readStore returns the empty current-version store ({ version: STORE_VERSION, learnings: [] }) without throwing. The next logLearning creates the file via atomicWriteFileSync. This is the expected onboarding path; every project starts cold.

Outcome semantics

learning.outcome is last-known, not append-only. When a pattern is re-logged, the latest outcome overwrites the prior value. This is documented in lib/learnings.cjs::logLearning JSDoc and surfaced here so callers know not to treat the field as a journal. If a pattern's outcome flips between re-logs, last_seen advances and occurrence increments; those two fields ARE append-stamped.

Field Index (v1)

FieldTypeRequiredNotes
versionintegeryesmatches STORE_VERSION
learnings[]arrayyeslearning records below
learnings[].fingerprintstring (16-hex)yessha1 of normalized tokens
learnings[].patternstring (≤ MAX_PATTERN_BYTES)yesoriginal prescriptive sentence
learnings[].tokensarray of stringsno (additive)normalized + sorted unique tokens; lazy-filled
learnings[].outcomestring (≤ MAX_OUTCOME_BYTES)yeslast-known outcome
learnings[].occurrenceintegeryescounter incremented on every re-log
learnings[].first_seenISO-8601 stringyes
learnings[].last_seenISO-8601 stringyes
learnings[].task_ids[]array of TASK_ID stringsyesunion of every task that triggered the log
learnings[].milestone_ids[]array of MILESTONE_ID stringsyesunion of every milestone

Bounded Store + Eviction

The store is capped to prevent unbounded growth from auto_log_learning:

  • MAX_LEARNINGS = 1000 — hard limit on entry count.
  • MAX_STORE_BYTES = 8 * 1024 * 1024 (8 MB) — hard limit on serialized size.

When logLearning would push the store past either cap, eviction runs inside the same lock window as the write:

  1. Sort candidates by (occurrence asc, last_seen asc) so the least-used + oldest entries leave first.
  2. Trim to MAX_LEARNINGS entries.
  3. If the serialized size still exceeds MAX_STORE_BYTES, drop one entry at a time (in eviction order) until it fits.

The newly-added entry is always retained because it has the most-recent last_seen. High-occurrence entries are protected; they evict only after every single-use entry is gone. The eviction is silent (no error, no warning) because hitting the cap is expected steady-state behaviour, not a fault.

learning-list (CLI) returns entries sorted by (occurrence desc, last_seen desc), the inverse of the eviction order, so the most-trusted patterns surface first.

Locking Semantics

  • logLearning and clearLearnings acquire withFileLock for read-modify-write. Atomic.
  • matchExistingLearning and listLearnings read without holding a lock. Safe on POSIX (rename-into-place is atomic). A transient race on filesystems where rename is not atomic (NFS, some Windows shares) surfaces as learnings-store-corrupt rather than silent corruption. That is the intended behaviour: fail loud, never silently corrupt.
  • _writeStore overwrites destructively with no read-merge step. It is intentionally low-level and only used by clearLearnings + tests; new callers wanting upsert semantics MUST use logLearning.

Version History

  • v1 (2026-05-03) — initial schema. Additive: tokens field (introduced same day, no bump).

More Information

  • Library: lib/learnings.cjs
  • Linter: none (the schema is enforced at parse time by _readStore).
  • Tests: lib/learnings.test.cjs (LRN-12..14 + LRN-VAL-1..5 cover version-mismatch, corruption, and field validation).