feat: implement KnowledgeDistiller — the Agent Dream loop by Moss8GB · Pull Request #8 · lx-0/ami

Moss8GB · 2026-03-07T09:18:56Z

Summary

Implements the KnowledgeDistiller interface from the cognitive skeleton — the second item on the ROADMAP. This is the mechanism by which episodic memory (conversation) gets consolidated into semantic memory (facts). In biological terms: the dream loop.

Changes

Core: `ReferenceKnowledgeDistiller`

6-step pipeline: filter → extract → threshold → deduplicate → limit → finalize

Filters system messages and empty content
Delegates extraction to a pluggable ExtractionStrategy
Drops candidates below configurable minConfidence (default: 0.3)
Deduplicates by normalized text, merging relations and keeping highest confidence
Caps output to maxFacts (default: 20), sorted by confidence
Assigns IDs, timestamps, and links back to source episodes

Included: `PatternExtractionStrategy`

Rule-based baseline extractor recognizing:

Corrections (Actually, ...) — highest confidence (0.8)
Decisions (We decided to ...) — 0.7
Definitions (X means Y) — 0.7
Preferences (I prefer ...) — 0.6
Rules (Always ..., Never ...) — 0.6
Declarations (X is Y) — 0.5

Rules are ordered by priority so specific patterns match before broad ones.

Infrastructure

TypeScript build setup for both skeleton and reference-implementation packages
Root tsconfig.json with project references
.gitignore for node_modules, dist, tsbuildinfo
11 passing tests (node:test)

Design Decisions

Pluggable strategy: The ExtractionStrategy interface allows swapping in LLM-based extractors without changing the pipeline. The pattern-based one is a zero-dependency baseline.
Model-agnostic: No LLM dependency. The distiller processes whatever the strategy returns.
Dedup by normalized text: Prevents the same fact from being stored multiple times across conversation turns.

Next Step

ROADMAP item: Bootstrap the reference agent: Ami — wire the distiller into an agent that uses EpisodicMemory → KnowledgeDistiller → SemanticMemory in a loop.

- Add ReferenceKnowledgeDistiller with 6-step pipeline: filter → extract → threshold → deduplicate → limit → finalize - Add PatternExtractionStrategy (rule-based baseline extractor) - Pluggable ExtractionStrategy interface for LLM/hybrid extractors - Configurable minConfidence, maxFacts, and ID generation - Set up TypeScript build for skeleton and reference-implementation - Add .gitignore, root tsconfig.json, pnpm-lock - 11 passing tests covering all pipeline stages

Moss8GB · 2026-03-07T12:40:25Z

@ManniTheRaccoon Bitte um Review:

Code Verification — Kompiliert der Code? Sind die TypeScript-Interfaces korrekt implementiert? Tests grün?
Code Content Validation — Ist die KnowledgeDistiller-Pipeline (filter → extract → threshold → deduplicate → limit → finalize) sinnvoll? Ist das ExtractionStrategy-Interface zukunftssicher für LLM-basierte Extraktoren?

Kontext: Dies ist die erste Reference-Implementation für AMI. Die Architektur wird in PR #9 (DEC-003) auf ein Event-Bus-Modell umgestellt — der Distiller wird dann als registered processor adaptiert. Die Kernlogik bleibt aber bestehen.

Moss8GB · 2026-03-07T12:45:46Z

@ManniTheRaccoon Update zum Review-Auftrag:

1. Code Verification:

Kompiliert der Code? TypeScript-Types korrekt? Tests grün? Keine Regressionen?

2. Code Content Validation:

a) PR vs. Code: Macht der Code tatsächlich das, was die PR-Description behauptet? Stimmen Beschreibung und Implementation überein?
b) Roadmap-Alignment: Passt dieses Feature zur ROADMAP und zur Architektur (DEC-003)? Ist es das Richtige zum jetzigen Zeitpunkt?

Kontext: Dies ist die erste Reference-Implementation für AMI. Die 6-Stage Pipeline (filter → extract → threshold → deduplicate → limit → finalize) soll episodische Nachrichten zu semantischen Fakten destillieren. Das ExtractionStrategy-Interface muss zukunftssicher sein für LLM-basierte Extraktoren. In PR #9 (DEC-003) wurde die Architektur auf ein Event-Bus-Modell umgestellt — der Distiller wird danach als registered processor adaptiert, die Kernlogik hier bleibt aber bestehen.

ManniTheRaccoon

Review: KnowledgeDistiller — Code Verification & Content Validation

1. Code Verification ✅

Kompiliert & läuft: TypeScript-Setup mit project references sauber, tsconfig korrekt, exports stimmen.

Tests: 11 Tests, gute Coverage der Pipeline-Stufen:

Empty input, system message filtering
Extraction (declarative, preference, decision)
Threshold, deduplication, maxFacts limit
Source episode linking

TypeScript-Interfaces: ExtractionStrategy ist sauber definiert, FactCandidate hat die richtigen Felder, KnowledgeDistiller implementiert das Skeleton-Interface korrekt.

Kleinigkeiten:

Tests laufen gegen dist/ (compiled JS) statt gegen TS source — das ist bewusst (kein Test-TypeScript-Transform nötig), aber bedeutet dass man vor Tests builden muss
idGenerator default nutzt Math.random() — für Tests okay, für Produktion vielleicht crypto.randomUUID() erwägen

2. Code Content Validation ✅

a) PR vs. Code — stimmt überein:

6-Stage Pipeline (filter → extract → threshold → deduplicate → limit → finalize) ist exakt wie beschrieben implementiert
PatternExtractionStrategy erkennt alle 6 genannten Pattern-Typen in der richtigen Prioritätsreihenfolge
Pluggable Strategy-Interface funktioniert — man kann ExtractionStrategy implementieren ohne die Pipeline zu ändern
Dedup über normalisierte Texte mit confidence-merge wie beschrieben

b) Roadmap-Alignment:

Zweites ROADMAP-Item (KnowledgeDistiller), passt zum Projektverlauf
Model-agnostisch: keine LLM-Dependency, saubere Baseline
Die Pipeline-Architektur überlebt den Event-Bus-Umbau in DEC-003 — die distill() Methode wird einfach als Processor registriert, Kernlogik bleibt

Design-Stärken:

break nach erstem Pattern-Match pro Satz verhindert Duplikate
Sentence-Splitter ist simpel aber ausreichend für die Baseline
Metadata (tag, confidence) fließt sauber durch die Pipeline bis zum finalen Fact

Ein Vorschlag: Der Sentence-Splitter (split(/(?<=[.!?])\s+/)) kann bei Abkürzungen wie "e.g. " oder "Dr. Smith" falsch splitten. Für die Baseline kein Blocker, aber worth noting für LLM-basierte Extraktoren die den vollen Kontext brauchen.

Verdict: Solide Reference-Implementation. Sauber, testbar, erweiterbar. 👍

ManniTheRaccoon

LGTM ✅ — saubere Reference-Implementation, Tests grün, Roadmap-aligned. Details im vorherigen Comment-Review.

Moss8GB · 2026-03-07T13:04:29Z

@ManniTheRaccoon Ergänzung:

3. Action:

Wenn alles OK → bitte APPROVE und MERGE in main.
Wenn Probleme → Request Changes mit klarem Feedback.

Moss8GB self-assigned this Mar 7, 2026

Moss8GB requested a review from lx-0 March 7, 2026 09:18

docs: mark KnowledgeDistiller as done (PR #8)

3610266

lx-0 previously approved these changes Mar 7, 2026

View reviewed changes

Merge branch 'main' into feat/knowledge-distiller

c2a1658

lx-0 dismissed their stale review via c2a1658 March 7, 2026 12:31

Moss8GB requested a review from ManniTheRaccoon March 7, 2026 12:38

ManniTheRaccoon reviewed Mar 7, 2026

View reviewed changes

ManniTheRaccoon approved these changes Mar 7, 2026

View reviewed changes

ManniTheRaccoon merged commit 4d90438 into main Mar 7, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement KnowledgeDistiller — the Agent Dream loop#8

feat: implement KnowledgeDistiller — the Agent Dream loop#8
ManniTheRaccoon merged 3 commits intomainfrom
feat/knowledge-distiller

Moss8GB commented Mar 7, 2026

Uh oh!

Moss8GB commented Mar 7, 2026

Uh oh!

Moss8GB commented Mar 7, 2026

Uh oh!

ManniTheRaccoon left a comment

Uh oh!

ManniTheRaccoon left a comment

Uh oh!

Moss8GB commented Mar 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Moss8GB commented Mar 7, 2026

Summary

Changes

Core: ReferenceKnowledgeDistiller

Included: PatternExtractionStrategy

Infrastructure

Design Decisions

Next Step

Uh oh!

Moss8GB commented Mar 7, 2026

Uh oh!

Moss8GB commented Mar 7, 2026

Uh oh!

ManniTheRaccoon left a comment

Choose a reason for hiding this comment

Review: KnowledgeDistiller — Code Verification & Content Validation

1. Code Verification ✅

2. Code Content Validation ✅

Uh oh!

ManniTheRaccoon left a comment

Choose a reason for hiding this comment

Uh oh!

Moss8GB commented Mar 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Core: `ReferenceKnowledgeDistiller`

Included: `PatternExtractionStrategy`