Skip to content

WIP: drop importance_score and confidence columns from memories#55

Merged
amitpaz1 merged 12 commits into
mainfrom
fix/drop-quality-score-columns
May 9, 2026
Merged

WIP: drop importance_score and confidence columns from memories#55
amitpaz1 merged 12 commits into
mainfrom
fix/drop-quality-score-columns

Conversation

@amitpaz1
Copy link
Copy Markdown
Collaborator

@amitpaz1 amitpaz1 commented May 9, 2026

Status: WIP, ~50% done. Mid-refactor checkpoint. Ready for a fresh session to resume.

The full plan + remaining-work checklist is committed at:
docs/superpowers/plans/2026-05-09-drop-quality-score-columns.md

What's done (8 commits)

  • Phase 1: Schema migrations (migrations/025_drop_quality_score_columns.sql + SQLite version) — DROP TRIGGER, DROP INDEX, DROP COLUMN. Verified parses and drops both columns + indexes correctly.
  • Phase 2: Dataclass field removals (NewMemory, StoredMemory, MemoryPatch, ExportedMemory in persistence/types.py)
  • Phase 3: Persistence layer (sqlite.py, postgres.py, protocol.py) — INSERT, SELECT, recall scoring, RETURNING clauses, AVG aggregates, ScoredMemory mapping, GraphStats.avg_importance — all stripped. Recall score formula drops the importance_score multiplier (was a no-op since every row had identical 1.0).
  • Phase 4: src/lore/importance.py deleted; replaced by src/lore/decay.py with the still-needed pure functions (decay_factor, resolve_half_life); lore.py cleansed (time_adjusted_importance_memory_decay); tests/test_importance_scoring.py deleted
  • Phase 5a: services/observations.py, services/memories.py, services/lessons.py — drop confidence parameter, drop importance_score from response dicts, rename min_confidencemin_score

What's left

  • Phase 5b — remaining services: conversations, snapshots, retrieve, graph/, consolidation, conversation/extractor, extract/, ingest/dedup
  • Phase 6 — MCP server (drop confidence param from remember())
  • Phase 7 — HTTP routes & response models (~10 files)
  • Phase 8 — top-level + CLI + export + UI source (~12 files)
  • Phase 9 — tests update + add regression test (~30 test files)
  • Phase 10 — lint, push, mark PR ready

Don't over-reach: leave these "confidence" concepts untouched

  • Graph relationship confidence/weight (entities/mentions/relationships tables)
  • recommend/engine.py RecommendationConfidence
  • freshness/detector.py staleness confidence
  • classify/ axis confidence

grep -rn "importance_score\|memory\.confidence" src/ | grep -v "graph\|recommend\|freshness\|classify" should drive the cleanup; don't touch refs in those four directories.

Tests will be RED until Phase 5b–9 land

Until services / routes / tests are updated, the suite will fail with TypeError on dropped kwargs. That's expected at this checkpoint — DO NOT panic and revert.

🤖 Generated with Claude Code

amitpaz1 added 7 commits May 9, 2026 11:40
Schema migration only — no application-code changes yet. Application
side will follow in subsequent commits.

Both columns carried mechanical defaults (importance_score=1.0,
confidence=0.5 for observations) and were no-ops in recall scoring.
Industry consensus (mem0, Letta, LangMem, Zep, Cognee) is to skip
per-memory quality scores entirely. Lore's existing cosine + FTS
hybrid recall handles ranking without them.

Indexes are dropped before columns to satisfy SQLite's DROP COLUMN
restriction. The PG on-access trigger (if present) is also dropped.

Out of scope (intentionally untouched): per-relationship confidence
on graph entities/mentions/relationships.
Removed from:
* NewMemory.confidence (default 0.5)
* NewMemory.importance_score (Optional[float])
* StoredMemory.confidence
* StoredMemory.importance_score
* MemoryPatch.confidence
* ExportedMemory.confidence

Persistence implementations (sqlite/postgres) and all callers will
follow in subsequent commits — until they're updated, those modules
will fail to import (unused-attribute access on the dataclasses).

NewMention.confidence and StoredMention.confidence (graph relationship
quality) are intentionally untouched — separate column on a separate
table.
* INSERT statements no longer write the columns
* SELECT lists drop them everywhere they appeared
* _row_to_memory and _row_to_exported stop populating the dropped fields
* recall_by_embedding score formula drops the importance multiplier
  (was a no-op since every row had identical 1.0)
* bump_access_counts and record_memory_access UPDATE drops the
  importance_score recomputation
* list_candidate_memories_for_recommendation now orders by created_at DESC
  (importance_score column is gone)
* search_memories_text orders by created_at DESC
* GraphStats loses the avg_importance field — also removed from
  the dataclass in types.py
* AVG(importance_score) aggregate queries removed from get_graph_stats
* _MEMORY_COLS constant trimmed
* ScoredMemory mapping no longer copies the dropped fields

Graph-table per-mention confidence (entity_mentions.confidence) is
unaffected — different table, different concept.
…tence

* postgres.py: column lists, RETURNING clauses, recall scoring,
  bump_access_counts UPDATE, AVG aggregates, search/recall/timeline
  queries all stop carrying the dropped columns. Recall score formula
  drops the importance multiplier (was a no-op).
* protocol.py: import_extracted_memory and upsert_memory_with_embedding
  signatures lose the confidence parameter; bump_access_counts docstring
  updated.
* sqlite.py: matching shape changes for upsert_memory_with_embedding
  and import_extracted_memory.
* GraphStats.avg_importance removed in types.py; PG and SQLite
  graph-stats methods stop computing it.
* list_candidate_memories_for_recommendation now orders by created_at
  DESC (importance ordering becomes meaningless once the column is gone).

The lone remaining `m.confidence` reference in postgres.py (entity_mentions
INSERT) is the per-relationship graph confidence on a separate table —
intentional, out of scope.
…re.py

* Delete src/lore/importance.py (compute_importance, time_adjusted_importance
  no longer needed — importance_score column gone)
* New src/lore/decay.py with decay_factor + resolve_half_life
  (the two pure functions still needed for recall scoring)
* lore.py: replace time_adjusted_importance with new _memory_decay helper
  (recency decay only, no importance multiplier)
* lore.py: upvote/downvote stop recomputing importance_score
* lore.py: cleanup_expired drops the importance-threshold filter and
  uses decay-only thresholding
* lore.py: recalculate_importance method removed (no column to recompute)
* Delete tests/test_importance_scoring.py (whole file was about
  the dropped functions)
…sons services

* observations.py: drop the hardcoded confidence=0.5 / importance_score=0.5
  from create_observation
* memories.py: drop confidence parameter from create_memory and update_memory
* lessons.py: drop confidence param from create() and update();
  rename min_confidence to min_score in search() (it was always a
  min-final-score threshold, despite the name); strip
  importance_score / confidence from search response, record_access
  response, and bulk-upsert path

Service-layer callers (HTTP routes, MCP, CLI, tests) will be updated
in subsequent commits; until they catch up they'll fail with
TypeError on the dropped kwargs.
Records what's already done (Phases 1-5a, 6 commits) and what's left
(Phases 5b-10) so a fresh session can resume cleanly. Includes the
"out of scope" boundaries (graph mention confidence, recommendation
confidence, freshness confidence, classification confidence) so the
executor doesn't over-reach.
@amitpaz1 amitpaz1 force-pushed the fix/drop-quality-score-columns branch from 54859be to 53c408f Compare May 9, 2026 09:46
@amitpaz1 amitpaz1 changed the title refactor: drop importance_score and confidence from memories WIP: drop importance_score and confidence columns from memories May 9, 2026
amitpaz1 and others added 5 commits May 9, 2026 13:18
- services/conversations.py: drop confidence pass-through to import_extracted_memory
- services/snapshots.py: drop confidence=1.0 from NewMemory
- services/retrieve.py: drop importance multiplier (always 1.0 in data) and importance signal
- services/graph/review.py: drop source_importance from risk score; RiskScore loses source_reliability
- services/graph/graph.py: drop importance/confidence from GraphNode and SearchHit; drop min_importance from get_graph_data
- server/routes/graph/{models,memories,stats}.py: drop matching response fields and query params
- consolidation.py: replace importance-based fallback with most-recent (created_at); drop fields from new Memory
- conversation/extractor.py: stop forwarding candidate confidence to lore.remember
- ingest/dedup.py + store/http.py: rename min_confidence -> min_score for the search() RPC

LEAVE entity_mention/relationship confidence and Fact.confidence (graph concepts)
LEAVE recommend / freshness / classify confidence (separate concepts)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…odels, and routes

MCP (mcp/server.py):
  - drop importance: ... from recall and list_memories text formatters
  - update on_this_day docstring (was "sorted by importance")

HTTP layer (server/models.py and routes):
  - LessonCreate/Response/Export/Import: drop confidence field
  - LessonSearchRequest, MemorySearchRequest: rename min_confidence -> min_score
  - MemoryCreate/Response/Update: drop confidence field
  - routes/lessons.py: drop confidence/min_confidence/importance_score plumbing
  - routes/memories.py: drop confidence param, drop importance_score from access response, fix update validator
  - routes/temporal.py: drop confidence from MemoryResponse
  - routes/recent.py: drop importance_score field and detailed-format suffix
  - routes/retention.py: drop min_importance_score query param + AffectedMemory.importance_score
  - routes/review.py: drop source_reliability from RiskScore (matches services side)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… export, UI

Top-level:
  - lore.py: drop confidence param from remember(); rename min_confidence -> min_score
    in recall (delegates) and _recall_local (now filters on score, not memory column);
    rename importance_threshold -> decay_threshold (the column it referenced is gone);
    cleanup_expired now decays-only.
  - async_lore.py: drop confidence from remember(); drop recalculate_importance() (no-op
    after the column was removed); cleanup_expired keyword renamed for parity; drop
    confidence from PromptFormatter Memory adapter.
  - types.py: drop importance_score, confidence from Memory dataclass; drop
    avg_importance / below_threshold_count from MemoryStats.
  - temporal.py: drop importance_score from on_this_day sort key and format output.
  - recent.py: drop importance_score from detailed/structured formatters.
  - retention.py: drop min_importance_score from RetentionPolicy; _find_expired now
    age-only; archive output drops importance_score; module docstring updated.
  - store/http.py: drop confidence from create payload, response parser, update path.

CLI:
  - cli/__init__.py: drop --confidence and --min-importance CLI args; drop --sort importance.
  - cli/commands/remember.py, manage.py, misc.py: drop matching arg plumbing and the
    importance column from list/json output.

Export:
  - export/serializers.py: drop importance_score / confidence from memory_to_dict + dict_to_memory.
  - export/markdown.py: drop the same fields from per-memory frontmatter.

UI source (note: dist/app.js is a build artifact and needs re-build):
  - ui/src/panels/detail.js: drop the Importance and Confidence rows.
  - ui/src/panels/stats.js: drop avg_importance row.
  - ui/src/panels/filters.js: drop the min-importance slider, unused debounce import.
  - ui/src/state.js: drop minImportance from filters / URL state / matcher.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… regression test, repair lessons view

Migration fix:
  - migrations/025_drop_quality_score_columns.sql: DROP VIEW lessons CASCADE
    before the ALTER TABLE so the rewrite-rule-bearing view is rebuilt; recreate
    the view (and INSERT/UPDATE/DELETE rules) without the dropped columns so any
    legacy /v1/lessons callers keep working.
  - migrations_sqlite/025_drop_quality_score_columns.sql: same fix for the
    SQLite read-only view; was failing migration on a fresh DB
    ("error in view lessons after drop column: no such column: importance_score").

Tests:
  - Drop importance_score / confidence args from StoredMemory / NewMemory /
    ScoredMemory / ExportedMemory / Memory factories across ~30 test files.
  - Drop tests that exercised behaviours we removed: confidence validation,
    recalculate_importance noop, source_reliability risk component,
    importance-only ordering of recommendation candidates, vote-factor recall
    boost (no longer in the multiplicative formula).
  - Update test_consolidation full pipeline to use distinct created_at so the
    most-recent-wins fallback is deterministic; bump both timestamps past the
    short-tier 7-day eligibility threshold.
  - Update test_temporal to assert year-DESC + created_at-DESC instead of
    importance ordering.
  - Rename min_confidence kwarg to min_score in HTTP / lessons-search tests.
  - Add tests/persistence/test_quality_columns_dropped.py regression test:
    asserts SqliteStore.open() yields a memories table without
    importance_score / confidence and that ``from lore import importance``
    raises ImportError.

Code:
  - Drop the now-unused `Optional` import flagged by ruff.

verification:
  - PYTHONPATH=src pytest tests/ --ignore=tests/integration → 2706 passed,
    1177 skipped.
  - ruff check src/ tests/ → clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…y + assertions

CI exposed two missed sites in tests/integration/test_remote.py (the local
suite skips this directory):

- _stored_lesson_row helper: drop the legacy ``"confidence": 0.8`` row field;
  the wire shape no longer includes it.
- test_full_flow_publish_query_verify: drop the ``data["confidence"] == 0.8``
  assertion against the GET /v1/lessons/{id} response.
- test_export_import_between_contexts: drop the ``confidence=0.8`` kwargs from
  both ExportedMemory constructors and the corresponding round-trip POST body.

Verified: PYTHONPATH=src pytest tests/ → 2787 passed, 1205 skipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@amitpaz1 amitpaz1 merged commit 9a38753 into main May 9, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant