Skip to content

graph-population PR B: route wiring + backfill endpoint#51

Merged
amitpaz1 merged 1 commit into
mainfrom
graph-extraction-route-wiring
May 8, 2026
Merged

graph-population PR B: route wiring + backfill endpoint#51
amitpaz1 merged 1 commit into
mainfrom
graph-extraction-route-wiring

Conversation

@amitpaz1
Copy link
Copy Markdown
Collaborator

@amitpaz1 amitpaz1 commented May 8, 2026

Summary

Wires PR A's extraction service into the create-time path and adds the HTTP backfill endpoint. After this PR + reinstall, every new memory / observation triggers graph extraction as a fire-and-forget task; the 50+ pre-existing unenriched memories recover via `POST /v1/graph/backfill` (or `lore graph-backfill` CLI).

Wiring

  • `routes/memories.py` — after the existing `enrich_memory_async` task, fire `graph_svc.extract_and_persist` if `graph_svc.is_enabled()`. Auto-on iff `claude` is on PATH; explicit override via `LORE_GRAPH_EXTRACTION_ENABLED`.
  • `routes/observations.py` — same hook. Closes the silent gap from the audit: observations were the bulk-write tier the dream subagent saves through, so without this the graph stayed empty even with PR A landed.

New route — `POST /v1/graph/backfill`

  • Body: `{limit?: 1-100, force?: bool, project?: str}`.
  • Default: walks `list_memories_without_mentions` and runs extraction.
  • `force=true`: pulls from `list_memories`, re-extracts memories that already have mentions (use after prompt/model revision).
  • Returns `{processed, failed, results, enabled}`. `enabled=false` when extraction is disabled so callers don't silently see `processed=0` and assume success.
  • Auth: writer/admin.

CLI — `lore graph-backfill`

  • Replaces the dead SDK-side path (which returned 0 silently in HTTP-store mode).
  • POSTs to `/v1/graph/backfill` via `HttpStore._request`, drains pages until `processed+failed == 0` (with a 50-page safety cap).
  • Prints clear "extraction disabled" message when the server reports `enabled=false` instead of exiting 0 silently.

Test plan

  • Manual end-to-end (PR A install): ran extraction on 8 unenriched memories live → 12 entities + 28 entity_mentions + 13 relationships materialized; cross-memory dedup works (`PR graph-population PR A: extraction service + store ops #50` referenced from 5+ memories, single entity row).
  • `ruff check src/ tests/` — clean
  • 12 new tests in `tests/test_graph_extraction_wiring.py`: create-time fire-and-forget on both routes (enabled + disabled), backfill happy path, no-unenriched edge case, `force=true` switch to `list_memories`, extraction-error → failed count, CLI body shape, CLI page draining, CLI disabled-message branch.
  • Existing `tests/test_enrichment_memories.py` fixture now sets `LORE_GRAPH_EXTRACTION_ENABLED=false` so its existing `assert_called_once` counts stay deterministic regardless of `claude` on PATH.
  • `pytest tests/ --ignore=test_http_store_integration` — 2815 passed (2803 baseline + 12 new), 0 failed.
  • CI green

🤖 Generated with Claude Code

Wires PR A's extraction service into the create-time path and adds the
HTTP backfill endpoint. After this PR + reinstall, every new memory /
observation triggers graph extraction as a fire-and-forget task; the
50+ pre-existing unenriched memories are recovered via
`POST /v1/graph/backfill` (or `lore graph-backfill` on the CLI).

Wiring:
  * routes/memories.py: after the existing enrich_memory_async task,
    fire `graph_svc.extract_and_persist` if `graph_svc.is_enabled()`.
    Auto-on iff `claude` is on PATH; explicit override via
    LORE_GRAPH_EXTRACTION_ENABLED.
  * routes/observations.py: same hook. Closes the silent gap from the
    audit — observations were the bulk-write tier the dream subagent
    saves through, so without this the graph stayed empty even with
    PR A landed.

New route — POST /v1/graph/backfill:
  * routes/graph_backfill.py mounted on top of the existing
    routes/graph package (which lives at /v1/ui).
  * Body: {limit?: 1-100, force?: bool, project?: str}.
  * Default: walks list_memories_without_mentions(org_id, project,
    limit) and runs extraction on each.
  * force=true: pulls from list_memories instead, re-extracts even
    memories that already have mentions (use after a prompt/model
    revision).
  * Concurrency: asyncio.gather over the page; the extraction
    service's semaphore caps actual subprocess fan-out at
    LORE_GRAPH_EXTRACTION_CONCURRENCY (default 2).
  * Returns processed/failed counts + per-memory result detail so the
    CLI can drain pages and report progress.
  * Response carries `enabled: false` (with empty results) when
    extraction is disabled, so callers don't silently see processed=0
    and assume the run succeeded.
  * Auth: writer/admin (require_role).

CLI — `lore graph-backfill` (cli/commands/graph.py):
  * Replaces the dead SDK-side path (`Lore.graph_backfill` returns 0
    in HTTP-store mode because `_knowledge_graph_enabled` is only
    initialized for local-Sqlite SDK construction).
  * POSTs /v1/graph/backfill via HttpStore._request, drains pages
    until processed+failed == 0 (or 50-page safety cap).
  * Prints a clear "extraction disabled" message when the server
    reports enabled=false instead of silently exiting 0.

Test plan:
  * 12 new tests in tests/test_graph_extraction_wiring.py covering
    create-time fire-and-forget on both routes (enabled + disabled),
    backfill happy path, no-unenriched edge case, force=true switch
    to list_memories, extraction-error → failed count, and the CLI's
    HTTP body shape, page draining, and disabled-message branch.
  * Existing tests/test_enrichment_memories.py fixture now sets
    LORE_GRAPH_EXTRACTION_ENABLED=false so the asserting-create_task
    counts stay deterministic regardless of whether `claude` is on
    PATH locally.
  * Manual: ran extract_and_persist live against 8 unenriched
    memories on the editable install — 12 entities + 28
    entity_mentions + 13 relationships materialized cleanly.
  * pytest: 2815 passed locally (2803 baseline + 12 new), 0 failed.
  * ruff: clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@amitpaz1 amitpaz1 merged commit 9f64e7f into main May 8, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant