Skip to content

fix(summary): exclude duplicate edges from node summary generation#1223

Merged
prasmussen15 merged 2 commits intomainfrom
fix/exclude-duplicate-edges-from-summary
Feb 12, 2026
Merged

fix(summary): exclude duplicate edges from node summary generation#1223
prasmussen15 merged 2 commits intomainfrom
fix/exclude-duplicate-edges-from-summary

Conversation

@prasmussen15
Copy link
Collaborator

Summary

  • Fix bug where edges that matched existing graph edges were being included in node summary generation, causing facts to be duplicated in summaries
  • Update resolve_extracted_edges to return a third value: new_edges (edges that are new to the graph, not duplicates)
  • Only pass new_edges to extract_attributes_from_nodes for summary generation in add_episode()

Details

When an extracted edge is resolved against existing edges in the graph, if it matches an existing edge (duplicate), the resolved edge takes on the UUID of the existing edge. Previously, all resolved edges were passed to summary generation, causing duplicate facts.

Now we track which edges are "new" by comparing resolved_edge.uuid == extracted_edge.uuid. Only new edges (non-duplicates) are passed to the summary generation flow.

Test plan

  • Updated existing tests to handle new return value
  • Added assertions to verify new_edges behavior
  • Linting passes
  • Type checking passes

🤖 Generated with Claude Code

When resolving extracted edges, edges that match existing edges in the
graph were still being passed to node summary generation, causing facts
to be duplicated in summaries.

Changes:
- Update resolve_extracted_edges to return new_edges (non-duplicates)
- Update _extract_and_resolve_edges to pass through new_edges
- Pass only new_edges to extract_attributes_from_nodes in add_episode
- An edge is considered "new" if its resolved UUID matches extracted UUID

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Comment on lines 711 to +712
invalidated_edges.extend(result[1])
# result[2] is new_edges - not used in bulk flow since attributes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says "not used in bulk flow since attributes are extracted before edge resolution", but this means the bug being fixed in add_episode (duplicate facts in summaries) could still occur in the bulk flow.

If _resolve_nodes_and_edges_bulk extracts attributes before edge resolution, and those attributes include summaries based on edges, wouldn't the same duplication problem exist? The bulk flow appears to call extract_attributes_from_nodes with all edges including potential duplicates.

Consider clarifying in the PR description whether this is a known limitation, or investigate whether the bulk flow actually has this issue.

for extracted_edge, result in zip(extracted_edges, results, strict=True):
resolved_edge = result[0]
invalidated_edge_chunk = result[1]
# result[2] is duplicate_edges list
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is misleading. result[2] is not a "duplicate_edges list" - looking at the resolve_extracted_edge function signature/return type would clarify what this actually is. The third element appears to be something else based on the tuple type annotation at line 433.

Consider either removing this comment or verifying what result[2] actually contains and documenting it accurately.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@prasmussen15 prasmussen15 merged commit fe19482 into main Feb 12, 2026
10 checks passed
@prasmussen15 prasmussen15 deleted the fix/exclude-duplicate-edges-from-summary branch February 12, 2026 00:26
@getzep getzep locked and limited conversation to collaborators Feb 12, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant