test: add event filtering and non-code file tests for realtime updater by vitali87 · Pull Request #485 · vitali87/code-graph-rag

vitali87 · 2026-03-23T19:16:10Z

Summary

Follow-up to PR #405 (realtime updater fixes by @bhargavchippada). Adds 13 tests covering the bug fixes and new functionality.

Tests added (13 total)

Event filtering (6):

MODIFIED, CREATED, DELETED events trigger graph updates
"opened", "closed_no_write", "access" events are ignored

Non-code file handling (4):

.md and .json files create File nodes
Deleting non-code file removes File node
Non-code files don't create Module nodes

Mixed sequences (2):

Rapid create/modify/delete on same file
Multiple files changed simultaneously

Query isolation (1):

DELETE_FILE only targets specific path

Test plan

13 tests pass
Lint and format clean

gemini-code-assist · 2026-03-23T19:16:25Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the test coverage for the realtime updater component. It introduces a suite of new tests to validate the updater's behavior across various file system events, ensuring that only relevant changes trigger graph updates and that non-code files are processed correctly without generating unnecessary module nodes. The new tests also cover complex scenarios involving rapid file changes and the precision of delete operations, contributing to a more robust and reliable codebase graph.

Highlights

Event Filtering Validation: Ensured that MODIFIED, CREATED, and DELETED file system events correctly trigger graph updates, while opened, closed_no_write, and access events are appropriately ignored by the realtime updater.
Non-Code File Processing: Verified that non-code files like Markdown and JSON are correctly handled, creating File nodes but not Module nodes, and that their deletion properly removes corresponding File nodes.
Mixed Event Sequence Handling: Added tests to confirm the updater's robustness when handling rapid sequences of create/modify/delete operations on the same file and when multiple files are changed simultaneously.
Delete Query Specificity: Confirmed that the DELETE_FILE Cypher query specifically targets the intended file path, preventing unintended deletions.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

vitali87 · 2026-03-23T19:16:25Z

@greptile

vitali87 · 2026-03-23T19:16:26Z

/gemini review

gemini-code-assist

Code Review

This pull request adds a comprehensive suite of tests for the real-time updater, covering event filtering, non-code file handling, and various event sequences. The tests are well-structured and cover the intended functionality thoroughly. My feedback focuses on making some of the assertions more specific to improve test robustness and clarity, ensuring that not just the number of database operations is correct, but also that the right queries are being executed with the correct parameters.

gemini-code-assist · 2026-03-23T19:17:59Z

codebase_rag/tests/test_realtime_event_filtering.py

+        f = temp_repo / "app.py"
+        f.write_text("x = 1", encoding="utf-8")
+        handler.dispatch(FileModifiedEvent(str(f)))
+        assert mock_updater.ingestor.execute_write.call_count == 3


The current assertion only checks the total number of write operations. To make this test more robust and descriptive, it's better to assert which specific Cypher queries were executed and with what parameters. This ensures that not only the count is correct, but also that the right operations are being performed on the correct file.

Suggested change

assert mock_updater.ingestor.execute_write.call_count == 3

calls = mock_updater.ingestor.execute_write.call_args_list

assert len(calls) == 3

assert calls[0].args[0] == cs.CYPHER_DELETE_MODULE

assert calls[0].args[1] == {cs.KEY_PATH: "app.py"}

assert calls[1].args[0] == cs.CYPHER_DELETE_FILE

assert calls[1].args[1] == {cs.KEY_PATH: "app.py"}

assert calls[2].args[0] == cs.CYPHER_DELETE_CALLS

gemini-code-assist · 2026-03-23T19:17:59Z

codebase_rag/tests/test_realtime_event_filtering.py

+        f = temp_repo / "new.py"
+        f.write_text("y = 2", encoding="utf-8")
+        handler.dispatch(FileCreatedEvent(str(f)))
+        assert mock_updater.ingestor.execute_write.call_count == 3


Similar to the modified event test, asserting the specific queries and their parameters will make this test more robust. It verifies that the correct cleanup and reprocessing steps are being triggered for a file creation event.

Suggested change

assert mock_updater.ingestor.execute_write.call_count == 3

calls = mock_updater.ingestor.execute_write.call_args_list

assert len(calls) == 3

assert calls[0].args[0] == cs.CYPHER_DELETE_MODULE

assert calls[0].args[1] == {cs.KEY_PATH: "new.py"}

assert calls[1].args[0] == cs.CYPHER_DELETE_FILE

assert calls[1].args[1] == {cs.KEY_PATH: "new.py"}

assert calls[2].args[0] == cs.CYPHER_DELETE_CALLS

gemini-code-assist · 2026-03-23T19:17:59Z

codebase_rag/tests/test_realtime_event_filtering.py

+    ) -> None:
+        f = temp_repo / "gone.py"
+        handler.dispatch(FileDeletedEvent(str(f)))
+        assert mock_updater.ingestor.execute_write.call_count == 3


To improve the test's specificity, it's better to assert the exact queries being called upon file deletion. This confirms that the correct DELETE operations are performed against the graph for the specified file.

Suggested change

assert mock_updater.ingestor.execute_write.call_count == 3

calls = mock_updater.ingestor.execute_write.call_args_list

assert len(calls) == 3

assert calls[0].args[0] == cs.CYPHER_DELETE_MODULE

assert calls[0].args[1] == {cs.KEY_PATH: "gone.py"}

assert calls[1].args[0] == cs.CYPHER_DELETE_FILE

assert calls[1].args[1] == {cs.KEY_PATH: "gone.py"}

assert calls[2].args[0] == cs.CYPHER_DELETE_CALLS

gemini-code-assist · 2026-03-23T19:17:59Z

codebase_rag/tests/test_realtime_event_filtering.py

+        # (H) After delete, no re-parse or file node creation
+        mock_updater.factory.definition_processor.process_file.assert_not_called()
+        mock_updater.factory.structure_processor.process_generic_file.assert_not_called()
+        assert mock_updater.ingestor.execute_write.call_count == 3


The assertion for the final delete event can be made more specific. By checking the exact queries and parameters, we can be more confident that the deletion part of this sequence behaves as expected, cleaning up the correct entities from the graph.

Suggested change

assert mock_updater.ingestor.execute_write.call_count == 3

calls = mock_updater.ingestor.execute_write.call_args_list

assert len(calls) == 3

assert calls[0].args[0] == cs.CYPHER_DELETE_MODULE

assert calls[0].args[1] == {cs.KEY_PATH: "ephemeral.py"}

assert calls[1].args[0] == cs.CYPHER_DELETE_FILE

assert calls[1].args[1] == {cs.KEY_PATH: "ephemeral.py"}

assert calls[2].args[0] == cs.CYPHER_DELETE_CALLS

gemini-code-assist · 2026-03-23T19:17:59Z

codebase_rag/tests/test_realtime_event_filtering.py

+        handler.dispatch(FileModifiedEvent(str(f1)))
+        handler.dispatch(FileModifiedEvent(str(f2)))
+
+        assert mock_updater.ingestor.execute_write.call_count == 6


Instead of just checking the total number of write calls, it would be more robust to verify the count of each type of query. This ensures that the correct set of operations is performed for each file modification, even when events happen close together. Using collections.Counter can make this check clean and effective.

Suggested change

assert mock_updater.ingestor.execute_write.call_count == 6

from collections import Counter

queries_called = [

c.args[0] for c in mock_updater.ingestor.execute_write.call_args_list

]

counts = Counter(queries_called)

assert len(queries_called) == 6

assert counts[cs.CYPHER_DELETE_MODULE] == 2

assert counts[cs.CYPHER_DELETE_FILE] == 2

assert counts[cs.CYPHER_DELETE_CALLS] == 2

greptile-apps · 2026-03-23T19:20:25Z

Greptile Summary

This PR adds 13 tests that cover the event-filtering and non-code-file bug-fixes from PR #405, and extends the orphan-node pruning logic to be project-scoped so that a multi-project graph database is not corrupted by false deletions.

Production changes

CYPHER_ALL_FILE_PATHS and CYPHER_ALL_FOLDER_PATHS now return absolute_path in addition to path, enabling the prune loop to confirm a node belongs to the current repo before deleting it.
CYPHER_ALL_MODULE_PATHS is renamed CYPHER_ALL_MODULE_PATHS_INTERNAL and gains a WHERE m.is_external IS NULL OR m.is_external = false guard, limiting pruning to modules ingested from local source code.
_prune_orphan_nodes now filters candidate orphans by absolute_path prefix (File/Folder) or qualified_name prefix (Module) before falling through to the path-existence check, preventing cross-project false deletions.

Test changes

test_graph_updater_pruning.py is rewritten to provide project-scoped fixture data that matches the new pruning signatures; a new test_prune_skips_other_projects test explicitly verifies that nodes from other projects are never deleted.
test_realtime_event_filtering.py is a new file with 13 tests covering MODIFIED / CREATED / DELETED event processing, ignored event types ("opened", "closed_no_write", "access"), non-code file handling (.md, .json), rapid multi-event sequences, and Cypher query isolation.

Notable design point: nodes stored without absolute_path (ingested before this change) bypass the absolute_path guard and remain subject to path-existence-only pruning. This is a known trade-off for backwards compatibility but means a first run after upgrade could still misidentify same-relative-path nodes from other projects as orphans.

Confidence Score: 4/5

Safe to merge; the production bugfix is well-covered and the one remaining comment is a non-blocking style improvement to test assertions.
The project-scoped pruning change is correct, tested, and addresses a real multi-project data-loss bug. Prior review threads already identified the two most important gaps (missing flush_all assertions on MODIFIED and DELETED test paths). The only new issue flagged here is the use of raw call-count assertions in TestEventFiltering instead of query-identity checks — a style improvement, not a correctness bug. All 13 new tests pass, lint is clean, and the refactored pruning tests align with the new fixture data.
Pay close attention to codebase_rag/tests/test_realtime_event_filtering.py (count-only assertions) and codebase_rag/graph_updater.py (backwards-compat note on nodes without absolute_path).

Important Files Changed

Filename	Overview
codebase_rag/constants.py	Added `absolute_path` to File/Folder queries and renamed `CYPHER_ALL_MODULE_PATHS` → `CYPHER_ALL_MODULE_PATHS_INTERNAL` with an `is_external` guard — clean, minimal, backward-compatible change.
codebase_rag/graph_updater.py	Project-scoped pruning logic added to `_prune_orphan_nodes`: File/Folder nodes are guarded by `absolute_path` prefix, Module nodes by `qualified_name` prefix. Nodes missing those fields fall through to path-existence check. Logic is correct but nodes without `absolute_path` (from older ingestion) remain subject to cross-repo path collisions.
codebase_rag/tests/test_graph_updater_pruning.py	Rewritten to use the new project-scoped pruning signatures. Tests for cross-project skip (`test_prune_skips_other_projects`) and multi-type orphan deletion are new and valuable. Explicit single-type File and Folder orphan tests were removed — coverage lives only in the combined `test_prune_multiple_orphans_across_types`.
codebase_rag/tests/test_realtime_event_filtering.py	13 new tests covering event filtering, non-code file handling, mixed sequences, and query isolation. Uses `autouse` monkeypatch to bypass the `QueryProtocol` runtime check. Some `TestEventFiltering` tests rely on raw `execute_write.call_count == 3` without verifying which 3 calls were made, making them slightly fragile.
realtime_updater.py	Cosmetic-only change: multi-line `execute_write` call condensed to a single line. No behavioral change.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[_prune_orphan_nodes called] --> B[Build project_prefix & repo_abs]
    B --> C{For each prune_spec\nFile / Module / Folder}
    C --> D[fetch_all rows from graph]
    D --> E{path is valid string?}
    E -- No --> F[skip row]
    E -- Yes --> G{abs_path present\nand is string?}
    G -- Yes, not this repo --> F
    G -- No / same repo --> H{qualified_name present\nnon-empty?}
    H -- Yes, different project --> F
    H -- No / same project --> I{path exists on disk?}
    I -- Yes --> F
    I -- No --> J[add to orphans list]
    J --> K[execute_write DELETE query\nfor each orphan]
    K --> C

Comments Outside Diff (1)

codebase_rag/tests/test_realtime_event_filtering.py, line 701-724 (link)

Count-only assertions obscure which writes occurred

TestEventFiltering tests assert execute_write.call_count == 3 but never verify which three queries were issued. Compare this to TestCypherDeleteFileQuery.test_delete_file_only_targets_specific_path, which explicitly inspects each call.

The count-based style has two problems:

A regression that swaps one query for another leaves the count at 3 and passes silently.
When the count changes (e.g. a new step is added to dispatch), the failure message assert 4 == 3 gives no indication of what changed.

test_deleted_event_is_processed is the clearest example — it already checks that two factories are not called, so checking the three specific writes (CYPHER_DELETE_MODULE, CYPHER_DELETE_FILE, CYPHER_DELETE_CALLS) would be consistent and complete:

write_queries = [c.args[0] for c in mock_updater.ingestor.execute_write.call_args_list]
assert write_queries == [cs.CYPHER_DELETE_MODULE, cs.CYPHER_DELETE_FILE, cs.CYPHER_DELETE_CALLS]

The same pattern applies to test_modified_event_is_processed (line 701) and test_multiple_files_changed (line 825).

Prompt To Fix With AI

This is a comment left during a code review.
Path: codebase_rag/tests/test_realtime_event_filtering.py
Line: 701-724

Comment:
**Count-only assertions obscure which writes occurred**

`TestEventFiltering` tests assert `execute_write.call_count == 3` but never verify *which* three queries were issued. Compare this to `TestCypherDeleteFileQuery.test_delete_file_only_targets_specific_path`, which explicitly inspects each call.

The count-based style has two problems:
1. A regression that swaps one query for another leaves the count at 3 and passes silently.
2. When the count changes (e.g. a new step is added to `dispatch`), the failure message `assert 4 == 3` gives no indication of what changed.

`test_deleted_event_is_processed` is the clearest example — it already checks that two factories are *not* called, so checking the three specific writes (`CYPHER_DELETE_MODULE`, `CYPHER_DELETE_FILE`, `CYPHER_DELETE_CALLS`) would be consistent and complete:

```python
write_queries = [c.args[0] for c in mock_updater.ingestor.execute_write.call_args_list]
assert write_queries == [cs.CYPHER_DELETE_MODULE, cs.CYPHER_DELETE_FILE, cs.CYPHER_DELETE_CALLS]
```

The same pattern applies to `test_modified_event_is_processed` (line 701) and `test_multiple_files_changed` (line 825).

How can I resolve this? If you propose a fix, please make it concise.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: codebase_rag/tests/test_realtime_event_filtering.py
Line: 701-724

Comment:
**Count-only assertions obscure which writes occurred**

`TestEventFiltering` tests assert `execute_write.call_count == 3` but never verify *which* three queries were issued. Compare this to `TestCypherDeleteFileQuery.test_delete_file_only_targets_specific_path`, which explicitly inspects each call.

The count-based style has two problems:
1. A regression that swaps one query for another leaves the count at 3 and passes silently.
2. When the count changes (e.g. a new step is added to `dispatch`), the failure message `assert 4 == 3` gives no indication of what changed.

`test_deleted_event_is_processed` is the clearest example — it already checks that two factories are *not* called, so checking the three specific writes (`CYPHER_DELETE_MODULE`, `CYPHER_DELETE_FILE`, `CYPHER_DELETE_CALLS`) would be consistent and complete:

```python
write_queries = [c.args[0] for c in mock_updater.ingestor.execute_write.call_args_list]
assert write_queries == [cs.CYPHER_DELETE_MODULE, cs.CYPHER_DELETE_FILE, cs.CYPHER_DELETE_CALLS]
```

The same pattern applies to `test_modified_event_is_processed` (line 701) and `test_multiple_files_changed` (line 825).

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (3): Last reviewed commit: "fix: remove unused CYPHER_ALL_MODULE_PAT..." | Re-trigger Greptile}

greptile-apps · 2026-03-23T19:20:28Z

codebase_rag/tests/test_realtime_event_filtering.py

+    def test_modified_event_is_processed(
+        self, handler: CodeChangeEventHandler, mock_updater: MagicMock, temp_repo: Path
+    ) -> None:
+        f = temp_repo / "app.py"
+        f.write_text("x = 1", encoding="utf-8")
+        handler.dispatch(FileModifiedEvent(str(f)))
+        assert mock_updater.ingestor.execute_write.call_count == 3


Missing flush_all assertion for MODIFIED event

The MODIFIED event path in realtime_updater.dispatch() calls self.updater.ingestor.flush_all() at Step 5, just like the CREATED path. However, unlike test_created_event_is_processed (line 62) and test_multiple_files_changed (line 183), this test never asserts that flush_all was actually invoked. This means a regression that skips the flush for MODIFIED events would pass this test undetected.

Suggested change

def test_modified_event_is_processed(

self, handler: CodeChangeEventHandler, mock_updater: MagicMock, temp_repo: Path

) -> None:

f = temp_repo / "app.py"

f.write_text("x = 1", encoding="utf-8")

handler.dispatch(FileModifiedEvent(str(f)))

assert mock_updater.ingestor.execute_write.call_count == 3

def test_modified_event_is_processed(

self, handler: CodeChangeEventHandler, mock_updater: MagicMock, temp_repo: Path

) -> None:

f = temp_repo / "app.py"

f.write_text("x = 1", encoding="utf-8")

handler.dispatch(FileModifiedEvent(str(f)))

assert mock_updater.ingestor.execute_write.call_count == 3

mock_updater.ingestor.flush_all.assert_called_once()

Prompt To Fix With AI

This is a comment left during a code review. Path: codebase_rag/tests/test_realtime_event_filtering.py Line: 47-53 Comment: **Missing `flush_all` assertion for MODIFIED event** The MODIFIED event path in `realtime_updater.dispatch()` calls `self.updater.ingestor.flush_all()` at Step 5, just like the CREATED path. However, unlike `test_created_event_is_processed` (line 62) and `test_multiple_files_changed` (line 183), this test never asserts that `flush_all` was actually invoked. This means a regression that skips the flush for MODIFIED events would pass this test undetected. ```suggestion def test_modified_event_is_processed( self, handler: CodeChangeEventHandler, mock_updater: MagicMock, temp_repo: Path ) -> None: f = temp_repo / "app.py" f.write_text("x = 1", encoding="utf-8") handler.dispatch(FileModifiedEvent(str(f))) assert mock_updater.ingestor.execute_write.call_count == 3 mock_updater.ingestor.flush_all.assert_called_once() ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-23T19:20:29Z

codebase_rag/tests/test_realtime_event_filtering.py

+    def test_deleted_event_is_processed(
+        self, handler: CodeChangeEventHandler, mock_updater: MagicMock, temp_repo: Path
+    ) -> None:
+        f = temp_repo / "gone.py"
+        handler.dispatch(FileDeletedEvent(str(f)))
+        assert mock_updater.ingestor.execute_write.call_count == 3
+        mock_updater.factory.definition_processor.process_file.assert_not_called()
+        mock_updater.factory.structure_processor.process_generic_file.assert_not_called()


Missing flush_all assertion for DELETED event

realtime_updater.dispatch() calls flush_all unconditionally at Step 5 for all three relevant event types (MODIFIED, CREATED, DELETED). While test_rapid_create_modify_delete (line 169) happens to cover this transitively, no standalone DELETE test verifies it. A regression dropping the flush_all call for DELETE events would slip past this test.

Suggested change

def test_deleted_event_is_processed(

self, handler: CodeChangeEventHandler, mock_updater: MagicMock, temp_repo: Path

) -> None:

f = temp_repo / "gone.py"

handler.dispatch(FileDeletedEvent(str(f)))

assert mock_updater.ingestor.execute_write.call_count == 3

mock_updater.factory.definition_processor.process_file.assert_not_called()

mock_updater.factory.structure_processor.process_generic_file.assert_not_called()

def test_deleted_event_is_processed(

self, handler: CodeChangeEventHandler, mock_updater: MagicMock, temp_repo: Path

) -> None:

f = temp_repo / "gone.py"

handler.dispatch(FileDeletedEvent(str(f)))

assert mock_updater.ingestor.execute_write.call_count == 3

mock_updater.ingestor.flush_all.assert_called_once()

mock_updater.factory.definition_processor.process_file.assert_not_called()

mock_updater.factory.structure_processor.process_generic_file.assert_not_called()

Prompt To Fix With AI

This is a comment left during a code review. Path: codebase_rag/tests/test_realtime_event_filtering.py Line: 64-71 Comment: **Missing `flush_all` assertion for DELETED event** `realtime_updater.dispatch()` calls `flush_all` unconditionally at Step 5 for all three relevant event types (MODIFIED, CREATED, DELETED). While `test_rapid_create_modify_delete` (line 169) happens to cover this transitively, no standalone DELETE test verifies it. A regression dropping the `flush_all` call for DELETE events would slip past this test. ```suggestion def test_deleted_event_is_processed( self, handler: CodeChangeEventHandler, mock_updater: MagicMock, temp_repo: Path ) -> None: f = temp_repo / "gone.py" handler.dispatch(FileDeletedEvent(str(f))) assert mock_updater.ingestor.execute_write.call_count == 3 mock_updater.ingestor.flush_all.assert_called_once() mock_updater.factory.definition_processor.process_file.assert_not_called() mock_updater.factory.structure_processor.process_generic_file.assert_not_called() ``` How can I resolve this? If you propose a fix, please make it concise.

gemini-code-assist

Code Review

This pull request adds a comprehensive suite of tests for the realtime updater's event filtering and non-code file handling. The tests are well-structured and cover a good range of scenarios. My feedback focuses on improving the consistency and completeness of assertions across the new test file to make the test suite even more robust.

gemini-code-assist · 2026-03-23T19:20:36Z

codebase_rag/tests/test_realtime_event_filtering.py

+        f = temp_repo / "app.py"
+        f.write_text("x = 1", encoding="utf-8")
+        handler.dispatch(FileModifiedEvent(str(f)))
+        assert mock_updater.ingestor.execute_write.call_count == 3


For consistency with other tests and to ensure the full event handling flow is tested, please also assert that flush_all is called. This aligns with the principle of prioritizing consistency with existing codebase patterns, especially in testing, to ensure a uniform and robust test suite.

Suggested change

assert mock_updater.ingestor.execute_write.call_count == 3

assert mock_updater.ingestor.execute_write.call_count == 3

mock_updater.ingestor.flush_all.assert_called_once()

References

When multiple valid refactoring patterns exist (e.g., using a flag vs. early continue), prioritize consistency with the existing codebase and previous review decisions over adopting a new, equally valid pattern.

gemini-code-assist · 2026-03-23T19:20:36Z

codebase_rag/tests/test_realtime_event_filtering.py

+        handler.dispatch(FileDeletedEvent(str(f)))
+        assert mock_updater.ingestor.execute_write.call_count == 3
+        mock_updater.factory.definition_processor.process_file.assert_not_called()
+        mock_updater.factory.structure_processor.process_generic_file.assert_not_called()


To ensure the test is comprehensive and consistent with other deletion tests, please also assert that flush_all is called after the event is dispatched. This aligns with the principle of prioritizing consistency with existing codebase patterns, especially in testing, to ensure a uniform and robust test suite.

Suggested change

mock_updater.factory.structure_processor.process_generic_file.assert_not_called()

mock_updater.factory.structure_processor.process_generic_file.assert_not_called()

mock_updater.ingestor.flush_all.assert_called_once()

References

When multiple valid refactoring patterns exist (e.g., using a flag vs. early continue), prioritize consistency with the existing codebase and previous review decisions over adopting a new, equally valid pattern.

gemini-code-assist · 2026-03-23T19:20:36Z

codebase_rag/tests/test_realtime_event_filtering.py

+        mock_updater.factory.structure_processor.process_generic_file.assert_called_once_with(
+            f, "readme.md"
+        )


This test is a bit minimal. For better coverage and consistency with other created event tests (like test_created_event_is_processed), please also assert the number of database writes and that flush_all is called. This aligns with the principle of prioritizing consistency with existing codebase patterns, especially in testing, to ensure a uniform and robust test suite.

mock_updater.factory.structure_processor.process_generic_file.assert_called_once_with( f, "readme.md" ) assert mock_updater.ingestor.execute_write.call_count == 3 mock_updater.ingestor.flush_all.assert_called_once()

References

When multiple valid refactoring patterns exist (e.g., using a flag vs. early continue), prioritize consistency with the existing codebase and previous review decisions over adopting a new, equally valid pattern.

gemini-code-assist · 2026-03-23T19:20:36Z

codebase_rag/tests/test_realtime_event_filtering.py

+        mock_updater.factory.structure_processor.process_generic_file.assert_called_once_with(
+            f, "config.json"
+        )


Similar to the markdown test, this test could be more comprehensive. Please add assertions for the number of database writes and the call to flush_all to ensure the full event handling process is verified. This aligns with the principle of prioritizing consistency with existing codebase patterns, especially in testing, to ensure a uniform and robust test suite.

mock_updater.factory.structure_processor.process_generic_file.assert_called_once_with( f, "config.json" ) assert mock_updater.ingestor.execute_write.call_count == 3 mock_updater.ingestor.flush_all.assert_called_once()

References

When multiple valid refactoring patterns exist (e.g., using a flag vs. early continue), prioritize consistency with the existing codebase and previous review decisions over adopting a new, equally valid pattern.

gemini-code-assist · 2026-03-23T19:20:36Z

codebase_rag/tests/test_realtime_event_filtering.py

+        delete_file_calls = [
+            c
+            for c in mock_updater.ingestor.execute_write.call_args_list
+            if c.args[0] == cs.CYPHER_DELETE_FILE
+        ]
+        assert len(delete_file_calls) == 1
+        assert delete_file_calls[0].args[1] == {
+            cs.KEY_PATH: "notes.md",
+        }
+        mock_updater.factory.structure_processor.process_generic_file.assert_not_called()


This test's assertions are not as comprehensive as other deletion tests in this file (e.g., test_delete_file_only_targets_specific_path). To improve consistency and test coverage, please verify all expected database operations for a file deletion event. This includes checking for CYPHER_DELETE_MODULE and CYPHER_DELETE_FILE calls with the correct path, the total number of writes, and the call to flush_all. This aligns with the principle of prioritizing consistency with existing codebase patterns, especially in testing, to ensure a uniform and robust test suite.

assert mock_updater.ingestor.execute_write.call_count == 3 mock_updater.ingestor.flush_all.assert_called_once() delete_file_calls = [ c for c in mock_updater.ingestor.execute_write.call_args_list if c.args[0] == cs.CYPHER_DELETE_FILE ] assert len(delete_file_calls) == 1 assert delete_file_calls[0].args[1] == { cs.KEY_PATH: "notes.md", } delete_module_calls = [ c for c in mock_updater.ingestor.execute_write.call_args_list if c.args[0] == cs.CYPHER_DELETE_MODULE ] assert len(delete_module_calls) == 1 assert delete_module_calls[0].args[1] == {cs.KEY_PATH: "notes.md"} mock_updater.factory.structure_processor.process_generic_file.assert_not_called()

References

When multiple valid refactoring patterns exist (e.g., using a flag vs. early continue), prioritize consistency with the existing codebase and previous review decisions over adopting a new, equally valid pattern.

codecov-commenter · 2026-03-23T19:24:19Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 99.13420% with 2 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
codebase_rag/tests/test_graph_updater_pruning.py	97.80%	2 Missing ⚠️

📢 Thoughts on this report? Let us know!

…ules

vitali87 · 2026-03-23T22:51:49Z

@greptile

vitali87 · 2026-03-23T22:51:50Z

/gemini review

gemini-code-assist

Code Review

This pull request adds a comprehensive set of tests for the realtime updater, covering event filtering, non-code file handling, and various event sequences. The changes also refine the orphan node pruning logic to be project-specific by checking qualified names.

My main feedback is regarding the updated pruning logic in graph_updater.py. The change to only prune Module nodes has removed the capability to clean up orphan Folder nodes, which could lead to stale data in the graph. I've left a suggestion to reintroduce folder pruning with the necessary adjustments.

gemini-code-assist · 2026-03-23T22:53:45Z

codebase_rag/graph_updater.py

        prune_specs: list[tuple[str, str, str]] = [
-            (cs.CYPHER_ALL_FILE_PATHS, cs.CYPHER_DELETE_FILE, "File"),
-            (cs.CYPHER_ALL_MODULE_PATHS, cs.CYPHER_DELETE_MODULE, "Module"),
-            (cs.CYPHER_ALL_FOLDER_PATHS, cs.CYPHER_DELETE_FOLDER, "Folder"),
+            (
+                cs.CYPHER_ALL_MODULE_PATHS_INTERNAL,
+                cs.CYPHER_DELETE_MODULE,
+                "Module",
+            ),
        ]

        for query_all, delete_query, label in prune_specs:
            rows = self.ingestor.fetch_all(query_all)
-            orphans = [
-                r["path"]
-                for r in rows
-                if r.get("path")
-                and not (self.repo_path / r["path"]).exists()
-            ]
+            orphans = []
+            for r in rows:
+                path = r.get("path")
+                qn = r.get("qualified_name", "")
+                if not isinstance(path, str) or not path:
+                    continue
+                if isinstance(qn, str) and qn and not qn.startswith(project_prefix):
+                    continue
+                if not (self.repo_path / path).exists():
+                    orphans.append(path)


This change removes the pruning logic for orphan File and Folder nodes, restricting it to only Module nodes. While the hash-cache mechanism handles deleted files, orphan Folder nodes (e.g., from a deleted empty directory) will no longer be removed from the graph. This can lead to stale data accumulating over time.

Consider reintroducing pruning for Folder nodes. The new orphan detection logic, which checks for a project-prefixed qualified_name, is specific to modules, so you'll need to adapt it for folders.

prune_specs: list[tuple[str, str, str]] = [ ( cs.CYPHER_ALL_MODULE_PATHS_INTERNAL, cs.CYPHER_DELETE_MODULE, "Module", ), ( cs.CYPHER_ALL_FOLDER_PATHS, cs.CYPHER_DELETE_FOLDER, "Folder", ), ] for query_all, delete_query, label in prune_specs: rows = self.ingestor.fetch_all(query_all) orphans = [] for r in rows: path = r.get("path") if not isinstance(path, str) or not path: continue # The project prefix check is only applicable to Modules if label == "Module": qn = r.get("qualified_name", "") if isinstance(qn, str) and qn and not qn.startswith(project_prefix): continue if not (self.repo_path / path).exists(): orphans.append(path)

codebase_rag/graph_updater.py

vitali87 · 2026-03-23T23:46:50Z

@greptile

vitali87 · 2026-03-23T23:46:51Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a comprehensive set of tests for the realtime updater and refines the orphan node pruning logic. The changes to make pruning project-specific are a solid improvement, and the new tests are well-structured, covering various event filtering scenarios, non-code file handling, and edge cases. I have one minor suggestion to remove an unused constant. Overall, this is a valuable contribution that enhances the robustness and test coverage of the codebase.

gemini-code-assist · 2026-03-23T23:48:06Z

codebase_rag/constants.py

+CYPHER_ALL_MODULE_PATHS = (
+    "MATCH (m:Module) RETURN m.path AS path, m.qualified_name AS qualified_name"
+)


This constant CYPHER_ALL_MODULE_PATHS appears to be unused after its usage in graph_updater.py was replaced with CYPHER_ALL_MODULE_PATHS_INTERNAL. To improve maintainability, it would be best to remove this dead code.

vitali87 · 2026-03-23T23:53:45Z

@greptile

sonarqubecloud · 2026-03-23T23:58:33Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
100.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

test: add event filtering and non-code file tests for realtime updater

bc89241

github-project-automation bot added this to @vitali87's graph code Mar 23, 2026

gemini-code-assist bot reviewed Mar 23, 2026

View reviewed changes

fix: format PR 405 files and remove unused import

f6670b7

greptile-apps bot reviewed Mar 23, 2026

View reviewed changes

gemini-code-assist bot reviewed Mar 23, 2026

View reviewed changes

vitali87 added 2 commits March 23, 2026 20:49

fix: scope orphan pruning to current project and exclude external mod…

dd22616

…ules

fix: type narrowing for ResultValue in orphan pruning

aa354ac

vitali87 force-pushed the test/realtime-updater-coverage branch from 9b9c079 to aa354ac Compare March 23, 2026 19:56

fix: update pruning tests to match scoped module-only pruning

11ac4e6

gemini-code-assist bot reviewed Mar 23, 2026

View reviewed changes

greptile-apps bot reviewed Mar 23, 2026

View reviewed changes

codebase_rag/graph_updater.py Show resolved Hide resolved

fix: restore File/Folder pruning scoped by absolute_path

1f159d8

gemini-code-assist bot reviewed Mar 23, 2026

View reviewed changes

fix: remove unused CYPHER_ALL_MODULE_PATHS constant

21ebacb

vitali87 merged commit 993017b into main Mar 24, 2026
17 checks passed

github-project-automation bot moved this to Done in @vitali87's graph code Mar 24, 2026

-        assert mock_updater.ingestor.execute_write.call_count == 3
+        calls = mock_updater.ingestor.execute_write.call_args_list
+        assert len(calls) == 3
+        assert calls[0].args[0] == cs.CYPHER_DELETE_MODULE
+        assert calls[0].args[1] == {cs.KEY_PATH: "app.py"}
+        assert calls[1].args[0] == cs.CYPHER_DELETE_FILE
+        assert calls[1].args[1] == {cs.KEY_PATH: "app.py"}
+        assert calls[2].args[0] == cs.CYPHER_DELETE_CALLS

-        assert mock_updater.ingestor.execute_write.call_count == 6
+        from collections import Counter
+        queries_called = [
+            c.args[0] for c in mock_updater.ingestor.execute_write.call_args_list
+        ]
+        counts = Counter(queries_called)
+        assert len(queries_called) == 6
+        assert counts[cs.CYPHER_DELETE_MODULE] == 2
+        assert counts[cs.CYPHER_DELETE_FILE] == 2
+        assert counts[cs.CYPHER_DELETE_CALLS] == 2

	assert mock_updater.ingestor.execute_write.call_count == 3
	assert mock_updater.ingestor.execute_write.call_count == 3
	mock_updater.ingestor.flush_all.assert_called_once()

	mock_updater.factory.structure_processor.process_generic_file.assert_not_called()
	mock_updater.factory.structure_processor.process_generic_file.assert_not_called()
	mock_updater.ingestor.flush_all.assert_called_once()

Uh oh!

Conversation

vitali87 commented Mar 23, 2026

Summary

Tests added (13 total)

Test plan

Uh oh!

gemini-code-assist bot commented Mar 23, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

vitali87 commented Mar 23, 2026

Uh oh!

vitali87 commented Mar 23, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Comments Outside Diff (1)

Uh oh!

greptile-apps bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

vitali87 commented Mar 23, 2026

Uh oh!

vitali87 commented Mar 23, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

greptile-apps bot commented Mar 23, 2026 •

edited

Loading

codecov-commenter commented Mar 23, 2026 •

edited

Loading