feat(meta-rules): weight graduation by applicability#231
Conversation
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
📝 Walkthrough
WalkthroughThis PR introduces applicability-weighted scoring for meta-rules. The ChangesApplicability-weighted meta-rule scoring
Possibly related PRs
Suggested labels
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 OpenGrep (1.22.0)OpenGrep fatal error (exit code 2): �[32m✔�[39m �[1mOpengrep OSS�[0m �[1m Loading rules from local config...�[0m Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
Gradata/tests/test_meta_rules.py (1)
162-200: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winAdd a round-trip assertion for
final_scoreonce persisted.This test now covers
applicability_observed_count; addingfinal_scorehere will prevent regressions in storage contract for ranking behavior.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Gradata/tests/test_meta_rules.py` around lines 162 - 200, The test_sqlite_roundtrip currently asserts persisted fields like applicability_observed_count but misses verifying MetaRule.final_score; update the test to include a round-trip assertion that after calling save_meta_rules(db_path, metas) and load_meta_rules(db_path) the loaded MetaRule objects have the expected final_score values (and maintain sort order by final_score if that's the intended ranking). Locate the MetaRule instances created in test_sqlite_roundtrip and add assertions comparing loaded[0].final_score and loaded[1].final_score to the expected computed scores (or at least that loaded[0].final_score >= loaded[1].final_score) so storage and ranking behavior for final_score is validated after save/load via save_meta_rules and load_meta_rules.Gradata/src/gradata/enhancements/meta_rules_storage.py (1)
40-55:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftPersist/load
MetaRule.final_scoreas part of the schema update.
MetaRulenow carriesfinal_score, but this file only stores/loadsapplicability_observed_count. The computed rank signal is lost after round-trip (Line 124+ insert, Line 191+ select). This breaks data continuity and misses the stated migration objective.Suggested patch (schema + save + load)
_CREATE_TABLE_SQL = """ CREATE TABLE IF NOT EXISTS meta_rules ( @@ context_weights TEXT, applicability_observed_count INTEGER, + final_score REAL, tenant_id TEXT, visibility TEXT DEFAULT 'private' ); """ @@ _ADD_APPLICABILITY_OBSERVED_COUNT_SQL = ( "ALTER TABLE meta_rules ADD COLUMN applicability_observed_count INTEGER" ) +_ADD_FINAL_SCORE_SQL = "ALTER TABLE meta_rules ADD COLUMN final_score REAL" @@ for stmt in ( @@ _ADD_APPLICABILITY_OBSERVED_COUNT_SQL, + _ADD_FINAL_SCORE_SQL, "ALTER TABLE meta_rules ADD COLUMN tenant_id TEXT", @@ scope, examples, context_weights, applies_when, never_when, - transfer_scope, source, applicability_observed_count, + transfer_scope, source, applicability_observed_count, final_score, tenant_id, visibility) - VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, 'private')""", + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, 'private')""", ( @@ meta.source, meta.applicability_observed_count, + meta.final_score, _tid, ), ) @@ applicability_expr = ( "applicability_observed_count" if "applicability_observed_count" in existing_cols else "NULL AS applicability_observed_count" ) + final_score_expr = "final_score" if "final_score" in existing_cols else "NULL AS final_score" @@ - transfer_scope, {source_expr}, {applicability_expr} + transfer_scope, {source_expr}, {applicability_expr}, {final_score_expr} @@ - applicability_observed_count=row[14], + applicability_observed_count=row[14], + final_score=row[15], )Also applies to: 124-130, 184-195, 222-223
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Gradata/src/gradata/enhancements/meta_rules_storage.py` around lines 40 - 55, The schema and persistence code in meta_rules_storage.py omitted MetaRule.final_score so it isn't round-tripped; update the _CREATE_TABLE_SQL to add a final_score REAL column, update the save/insert logic (the block that inserts into meta_rules—refer to the INSERT code around where applicability_observed_count is written) to include final_score in the INSERT columns and values, and update the load/select logic (the code that SELECTs rows and constructs MetaRule objects) to read final_score from the result set and assign it to MetaRule.final_score so the computed rank persists across migrations and reloads.Gradata/src/gradata/enhancements/meta_rules.py (1)
470-471:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUpdate stale return-contract wording in
discover_meta_rulesdocstring.Line 470 still states sorting by confidence, but Line 519 now sorts by
meta_rule_final_score. This can mislead callers reading API docs.Also applies to: 519-519
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Gradata/src/gradata/enhancements/meta_rules.py` around lines 470 - 471, Update the docstring for discover_meta_rules to reflect the current return contract: replace the outdated "sorted by confidence" wording with a precise statement that results are sorted by meta_rule_final_score (and return an empty list when no cluster meets min_group_size); mention meta_rule_final_score and min_group_size in the docstring so callers know the sorting key and the empty-list behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@Gradata/src/gradata/enhancements/meta_rules_storage.py`:
- Around line 40-55: The schema and persistence code in meta_rules_storage.py
omitted MetaRule.final_score so it isn't round-tripped; update the
_CREATE_TABLE_SQL to add a final_score REAL column, update the save/insert logic
(the block that inserts into meta_rules—refer to the INSERT code around where
applicability_observed_count is written) to include final_score in the INSERT
columns and values, and update the load/select logic (the code that SELECTs rows
and constructs MetaRule objects) to read final_score from the result set and
assign it to MetaRule.final_score so the computed rank persists across
migrations and reloads.
In `@Gradata/src/gradata/enhancements/meta_rules.py`:
- Around line 470-471: Update the docstring for discover_meta_rules to reflect
the current return contract: replace the outdated "sorted by confidence" wording
with a precise statement that results are sorted by meta_rule_final_score (and
return an empty list when no cluster meets min_group_size); mention
meta_rule_final_score and min_group_size in the docstring so callers know the
sorting key and the empty-list behavior.
In `@Gradata/tests/test_meta_rules.py`:
- Around line 162-200: The test_sqlite_roundtrip currently asserts persisted
fields like applicability_observed_count but misses verifying
MetaRule.final_score; update the test to include a round-trip assertion that
after calling save_meta_rules(db_path, metas) and load_meta_rules(db_path) the
loaded MetaRule objects have the expected final_score values (and maintain sort
order by final_score if that's the intended ranking). Locate the MetaRule
instances created in test_sqlite_roundtrip and add assertions comparing
loaded[0].final_score and loaded[1].final_score to the expected computed scores
(or at least that loaded[0].final_score >= loaded[1].final_score) so storage and
ranking behavior for final_score is validated after save/load via
save_meta_rules and load_meta_rules.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 781bd30e-1f14-43c5-9c3b-8f646d2cbb9e
📒 Files selected for processing (5)
Gradata/src/gradata/_migrations/006_meta_rule_applicability.pyGradata/src/gradata/_migrations/__init__.pyGradata/src/gradata/enhancements/meta_rules.pyGradata/src/gradata/enhancements/meta_rules_storage.pyGradata/tests/test_meta_rules.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: pytest (py3.12)
- GitHub Check: pytest (py3.11)
- GitHub Check: pytest windows-latest / py3.12
- GitHub Check: pytest macos-latest / py3.11
- GitHub Check: pytest ubuntu-latest / py3.12
- GitHub Check: pytest macos-latest / py3.12
- GitHub Check: pytest ubuntu-latest / py3.11
- GitHub Check: pytest windows-latest / py3.11
🧰 Additional context used
📓 Path-based instructions (2)
Gradata/src/**/*.py
📄 CodeRabbit inference engine (Gradata/AGENTS.md)
Gradata/src/**/*.py: Prefersentence-transformersfor local embeddings,google-genaifor Gemini embeddings,cryptographyfor AES-GCM encrypted system.db,bm25sfor BM25 rule ranking, andmem0aifor external memory adapters — guard all optional dependency imports withtry / except ImportErrorat the call site, never at module level
Maintain strict layering: Layer 0 (Primitives: _types.py, _db.py, _events.py, _paths.py, _file_lock.py; Patterns: contrib/patterns/) must never import from Layer 1 (Enhancements: enhancements/, rules/) or Layer 2 (Public API: brain.py, cli.py, daemon.py, mcp_server.py)
Never use bareexcept: pass— use typed exceptions or at minimumlogger.warning(...)withexc_info=Trueto avoid silent failure in a memory product
Never import from out-of-scope sibling directories../Sprites/or../Hausgem/withingradata/*code — that is a layering bug
Never leak private-sibling paths into public docs/code — no references to../Sprites/,../Hausgem/, email addresses, OneDrive paths, or Sprites-specific examples from insidegradata/*
Use atomic-write helper when writing JSON files to prevent corruption from mid-write crashes
Files:
Gradata/src/gradata/_migrations/__init__.pyGradata/src/gradata/_migrations/006_meta_rule_applicability.pyGradata/src/gradata/enhancements/meta_rules_storage.pyGradata/src/gradata/enhancements/meta_rules.py
Gradata/tests/**/*.py
📄 CodeRabbit inference engine (Gradata/AGENTS.md)
Gradata/tests/**/*.py: SetBRAIN_DIRenvironment variable viatmp_pathin conftest.py for test isolation — ensure_paths.pymodule cache refreshes when callingBrain.init()directly inside tests
Add unit tests intests/test_*.pyfor every CI push without LLM calls (deterministic); mark integration tests with@pytest.mark.integrationand skip them by default (they hit real LLM APIs)
Files:
Gradata/tests/test_meta_rules.py
🔇 Additional comments (4)
Gradata/src/gradata/enhancements/meta_rules.py (1)
35-35: LGTM!Also applies to: 85-91, 100-197, 503-518, 766-767, 807-830
Gradata/src/gradata/_migrations/006_meta_rule_applicability.py (1)
1-85: LGTM!Gradata/src/gradata/_migrations/__init__.py (1)
98-98: LGTM!Gradata/tests/test_meta_rules.py (1)
32-39: LGTM!Also applies to: 178-179, 199-199, 205-255
Summary
GRADATA_APPLICABILITY_WEIGHTdefaulting to 0.5applicability_observed_countandfinal_scorefor meta rulesTest Plan
python3 -m pytest Gradata/tests/test_meta_rules.py -qCloses: ba83b6dd-8301-48cb-bb7d-e6b313346460
Paperclip: GRA-1296