Skip to content

Conversation

@naaa760
Copy link

@naaa760 naaa760 commented Nov 17, 2025

fix: #22

  • Add a new RepositoryAnalysisAgent that analyzes GitHub repositories to automatically generate personalized Watchflow rule recommendations. The agent uses LangGraph for multi-step analysis including repository structure examination, contributing guidelines parsing, and pattern-based rule generation with confidence scoring.

Key Features:

  • Repository Analysis: Examines workflows, contributors, branch protection, and code patterns
  • Contributing Guidelines Parsing: Uses LLM to extract requirements from CONTRIBUTING.md files
  • Rule Recommendations: Generates valid Watchflow YAML rules with confidence scores and reasoning
  • API Integration: Adds POST /api/v1/rules/recommend endpoint with caching and rate limiting
  • Agent Factory: Integrated into the existing agent system via get_agent("repository_analysis")

Files Added:

  • src/agents/repository_analysis_agent/ - Complete agent implementation
  • src/api/recommendations.py - REST API endpoints
  • Updated src/agents/factory.py and src/main.py for integration

Acceptance Criteria:

Analyzes repositories and generates valid rule recommendations
Recommendations include confidence scores and clear reasoning
API endpoint accepts repository identifiers and returns structured data
Integrates seamlessly with existing Watchflow architecture
All recommendations are valid Watchflow rule YAML
Comprehensive error handling and validation

@naaa760 naaa760 requested a review from dkargatzis as a code owner November 17, 2025 08:23
@dkargatzis
Copy link
Member

Hello @naaa760 - thanks for the PR, it adds really cool functionality for better / smother UX. Could we please address the pre-commit and test issues? Also please give an example of the expected outcome - what are the rules that this endpoint suggests for a high volume repo.

@naaa760
Copy link
Author

naaa760 commented Nov 29, 2025

@dkargatzis
please check!

@dkargatzis
Copy link
Member

@naaa760 I guess we still need to instruct the agent / pass that context to the agent?

@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 50.09901% with 252 lines in your changes missing coverage. Please review.

❌ Your patch status has failed because the patch coverage (50.0%) is below the target coverage (80.0%). You can increase the patch coverage or adjust the target coverage.
❌ Your project status has failed because the head coverage (32.0%) is below the target coverage (80.0%). You can increase the head coverage or adjust the target coverage.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

@@           Coverage Diff           @@
##            main     #27     +/-   ##
=======================================
+ Coverage   24.7%   32.0%   +7.2%     
=======================================
  Files         79      85      +6     
  Lines       4497    5000    +503     
=======================================
+ Hits        1115    1604    +489     
- Misses      3382    3396     +14     

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a518bcb...2f1c0b9. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dkargatzis
Copy link
Member

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new Repository Analysis Agent for recommending Watchflow rules, along with a corresponding API endpoint and new diff-aware validators. The implementation is extensive and adds significant new capabilities.

My review has identified a few key areas for improvement:

  • Inconsistent Documentation: The documentation for the new diff-aware validators is duplicated and contains conflicting information, which could confuse users.
  • Incomplete Agent Logic: The repository analysis agent's implementation appears incomplete. The analysis of CONTRIBUTING.md is not fully implemented, and the core rule recommendations are hardcoded instead of being dynamically generated based on repository analysis, which doesn't align with the feature's description.
  • Inconsistent Validator Implementation: There's a critical issue in src/rules/validators.py where a new validator uses an inconsistent method for file path matching, which could lead to incorrect behavior.
  • Testing and Utilities: I've also included some suggestions to improve test case assertions and make the new caching utility safer for future use.

Overall, this is a great feature addition, but the agent's core logic needs to be completed to match its intended design, and the inconsistencies in documentation and implementation should be resolved.

Comment on lines +982 to +987
@staticmethod
def _matches_any(path: str, patterns: list[str]) -> bool:
if not path:
return False
posix_path = PurePosixPath(path)
return any(posix_path.match(pattern) for pattern in patterns)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This _matches_any static method inside RequiredFieldInDiffCondition is inconsistent with the more powerful, custom _matches_any helper function defined at the module level (line 69). This implementation uses PurePosixPath.match, which does not handle ** for recursive directory matching in the same way as the custom glob logic. This will lead to inconsistent and potentially incorrect file matching behavior compared to other new validators like DiffPatternCondition. This class should use the shared _matches_any helper from the module to ensure consistent glob pattern matching across all validators. Please remove this static method so the module-level one is used.

prompt = CONTRIBUTING_GUIDELINES_ANALYSIS_PROMPT.format(content=content)
await llm.ainvoke(prompt)

# TODO: Parse JSON response and create ContributingGuidelinesAnalysis

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This TODO indicates that the LLM response from analyzing CONTRIBUTING.md is not being parsed and used. The function currently falls back to creating a ContributingGuidelinesAnalysis object with just the raw content. This is a significant gap in the implementation, as the structured analysis of contribution guidelines is a key part of the agent's described functionality.

Comment on lines +141 to +351
async def generate_rule_recommendations(state: RepositoryAnalysisState) -> dict[str, Any]:
"""
Generate Watchflow rule recommendations based on repository analysis.
"""
try:
logger.info(f" Generating rule recommendations for {state.repository_full_name}")

recommendations = []

features = state.repository_features
contributing = state.contributing_analysis

# Diff-aware: enforce filter handling in core RAG/query code
recommendations.append(
RuleRecommendation(
yaml_content="""description: "Block merges when PRs change filter validation logic without failing on invalid inputs"
enabled: true
severity: "high"
event_types: ["pull_request"]
parameters:
file_patterns:
- "packages/core/src/**/vector-query.ts"
- "packages/core/src/**/graph-rag.ts"
- "packages/core/src/**/filters/*.ts"
require_patterns:
- "throw\\\\s+new\\\\s+Error"
- "raise\\\\s+ValueError"
forbidden_patterns:
- "return\\\\s+.*filter\\\\s*$"
how_to_fix: "Ensure invalid filters raise descriptive errors instead of silently returning unfiltered results."
""",
confidence=0.85,
reasoning="Filter handling regressions were flagged in historical fixes; enforce throws on invalid input.",
source_patterns=["pr_history"],
category="quality",
estimated_impact="high",
)
)

# Diff-aware: enforce test updates when core code changes
recommendations.append(
RuleRecommendation(
yaml_content="""description: "Require regression tests when modifying tool schema validation or client tool execution"
enabled: true
severity: "medium"
event_types: ["pull_request"]
parameters:
source_patterns:
- "packages/core/src/**/tool*.ts"
- "packages/core/src/agent/**"
- "packages/client/**"
test_patterns:
- "packages/core/tests/**"
- "tests/**"
min_test_files: 1
rationale: "Tool invocation changes have previously caused regressions in clientTools streaming."
""",
confidence=0.8,
reasoning="Core tool changes often broke client tools; require at least one related test update.",
source_patterns=["pr_history"],
category="quality",
estimated_impact="medium",
)
)

# Diff-aware: ensure agent descriptions exist
recommendations.append(
RuleRecommendation(
yaml_content="""description: "Ensure every agent exposes a user-facing description for UI profiles"
enabled: true
severity: "low"
event_types: ["pull_request"]
parameters:
file_patterns:
- "packages/core/src/agent/**"
required_text:
- "description"
message: "Add or update the agent description so downstream UIs can render capabilities."
""",
confidence=0.75,
reasoning="Agent profile UIs require descriptions; ensure new/updated agents include them.",
source_patterns=["pr_history"],
category="process",
estimated_impact="low",
)
)

# Diff-aware: preserve URL handling for supported providers
recommendations.append(
RuleRecommendation(
yaml_content="""description: "Block merges when URL or asset handling changes bypass provider capability checks"
enabled: true
severity: "high"
event_types: ["pull_request"]
parameters:
file_patterns:
- "packages/core/src/agent/message-list/**"
- "packages/core/src/llm/**"
require_patterns:
- "isUrlSupportedByModel"
forbidden_patterns:
- "downloadAssetsFromMessages\\(messages\\)"
how_to_fix: "Preserve remote URLs for providers that support them natively; only download assets for unsupported providers."
""",
confidence=0.8,
reasoning="Past URL handling bugs; ensure capability checks remain intact.",
source_patterns=["pr_history"],
category="quality",
estimated_impact="high",
)
)

# Legacy structural signals retained for completeness
if features.has_workflows:
recommendations.append(
RuleRecommendation(
yaml_content="""description: "Require CI checks to pass"
enabled: true
severity: "high"
event_types:
- pull_request
conditions:
- type: "ci_checks_passed"
parameters:
required_checks: []
actions:
- type: "block_merge"
parameters:
message: "All CI checks must pass before merging"
""",
confidence=0.9,
reasoning="Repository has CI workflows configured, so requiring checks to pass is a standard practice",
source_patterns=["has_workflows"],
category="quality",
estimated_impact="high",
)
)

if features.has_codeowners:
recommendations.append(
RuleRecommendation(
yaml_content="""description: "Require CODEOWNERS approval for changes"
enabled: true
severity: "medium"
event_types:
- pull_request
conditions:
- type: "codeowners_approved"
parameters: {}
actions:
- type: "require_approval"
parameters:
message: "CODEOWNERS must approve changes to owned files"
""",
confidence=0.8,
reasoning="CODEOWNERS file exists, indicating ownership requirements for code changes",
source_patterns=["has_codeowners"],
category="process",
estimated_impact="medium",
)
)

if contributing.requires_tests:
recommendations.append(
RuleRecommendation(
yaml_content="""description: "Require test coverage for code changes"
enabled: true
severity: "medium"
event_types:
- pull_request
conditions:
- type: "test_coverage_threshold"
parameters:
minimum_coverage: 80
actions:
- type: "block_merge"
parameters:
message: "Test coverage must be at least 80%"
""",
confidence=0.7,
reasoning="Contributing guidelines mention testing requirements",
source_patterns=["requires_tests"],
category="quality",
estimated_impact="medium",
)
)

if features.contributor_count > 10:
recommendations.append(
RuleRecommendation(
yaml_content="""description: "Require at least one approval for pull requests"
enabled: true
severity: "medium"
event_types:
- pull_request
conditions:
- type: "minimum_approvals"
parameters:
count: 1
actions:
- type: "block_merge"
parameters:
message: "Pull requests require at least one approval"
""",
confidence=0.6,
reasoning="Repository has multiple contributors, indicating collaborative development",
source_patterns=["contributor_count"],
category="process",
estimated_impact="medium",
)
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The rule recommendations, especially the core 'diff-aware' ones, appear to be hardcoded rather than dynamically generated based on the repository analysis. The analysis data from repository_features and contributing_analysis is only used for a few legacy rules, but the main new recommendations are added unconditionally. This doesn't align with the PR description which suggests 'pattern-based rule generation'. The prompts defined in prompts.py for dynamic analysis and rule generation also seem to be unused here. This implementation should be updated to dynamically generate rules based on the actual analysis results from previous steps to fulfill the feature's goal.

Comment on lines +268 to +299
## Diff-Aware Validators

Watchflow supports advanced validators that inspect actual PR diffs to enforce code-level patterns:

### diff_pattern
Enforce regex requirements or prohibitions within file patches.

```yaml
parameters:
file_patterns: ["packages/core/src/**/vector-query.ts"]
require_patterns: ["throw\\s+new\\s+Error"]
forbid_patterns: ["silent.*skip"]
```

### related_tests
Require test file updates when core code changes.

```yaml
parameters:
file_patterns: ["packages/core/src/**"]
require_test_updates: true
min_test_files: 1
```

### required_field_in_diff
Ensure new additions include required fields (e.g., agent descriptions).

```yaml
parameters:
file_patterns: ["packages/core/src/agent/**"]
required_text: "description"
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There appears to be duplicated and inconsistent documentation for Diff-Aware Validators. This section is added here, but a similar, slightly different version is also added earlier in the file (starting around line 114). The two versions use conflicting parameter names and formats, which will be confusing for users. For example:

  • The other section uses forbidden_patterns, while this one uses forbid_patterns.
  • The related_tests example here uses require_test_updates: true, while the other uses source_patterns, test_patterns, and min_test_files.
  • The required_text parameter is a string here, but a list in the other example.

Please consolidate these into a single, consistent section to avoid confusion.

@@ -0,0 +1,86 @@
# Repository Analysis Agent

the Repository Analysis Agent analyzes GitHub repositories to generate personalized Watchflow rule recommendations based on repository structure, contributing guidelines, and development patterns.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Minor grammatical fix: The sentence should start with a capital letter for better readability.

Suggested change
the Repository Analysis Agent analyzes GitHub repositories to generate personalized Watchflow rule recommendations based on repository structure, contributing guidelines, and development patterns.
The Repository Analysis Agent analyzes GitHub repositories to generate personalized Watchflow rule recommendations based on repository structure, contributing guidelines, and development patterns.


## Overview

this agent performs comprehensive analysis of repositories and provides actionable governance rule recommendations with confidence scores and reasoning.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Minor grammatical fix: The sentence should start with a capital letter for better readability.

Suggested change
this agent performs comprehensive analysis of repositories and provides actionable governance rule recommendations with confidence scores and reasoning.
This agent performs comprehensive analysis of repositories and provides actionable governance rule recommendations with confidence scores and reasoning.

Comment on lines +109 to +112
if isinstance(result, dict):
state = RepositoryAnalysisState(**result)
else:
state = result

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic to handle the result from self.graph.ainvoke can be simplified. The ainvoke method on a compiled graph returns the final state object directly, not a dictionary that needs to be converted. You can remove the isinstance check and directly assign the result to the state variable.

            state = result

Comment on lines +95 to +97
result = await agent.execute("test-owner/test-repo")

assert isinstance(result, object)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The assertion assert isinstance(result, object) in this error handling test is too weak, as almost everything in Python is an object. The test should be more specific to be meaningful. It should verify that the agent's execution correctly reports a failure.

Suggested change
result = await agent.execute("test-owner/test-repo")
assert isinstance(result, object)
result = await agent.execute("test-owner/test-repo")
assert not result.success
assert "analysis failed" in result.message

Comment on lines +119 to +136
@router.get("/v1/rules/recommend/{repository_full_name}")
async def get_cached_recommendations(repository_full_name: str) -> JSONResponse:
"""
Get cached recommendations for a repository.
Args:
repository_full_name: Full repository name (owner/repo)
Returns:
Cached analysis results or 404 if not found
"""
cache_key = f"repo_analysis:{repository_full_name}"
cached_result = await get_cache(cache_key)

if not cached_result:
raise HTTPException(status_code=404, detail="No cached analysis found for repository")

return JSONResponse(content=cached_result)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The get_cached_recommendations endpoint is a great addition for retrieving cached results. However, it's defined with a path that could be more consistent with REST principles. Since it's retrieving a specific resource (the analysis for a repository), using query parameters might be more idiomatic.

Suggested change
@router.get("/v1/rules/recommend/{repository_full_name}")
async def get_cached_recommendations(repository_full_name: str) -> JSONResponse:
"""
Get cached recommendations for a repository.
Args:
repository_full_name: Full repository name (owner/repo)
Returns:
Cached analysis results or 404 if not found
"""
cache_key = f"repo_analysis:{repository_full_name}"
cached_result = await get_cache(cache_key)
if not cached_result:
raise HTTPException(status_code=404, detail="No cached analysis found for repository")
return JSONResponse(content=cached_result)
@router.get("/v1/rules/recommend", response_model=RepositoryAnalysisResponse)
async def get_cached_recommendations(repository_full_name: str) -> RepositoryAnalysisResponse:
"""
Get cached recommendations for a repository.
Args:
repository_full_name: Full repository name (owner/repo)
Returns:
Cached analysis results or 404 if not found
"""
cache_key = f"repo_analysis:{repository_full_name}"
cached_result = await get_cache(cache_key)
if not cached_result:
raise HTTPException(status_code=404, detail="No cached analysis found for repository")
return RepositoryAnalysisResponse(**cached_result)

Comment on lines +133 to +134
if ttl and ttl != _GLOBAL_CACHE.ttl:
_GLOBAL_CACHE.ttl = ttl

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Modifying the ttl of the global cache instance _GLOBAL_CACHE within the set_cache helper can lead to unpredictable behavior. If another part of the application calls this function with a different ttl, it will alter the cache's expiration policy for all subsequent calls. It would be safer to either not allow overriding the TTL here or to use a different caching mechanism if variable TTLs are required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: implement repository analysis agent with recommendation API

3 participants