This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Trigent—A Rich Issue MCP for GitHub Triaging at Scale is an MCP server that provides enriched GitHub issue data to help AI agents effectively triage thousands of issues in upstream projects like JupyterLab. The system enriches raw issue data with semantic embeddings, metrics computation, and intelligent analysis to enable better AI-powered decision-making.
The system consists of several Python modules under trigent/:
-
trigent/pull.py: Data pulling module that fetches raw issues from GitHub repositories using intelligent paging
- Uses
ghCLI for GitHub API access with weekly chunking based onupdatedAttimestamps - Implements incremental updates to avoid refetching unchanged issues
- Uses TinyDB for persistent storage and direct issue comparison for updates
- Merges new/updated issues with existing data while preserving all information
- Stores data directly in TinyDB database files
- Uses
-
trigent/enrich.py: Data enrichment module that processes raw issue data
- Adds embeddings for semantic search (via Mistral API)
- Computes metrics: reactions, comments, age, activity scores
- Assigns quartiles for all metrics using pandas
qcut()with descriptive labels (Bottom25%, Bottom50%, Top50%, Top25%) - Updates TinyDB database with enriched data
-
trigent/mcp_server.py: FastMCP server providing database access tools
- Serves enriched issue data to AI agents
- Tools: get_issue, find_similar_issues, find_cross_referenced_issues, get_issue_metrics
-
trigent/cli.py: CLI orchestration module
- Unified
trigentcommand with subcommands - Orchestrates the entire workflow from pull to triaging
- Unified
- trigent/config.py: Configuration management and caching
- trigent/database.py: Database utilities and operations
# Install the package
pip install -e .
# 1. Initial repository setup (pulls data and enriches it)
trigent pull jupyterlab/jupyterlab --start-date 2025-01-01
# 2. Keep repository up to date (incremental updates)
trigent update jupyterlab/jupyterlab
# 3. Start MCP server for AI agent access
trigent serve jupyterlab/jupyterlab
# 4. Export data for analysis
trigent export jupyterlab/jupyterlab --csv --viz
# 5. Show collection statistics
trigent stats # Show all collections
trigent stats jupyterlab/jupyterlab # Show specific repo
# 6. Clean repository data
trigent clean jupyterlab/jupyterlab# Install with development dependencies
pip install -e ".[dev]"
# Configure Mistral API key in config.toml
cp config.toml.example config.toml
# Edit config.toml and add your Mistral API key# Lint, format, and type check
ruff check trigent/ && ruff format trigent/ && mypy trigent/trigent/cli.py: Main CLI entry point with simplified commandstrigent/pull.py: Python module for fetching raw issues from GitHubtrigent/enrich.py: Python enrichment pipeline with embeddings/metricstrigent/mcp_server.py: FastMCP server for database accesstrigent/database.py: Qdrant vector database operationstrigent/config.py: Configuration management and API key handlingconfig.toml: User configuration file (API keys, Qdrant settings)pyproject.toml: Project configuration
- Python 3.12+: Core language with modern type hints (updated requirement)
- pandas, numpy: Data processing and quartile calculations
- requests: HTTP client for Mistral API
- FastMCP: Minimal server for database access
- scikit-learn: Machine learning utilities for k-nearest neighbors
- diskcache: Persistent caching for API responses
- toml: Configuration file parsing
- ipython, ipdb: Interactive development and debugging
- gh CLI: GitHub issue fetching (external dependency)
- Unified Python: All components integrated in single Python package with clean module separation
- Intelligent Paging: GitHub issues fetched via
ghCLI with weekly chunking and incremental updates - State Management: Pull module tracks last fetch timestamps to enable efficient incremental updates
- Issue Merging: Smart merge logic updates existing issues while preserving all data integrity
- Enriched Data: Pandas-based processing adds embeddings and quartiles (UMAP removed)
- MCP Server: FastMCP provides database access tools for AI agents
- Simplified CLI: Streamlined commands that combine operations (e.g.,
pulldoes fetch + enrich) - Direct Integration: No subprocess calls between internal modules - all use direct Python imports
Trigent/
├── trigent/ # Main Python package
│ ├── __init__.py
│ ├── __main__.py # Entry point for python -m trigent
│ ├── cli.py # CLI orchestration with all subcommands
│ ├── clean.py # Clean command implementation
│ ├── update.py # Update command implementation
│ ├── stats.py # Stats command implementation
│ ├── pull.py # GitHub issue fetching via gh CLI
│ ├── enrich.py # Data enrichment with embeddings/metrics
│ ├── database.py # Qdrant operations and utilities
│ ├── config.py # Configuration management and caching
│ ├── export/ # Export command + CSV/viz subdirectory
│ │ ├── __init__.py
│ │ ├── command.py # Export command entry point
│ │ ├── csv.py # CSV export functionality
│ │ └── visualize.py # Visualization export
│ └── serve/ # Serve command + MCP server subdirectory
│ ├── __init__.py
│ ├── __main__.py # Entry point for python -m trigent.serve
│ ├── command.py # Serve command entry point
│ └── mcp_server.py # FastMCP server implementation
├── data/ # Data storage directory
│ └── issues-{repo}.db # TinyDB database files (e.g., issues-jupyterlab-jupyterlab.db)
├── dcache/ # Diskcache directory for API response caching
├── example/ # Example implementations and agents
├── config.toml # Configuration file (API keys, settings)
├── config.toml.example # Example configuration template
├── pyproject.toml # Python project configuration
├── README.md # Project documentation
├── CLAUDE.md # Development instructions for Claude Code
└── uv.lock # Dependency lock file
To test database functionality, load the database the same way as the MCP server:
from trigent.database import load_issues
def _get_repo_name(repo=None):
"""Get repository name, defaulting to jupyterlab/jupyterlab."""
return repo or "jupyterlab/jupyterlab"
# Load exactly like MCP server
repo = _get_repo_name()
issues = load_issues(repo)
# Find specific issue
issue_3224 = next((i for i in issues if i["number"] == 3224), None)Note: The database must be populated first by running:
trigent pull jupyterlab/jupyterlab --mode create(to fetch raw issues in create mode)trigent enrich jupyterlab/jupyterlab(to add embeddings and metrics)- Subsequent updates use:
trigent pull jupyterlab/jupyterlab --mode update