Skip to content

akhilsinghcodes/jidra

Repository files navigation

JIDRA: Enterprise Codebase Context Backend for LLM Workflows

JIDRA = Java/Scala/TypeScript/Python/Go Integrated Graph Reduction & Analysis

JIDRA is a structured context backend that reduces LLM input tokens by 68-95% for code-native queries by giving Claude a pre-analyzed call graph instead of raw source files. Multi-language support: Scala (~90% resolution), Java (~85% resolution), TypeScript (~80% resolution), Python (~68.5% resolution), Go (tree-sitter-based, best-effort resolution).

What This Means

Real Claude Code sessions, same question, same model (claude-sonnet-4-6 1M):

Without JIDRA:  833,782 input tokens  ($0.2298)  — Claude read files manually
With JIDRA:     227,095 input tokens  ($0.2275)  — Claude used graph tools
Reduction:          72.8% fewer input tokens, same answer quality

At Opus pricing ($15/M input):
Without JIDRA:  $12.51/query
With JIDRA:      $3.41/query
Savings:         $9.10/query  →  $4,550/year at 500 queries

JIDRA connects to Claude Code as an MCP server — one command to set up:

This project is intentionally focused and graph-driven.

Pitch (TL;DR)

  • Multi-language → Scala, Java, TypeScript, Python, Go (auto-detected)
  • Index once → Get a deterministic call graph (AST-based, language-optimized)
  • Reduce noise → Remove phantom edges (Java: runtime validation, TS/Python/Scala/Go: static analysis)
  • Generate context → 68-95% smaller prompt-ready context for Claude/Codex/Gemini
  • Trace execution → See likely business flow with uncertainty markers
  • Reduce LLM cost → Proven token reduction on code-native workflows (measured on real projects)

Real Proof — Claude Code sessions, same question, same model (claude-sonnet-4-6):

Session Input tokens Output tokens Cost
Without JIDRA (filesystem tools only) 833,782 5,161 $0.2298
With JIDRA (MCP graph tools) 227,095 1,784 $0.2275
Reduction 72.8% 65.4% ~same

Same cost today at Sonnet pricing because output tokens dominate — but 606k fewer input tokens per query. At Opus pricing ($15/M input vs $3/M) that gap is $9.09 saved per query.

Screenshots

jidra up is the one-command setup flow — prompts, a live spinner while parsing, and a styled summary panel when it's done:

jidra up — full run: banner, prompts, pipeline progress, and ready panel

Live progress while indexing is in flight:

jidra up — live parsing spinner

jidra up writes its output (graph + visualization) under jidra/output/<repo-slug>-<branch>/, never into the target repo:

jidra output directory structure

The generated graph_visualization.html — interactive dark-theme call graph with a node inspector, search, and callers/callees navigation:

Interactive graph visualization — dark theme, node inspector, physics/filter controls

Click any node to inspect its module, signature, and file location; "Show Neighbors" highlights its direct call relationships:

Graph node inspector and Show Neighbors highlighting

Search by method or class name to jump straight to a node:

Graph search box with live results dropdown

The same view also exports the full graph as Graphviz DOT or pretty-printed JSON, in-place:

Graph visualization JSON export overlay

jidra up also offers to index your docs//README.md and link doc chunks to the classes/methods they describe — rendered as its own interactive doc graph:

JIDRA doc graph — documents linked to source classes

Every index, reindex, and doc-index run is recorded to a local telemetry dashboard (jidra history --html, served from ~/.jidra/telemetry/telemetry.html):

Interactive graph visualization overview Graph visualization JSON export tab JIDRA telemetry dashboard — index history, elapsed time, doc chunk charts JIDRA telemetry — full index events and doc index events tables

Web UI (jidra ui)

Everything above also runs as a single-page app — one process, no separate static HTML reports to regenerate. Pick a repo once, then move between seven tabs without re-entering anything:

cd ui && npm install && npm run build   # one-time, builds ui/dist
jidra ui --port 7474                    # serves the React app + FastAPI backend
Tab What it does
IDX Run the index/validate/doc-index pipeline, trigger incremental reindex, install git hooks — all from one form, with a live streaming output log
GRF Interactive call graph: search-to-jump, depth control, node inspector with callers/callees, endpoint filter, JSON export
SQL Query graph.db / telemetry.db directly against a live schema browser
MCP Call any MCP tool by hand — fill the JSON schema, see the raw result, browse the session log
TRC Hit the underlying graph functions directly (trace / context / flow / route / flow-doc / error-doc) without going through MCP tool-call plumbing
DOC Doc-to-code linkage graph — which README/spec chunks reference which classes, click through to see the actual linked classes
HIST Telemetry dashboard across all indexed repos — stat cards, growth/elapsed charts, full index/reindex/doc-index event tables

Pick a repository — the same picker drives every tab:

JIDRA UI — repository picker

IDX — pipeline configuration as label/control rows, live output log, plus incremental reindex and git-hook install:

JIDRA UI — index pipeline configuration JIDRA UI — index pipeline running, live output log

GRF — full interactive graph, fuzzy search-to-jump, and a node inspector with callers/callees:

JIDRA UI — graph overview

SQL — schema browser plus a real query editor against graph.db:

JIDRA UI — SQL editor and schema browser

MCP — call any tool by hand and watch the session log fill in:

JIDRA UI — MCP tool call (jidra_explore) with live result

TRC — run flow/trace/context directly against the graph, no MCP plumbing in the way:

JIDRA UI — TRC flow result for a selected method

DOC — doc-to-code linkage graph; click a chunk or doc to see exactly which classes it links to:

JIDRA UI — doc graph overview

HIST — telemetry is top-level, not scoped to one repo: stat cards and charts across everything you've indexed:

JIDRA UI — telemetry dashboard, stat cards and charts JIDRA UI — telemetry event tables

What JIDRA Does

  • Indexes Scala, Java, TypeScript, Python, and Go source into deterministic call graphs
  • Validates with language-specific strategies (Spring Actuator for Java, compiler-resolved for Scala, static analysis for TS/Python/Go)
  • Searches the graph by keyword or natural language (FTS5-backed jidra_search / jidra_explore)
  • Surfaces framework structure as first-class data — HTTP endpoints, React/Vue/Angular components & hooks (jidra_get_endpoints, jidra_get_components, jidra_get_framework_summary)
  • Answers impact-analysis questions — what breaks if I change this file (jidra_get_file_dependents / _dependencies)
  • Resolves interface→implementation and class surface — list every concrete implementation of an interface/abstract class in one call (jidra_get_implementations), or every method and field of a class (jidra_get_class_members)
  • Generates noise-free context (68-95% smaller depending on language), auto-scaled to repo size (budget tiers)
  • Traces method/function execution with uncertainty markers
  • Stays fresh automatically via git hooks + an in-daemon file watcher (no manual reindex)
  • Exports as JSON, MCP tools, or interactive HTML
  • Integrates with Claude/Codex/Gemini via MCP (shared-daemon proxy mode shares one in-memory graph across editor windows)
  • Reduces LLM token costs by 68-95% (proven on real projects)

Benchmarked vs CodeGraph (agent-in-loop)

JIDRA was evaluated against the CodeGraph MCP server using a real coding agent: the same LLM was given a navigation task and exactly one backend's tools, then scored on correctness, tool calls, tokens, and hallucinations. On a Spring Boot Java repo (7 tasks, Haiku 4.5):

Config Backend correct avg tool calls avg tokens halluc.
A (runtime Actuator) JIDRA 7/7 3.0 ~18.9k 0/7
A (runtime Actuator) CodeGraph 7/7 4.9 ~65.2k 0/7
B (static only) JIDRA 7/7 3.9 ~26.0k 0/7
B (static only) CodeGraph 6/7 5.0 ~68.3k 0/7

JIDRA used ~3.5x fewer tokens than CodeGraph at equal correctness. Runtime Actuator grounding (Config A) cut JIDRA a further ~27% tokens and ~1 tool call per task versus static analysis (Config B) by resolving dependency injection ambiguity up front—something CodeGraph's pure-static approach cannot match. The CodeGraph 7/7→6/7 variance between configs is agent run-variance (CodeGraph never uses JIDRA's graph), not a real effect. Full methodology, per-task data, and limitations are in FINDINGS_jidra_vs_codegraph.md.

What JIDRA Does NOT Do (By Design)

  • Autonomous agent loops - Claude already does this; we provide context
  • Multi-service distributed reasoning - Requires service mesh, not code analysis
  • Interactive debugging sessions - Single-shot context generation (not loops)
  • Config-driven behavior analysis - YAML/JSON parsing planned for v2.0
  • Full semantic Java correctness - AST + Actuator validation is best-effort

Bottom line: JIDRA is infrastructure FOR agents, not a replacement agent.

Project Layout

jidra/
├── README.md
├── pyproject.toml
├── pytest.ini
├── requirements.txt
├── LICENSE
├── .gitignore
├── docs/
│   ├── MCP_VERIFICATION_RESULTS.md
│   ├── actuator_incremental_plan.md
│   ├── incremental_reindex_plan.md
│   ├── quick_test_script.sh
│   ├── testing_incremental_reindex.md
│   ├── assets/
│   └── archive/
├── evals/
│   ├── agent_eval.py
│   ├── agent_eval_py.py
│   ├── agent_eval_ts.py
│   ├── analyze_session_logs.py
│   ├── compare_chat.py
│   ├── eval_chat.py
│   ├── eval_queries.yaml
│   ├── migrate_structure.py
│   └── validate_jidra_analysis.py
├── examples/
│   └── sample-java/
├── experiments/
│   ├── compare_graph_json.py
│   ├── enrichment_agent.py
│   ├── enrichment_judge.py
│   ├── enrichment_ui.py
│   ├── method_prompt.py
│   └── token_count.py
├── sidecar/
│   ├── scala/
│   └── typescript/
├── src/
│   └── jidra/
│       ├── cli.py
│       ├── cost_calculator.py
│       ├── context_builder.py
│       ├── daemon.py
│       ├── doc_indexer.py
│       ├── engine.py
│       ├── extractor.py
│       ├── flow_stitcher.py
│       ├── graph_rag.py
│       ├── graph_validator.py
│       ├── mcp_server.py
│       ├── scala_extractor.py
│       ├── session_log.py
│       ├── ts_filters.py
│       └── ...
├── tests/
│   ├── fixtures/
│   ├── conftest.py
│   ├── test_budget.py
│   ├── test_context_builder.py
│   ├── test_continuous_sync.py
│   ├── test_cost_calculator.py
│   ├── test_daemon.py
│   ├── test_engine.py
│   ├── test_file_deps.py
│   ├── test_flow_stitcher.py
│   ├── test_frameworks.py
│   ├── test_go_extractor.py
│   ├── test_graph_health.py
│   ├── test_incremental_index.py
│   ├── test_index_cache.py
│   ├── test_module_partitioning.py
│   ├── test_py_extractor.py
│   ├── test_search.py
│   ├── test_session_log.py
│   ├── test_smithy.py
│   └── test_ts_treesitter.py
├── ui/
│   ├── index.html
│   ├── package.json
│   ├── package-lock.json
│   ├── tsconfig.json
│   ├── vite.config.ts
│   └── src/
└── validations/
    ├── hallucination_test.py
    └── run_validation.py

Installation

This project is released under the MIT License (see LICENSE).

From project root:

cd scripts/jidra
pip install -e .

If you use the local venv:

./venv/bin/pip install -e .

Quick Start

Optional: configure your project package prefixes

Some features (like error-doc choosing the first "project" stack frame as an anchor) can use package prefixes to distinguish your code from third-party libraries.

Set a comma-separated list:

Clone repository

git clone https://github.com/akhilsinghcodes/jidra.git cd jidra

If unset, JIDRA treats any package as project code for anchoring.

1) Build graph

python -m jidra.cli index \
  --codebase /path/to/java/repo \
  --output /tmp/graph.db

When output is a directory, JIDRA writes a single graph.db SQLite file. Main and test source are kept as separate rows via a variant column (main / test / validated) rather than separate files.

2) Trace method flow

python -m jidra.cli trace \
  --graph /tmp/graph.db \
  --method com.example.Controller.search

3) Build method context

python -m jidra.cli context \
  --graph /tmp/graph.db \
  --method com.example.Controller.search

4) Generate prompt text

python -m jidra.cli prompt \
  --graph /tmp/graph.db \
  --method com.example.Controller.search \
  --target codex

5) Diagnose with LLM

python -m jidra.cli diagnose \
  --graph /tmp/graph.db \
  --method com.example.Controller.search \
  --target codex \
  --llm-profile local

Storage: SQLite (graph.db)

JIDRA persists the code graph in a single SQLite database, graph.db, instead of the JSONL files (graph.jsonl, graph_test.jsonl, graph_validated.jsonl) used in earlier versions. One file, three logical graphs:

  • variant column (main / test / validated) replaces the three separate JSONL files — production code, test code, and the Spring-Actuator-filtered graph all live in the same tables, distinguished by a column instead of a filename.
  • module_id column replaces per-module JSONL files + modules_index.json for multi-module repos — one graph.db, scoped rows, no index file to keep in sync.
  • Real incremental updates: reindexing now runs scoped SQL DELETE/INSERT against just the changed file_path rows, instead of loading the entire graph into memory and rewriting the whole file on every change. Large repos reindex proportionally to what changed, not to total codebase size.
  • No compression step: SQLite's on-disk format made the --compress/.jsonl.zst path unnecessary, so it (and the zstandard dependency) was removed entirely.
  • Inspectable with standard tools: sqlite3 graph.db ".tables" / SELECT * FROM methods WHERE variant='validated' work directly — no custom JSONL parsing required to poke at the data.
  • Full-text search index: a methods_fts FTS5 virtual table (kept in sync by triggers) backs jidra_search / jidra_explore. The methods table also carries a framework_role column for endpoint/component queries.
  • In-place migration: the schema is versioned (schema_version); opening an older 2.0 database transparently upgrades it to 2.1 (creates + backfills the FTS index, adds framework_role) on first connect() — no rebuild required.

jidra up writes graph.db (plus the validation report and visualization HTML) to JIDRA's own jidra/output/<repo-slug>-<branch>/ directory rather than into the target repo — the repo you're analyzing only ever gets a .mcp.json (or nothing, if you register the MCP server via claude mcp add / codex mcp add instead).

Graph Selection Behavior

For trace, context, trace-route, prompt, diagnose:

  • --graph provided: used directly
  • --graph omitted: selected by --graph-type (main default)
    • main -> jidra/output/graph.db (variant="main")
    • test -> jidra/output/graph.db (variant="test")

Method Selectors

Supported method selectors:

  • method id
  • full signature
  • full class + method (com.example.Class.method)
  • short class + method (Class.method)
  • bare method name (if unique)

Ambiguous selector output includes candidate ids you can use directly.

Command Reference

validate

Purpose: validate static call graph against a running Spring Boot app's actuator beans, filtering out phantom edges to uninstantiated classes.

jidra validate \
  [--graph <path>] \
  [--graph-type main|test] \
  [--actuator-url <url>] \
  [--codebase <path>] \
  [--port 8080] \
  [--timeout 120] \
  [--output <path>] \
  [--report <path>] \
  [--no-filter]

Behavior:

  • --actuator-url provided: connect directly to running app (e.g., http://localhost:8080)
  • --codebase provided: auto-build Docker image, run container, query actuator, cleanup
  • Must provide one of --actuator-url or --codebase
  • Fetches /actuator/beans to extract confirmed bean class names
  • Filters graph: removes edges to non-bean classes, removes CallSites pointing to non-beans
  • Upgrades unresolved CallSites where receiver type matches a confirmed bean
  • Outputs: graph.db with variant="validated" rows (filtered graph) + JSON report

Filtering logic:

  1. Extract confirmed bean class names from actuator response
  2. Remove ResolvedCallEdge where callee method's class is not a confirmed bean
  3. Remove CallSite records where all resolved_candidates point to non-beans
  4. Upgrade CallSite with status unresolved_receiver if receiver type matches a bean
  5. Keep all class/method nodes for context (not all classes are beans)

Example: Direct URL

jidra validate \
  --actuator-url http://localhost:8080 \
  --graph /path/to/graph.db \
  --output /path/to/output \
  --report /path/to/report.json

Example: Auto Docker build+run (always does clean Java build)

jidra validate \
  --codebase /path/to/java/repo \
  --graph /path/to/graph.db \
  --port 8080 \
  --timeout 120

jidra automatically:

  1. Detects ./gradlew or pom.xml (gradle or maven)
  2. Runs ./gradlew clean build -x test or ./mvnw clean package -DskipTests
  3. Builds Docker image
  4. Runs container and queries actuator
  5. Cleans up Docker resources

To skip the Java build (if you've already done gradle build manually):

jidra validate --codebase /path/to/java/repo --graph /path/to/graph.db --skip-build

Example: Debug mode (report removals, don't filter)

jidra validate \
  --actuator-url http://localhost:8080 \
  --graph /path/to/graph.db \
  --no-filter \
  --report /tmp/validation_debug.json

Report output shape:

{
  "total_classes": 412,
  "confirmed_beans": 87,
  "unconfirmed_classes_sample": ["com.example.Dto", ...],
  "edges_before": 1843,
  "edges_after": 1201,
  "edges_removed": 642,
  "callsites_upgraded": 14,
  "removed_edges_sample": [
    {"caller": "...", "callee": "..."}
  ]
}

flow-doc

Purpose: generate deterministic flow investigation markdown from indexed graph data (no LLM calls).

jidra flow-doc \
  [--graph <path>] \
  [--graph-type main|test] \
  --method <selector> \
  --output <markdown-path> \
  [--depth 4] \
  [--top-n 8] \
  [--max-subflows 8] \
  [--mind-map] \
  [--max-nodes 200] \
  [--include-details] \
  [--include-utility]

Behavior:

  • Normal mode (no --mind-map): prioritized flow slices using top_n and max_subflows.
  • --mind-map mode: recursive resolved-edge traversal using depth + max_nodes; it does not use top_n/max_subflows for traversal.
  • --include-details: in --mind-map mode, appends legacy detailed expanded sections that still use prioritized slicing (top_n/max_subflows).
  • Output is deterministic for the same graph + method + flags.

Examples:

python -m jidra.cli flow-doc \
  --method SearchController.search \
  --output flow_docs/verify_SearchController_search.md \
  --depth 10 \
  --top-n 10 \
  --max-subflows 10 \
  --show-agents
python -m jidra.cli flow-doc \
  --method SearchController.search \
  --output flow_docs/mindmap_SearchController_search.md \
  --mind-map \
  --depth 6 \
  --max-nodes 120

error-doc

Purpose: generate deterministic error investigation markdown from a Java stack trace text file and indexed graph.

jidra error-doc \
  --stack-trace <stack-trace.txt> \
  --output <markdown-path> \
  [--graph <path>] \
  [--graph-type main|test] \
  [--depth 6] \
  [--max-nodes 200] \
  [--mind-map]

Stack frame parsing:

  • Parses lines in format: at package.Class.method(File.java:123).

Frame-to-method matching:

  • class full name
  • method name
  • file name
  • line in method [start_line, end_line]

Match semantics:

  • matched: exactly one graph method candidate.
  • ambiguous: multiple candidates (reported as ambiguity).
  • unmatched: no candidate.

Anchor + focused map:

  • primary failure anchor: first matched/ambiguous project frame.
  • focused flow map: generated via deterministic flow-doc mind-map traversal around anchor.
  • upstream/downstream behavior:
    • downstream-focused when anchor has meaningful downstream callees.
    • upstream-focused fallback when downstream is weak.

Examples:

python -m jidra.cli error-doc \
  --stack-trace examples/error_1.txt \
  --output flow_docs/error_doc_verify_clean.md \
  --mind-map \
  --depth 6 \
  --max-nodes 80

Determinism and Limits

  • Static analysis only; runtime dispatch is not guaranteed.
  • Unresolved calls may remain in outputs.
  • External library frames/methods may be unmatched.
  • Graph quality directly affects output quality.
  • No runtime correctness claims; output is investigation guidance.

Example Output Snippet

## Suggested Debug Locations
| priority | location | reason |
|---:|---|---|
| 1 | `com.example.app.health.HealthIndicator#doHealthCheck(Health.Builder)` | failing project frame |
| 2 | `org.opensearch.client.opensearch.cluster.OpenSearchClusterClient#health:360` | caller frame above failure |
| 3 | `this.client.cluster().health` | unresolved external call near failure |

index

jidra index --codebase <path> --output <path-or-dir> [--ts-backend auto|treesitter|tsmorph]

Builds the graph (graph.db) from source code (auto-detects language):

  • Scala: SemanticDB-based extraction (Docker sidecar) — compiler-resolved call edges, ~90% resolution
  • Java: tree-sitter-based AST extraction + call resolution, ~85% resolution
  • TypeScript: in-process tree-sitter extraction by default (no Docker), ~65% resolution. Pass --ts-backend tsmorph to use the Docker ts-morph sidecar instead (~80% resolution via the TypeScript compiler). auto (default) uses tree-sitter and falls back to the sidecar only if tree-sitter-typescript isn't installed.
  • Python: libcst/AST-based extraction + symbol table type inference, ~68.5% resolution
  • Go: tree-sitter-based AST extraction (in-process, no Docker/compiler) + local symbol-table call resolution, best-effort — no interface-satisfaction (structural typing) resolution

Language detection is automatic via manifest files (build.sbt, pom.xml, package.json, pyproject.toml, go.mod, etc.). Multiple languages in the same repo are detected and merged into a single graph automatically.

Note: a directory that contains source files but no manifest (e.g. loose .py/.ts files with no requirements.txt/pyproject.toml/package.json) is not recognized as that language and will index to an empty graph. Add the appropriate manifest so the codebase is detected. This also affects auto-sync (below): the watcher/hooks will reindex but find nothing to extract.

reindex

jidra reindex [--graph <path>] [--codebase <path>] [--changed-files <f1> <f2> ...]

Incrementally updates an existing graph.db after files change (fingerprint-based; falls back to a full rebuild when needed). --changed-files is a hint used by the git hooks. This is the command the git hooks and the in-daemon file watcher call under the hood.

hooks

jidra hooks install   [--repo <path>] [--graph <path>]
jidra hooks uninstall [--repo <path>]

Installs post-commit / post-merge / post-checkout git hooks that auto-reindex the graph when the working tree changes, so it never goes stale. Hook bodies are wrapped in delimited # BEGIN JIDRA / # END JIDRA blocks, so they compose with other hook managers (Husky, lefthook) and uninstall removes only JIDRA's block. When running the MCP server in --mode proxy, the shared daemon also runs a debounced filesystem watcher that hot-reloads the graph on save — so on most setups you get fresh graphs with no manual reindex at all.

ui

jidra ui [--host 127.0.0.1] [--port 7474] [--reload]

Serves the React web UI (see Web UI above) plus its FastAPI backend on one port. Requires ui/dist to exist — build it once with cd ui && npm install && npm run build. --reload enables uvicorn auto-reload for backend development; it does not rebuild the frontend (run npm run dev in ui/ separately for frontend hot-reload during UI development).

trace

jidra trace \
  [--graph <path>] \
  [--graph-type main|test] \
  --method <selector> \
  [--max-depth 5] \
  [--business-only] \
  [--output <file-or-dir>]
  • --business-only filters support/metrics/logging from flow output
  • root node is always preserved

context

jidra context \
  [--graph <path>] \
  [--graph-type main|test] \
  --method <selector> \
  [--max-chars 12000] \
  [--max-tokens <int>] \
  [--business-only] \
  [--output <file-or-dir>]

Includes:

  • method signature/source
  • endpoint metadata
  • resolved callee summary
  • unresolved call summary

Context output is deduped/grouped for prompt readiness.

trace-route

jidra trace-route \
  [--graph <path>] \
  [--graph-type main|test] \
  --route <path> \
  [--max-depth 5] \
  [--output <file-or-dir>]

prompt

jidra prompt \
  [--graph <path>] \
  [--graph-type main|test] \
  --method <selector> \
  [--max-chars 12000] \
  [--max-tokens <int>] \
  [--business-only|--no-business-only] \
  [--target claude|codex|generic] \
  [--output <file-or-dir>]

Default: --business-only is enabled.

diagnose

jidra diagnose \
  [--graph <path>] \
  [--graph-type main|test] \
  --method <selector> \
  [--target claude|codex|generic] \
  [--model <model>] \
  [--max-chars 12000] \
  [--max-tokens <int>] \
  [--business-only|--no-business-only] \
  [--llm-profile local|enterprise] \
  [--config <path-to-config.yaml>] \
  [--show-prompt] \
  [--quiet] \
  [--output <file-or-dir>]

Behavior:

  • No --output + interactive TTY + not --quiet: ANSI-readable report
  • No --output + non-TTY or --quiet: JSON printed
  • With --output: JSON written to file
  • --show-prompt: includes prompt text in result JSON
  • --max-chars: controls method context/source size sent into prompt construction
  • --max-tokens: overrides model output token limit for this run (when omitted, config profile default is used)

Output Naming

When --output is a directory:

  • trace: trace_<graph_type>_<method>.json
  • trace + business-only: trace_business_<graph_type>_<method>.json
  • context: context_<graph_type>_<method>.json
  • context + business-only: context_business_<graph_type>_<method>.json
  • trace-route: trace_route_<graph_type>_<route_or_entry>.json
  • prompt: prompt_<target>_<graph_type>_<method>.txt
  • diagnose: diagnose_<target>_<graph_type>_<method>.json

Names are normalized to lowercase snake-style safe parts.

LLM Configuration

JIDRA uses jidra/config.yaml.

Example:

llm:
  provider: litellm
  profile: local

  profiles:
    local:
      api_base: "http://localhost:4000"
      api_key_env: "LITELLM_PROXY_API_KEY"
      default_model: "ollama/gemma4:e4b"
      timeout_seconds: 120
      temperature: 0.2
      max_tokens: 1200

    enterprise:
      api_base: "https://your-enterprise-litellm.example.com"
      api_key_env: "ENTERPRISE_LITELLM_API_KEY"
      default_model: "gpt-4o-mini"
      timeout_seconds: 120
      temperature: 0.2
      max_tokens: 2000

Rules:

  • Default profile comes from llm.profile
  • CLI override: --llm-profile
  • If api_key_env is set, env var is read
  • Missing config falls back to safe local defaults

Diagnose Output Shape

diagnose returns JSON with:

{
  "method": "...",
  "analysis": "...",
  "llm": {
    "provider": "litellm",
    "profile": "local",
    "model": "...",
    "usage": {
      "input_tokens": 0,
      "output_tokens": 0,
      "total_tokens": 0,
      "reasoning_tokens": 0
    },
    "latency_seconds": 0.0,
    "limits": {
      "max_chars": 12000,
      "max_tokens": null
    }
  },
  "context_summary": {
    "business_flow_count": 0,
    "unresolved_count": 0
  }
}

If provider usage is unavailable, token counts are estimated and:

"estimated": true

is added under llm.usage.

Context/Token Limits

  • --max-chars (context, prompt, diagnose):
    • default 12000
    • passed directly to context building to constrain context payload size
  • --max-tokens (context, prompt, diagnose):
    • optional CLI override
    • primarily used by diagnose to cap LLM output tokens
    • if omitted, profile default from jidra/config.yaml is used

Troubleshooting

jidra --help works but diagnose fails

Likely LLM connectivity issue:

  • verify LiteLLM endpoint in config
  • verify API key/env key
  • verify network access to endpoint

No methods matched selector

Use a stronger selector:

  • class+method or exact method id from ambiguity output

no_flow_root:/route

No endpoint matched that route in graph. Validate route annotations and graph source set.

pip install -e . fails

Check Python/venv and package index/network availability.

Cost/ROI Calculator

JIDRA includes a cost calculator that measures actual token savings from your real codebase — not estimates.

# Graph-wide averages
jidra cost-roi --model claude-opus-4-7 --queries 1000

# Specific method — reads real source files, no API calls
jidra cost-roi --method SearchController.search --model claude-opus-4-7 --queries 1000

# Specific method — real Claude API calls, exact token counts (requires ANTHROPIC_API_KEY)
jidra cost-roi \
  --method SearchController.search \
  --codebase /path/to/java-repo \
  --model claude-opus-4-7 \
  --queries 1000 \
  --offline false

See COST_ROI_CALCULATOR.md for full usage and how the measurement works.

Validation

Two scripts in validations/ let you prove JIDRA's value on your own codebase.

Token & Cost Validation (run_validation.py)

Measures real token savings via Claude API calls — traditional raw source vs JIDRA context.

ANTHROPIC_API_KEY=... python validations/run_validation.py \
    --graph /path/to/.jidra/graph.db \
    --codebase /path/to/your-repo \
    --methods "OrderController.createOrder,PaymentService.charge" \
    --model claude-opus-4-7 \
    --output validations/results.json

Hallucination & Consistency Validation (hallucination_test.py)

Tests whether JIDRA reduces hallucinations and model drift across 5 dimensions:

  1. Call graph accuracy — does the model correctly name what a method calls?
  2. Caller tracing — does the model correctly name what calls a method?
  3. Change impact — does the model correctly identify what breaks if a method changes?
  4. Unit test generation — do generated tests reference real symbols?
  5. Consistency/drift — does the model give the same answer twice in separate sessions?
# Pass your own methods inline
ANTHROPIC_API_KEY=... python validations/hallucination_test.py \
    --graph /path/to/.jidra/graph.db \
    --codebase /path/to/your-repo \
    --methods "OrderController.createOrder,PaymentService.charge"

# Pass methods via file (one per line)
ANTHROPIC_API_KEY=... python validations/hallucination_test.py \
    --graph /path/to/.jidra/graph.db \
    --codebase /path/to/your-repo \
    --methods-file my_methods.txt

# Auto-discover endpoints from the graph
ANTHROPIC_API_KEY=... python validations/hallucination_test.py \
    --graph /path/to/.jidra/graph.db \
    --codebase /path/to/your-repo \
    --auto-discover --discover-limit 5

# Run specific tests only (e.g. unit test gen + drift)
    --tests 4,5

Methods file format (my_methods.txt):

# One selector per line — ClassName.methodName or fully qualified
OrderController.createOrder
PaymentService.charge
com.example.search.SearchController.search

Aggregate output:

hallucination_rate    traditional=0.42  jidra=0.08  improvement=+81.0%
fabrication_rate      traditional=0.35  jidra=0.06  improvement=+82.9%
drift_score           traditional=0.28  jidra=0.04  improvement=+85.7%

Testing

# All tests
python -m pytest tests/ -v

# Cost calculator only
python -m pytest tests/test_cost_calculator.py -v

# Unit tests only (no graph file needed)
python -m pytest tests/test_cost_calculator.py -v -k "not real and not missing"

See tests/README.md for the full test structure.

Development Notes

  • cli.py handles command orchestration only.
  • llm_client.py owns provider/config/use-metrics behavior.
  • graph extraction and graph format are intentionally unchanged.

About

A CLI and MCP server for Multi Language codebase graph indexing, call tracing, and LLM-assisted diagnosis. Builds a static call graph via tree-sitter, traces method and route flows, generates deterministic flow and error investigation docs.

Topics

Resources

License

Stars

Watchers

Forks

Contributors