AGENTS

This playbook is for agentic contributors working in this repository. Follow these rules unless a task explicitly overrides them. Keep edits in ASCII.

Scope and Environment

Scope: entire repository (root + examples/*).
Runtime: Python 3.12, uv-managed .venv (do not swap toolchains).
Workspace: uv workspace members include examples/* (demo apps and notebooks).
Platform: macOS; SurrealDB used for examples; keep commands portable.
No Cursor rules (.cursor/rules, .cursorrules) or Copilot instructions found.

Setup and Dependencies

Create/activate env with uv: uv sync --all-packages (or uv sync inside the venv).
If env already present, prefer uv run ... to ensure deps resolve via uv.
Lockfile: uv.lock is source of truth; avoid hand-editing.
Additional sources: pyproject.toml pins workspace extras (logfire from Git, surrealfs-py via Git in knowledge-graph example).

Core Commands (root package `kaig`)

Format imports: uv run ruff check --fix --select I.
Format code: uv run ruff format (80 cols, 4-space indent, double quotes, trailing commas on multi-line).
Lint: uv run ruff check (keep clean; prefer fixing warnings rather than ignoring).
Type check all: uv run basedpyright src/ examples/ --level error.
Package-specific type check: uv run basedpyright -p examples/<pkg> (e.g., examples/knowledge-graph).
Tests (all): uv run pytest.
Tests (single file): uv run pytest src/kaig/tests/test_db.py (adjust path).
Tests (single test): uv run pytest src/kaig/tests/test_db.py -k test_name.
Keep tests deterministic; avoid hitting external services. Mock SurrealDB/LLM when possible.

Justfile Shortcuts (run with `just <recipe>`)

format: runs import sort then formatter.
lint: ruff check + basedpyright for root and selected examples.
knowledge-graph-db / kg-db: start SurrealDB locally for the knowledge-graph example (surreal start -u root -p root rocksdb:databases/knowledge-graph).
knowledge-graph DB / kg DB: run FastAPI ingestion server with DB_NAME env, e.g. just kg test_db (wraps uv run --env-file .env -- fastapi run examples/knowledge-graph/src/knowledge_graph/server.py --port 8080).
knowledge-graph-agent DB / kg-agent DB: start chat agent UI, e.g. just kg-agent test_db (wraps uv run --env-file .env uvicorn knowledge_graph.agent:app --host 127.0.0.1 --port 7932).

Repository Layout (anchors for navigation)

src/kaig: core library (DB, embeddings, LLM utilities, definitions, tests under src/kaig/tests).
docs/: assets and SurrealQL references.
examples/knowledge-graph: ingestion + chat example (FastAPI + data-flow executor + PydanticAI agent).
examples/demo-graph, examples/demo-simple, examples/demo-ingest-throttled, examples/notebooks: other workspace members; follow their local pyproject settings.
Justfile: authoritative task runner; prefer recipes over custom scripts when available.

Knowledge Graph Example Quickstart (from repo root)

Start DB: just knowledge-graph-db (or surreal start -u root -p root rocksdb:databases/knowledge-graph).
Run server + ingestion worker: DB_NAME=test_db just knowledge-graph test_db or the underlying uv command from README.
Run chat agent: DB_NAME=test_db just knowledge-graph-agent test_db.
SurrealQL helper queries live in examples/knowledge-graph/surql/ (schema + retrieval).
Flow-based ingestion uses stamp fields; updating handler code changes hashes—watch flow tables to gauge progress.
To inspect flow status, run the SurrealQL snippet from examples/knowledge-graph/README.md under a Surreal client.
Local tmp directories (examples/knowledge-graph/tmp, root tmp/) may hold artifacts; do not commit transient outputs.

Style and Implementation Guidelines

Imports: absolute only; no wildcards; group stdlib/third-party/local (ruff handles ordering).
Prefer from collections.abc import Iterable, Callable, ... over typing.List etc.
Typing: be explicit; avoid Any; use | None for optional; mark callables precisely; consider Protocols instead of duck-typing comments.
Public interfaces: annotate return types; keep function signatures narrow and clear.
Dataclasses vs Pydantic: favor Pydantic BaseModel for structured data exchange and validation; keep validators minimal and explicit.
Error handling: raise specific exceptions with actionable messages; avoid bare except; never silently pass; when useful, log via logfire. Re-raise with from to preserve context.
Logging/tracing: prefer structured logs; avoid print; ensure secrets are redacted.
Formatting: f-strings; double quotes; trailing commas on multiline collections/calls; keep lines ≤ 80 chars.
Naming: snake_case for variables/functions; PascalCase for classes; UPPER_SNAKE for constants; module filenames snake_case.
Mutability: prefer immutable data where practical; copy before mutating shared inputs.
File paths: use pathlib.Path; avoid hard-coded relative string paths.
Timeouts/retries: set explicit timeouts for network/DB calls; avoid unbounded loops.
Async: when adding async code, use asyncio best practices; avoid blocking calls; close sessions/clients.
Database (SurrealDB): parameterize NS/DB/user/pass via env; never hard-code credentials; keep schema changes in .surql files.
Vector/LLM: prefer dependency injection for models/embedders to keep tests hermetic; mock external calls in tests.
CLI/servers: guard entrypoints with if __name__ == "__main__" when adding scripts; ensure uvicorn/fastapi commands reference module paths (not local file assumptions).
Documentation: update README snippets when changing APIs; keep docstrings concise and informative.
Comments: add only when behavior is non-obvious; keep them up to date.

Testing Guidance

Add tests under src/kaig/tests or relevant example packages; name files test_*.py.
Structure tests for determinism; seed randomness; isolate filesystem writes under tmp/ or pytest tmp_path.
Avoid networked dependencies; mock SurrealDB/LLM/embedding calls.
Use pytest markers sparingly; default to unit-style tests; no skipped tests unless necessary.
When debugging flows, prefer targeted tests with small fixtures rather than full DB spins.

Git and Workflow Notes

Do not rewrite existing user changes. Avoid destructive git commands (no hard reset/checkout).
Only commit when explicitly asked. Keep branches clean; respect repo commit style by inspecting git log if needed.
Secrets: never commit .env, credentials, tokens, or SurrealDB passwords. Use env vars and .env locally only.

Observability and Telemetry

logfire is available (via dev deps). If adding logs, use structured logging and keep PII out.
Prefer surfacing operational metrics through structured events rather than ad-hoc prints.

SurrealQL Artifacts

Surreal queries live in docs/surql-intro.surql and examples/knowledge-graph/surql/*. Keep queries in .surql files and load them via helpers (DB.execute / async_execute).
When modifying schema or retrieval queries, keep them idempotent and documented in README sections.

Configuration Defaults

Knowledge-graph server requires DB_NAME env; it defaults Surreal connection to ws://localhost:8000/rpc, user root, pass root, namespace kaig.
LLM defaults (knowledge-graph): provider openai, model gpt-5-mini-2025-08-07, temperature 1; embedder uses OpenAI text-embedding-3-small with vector type F32.
Supply API keys via env (e.g., OPENAI_API_KEY); do not hard-code credentials. Use .env locally and keep it untracked.
Keep Surreal schema definitions in .surql files; re-run init if tables/edges change.
Avoid clearing DB automatically unless you know it is safe (db.clear() is commented out by default).

Knowledge-Graph Flow Notes

Flow executor registers handlers via @exe.flow(...) and writes handler hashes into stamp fields to avoid re-processing.
Current flows: chunk operates on document rows lacking a flow_chunked stamp; infer_concepts operates on chunk rows lacking concepts_inferred.
Flow eligibility: stamp is NONE and dependencies satisfied; this makes runs restart-safe and incremental.
When updating a flow handler, expect hash changes; use the flow status SurrealQL snippet (README) to see processed vs pending records.
Background ingestion loop starts with FastAPI server startup (ingestion_loop task) and stops gracefully on shutdown.
Upload route (/upload) schedules ingestion via background task; keep handlers async-safe and best-effort idempotent.

Data and Assets

Images and docs live under docs/assets/; keep large binaries out of git unless essential.
Temp artifacts: tmp/, examples/knowledge-graph/tmp, databases/, uploads/ should stay untracked; clean them before sharing branches.
.surql files are canonical for DB schema and queries—do not duplicate queries inline unless necessary for tests.

Packaging and Imports

Root pyproject.toml defines workspace members; use uv workspace semantics rather than ad-hoc pip install -e ..
Dev dependencies live under [dependency-groups].dev; run uv sync --all-packages to pull workspace extras.
Example packages (e.g., knowledge-graph) depend on the root package via workspace source; avoid circular imports across packages.

Review Checklist

Lint + format + type-check before sharing changes: run just format then just lint or the underlying uv commands.
Confirm tests relevant to your changes pass (single file/test commands above).
Validate knowledge-graph flows still stamp correctly after handler edits (check flow status query if unsure).
Ensure new endpoints/CLI entrypoints include parameter validation and do not hard-code secrets or ports.
Update documentation (README/AGENTS/SurrealQL files) when adjusting APIs, env vars, or schema.

Safety and Cleanup

Clean temp outputs before PRs; ignore transient folders (tmp/, uploads/, databases/).
If running servers, stop them after tests to avoid port conflicts (8080 for FastAPI, 7932 for agent UI by default).
When adding new commands, prefer just recipes or document the uv invocation clearly.

Quick Reference (copy/paste)

Install deps: uv sync --all-packages
Format imports: uv run ruff check --fix --select I
Format code: uv run ruff format
Lint: uv run ruff check
Type check (all): uv run basedpyright src/ examples/ --level error
Type check (pkg): uv run basedpyright -p examples/knowledge-graph
Tests (all): uv run pytest
Tests (file): uv run pytest src/kaig/tests/test_db.py
Tests (single): uv run pytest src/kaig/tests/test_db.py -k test_name
Start KG DB: just kg-db
Run KG server: DB_NAME=test_db just kg test_db
Run KG agent: DB_NAME=test_db just kg-agent test_db

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AGENTS

Scope and Environment

Setup and Dependencies

Core Commands (root package `kaig`)

Justfile Shortcuts (run with `just <recipe>`)

Repository Layout (anchors for navigation)

Knowledge Graph Example Quickstart (from repo root)

Style and Implementation Guidelines

Testing Guidance

Git and Workflow Notes

Observability and Telemetry

SurrealQL Artifacts

Configuration Defaults

Knowledge-Graph Flow Notes

Data and Assets

Packaging and Imports

Review Checklist

Safety and Cleanup

Quick Reference (copy/paste)

FilesExpand file tree

AGENTS.md

Latest commit

History

AGENTS.md

File metadata and controls

AGENTS

Scope and Environment

Setup and Dependencies

Core Commands (root package kaig)

Justfile Shortcuts (run with just <recipe>)

Repository Layout (anchors for navigation)

Knowledge Graph Example Quickstart (from repo root)

Style and Implementation Guidelines

Testing Guidance

Git and Workflow Notes

Observability and Telemetry

SurrealQL Artifacts

Configuration Defaults

Knowledge-Graph Flow Notes

Data and Assets

Packaging and Imports

Review Checklist

Safety and Cleanup

Quick Reference (copy/paste)

Core Commands (root package `kaig`)

Justfile Shortcuts (run with `just <recipe>`)