Ainfera Routing

The Inference of AI Agents. Agent-native inference routing — constrained optimization across 50+ models, not a billing gateway.

Ainfera Routing is the control plane for production agent workloads: pick the highest-quality model that satisfies hard budget and latency limits, dispatch, capture outcomes, and compound routing intelligence from real traffic.

Why this exists

Gateways like LiteLLM, OpenRouter, and Portkey optimize single calls for human teams. Ainfera Routing targets autonomous agents: hard caps, workflow-aware traces, identity-bound receipts, and an outcome loop that only compounds when traffic flows through Ainfera.

Quick links

Resource	URL
Methodology (Notion)	Ainfera Routing — Methodology v1.0
API	api.ainfera.ai
Python SDK	ainfera-ai/sdk
Specs	ainfera-ai/specs
MCP	mcp.ainfera.ai

Starter routing templates

Copy a template into your tenant routing policy or agent config:

Template	Goal	File
Latency-first	Minimize p95 under quality floor	`templates/latency-first.yaml`
Cost-first	Minimize cost under quality floor	`templates/cost-first.yaml`
Compliance-first	Strict veto + no fallback	`templates/compliance-first.yaml`

Policy shape is documented in schema/routing-policy.schema.json.

Minimal curl

export AINFERA_API_KEY="ai_infera_..."

curl -sS https://api.ainfera.ai/v1/inference \
  -H "Authorization: Bearer $AINFERA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ainfera-inference",
    "messages": [{"role": "user", "content": "Route this agent call"}],
    "max_tokens": 256
  }'

Use model: "ainfera-inference" to let Ainfera route across the live catalog. (ainfera-mithril / ainfera-auto are legacy aliases and still resolve identically.) See docs for Agent Cards, workflows, and audit verification.

Production E2E smoke

Full L1–L5 verification against live api.ainfera.ai (~3 min):

git clone https://github.com/ainfera-ai/routing
cd routing
./scripts/ainfera-e2e.sh          # human-readable
MODE=json ./scripts/ainfera-e2e.sh   # agent-parseable

Spec: scripts/AINFERA-E2E.md

Outcome capture (methodology §16): docs/outcome-capture.md

Repo map (ainfera-ai org)

Repo	Role
routing (this repo)	Public methodology, templates, onboarding
api	L2 routing runtime + `/v1/inference`
sdk	Python client
specs	ATS, AAMC, API contracts (CC-BY 4.0)
ainfera-mcp	MCP reference adapter
Framework adapters	`ainfera-mcp`, `ainfera-hermes`, `ainfera-openclaw`, `ainfera-letta`, `ainfera-langgraph`, `ainfera-langchain`, `ainfera-llamaindex`, `ainfera-crewai`, `ainfera-google-adk`

License

Apache 2.0 — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.github/workflows		.github/workflows
ainfera_routing		ainfera_routing
docs		docs
schema		schema
scripts		scripts
templates		templates
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
STRATEGY.md		STRATEGY.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ainfera Routing

Why this exists

Quick links

Starter routing templates

Minimal curl

Production E2E smoke

Repo map (ainfera-ai org)

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Ainfera Routing

Why this exists

Quick links

Starter routing templates

Minimal curl

Production E2E smoke

Repo map (ainfera-ai org)

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages