Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -426,7 +426,7 @@ jobs:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
source venv/bin/activate
cd demos/shared/test_runner && sh run_demo_tests.sh use_cases/preference_based_routing
cd demos/shared/test_runner && sh run_demo_tests.sh llm_routing/preference_based_routing

# ──────────────────────────────────────────────
# E2E: demo — currency conversion
Expand Down Expand Up @@ -476,4 +476,4 @@ jobs:
GROQ_API_KEY: ${{ secrets.GROQ_API_KEY }}
run: |
source venv/bin/activate
cd demos/shared/test_runner && sh run_demo_tests.sh samples_python/currency_exchange
cd demos/shared/test_runner && sh run_demo_tests.sh advanced/currency_exchange
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Plano pulls rote plumbing out of your framework so you can stay focused on what

Plano handles **orchestration, model management, and observability** as modular building blocks - letting you configure only what you need (edge proxying for agentic orchestration and guardrails, or LLM routing from your services, or both together) to fit cleanly into existing architectures. Below is a simple multi-agent travel agent built with Plano that showcases all three core capabilities

> 📁 **Full working code:** See [`demos/use_cases/travel_agents/`](demos/use_cases/travel_agents/) for complete weather and flight agents you can run locally.
> 📁 **Full working code:** See [`demos/agent_orchestration/travel_agents/`](demos/agent_orchestration/travel_agents/) for complete weather and flight agents you can run locally.



Expand Down Expand Up @@ -113,7 +113,7 @@ async def chat(request: Request):
days = 7

# Your agent logic: fetch data, call APIs, run tools
# See demos/use_cases/travel_agents/ for the full implementation
# See demos/agent_orchestration/travel_agents/ for the full implementation
weather_data = await get_weather_data(request, messages, days)

# Stream the response back through Plano
Expand Down
16 changes: 7 additions & 9 deletions cli/planoai/templates/coding_agent_routing.yaml
Original file line number Diff line number Diff line change
@@ -1,13 +1,6 @@
version: v0.1
version: v0.3.0

listeners:
egress_traffic:
address: 0.0.0.0
port: 12000
message_format: openai
timeout: 30s

llm_providers:
model_providers:
# OpenAI Models
- model: openai/gpt-5-2025-08-07
access_key: $OPENAI_API_KEY
Expand Down Expand Up @@ -39,5 +32,10 @@ model_aliases:
arch.claude.code.small.fast:
target: claude-haiku-4-5

listeners:
- type: model
name: model_listener
port: 12000

tracing:
random_sampling: 100
31 changes: 21 additions & 10 deletions cli/planoai/templates/conversational_state_v1_responses.yaml
Original file line number Diff line number Diff line change
@@ -1,25 +1,36 @@
version: v0.1
version: v0.3.0

listeners:
egress_traffic:
address: 0.0.0.0
port: 12000
message_format: openai
timeout: 30s

llm_providers:
agents:
- id: assistant
url: http://localhost:10510

model_providers:
# OpenAI Models
- model: openai/gpt-5-mini-2025-08-07
access_key: $OPENAI_API_KEY
default: true

# Anthropic Models
# Anthropic Models
- model: anthropic/claude-sonnet-4-20250514
access_key: $ANTHROPIC_API_KEY

listeners:
- type: agent
name: conversation_service
port: 8001
router: plano_orchestrator_v1
agents:
- id: assistant
description: |
A conversational assistant that maintains context across multi-turn
conversations. It can answer follow-up questions, remember previous
context, and provide coherent responses in ongoing dialogues.

# State storage configuration for v1/responses API
# Manages conversation state for multi-turn conversations
state_storage:
# Type: memory | postgres
type: memory

tracing:
random_sampling: 100
16 changes: 7 additions & 9 deletions cli/planoai/templates/preference_aware_routing.yaml
Original file line number Diff line number Diff line change
@@ -1,13 +1,6 @@
version: v0.1.0
version: v0.3.0

listeners:
egress_traffic:
address: 0.0.0.0
port: 12000
message_format: openai
timeout: 30s

llm_providers:
model_providers:

- model: openai/gpt-4o-mini
access_key: $OPENAI_API_KEY
Expand All @@ -25,5 +18,10 @@ llm_providers:
- name: code generation
description: generating new code snippets, functions, or boilerplate based on user prompts or requirements

listeners:
- type: model
name: model_listener
port: 12000

tracing:
random_sampling: 100
5 changes: 4 additions & 1 deletion cli/planoai/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,10 @@ def convert_legacy_listeners(
)
listener["model_providers"] = model_providers or []
model_provider_set = True
llm_gateway_listener = listener
# Merge user listener values into defaults for the Envoy template
llm_gateway_listener = {**llm_gateway_listener, **listener}
elif listener.get("type") == "prompt":
prompt_gateway_listener = {**prompt_gateway_listener, **listener}
if not model_provider_set:
listeners.append(llm_gateway_listener)

Expand Down
2 changes: 1 addition & 1 deletion cli/test/test_init.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ def test_init_template_builtin_writes_config(tmp_path, monkeypatch):
config_path = tmp_path / "config.yaml"
assert config_path.exists()
config_text = config_path.read_text(encoding="utf-8")
assert "llm_providers:" in config_text
assert "model_providers:" in config_text


def test_init_refuses_overwrite_without_force(tmp_path, monkeypatch):
Expand Down
2 changes: 1 addition & 1 deletion config/docker-compose.dev.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ services:
- "12000:12000"
- "19901:9901"
volumes:
- ${PLANO_CONFIG_FILE:-../demos/samples_python/weather_forecast/plano_config.yaml}:/app/plano_config.yaml
- ${PLANO_CONFIG_FILE:-../demos/getting_started/weather_forecast/plano_config.yaml}:/app/plano_config.yaml
- /etc/ssl/cert.pem:/etc/ssl/cert.pem
- ./envoy.template.yaml:/app/envoy.template.yaml
- ./plano_config_schema.yaml:/app/plano_config_schema.yaml
Expand Down
Original file line number Diff line number Diff line change
@@ -1,13 +1,11 @@
version: v0.1.0
version: v0.3.0

listeners:
ingress_traffic:
address: 0.0.0.0
- type: prompt
name: prompt_listener
port: 10000
message_format: openai
timeout: 30s

llm_providers:
model_providers:
- model: openai/gpt-4o-mini
access_key: $OPENAI_API_KEY
default: true
Expand Down
25 changes: 25 additions & 0 deletions demos/advanced/currency_exchange/docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
services:
anythingllm:
image: mintplexlabs/anythingllm
restart: always
ports:
- "3001:3001"
cap_add:
- SYS_ADMIN
environment:
- STORAGE_DIR=/app/server/storage
- LLM_PROVIDER=generic-openai
- GENERIC_OPEN_AI_BASE_PATH=http://host.docker.internal:10000/v1
- GENERIC_OPEN_AI_MODEL_PREF=gpt-4o-mini
- GENERIC_OPEN_AI_MODEL_TOKEN_LIMIT=128000
- GENERIC_OPEN_AI_API_KEY=sk-placeholder
extra_hosts:
- "host.docker.internal:host-gateway"

jaeger:
build:
context: ../../shared/jaeger
ports:
- "16686:16686"
- "4317:4317"
- "4318:4318"
Original file line number Diff line number Diff line change
@@ -1,13 +1,11 @@
version: v0.1.0
version: v0.3.0

listeners:
egress_traffic:
address: 0.0.0.0
- type: model
name: model_listener
port: 12000
message_format: openai
timeout: 30s

llm_providers:
model_providers:
- model: openai/gpt-4o-mini
access_key: $OPENAI_API_KEY
default: true
Expand All @@ -20,3 +18,6 @@ model_aliases:
target: gpt-4o-mini
arch.reason.v1:
target: o3

tracing:
random_sampling: 100
Original file line number Diff line number Diff line change
@@ -1,18 +1,16 @@
version: v0.1.0
version: v0.3.0

listeners:
ingress_traffic:
address: 0.0.0.0
- type: prompt
name: prompt_listener
port: 10000
message_format: openai
timeout: 30s

endpoints:
rag_energy_source_agent:
endpoint: host.docker.internal:18083
connect_timeout: 0.005s

llm_providers:
model_providers:
- access_key: $OPENAI_API_KEY
model: openai/gpt-4o-mini
default: true
Expand Down
28 changes: 28 additions & 0 deletions demos/advanced/multi_turn_rag/docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
services:
rag_energy_source_agent:
build:
context: .
dockerfile: Dockerfile
ports:
- "18083:80"
healthcheck:
test: ["CMD", "curl" ,"http://localhost:80/healthz"]
interval: 5s
retries: 20

anythingllm:
image: mintplexlabs/anythingllm
restart: always
ports:
- "3001:3001"
cap_add:
- SYS_ADMIN
environment:
- STORAGE_DIR=/app/server/storage
- LLM_PROVIDER=generic-openai
- GENERIC_OPEN_AI_BASE_PATH=http://host.docker.internal:10000/v1
- GENERIC_OPEN_AI_MODEL_PREF=gpt-4o-mini
- GENERIC_OPEN_AI_MODEL_TOKEN_LIMIT=128000
- GENERIC_OPEN_AI_API_KEY=sk-placeholder
extra_hosts:
- "host.docker.internal:host-gateway"
Original file line number Diff line number Diff line change
@@ -1,13 +1,11 @@
version: v0.1.0
version: v0.3.0

listeners:
ingress_traffic:
address: 0.0.0.0
- type: prompt
name: prompt_listener
port: 10000
message_format: openai
timeout: 30s

llm_providers:
model_providers:
- access_key: $OPENAI_API_KEY
model: openai/gpt-4o

Expand Down
25 changes: 25 additions & 0 deletions demos/advanced/stock_quote/docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
services:
anythingllm:
image: mintplexlabs/anythingllm
restart: always
ports:
- "3001:3001"
cap_add:
- SYS_ADMIN
environment:
- STORAGE_DIR=/app/server/storage
- LLM_PROVIDER=generic-openai
- GENERIC_OPEN_AI_BASE_PATH=http://host.docker.internal:10000/v1
- GENERIC_OPEN_AI_MODEL_PREF=gpt-4o-mini
- GENERIC_OPEN_AI_MODEL_TOKEN_LIMIT=128000
- GENERIC_OPEN_AI_API_KEY=sk-placeholder
extra_hosts:
- "host.docker.internal:host-gateway"

jaeger:
build:
context: ../../shared/jaeger
ports:
- "16686:16686"
- "4317:4317"
- "4318:4318"
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ Plano acts as a **framework-agnostic proxy and data plane** that:

```bash
# From the demo directory
cd demos/use_cases/multi_agent_with_crewai_langchain
cd demos/agent_orchestration/multi_agent_crewai_langchain

# Build and start all services
docker-compose up -d
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ This demo shows how you can use Plano gateway to manage keys and route to upstre
```sh
sh run_demo.sh
```
1. Navigate to http://localhost:18080/
1. Navigate to http://localhost:3001/

Following screen shows an example of interaction with Plano gateway showing dynamic routing. You can select between different LLMs using "override model" option in the chat UI.

Expand All @@ -32,7 +32,7 @@ $ curl --header 'Content-Type: application/json' \
"messages": {
"role": "assistant",
"tool_calls": null,
"content": "Hello! How can I assist you today? Let's chat about anything you'd like. 😊"
"content": "Hello! How can I assist you today? Let's chat about anything you'd like."
},
"finish_reason": "stop"
}
Expand All @@ -47,11 +47,7 @@ $ curl --header 'Content-Type: application/json' \
```

# Observability
Plano gateway publishes stats endpoint at http://localhost:19901/stats. In this demo we are using prometheus to pull stats from Plano and we are using grafana to visualize the stats in dashboard. To see grafana dashboard follow instructions below,

1. Navigate to http://localhost:3000/ to open grafana UI (use admin/grafana as credentials)
1. From grafana left nav click on dashboards and select "Intelligent Gateway Overview" to view Plano gateway stats
1. For tracing you can head over to http://localhost:16686/ to view recent traces.
For tracing you can head over to http://localhost:16686/ to view recent traces.

Following is a screenshot of tracing UI showing call received by Plano gateway and making upstream call to LLM,

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,13 +37,3 @@ services:
- "16686:16686"
- "4317:4317"
- "4318:4318"

prometheus:
build:
context: ../../shared/prometheus

grafana:
build:
context: ../../shared/grafana
ports:
- "3000:3000"
Loading
Loading