-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
🚀 Describe the new functionality needed
Related discussion: #3926
Overview
This document proposes a multi-phase evolution of how the Llama Stack instance manages and exposes MCP connectors to agents and models via the responses API
- Phase 1 introduces a static configuration model, where connectors are manually defined.
- Phase 2 introduces a dynamic registry-based model, enabling discovery of connectors from one or more MCP registries while preserving backward compatibility with static entries.
- Phase 3 introduces a dedicated API for CRUD operations on connectors as well as MCP Registries
Client side usage
Client side usage would be similar to how it is documented in https://platform.openai.com/docs/guides/tools-connectors-mcp#quickstart
from openai import OpenAI
# Initialize OpenAI client
client = OpenAI(base_url="http://localhost:8321/v1", api_key="fake")
resp = client.responses.create(
model="Qwen/Qwen3-32B",
tools=[
{
"type": "mcp",
"server_label": "Dropbox",
"connector_id": <connector_id>,
"authorization": "<oauth access token>",
"require_approval": "never",
},
],
input="Summarize the Q2 earnings report.",
)
print(resp.output_text)Static Connector Configuration
In this mode, the Llama Stack instance administrator must manually define the allowed MCP connectors in the run.yaml
EDIT: this section has been split into its own issue #4186
Example config
apis: ...
providers:
agents:
- provider_id: meta-reference
provider_type: inline::meta-reference
config:
persistence:
agent_state:
namespace: agents
backend: kv_default
responses:
table_name: responses
backend: sql_default
max_write_queue_size: 10000
num_writers: 4
connectors:
- connector_id: connector_github
url: https://api.github.com/mcp
- connector_id: connector_gitlab
url: https://gitlab.com/mcp
- connector_id: connector_slack
url: https://slack.com/mcp
...Operational notes
- Any new connector addition requires updating run.yaml and restarting the server.
- The server loads these connectors into memory at startup and exposes them to agents.
- This configuration model is suitable for small or tightly controlled deployments where server persistence is not required
- Connector names must be exposed to users via off band (like documentation) means if they are unable to view the server configuration
Dynamic Registry Integration
In dynamic mode, the administrator may configure one or more MCP registries.
The server will query these registries at runtime to discover and construct connector strings automatically.
Each registry must follow the official Anthropic MCP Registry API definition as described at:
Official MCP Registry Reference
Example configuration
apis: ...
providers:
agents:
- provider_id: meta-reference
provider_type: inline::meta-reference
config:
persistence:
agent_state:
namespace: agents
backend: kv_default
responses:
table_name: responses
backend: sql_default
max_write_queue_size: 10000
num_writers: 4
connectors:
- name: connector_github
url: https://api.github.com/mcp
mcp_registries:
- name: registry_internal
id: internal
url: https://registry.internal.dev/
- name: registry_public
id: public
url: https://registry.public.dev/
...Each dynamically discovered server produces a unique connector_id in the form mcp::<registry_id>::<server_name>
This convention ensures uniqueness across registries and provides users with a stable, self-describing identifier. Administrators must ensure all registry_id values are unique within the run.yaml.
Operational notes
- Any new registry addition requires updating run.yaml and restarting the server.
- Users should be made aware of this pattern (via documentation) so they can construct the connector string for themselves based on how the registries are configured and the server information in the registry.
Connector/Registry Management API
Allow runtime inspection and modification (CRUD) of connectors and registries via a new API + provider (Full spec to be fleshed out, but would retain MCP registries as a first class concept and be built around that)
- Allow registration/removal/update of connectors without server restart
- Allow registration/removal/update of MCP registries without server restart
- Allow listing of constructed connector_ids
💡 Why is this needed? What if we don't build it?
Background and motivation
Currently llama stack’s OpenAI Responses implementation does not cover connectors. This requires users to be aware of each remote MCP server’s URLs to be able to use them through their queries. There are a number of benefits to being able to abstract away that piece of information from the users, such as not having to worry about updating applications when a server URL/transport protocol changes and being able to separate the user plane from the platform where the server itself maybe running (potentially useful in enterprise settings)
This feature would allow llama stack to cater towards an MCP as a service paradigm that is likely around the corner. If we don't build this feature, llama stack could be less favorably positioned in the market when such use cases become more and more common. Anthropic's MCP Registry API is well positioned to become the industry standard, llama stack should give its users the ability to integrate with it.
Other thoughts
Alternative approaches:
Lead with API design if community need for it is more urgent
Issue tracking
Based on discussions, this work has been broken down into the following sub-issues for tracking purposes: