Feature Description
Add streaming support to the MCP protocol, enabling servers and clients to send and receive partial, incremental responses instead of waiting for full message completion.
Problem Statement
The current MCP interaction model is strictly request–response and requires full payload completion before delivery. This prevents low-latency feedback, progressive rendering, real-time tool output, and efficient handling of long-running model executions. It limits MCP’s usability for interactive applications and makes it less competitive with existing LLM APIs that support token or chunk streaming.
Proposed Solution
Extend the MCP protocol to support streaming semantics at the transport and message schema level. This includes:
- A standardized way to emit partial responses (chunks/events) for model outputs.
- Stream lifecycle signaling (start, delta, end, error).
- Backward-compatible handling for non-streaming clients.
- Clear guarantees around ordering, termination, and cancellation.
Use Case
- Interactive chat UIs that render tokens as they are generated.
- Long-running tool calls that progressively emit logs or intermediate results.
- Agent frameworks that need early partial outputs to drive downstream actions.
Additional Context
Most modern LLM APIs expose streaming as a baseline capability. Lack of streaming in MCP forces higher latency, poorer UX, and unnecessary buffering. Adding this at the protocol level avoids incompatible custom implementations and establishes MCP as a viable foundation for real-time, interactive AI systems.
Feature Description
Add streaming support to the MCP protocol, enabling servers and clients to send and receive partial, incremental responses instead of waiting for full message completion.
Problem Statement
The current MCP interaction model is strictly request–response and requires full payload completion before delivery. This prevents low-latency feedback, progressive rendering, real-time tool output, and efficient handling of long-running model executions. It limits MCP’s usability for interactive applications and makes it less competitive with existing LLM APIs that support token or chunk streaming.
Proposed Solution
Extend the MCP protocol to support streaming semantics at the transport and message schema level. This includes:
Use Case
Additional Context
Most modern LLM APIs expose streaming as a baseline capability. Lack of streaming in MCP forces higher latency, poorer UX, and unnecessary buffering. Adding this at the protocol level avoids incompatible custom implementations and establishes MCP as a viable foundation for real-time, interactive AI systems.