-
Notifications
You must be signed in to change notification settings - Fork 626
Description
Problem
The ITK test runner and v1.0 agents use incompatible SSE wire formats, which means streaming interop tests are not actually validating SSE event structure.
Test runner (pydantic SDK, a2a-sdk==0.3.25)
Expects flat events with kind discriminator and lowercase state enums:
data: {"kind": "status-update", "taskId": "tsk-1", "status": {"state": "working"}}
data: {"kind": "artifact-update", "taskId": "tsk-1", "artifact": {"parts": [{"kind": "text", "text": "hello"}]}}Python v1.0 agent (protobuf SDK, a2a-sdk==1.0.0a0)
Emits StreamResponse wrapper with oneof payload and SCREAMING_SNAKE_CASE enums:
data: {"result": {"statusUpdate": {"taskId": "tsk-1", "status": {"state": "TASK_STATE_WORKING"}}}}
data: {"result": {"artifactUpdate": {"taskId": "tsk-1", "artifact": {"parts": [{"text": "hello"}]}}}}Differences
| Aspect | pydantic SDK (v0.3) | protobuf SDK (v1.0) |
|---|---|---|
| Event wrapper | flat, kind discriminator |
StreamResponse.result oneof |
| State enum | working, completed |
TASK_STATE_WORKING, TASK_STATE_COMPLETED |
| Role enum | agent, user |
ROLE_AGENT, ROLE_USER |
| Part format | {"kind": "text", "text": "..."} |
{"text": "..."} (no kind) |
Impact
-
No single SSE format satisfies both consumers. An agent emitting pydantic format works for the test runner but not for v1.0 agents receiving streaming callbacks in a multi-hop chain. An agent emitting protobuf format interops with v1.0 agents but the test runner may fail to parse.
-
Streaming tests pass for the wrong reason. The
v10-core-streamingtests pass because token verification checks the final concatenated result, not individual SSE events. The ITK is not validating that SSE events conform to the v1.0 wire format. -
Agents must hide streaming capability. When adding the Elixir agent (PR feat: add Elixir A2A SDK (actioncard/a2a-elixir) to ITK #475), we had to intentionally not advertise
capabilities.streaming=trueto avoid parse failures during multi-hop callbacks where different SDKs handle the downstream response.
Suggested fixes
- Upgrade the ITK test runner to use
a2a-sdk>=1.0.0a0(protobuf SDK) — aligns with v1.0 agents natively - Add format detection in the test runner's SSE parser to handle both pydantic and protobuf event formats
- Separate v0.3 and v1.0 streaming tests so each uses the matching SDK for parsing
- Add SSE event structure assertions to streaming tests — currently only the final result is checked
Reproduction
- Run
v10-core-streamingtest (Python v1.0 + Go v1.0) - Capture the actual SSE events emitted by each agent
- Compare against what
testlib.pycan parse - Note: tests pass but individual events may not parse correctly
Context
Discovered while adding the Elixir v1.0 agent to the ITK (PR #475). The Elixir agent implements SSE streaming but cannot advertise it due to this format mismatch.
cc @kdziedzic70