feat(dashscope): add native Responses API support#23945
feat(dashscope): add native Responses API support#23945rusherman wants to merge 3 commits intoBerriAI:mainfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR adds native Responses API support for DashScope (Alibaba Cloud) by implementing Key observations:
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| litellm/llms/dashscope/responses/transformation.py | New DashScope Responses API config extending OpenAIResponsesAPIConfig. Correctly overrides supports_native_websocket() to return False and simplifies URL construction. params.pop("metadata", None) on line 72 is dead code (metadata is not in _SUPPORTED_OPTIONAL_PARAMS, so the dict comprehension already excludes it). litellm.api_base takes priority over DASHSCOPE_API_BASE in get_complete_url, which can silently mis-route calls when a global api_base is set for another provider. |
| tests/test_litellm/llms/dashscope/responses/test_dashscope_responses_transformation.py | Mock-only test suite covering provider registration, URL construction, auth, and param filtering. test_get_complete_url_default lacks monkeypatch.setattr(litellm, "api_base", None), making it flaky when litellm.api_base is set by another test. sys.path.insert(0, os.path.abspath("../../../../..")) is CWD-relative (not __file__-relative), which may silently fail to add the repo root when pytest is invoked from a different directory. |
| litellm/utils.py | Adds a DASHSCOPE branch to ProviderConfigManager._get_python_responses_api_config(), consistent with every other provider registered in this dispatch table. |
| litellm/_lazy_imports_registry.py | Registers DashScopeResponsesAPIConfig in the LLM config name list and import map, correctly matching the pattern used by all other Responses API configs. |
| litellm/init.py | Adds a TYPE_CHECKING import for DashScopeResponsesAPIConfig, consistent with the pattern used by VolcEngineResponsesAPIConfig and ManusResponsesAPIConfig. |
Sequence Diagram
sequenceDiagram
participant Caller
participant ProviderConfigManager
participant DashScopeConfig as DashScopeResponsesAPIConfig
participant OpenAIConfig as OpenAIResponsesAPIConfig
participant DashScope
Caller->>ProviderConfigManager: get_provider_responses_api_config(provider=DASHSCOPE)
ProviderConfigManager-->>Caller: DashScopeResponsesAPIConfig()
Caller->>DashScopeConfig: get_complete_url(api_base, litellm_params)
Note over DashScopeConfig: api_base → litellm.api_base<br/>→ DASHSCOPE_API_BASE → _DEFAULT_API_BASE
DashScopeConfig-->>Caller: resolved endpoint URL
Caller->>DashScopeConfig: validate_environment(headers, model, litellm_params)
Note over DashScopeConfig: Resolves DASHSCOPE_API_KEY,<br/>builds Authorization header
DashScopeConfig-->>Caller: auth headers dict
Caller->>DashScopeConfig: map_openai_params(optional_params, model, drop_params)
Note over DashScopeConfig: Filters to _SUPPORTED_OPTIONAL_PARAMS whitelist
DashScopeConfig-->>Caller: filtered_params
Caller->>DashScopeConfig: transform_responses_api_request(...)
DashScopeConfig->>OpenAIConfig: super().transform_responses_api_request(...)
OpenAIConfig-->>Caller: request_body
Caller->>DashScope: POST /compatible-mode/v1/responses
DashScope-->>Caller: HTTP Response
Caller->>DashScopeConfig: transform_response_api_response(raw_response)
DashScopeConfig->>OpenAIConfig: super().transform_response_api_response(...)
OpenAIConfig-->>Caller: ResponsesAPIResponse
Last reviewed commit: "fix(dashscope): fix ..."
| class DashScopeResponsesAPIConfig(OpenAIResponsesAPIConfig): | ||
| """Responses API configuration for DashScope (Alibaba Cloud).""" | ||
|
|
||
| @property | ||
| def custom_llm_provider(self) -> LlmProviders: | ||
| return LlmProviders.DASHSCOPE |
There was a problem hiding this comment.
Missing
supports_native_websocket() override
DashScopeResponsesAPIConfig inherits supports_native_websocket() from OpenAIResponsesAPIConfig, which returns True. DashScope's compatible-mode endpoint almost certainly does not expose a native WebSocket interface for the Responses API.
When supports_native_websocket() returns True, llm_http_handler.py skips the ManagedResponsesWebSocketHandler fallback and attempts to open a direct wss:// connection — which will fail against DashScope's HTTP-only compatible-mode endpoint.
Every other non-OpenAI OpenAIResponsesAPIConfig subclass in the codebase explicitly overrides this method to return False (e.g. VolcEngineResponsesAPIConfig, DatabricksResponsesAPIConfig, HostedVLLMResponsesAPIConfig). DashScope needs the same:
| class DashScopeResponsesAPIConfig(OpenAIResponsesAPIConfig): | |
| """Responses API configuration for DashScope (Alibaba Cloud).""" | |
| @property | |
| def custom_llm_provider(self) -> LlmProviders: | |
| return LlmProviders.DASHSCOPE | |
| class DashScopeResponsesAPIConfig(OpenAIResponsesAPIConfig): | |
| """Responses API configuration for DashScope (Alibaba Cloud).""" | |
| @property | |
| def custom_llm_provider(self) -> LlmProviders: | |
| return LlmProviders.DASHSCOPE | |
| def supports_native_websocket(self) -> bool: | |
| """DashScope compatible-mode does not expose a native WebSocket endpoint.""" | |
| return False |
| _SUPPORTED_OPTIONAL_PARAMS: List[str] = [ | ||
| "instructions", | ||
| "max_output_tokens", | ||
| "metadata", | ||
| "previous_response_id", | ||
| "reasoning", | ||
| "store", | ||
| "stream", | ||
| "temperature", | ||
| "text", | ||
| "tools", | ||
| "tool_choice", | ||
| "top_p", | ||
| # LiteLLM request plumbing helpers | ||
| "extra_headers", | ||
| "extra_query", | ||
| "extra_body", | ||
| "timeout", | ||
| ] |
There was a problem hiding this comment.
metadata included in filter list then immediately discarded
metadata is listed in _SUPPORTED_OPTIONAL_PARAMS (so it passes the if key in _SUPPORTED_OPTIONAL_PARAMS filter in map_openai_params), but is then unconditionally popped in both map_openai_params and get_supported_openai_params. This is logically contradictory and misleading — if the intent is "never send metadata to the provider", it should not be in the module-level constant that semantically describes "what DashScope supports".
The same inconsistency exists in the analogue VolcEngineResponsesAPIConfig, but there it is at least a class attribute so a subclass could potentially change the behaviour. Here it is a module-level constant.
Consider removing "metadata" from _SUPPORTED_OPTIONAL_PARAMS entirely and keeping only the params.pop("metadata", None) guard in map_openai_params as a defensive measure, matching the comment intent ("LiteLLM metadata is internal-only"):
| _SUPPORTED_OPTIONAL_PARAMS: List[str] = [ | |
| "instructions", | |
| "max_output_tokens", | |
| "metadata", | |
| "previous_response_id", | |
| "reasoning", | |
| "store", | |
| "stream", | |
| "temperature", | |
| "text", | |
| "tools", | |
| "tool_choice", | |
| "top_p", | |
| # LiteLLM request plumbing helpers | |
| "extra_headers", | |
| "extra_query", | |
| "extra_body", | |
| "timeout", | |
| ] | |
| _SUPPORTED_OPTIONAL_PARAMS: List[str] = [ | |
| "instructions", | |
| "max_output_tokens", | |
| "previous_response_id", | |
| "reasoning", | |
| "store", | |
| "stream", | |
| "temperature", | |
| "text", | |
| "tools", | |
| "tool_choice", | |
| "top_p", | |
| # LiteLLM request plumbing helpers | |
| "extra_headers", | |
| "extra_query", | |
| "extra_body", | |
| "timeout", | |
| ] |
| def get_complete_url( | ||
| self, | ||
| api_base: Optional[str], | ||
| litellm_params: dict, | ||
| ) -> str: | ||
| """Construct DashScope Responses API endpoint.""" | ||
| base_url = ( | ||
| api_base | ||
| or litellm.api_base | ||
| or get_secret_str("DASHSCOPE_API_BASE") | ||
| or _DEFAULT_API_BASE | ||
| ) | ||
|
|
||
| base_url = base_url.rstrip("/") | ||
|
|
||
| if base_url.endswith("/responses"): | ||
| return base_url | ||
| if base_url.endswith("/v1"): | ||
| return f"{base_url}/responses" | ||
| if base_url.endswith("/compatible-mode/v1"): | ||
| return f"{base_url}/responses" | ||
| return f"{base_url}/compatible-mode/v1/responses" |
There was a problem hiding this comment.
Fallback path appends DashScope-specific route to arbitrary base URLs
When api_base is provided but doesn't match any of the known suffixes (/responses, /v1, /compatible-mode/v1), the final branch appends /compatible-mode/v1/responses to whatever was given. This means a proxy user who sets api_base="https://my-proxy.example.com" expecting standard /v1/responses routing would instead get https://my-proxy.example.com/compatible-mode/v1/responses, which is likely wrong for non-DashScope-specific proxies.
The parent OpenAIResponsesAPIConfig.get_complete_url simply appends /responses to the stripped base — a reasonable default. Consider whether a plain {base_url}/responses fallback (matching the parent) would be safer for the last branch than the DashScope-specific path:
# Instead of:
return f"{base_url}/compatible-mode/v1/responses"
# Consider:
return f"{base_url}/responses"This would match the behaviour a user would expect when providing a fully-qualified custom endpoint base.
| if base_url.endswith("/responses"): | ||
| return base_url | ||
| if base_url.endswith("/v1"): | ||
| return f"{base_url}/responses" | ||
| if base_url.endswith("/compatible-mode/v1"): | ||
| return f"{base_url}/responses" | ||
| return f"{base_url}/responses" |
There was a problem hiding this comment.
Redundant branches — all three produce the same result
The two intermediate checks (/v1 and /compatible-mode/v1) are dead code: they each return f"{base_url}/responses", which is identical to the final fallback. The only branch that meaningfully changes behavior is the first one. The three clauses can be collapsed:
| if base_url.endswith("/responses"): | |
| return base_url | |
| if base_url.endswith("/v1"): | |
| return f"{base_url}/responses" | |
| if base_url.endswith("/compatible-mode/v1"): | |
| return f"{base_url}/responses" | |
| return f"{base_url}/responses" | |
| if base_url.endswith("/responses"): | |
| return base_url | |
| return f"{base_url}/responses" |
If you do want to keep explicit documentation of the known DashScope-specific suffixes, that's fine — but a comment explaining the intent would be clearer than three separate branches.
| params = { | ||
| key: value | ||
| for key, value in dict(response_api_optional_params).items() | ||
| if key in _SUPPORTED_OPTIONAL_PARAMS | ||
| } | ||
| # LiteLLM metadata is internal-only; don't send to provider | ||
| params.pop("metadata", None) | ||
| return params |
There was a problem hiding this comment.
params.pop("metadata", None) is unreachable dead code
"metadata" is not present in _SUPPORTED_OPTIONAL_PARAMS, so the dict comprehension on lines 65–68 already excludes it unconditionally. By the time execution reaches line 71, params can never contain a "metadata" key, making the .pop() a no-op.
Either add "metadata" to _SUPPORTED_OPTIONAL_PARAMS (to document that LiteLLM may pass it) and rely solely on the .pop() to strip it before sending — matching the VolcEngineResponsesAPIConfig pattern — or remove the .pop() entirely and rely solely on the comprehension filter.
| params = { | |
| key: value | |
| for key, value in dict(response_api_optional_params).items() | |
| if key in _SUPPORTED_OPTIONAL_PARAMS | |
| } | |
| # LiteLLM metadata is internal-only; don't send to provider | |
| params.pop("metadata", None) | |
| return params | |
| params = { | |
| key: value | |
| for key, value in dict(response_api_optional_params).items() | |
| if key in _SUPPORTED_OPTIONAL_PARAMS | |
| } | |
| return params |
| def test_get_complete_url_default(self): | ||
| """Default URL should point to DashScope compatible-mode endpoint.""" | ||
| config = DashScopeResponsesAPIConfig() | ||
| url = config.get_complete_url(api_base=None, litellm_params={}) | ||
| assert url == "https://dashscope.aliyuncs.com/compatible-mode/v1/responses" |
There was a problem hiding this comment.
Missing test for
supports_native_websocket and DASHSCOPE_API_BASE env-var fallback
The supports_native_websocket() override is the most critical behavioral change in this PR (it prevents llm_http_handler from attempting a WebSocket connection to DashScope's HTTP-only endpoint), but no test asserts it returns False. A regression here would silently break all WebSocket paths.
Similarly, get_complete_url reads from get_secret_str("DASHSCOPE_API_BASE") as an intermediate fallback, but there is no test exercising that code path.
Consider adding:
def test_supports_native_websocket_is_false(self):
"""DashScope compatible-mode must not expose a native WebSocket."""
config = DashScopeResponsesAPIConfig()
assert config.supports_native_websocket() is False
def test_get_complete_url_env_base(self, monkeypatch):
"""DASHSCOPE_API_BASE env var should be used when api_base is None."""
config = DashScopeResponsesAPIConfig()
monkeypatch.setattr(litellm, "api_base", None)
monkeypatch.setenv("DASHSCOPE_API_BASE", "https://env-base.example.com/v1")
url = config.get_complete_url(api_base=None, litellm_params={})
assert url == "https://env-base.example.com/v1/responses"| import sys | ||
|
|
||
| import pytest | ||
|
|
There was a problem hiding this comment.
sys.path.insert uses a CWD-relative path, not file-relative
os.path.abspath("../../../../..") resolves the path relative to the current working directory at runtime, not relative to the test file's location. When pytest is invoked from the repo root (/github/berriai/litellm), this call computes ../../../../.. from the repo root — i.e., five levels above the repo root — which is an incorrect and useless path insertion.
The safer, widely-used pattern (as seen in test_message_sanitization.py) anchors the path to __file__:
| sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), "../../../../.."))) |
This guarantees the repo root is always added to sys.path regardless of where pytest is invoked.
| base_url = ( | ||
| api_base | ||
| or litellm.api_base | ||
| or get_secret_str("DASHSCOPE_API_BASE") | ||
| or _DEFAULT_API_BASE | ||
| ) |
There was a problem hiding this comment.
litellm.api_base takes priority over DASHSCOPE_API_BASE, shadowing provider-specific config
The resolution order is:
base_url = (
api_base
or litellm.api_base # ← global, set by the user for any provider
or get_secret_str("DASHSCOPE_API_BASE") # ← DashScope-specific
or _DEFAULT_API_BASE
)litellm.api_base is a global setting that users typically configure for OpenAI or a proxy, not for DashScope. If a user has set litellm.api_base = "https://api.openai.com/v1" for a different provider earlier in the same process, every DashScope Responses API call will silently route to https://api.openai.com/v1/responses instead of the DashScope endpoint, even if DASHSCOPE_API_BASE is set.
VolcEngineResponsesAPIConfig.get_complete_url follows the same ordering, but other providers (e.g. OpenAIResponsesAPIConfig) similarly inherit the global. This is a systemic LiteLLM pattern, but it is worth documenting (via a comment) that litellm.api_base takes precedence over the environment variable, since it will surprise users who configure per-provider base URLs via env vars.
If the intent is to give per-provider env vars higher precedence than the global, swap the order:
base_url = (
api_base
or get_secret_str("DASHSCOPE_API_BASE")
or litellm.api_base
or _DEFAULT_API_BASE
)DashScope (Alibaba Cloud) provides an OpenAI-compatible `/v1/responses` endpoint. This adds a `DashScopeResponsesAPIConfig` so that LiteLLM can route Responses API calls to DashScope natively, enabling `previous_response_id` sticky routing and server-side context caching. Changes: - New `litellm/llms/dashscope/responses/transformation.py` - Register in `ProviderConfigManager._get_python_responses_api_config()` - Add lazy import entry in `_lazy_imports_registry.py` - Unit tests covering registration, URL construction, auth, and params
- Add supports_native_websocket() returning False (DashScope has no native WS) - Remove metadata from _SUPPORTED_OPTIONAL_PARAMS (was contradictory) - Add DashScopeResponsesAPIConfig to TYPE_CHECKING block in __init__.py - Fix get_complete_url fallback to use /responses for custom proxy URLs
- Fix get_error_class return type to BaseLLMException (was Exception) - Remove redundant URL branches in get_complete_url (dead code) - Use BaseLLMException directly instead of OpenAIError wrapper
55d5b05 to
8815635
Compare
| def test_get_complete_url_default(self): | ||
| """Default URL should point to DashScope compatible-mode endpoint.""" | ||
| config = DashScopeResponsesAPIConfig() | ||
| url = config.get_complete_url(api_base=None, litellm_params={}) | ||
| assert url == "https://dashscope.aliyuncs.com/compatible-mode/v1/responses" |
There was a problem hiding this comment.
litellm.api_base not guarded against in URL default test
test_get_complete_url_default calls get_complete_url(api_base=None, …) but does not patch litellm.api_base to None. The implementation resolves the base URL as:
base_url = (
api_base # None — falls through
or litellm.api_base # ← picked up if set globally!
or get_secret_str("DASHSCOPE_API_BASE")
or _DEFAULT_API_BASE
)If any other test (or module-level code) has set litellm.api_base to a non-None value earlier in the test session, the assertion url == "https://dashscope.aliyuncs.com/compatible-mode/v1/responses" will fail because base_url will resolve to whatever litellm.api_base holds rather than _DEFAULT_API_BASE. Compare with test_validate_environment_uses_api_key, which correctly patches litellm.api_key.
| def test_get_complete_url_default(self): | |
| """Default URL should point to DashScope compatible-mode endpoint.""" | |
| config = DashScopeResponsesAPIConfig() | |
| url = config.get_complete_url(api_base=None, litellm_params={}) | |
| assert url == "https://dashscope.aliyuncs.com/compatible-mode/v1/responses" | |
| def test_get_complete_url_default(self, monkeypatch): | |
| """Default URL should point to DashScope compatible-mode endpoint.""" | |
| config = DashScopeResponsesAPIConfig() | |
| monkeypatch.setattr(litellm, "api_base", None) | |
| url = config.get_complete_url(api_base=None, litellm_params={}) | |
| assert url == "https://dashscope.aliyuncs.com/compatible-mode/v1/responses" |
Summary
DashScopeResponsesAPIConfigextendingOpenAIResponsesAPIConfigfor DashScope (Alibaba Cloud) Responses API support/v1/responsesendpoint, enablingprevious_response_idsticky routing and server-side context cachingProviderConfigManagerand_lazy_imports_registry.pyChanges
litellm/llms/dashscope/responses/__init__.pylitellm/llms/dashscope/responses/transformation.pylitellm/utils.py— add DASHSCOPE branch in_get_python_responses_api_config()litellm/_lazy_imports_registry.py— add lazy import entrytests/test_litellm/llms/dashscope/responses/test_dashscope_responses_transformation.pyImplementation Notes
DashScope is fully OpenAI-compatible, so the config is intentionally minimal — only overriding:
custom_llm_provider→LlmProviders.DASHSCOPEvalidate_environment()→ readsDASHSCOPE_API_KEYget_complete_url()→https://dashscope.aliyuncs.com/compatible-mode/v1/responsesget_supported_openai_params()→ parameter whitelistmap_openai_params()→ filter to supported setAll other methods (transform_request, transform_response, streaming, etc.) are inherited from
OpenAIResponsesAPIConfig.Test Plan
test_provider_config_registration— registry returns correct configtest_get_complete_url_default— default URL constructiontest_get_complete_url_custom_base— custom base URL variantstest_validate_environment_uses_api_key— API key from paramstest_validate_environment_from_env— API key from env vartest_validate_environment_raises_without_key— missing key errortest_supported_params— parameter whitelist correctnesstest_map_openai_params_filters_unsupported— unsupported param filteringtest_extra_headers_merged— custom header mergingAll 10 tests passing locally.