Skip to content

add forward_headers passthrough to remote::model-context-protocol#2

Closed
skamenan7 wants to merge 17 commits intomainfrom
feat/5152-mcp-tool-passthrough
Closed

add forward_headers passthrough to remote::model-context-protocol#2
skamenan7 wants to merge 17 commits intomainfrom
feat/5152-mcp-tool-passthrough

Conversation

@skamenan7
Copy link
Copy Markdown
Owner

@skamenan7 skamenan7 commented Mar 23, 2026

Note: this branch is stacked on llamastack#5134 (inference + safety passthrough, not yet merged). The only new code here is in the last commit — providers/remote/tool_runtime/model_context_protocol/ and tests/integration/tool_runtime/test_passthrough_mcp.py. For reviews for this PR please focus there; the rest is carried from llamastack#5134 pending merge.

Adds forward_headers and extra_blocked_headers to MCPProviderConfig, wiring per-request header forwarding into list_runtime_tools and invoke_tool. This lets deployers map keys from X-LlamaStack-Provider-Data to outbound HTTP headers so request-scoped auth tokens (MaaS API keys, tenant IDs, etc.) reach the downstream MCP server without the caller passing them via authorization= on every tool call.

Follows the same forward_headers pattern introduced for inference and safety passthrough in llamastack#5134. Authorization-mapped values are split out and passed via the authorization= param — prepare_mcp_headers() rejects Authorization in the headers dict directly, so it flows through the dedicated param instead.

What changed

  • MCPProviderConfig: added forward_headers: dict[str, str] | None and extra_blocked_headers: list[str] with config-time validation via validate_forward_headers_config() from providers/utils/forward_headers.py
  • MCPProviderDataValidator: added model_config = ConfigDict(extra="allow") so deployer-defined keys survive Pydantic parsing (key names are operator-configured at deploy time and can't be declared as typed fields)
  • ModelContextProtocolImpl: new _get_forwarded_headers_and_auth() reads the allowlist from provider data, splits Authorization for the authorization= param, returns non-auth headers separately. Both list_runtime_tools and invoke_tool merge forwarded headers with the legacy mcp_headers URI-keyed path (kept for backward compat). Explicit authorization= from the caller wins over forwarded auth.

Config example:

providers:
  tool_runtime:
  - provider_type: remote::model-context-protocol
    config:
      forward_headers:
        maas_api_token: Authorization   # bare token, "Bearer " prepended by prepare_mcp_headers
        tenant_id: X-Tenant-ID
        team_id: X-Team-ID

Test plan

Unit/integration tests — tests covering config validation, header forwarding, auth splitting, default-deny enforcement, missing-key soft-skip, and wiring through list_runtime_tools and invoke_tool:

uv run pytest tests/integration/tool_runtime/test_passthrough_mcp.py -v

Local testing with mock MCP server — ran a mock MCP server that captures all inbound headers and exposes them at /debug/last-headers. Started llama-stack with the forward_headers config above, then verified via the agent API that:

  • maas_api_token from X-LlamaStack-Provider-Data arrived as Authorization: Bearer <token> on the downstream server
  • tenant_id arrived as X-Tenant-ID
  • Unlisted keys (secret_internal) did not appear in any downstream header (default-deny confirmed)
  • Requests with missing or no provider data worked without crashing

Note: invoke_tool and list_runtime_tools are internal methods called by the agents layer and not exposed as HTTP endpoints (changed in upstream PRs llamastack#4997 and llamastack#5246), so full e2e requires a model server via the agent API.

Checklist

  • forward_headers / extra_blocked_headers optional with backward-compatible defaults — configs without them parse correctly
  • Default-deny: unlisted keys are silently dropped, nothing escapes the allowlist
  • Three-state config tested: None, empty dict, populated dict
  • MCPProviderDataValidator uses extra="allow" (intentional — deployer key names can't be pre-declared)
  • MCPProviderConfig uses extra="forbid"
  • Reuses shared build_forwarded_headers() and validate_forward_headers_config() from providers/utils/forward_headers.py — no duplication
  • Existing mcp_headers URI-keyed path preserved for backward compat
  • No breaking changes

Summary by Sourcery

Add configurable per-request header forwarding support to remote passthrough providers, including MCP tool runtime, inference, and safety, using a shared forward_headers utility with stricter validation and default-deny behavior.

New Features:

  • Allow MCP remote tool runtime to forward selected request-scoped headers and auth tokens from provider data to downstream MCP servers via forward_headers configuration.
  • Enable inference and safety passthrough providers to forward whitelisted headers from X-LlamaStack-Provider-Data using a common forward_headers policy and extra_blocked_headers for operator overrides.

Enhancements:

  • Refine passthrough inference auth handling so the OpenAI client relies solely on composed request headers, avoiding sentinel API keys and ensuring static credentials override forwarded Authorization values.
  • Harden header forwarding with centralized validation, case-insensitive blocking of security-sensitive and operator-defined headers, value sanitization, and size limits.
  • Relax provider-data validators for passthrough and MCP so deployer-defined keys are preserved while keeping provider configs strict via extra='forbid'.
  • Document forward_headers and extra_blocked_headers options for MCP tool runtime, inference passthrough, and safety passthrough providers.

Tests:

  • Add comprehensive unit tests for the shared forward_headers utilities, passthrough inference adapter behavior, safety passthrough config and headers, and MCP provider config and wiring through list_runtime_tools and invoke_tool.

@sourcery-ai
Copy link
Copy Markdown

sourcery-ai bot commented Mar 23, 2026

Reviewer's Guide

Adds configurable per-request header forwarding to the remote model-context-protocol tool runtime, reusing a shared forward_headers utility and aligning MCP behavior with the existing inference/safety passthrough patterns while preserving backward compatibility with legacy mcp_headers and auth handling.

Sequence diagram for MCP tool invocation with forwarded headers

sequenceDiagram
    actor Agent
    participant ModelContextProtocolImpl
    participant ForwardHeadersUtils as ForwardHeadersUtils_build_forwarded_headers
    participant LegacyHeaders as ModelContextProtocolImpl_get_headers_from_request
    participant MCPClient as invoke_mcp_tool
    participant MCPServer as Downstream_MCP_server

    Agent->>ModelContextProtocolImpl: invoke_tool(tool_name, kwargs, authorization?)
    activate ModelContextProtocolImpl

    ModelContextProtocolImpl->>ModelContextProtocolImpl: _get_forwarded_headers_and_auth()
    activate ForwardHeadersUtils
    ModelContextProtocolImpl->>ForwardHeadersUtils: build_forwarded_headers(provider_data, config.forward_headers)
    ForwardHeadersUtils-->>ModelContextProtocolImpl: forwarded_headers
    deactivate ForwardHeadersUtils

    ModelContextProtocolImpl->>ModelContextProtocolImpl: split Authorization from forwarded_headers
    ModelContextProtocolImpl-->>ModelContextProtocolImpl: forwarded_headers_no_auth, forwarded_auth

    ModelContextProtocolImpl->>LegacyHeaders: get_headers_from_request(endpoint)
    activate LegacyHeaders
    LegacyHeaders-->>ModelContextProtocolImpl: legacy_headers (from mcp_headers)
    deactivate LegacyHeaders

    ModelContextProtocolImpl-->>ModelContextProtocolImpl: merged_headers = forwarded_headers_no_auth + legacy_headers
    ModelContextProtocolImpl-->>ModelContextProtocolImpl: effective_auth = authorization param or forwarded_auth

    ModelContextProtocolImpl->>MCPClient: invoke_mcp_tool(endpoint, tool_name, kwargs, headers=merged_headers, authorization=effective_auth)
    activate MCPClient
    MCPClient->>MCPServer: HTTP request with merged_headers and Authorization
    MCPServer-->>MCPClient: HTTP response
    MCPClient-->>ModelContextProtocolImpl: result
    deactivate MCPClient

    ModelContextProtocolImpl-->>Agent: tool result
    deactivate ModelContextProtocolImpl
Loading

Sequence diagram for inference passthrough with forwarded headers and API key precedence

sequenceDiagram
    actor Client
    participant InferenceAdapter as PassthroughInferenceAdapter
    participant ForwardHeadersUtils as ForwardHeadersUtils_build_forwarded_headers
    participant OpenAI as AsyncOpenAI_client
    participant Downstream as Downstream_inference_service

    Client->>InferenceAdapter: openai_completion(...)
    activate InferenceAdapter

    InferenceAdapter->>InferenceAdapter: _get_openai_client()

    InferenceAdapter->>InferenceAdapter: _build_request_headers()
    InferenceAdapter->>ForwardHeadersUtils: build_forwarded_headers(provider_data, config.forward_headers)
    activate ForwardHeadersUtils
    ForwardHeadersUtils-->>InferenceAdapter: forwarded_headers
    deactivate ForwardHeadersUtils

    InferenceAdapter->>InferenceAdapter: _get_passthrough_api_key_or_none(provider_data?)
    InferenceAdapter-->>InferenceAdapter: api_key_or_none

    alt static_or_per_request_api_key_present
        InferenceAdapter-->>InferenceAdapter: drop forwarded Authorization variants
        InferenceAdapter-->>InferenceAdapter: headers[Authorization] = Bearer api_key
    end

    InferenceAdapter-->>InferenceAdapter: request_headers

    InferenceAdapter->>OpenAI: create AsyncOpenAI(base_url, api_key="", default_headers=request_headers)
    activate OpenAI

    InferenceAdapter->>OpenAI: chat.completions.create(...)
    OpenAI->>Downstream: HTTP request with request_headers
    Downstream-->>OpenAI: HTTP response
    OpenAI-->>InferenceAdapter: completion
    deactivate OpenAI

    InferenceAdapter-->>Client: completion
    deactivate InferenceAdapter
Loading

Class diagram for forward_headers configs and utilities

classDiagram
    class MCPProviderDataValidator {
        +ConfigDict model_config
        +dict~str, dict~str, str~~ mcp_headers
    }

    class MCPProviderConfig {
        +ConfigDict model_config
        +dict~str, str~ forward_headers
        +list~str~ extra_blocked_headers
        +sample_run_config(forward_headers, extra_blocked_headers, _kwargs) dict~str, Any~
        +validate_forward_headers() MCPProviderConfig
    }

    class PassthroughImplConfig {
        +HttpUrl base_url
        +SecretStr auth_credential
        +dict~str, str~ forward_headers
        +list~str~ extra_blocked_headers
        +validate_forward_headers() PassthroughImplConfig
        +sample_run_config(base_url, api_key, forward_headers, extra_blocked_headers, kwargs) dict~str, Any~
    }

    class PassthroughProviderDataValidator_inference {
        +ConfigDict model_config
        +HttpUrl passthrough_url
        +SecretStr passthrough_api_key
    }

    class PassthroughSafetyConfig {
        +HttpUrl base_url
        +SecretStr api_key
        +dict~str, str~ forward_headers
        +list~str~ extra_blocked_headers
        +validate_forward_headers() PassthroughSafetyConfig
        +sample_run_config(base_url, api_key, forward_headers, extra_blocked_headers, kwargs) dict~str, Any~
    }

    class PassthroughProviderDataValidator_safety {
        +ConfigDict model_config
        +SecretStr passthrough_api_key
    }

    class PassthroughInferenceAdapter {
        +PassthroughImplConfig config
        +initialize() None
        +_get_openai_client() AsyncOpenAI
        +_build_request_headers() dict~str, str~
        +_get_passthrough_api_key_or_none(provider_data) str
        +_get_passthrough_url() str
    }

    class PassthroughSafetyAdapter {
        +PassthroughSafetyConfig config
        +_get_api_key() str
        +_build_forward_headers() dict~str, str~
        +_build_request_headers() dict~str, str~
    }

    class ModelContextProtocolImpl {
        +MCPProviderConfig config
        +_get_forwarded_headers_and_auth() tuple~dict~str, str~, str~
        +get_headers_from_request(mcp_endpoint_uri) dict~str, str~
        +list_runtime_tools(mcp_endpoint, authorization) ListToolDefsResponse
        +invoke_tool(tool_name, kwargs, authorization) Any
    }

    class ForwardHeadersUtils {
        +CORE_BLOCKED_FORWARD_HEADERS : frozenset~str~
        +normalize_header_name(header_name) str
        +is_valid_header_name(header_name) bool
        +get_effective_blocked_forward_headers(extra_blocked_headers) frozenset~str~
        +validate_forward_headers_config(forward_headers, extra_blocked_headers) None
        +build_forwarded_headers(provider_data, forward_headers) dict~str, str~
    }

    MCPProviderConfig ..> ForwardHeadersUtils : uses
    PassthroughImplConfig ..> ForwardHeadersUtils : uses
    PassthroughSafetyConfig ..> ForwardHeadersUtils : uses

    PassthroughInferenceAdapter --> PassthroughImplConfig : has
    PassthroughSafetyAdapter --> PassthroughSafetyConfig : has
    ModelContextProtocolImpl --> MCPProviderConfig : has

    PassthroughInferenceAdapter ..> PassthroughProviderDataValidator_inference : reads provider_data
    PassthroughSafetyAdapter ..> PassthroughProviderDataValidator_safety : reads provider_data
    ModelContextProtocolImpl ..> MCPProviderDataValidator : reads provider_data

    PassthroughImplConfig --|> RemoteInferenceProviderConfig
    PassthroughSafetyAdapter --|> Safety
    PassthroughInferenceAdapter --|> Inference
    ModelContextProtocolImpl --|> NeedsRequestProviderData
Loading

File-Level Changes

Change Details Files
Introduce shared forward_headers utility and apply it across inference, safety, and MCP providers for consistent, policy-enforced header forwarding.
  • Added providers/utils/forward_headers.py with validation helpers, blocked-header policy, header-name normalization, and safe header-value extraction/sanitization.
  • Refactored inference and safety passthrough adapters to use build_forwarded_headers() instead of bespoke logic, including case-insensitive coalescing of Authorization variants and max-size enforcement.
  • Centralized forward_headers validation via validate_forward_headers_config() and get_effective_blocked_forward_headers(), replacing ad-hoc blocked-header constants in safety config.
src/llama_stack/providers/utils/forward_headers.py
src/llama_stack/providers/remote/inference/passthrough/passthrough.py
src/llama_stack/providers/remote/safety/passthrough/passthrough.py
src/llama_stack/providers/remote/safety/passthrough/config.py
src/llama_stack/providers/remote/inference/passthrough/config.py
tests/unit/providers/safety/passthrough/test_headers.py
tests/unit/providers/safety/passthrough/test_config.py
tests/unit/providers/inference/test_passthrough_forward_headers.py
Add forward_headers and extra_blocked_headers support to the MCP tool runtime configuration and provider data handling, including auth splitting and legacy-path merging.
  • Extended MCPProviderConfig with forward_headers and extra_blocked_headers fields plus a model_validator that delegates to validate_forward_headers_config(), and made config extra="forbid".
  • Updated MCPProviderDataValidator to use extra="allow" so deployer-defined provider-data keys survive parsing alongside legacy mcp_headers.
  • Implemented ModelContextProtocolToolRuntimeImpl._get_forwarded_headers_and_auth() to build forwarded headers from provider data, split Authorization into a separate auth token, warn on configured-but-missing keys, and return (non_auth_headers, auth_token).
  • Updated list_runtime_tools and invoke_tool to merge forwarded headers with legacy mcp_headers, and to prefer explicit authorization= over any forwarded auth token.
  • Added integration tests that validate MCPProviderConfig, provider-data behavior, and list_runtime_tools/invoke_tool wiring for header forwarding and auth precedence.
src/llama_stack/providers/remote/tool_runtime/model_context_protocol/config.py
src/llama_stack/providers/remote/tool_runtime/model_context_protocol/model_context_protocol.py
tests/integration/tool_runtime/test_passthrough_mcp.py
Tighten and document passthrough inference and safety configs to support header forwarding and more flexible per-request auth while maintaining backward compatibility.
  • Extended PassthroughImplConfig and PassthroughSafetyConfig with forward_headers and extra_blocked_headers, plus model_validator-based validation using validate_forward_headers_config().
  • Adjusted sample_run_config helpers and docs to expose the new fields, default api_key to an empty string when unset, and propagate extra_blocked_headers into sample configs.
  • Changed PassthroughProviderDataValidator (inference) and PassthroughProviderDataValidator (safety) to allow extra keys so forward_headers-mapped provider data is preserved, and made passthrough_url/passthrough_api_key optional.
  • Reworked PassthroughInferenceAdapter to build outbound headers via _build_request_headers(), prefer static auth_credential over provider-data API key, fall back to provider-data API key if static is empty/None, and avoid sending a spurious Authorization from the OpenAI SDK by setting api_key="" when relying solely on headers.
  • Updated PassthroughSafetyAdapter header-building logic to reuse build_forwarded_headers(), log when forward_headers is configured but yields no matches, and ensure static api_key overwrites any forwarded Authorization in a case-insensitive way.
  • Added/updated unit tests to cover config validation, provider-data preservation, header assembly behavior (including precedence and case-collision), size and character sanitization, and ContextVar isolation between concurrent requests.
src/llama_stack/providers/remote/inference/passthrough/config.py
src/llama_stack/providers/remote/inference/passthrough/__init__.py
src/llama_stack/providers/remote/inference/passthrough/passthrough.py
src/llama_stack/providers/remote/safety/passthrough/config.py
src/llama_stack/providers/remote/safety/passthrough/passthrough.py
docs/docs/providers/inference/remote_passthrough.mdx
docs/docs/providers/safety/remote_passthrough.mdx
tests/unit/providers/inference/test_passthrough_forward_headers.py
tests/unit/providers/safety/passthrough/test_headers.py
tests/unit/providers/safety/passthrough/test_config.py
Update documentation for the remote MCP and passthrough providers to describe the new forward_headers and extra_blocked_headers configuration options.
  • Extended remote::model-context-protocol provider docs with a configuration table describing forward_headers and extra_blocked_headers semantics and security guarantees.
  • Updated inference and safety remote_passthrough docs to list forward_headers/extra_blocked_headers in the config tables and to clarify behavior for Authorization and security-sensitive headers.
  • Adjusted the inference remote_passthrough sample config to allow an empty PASSTHROUGH_API_KEY env value (":=") to support header-only auth setups.
docs/docs/providers/tool_runtime/remote_model-context-protocol.mdx
docs/docs/providers/inference/remote_passthrough.mdx
docs/docs/providers/safety/remote_passthrough.mdx

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • The warning about forward_headers being configured but yielding no matching provider-data keys is now duplicated in multiple adapters (inference, safety, MCP); consider centralizing this into a small helper around build_forwarded_headers so the log message and behavior stay consistent in one place.
  • In build_forwarded_headers, you recompute len(sanitized.encode()) when logging the oversize warning; you could compute it once into a local variable to avoid double encoding and make the intent clearer.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The warning about `forward_headers` being configured but yielding no matching provider-data keys is now duplicated in multiple adapters (`inference`, `safety`, MCP); consider centralizing this into a small helper around `build_forwarded_headers` so the log message and behavior stay consistent in one place.
- In `build_forwarded_headers`, you recompute `len(sanitized.encode())` when logging the oversize warning; you could compute it once into a local variable to avoid double encoding and make the intent clearer.

## Individual Comments

### Comment 1
<location path="tests/integration/tool_runtime/test_passthrough_mcp.py" line_range="183" />
<code_context>
+# ---------------------------------------------------------------------------
+
+
+class TestListRuntimeToolsWiring:
+    async def test_forwarded_headers_passed_to_list_mcp_tools(self, monkeypatch):
+        """forward_headers config causes headers to be passed to list_mcp_tools."""
</code_context>
<issue_to_address>
**suggestion (testing):** Missing tests for interaction between forward_headers and legacy mcp_headers merging in list_runtime_tools

These tests only cover the forwarded-headers path and the no-headers case. Please add a test that:

- Sets `mcp_headers` in the provider data for an endpoint.
- Configures `forward_headers` (e.g. `"tid" -> "X-Tenant-ID"`).
- Asserts that `list_runtime_tools` passes a merged headers dict to `list_mcp_tools`, with the correct precedence between legacy and forwarded headers.

This will validate the backward-compatible path and ensure the new merge logic doesn’t drop or wrongly override legacy headers.

Suggested implementation:

```python
class TestListRuntimeToolsWiring:
    async def test_forwarded_headers_passed_to_list_mcp_tools(self, monkeypatch):
        """forward_headers config causes headers to be passed to list_mcp_tools."""
        from llama_stack_api import URL

        impl = _make_impl(forward_headers={"tid": "X-Tenant-ID"})
        impl.get_request_provider_data = MagicMock(
            return_value=_make_provider_data(tid="acme")
        )  # type: ignore[method-assign]

        captured: dict[str, Any] = {}

        async def fake_list_mcp_tools(endpoint, headers=None, authorization=None, **_):
            captured["headers"] = headers
            captured["authorization"] = authorization

        # Wire our fake into the runtime impl
        monkeypatch.setattr(impl.runtime, "list_mcp_tools", fake_list_mcp_tools)

        # Exercise list_runtime_tools; this should compute forwarded headers
        await impl.list_runtime_tools(URL("https://example.com/mcp"))

        # We only care that the forwarded header mapping is honored
        assert captured["headers"] == {"X-Tenant-ID": "acme"}

    async def test_forwarded_and_legacy_mcp_headers_are_merged(self, monkeypatch):
        """forward_headers and legacy mcp_headers are merged with legacy taking precedence."""
        from llama_stack_api import URL

        # Map request headers -> MCP headers
        impl = _make_impl(
            forward_headers={
                "tid": "X-Tenant-ID",
                "org": "X-Org-ID",
            }
        )

        # Legacy MCP headers configured on the provider
        legacy_mcp_headers = {
            "X-Tenant-ID": "legacy-tenant",  # should win over forwarded value
            "X-Static": "static-value",
        }

        # Request-scoped headers (available for forwarding)
        impl.get_request_provider_data = MagicMock(
            return_value=_make_provider_data(
                tid="forwarded-tenant",
                org="acme-org",
                mcp_headers=legacy_mcp_headers,
            )
        )  # type: ignore[method-assign]

        captured: dict[str, Any] = {}

        async def fake_list_mcp_tools(endpoint, headers=None, authorization=None, **_):
            captured["headers"] = headers
            captured["authorization"] = authorization

        # Wire our fake into the runtime impl
        monkeypatch.setattr(impl.runtime, "list_mcp_tools", fake_list_mcp_tools)

        # Exercise list_runtime_tools; this should merge forwarded and legacy headers
        await impl.list_runtime_tools(URL("https://example.com/mcp"))

        # Expected behavior:
        # - Non-overlapping forwarded headers are included.
        # - Overlapping keys are resolved in favor of legacy mcp_headers (backward compatible).
        assert captured["headers"] == {
            "X-Tenant-ID": "legacy-tenant",  # legacy wins
            "X-Static": "static-value",
            "X-Org-ID": "acme-org",  # only comes from forwarded headers
        }

```

I had to reconstruct parts of the existing `test_forwarded_headers_passed_to_list_mcp_tools` body and the wiring to `impl.runtime.list_mcp_tools` / `impl.list_runtime_tools` based on the incomplete snippet.

To integrate this cleanly, please:
1. Align the use of `URL("https://example.com/mcp")` and the `impl.list_runtime_tools(...)` call with how other tests in this file currently invoke `list_runtime_tools` (e.g., if they pass an endpoint ID or different argument shape, mirror that instead).
2. Ensure `_make_provider_data(...)` actually accepts an `mcp_headers` keyword argument and that it places those headers where `list_runtime_tools` expects to find legacy MCP headers. If the shape differs (e.g., nested under a specific endpoint key), adapt the `return_value` accordingly.
3. If `impl.runtime.list_mcp_tools` lives under a different attribute path or module in your codebase, adjust the `monkeypatch.setattr(...)` calls to match the actual location.
4. If your existing `test_forwarded_headers_passed_to_list_mcp_tools` already has more assertions or setup not visible in the snippet, merge those back into the reconstructed body above so no coverage is lost.
</issue_to_address>

### Comment 2
<location path="tests/integration/tool_runtime/test_passthrough_mcp.py" line_range="263" />
<code_context>
+# ---------------------------------------------------------------------------
+
+
+class TestInvokeToolWiring:
+    async def test_forwarded_headers_passed_to_invoke_mcp_tool(self, monkeypatch):
+        """forward_headers config causes headers to be passed to invoke_mcp_tool."""
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a test that legacy mcp_headers are still honored in invoke_tool alongside forwarded headers

Right now this only exercises the `forwarded_headers` path. Please add a test where `provider_data.mcp_headers` is set for the tool endpoint and `forward_headers` also injects a header, then assert that `invoke_mcp_tool` sees both and that the intended precedence between them is preserved. This will cover the backward-compatible headers behavior in `invoke_tool`.

Suggested implementation:

```python
# ---------------------------------------------------------------------------
# invoke_tool wiring tests
# ---------------------------------------------------------------------------


class TestInvokeToolWiring:
    async def test_forwarded_headers_passed_to_invoke_mcp_tool(self, monkeypatch):
        """forward_headers config causes headers to be passed to invoke_mcp_tool."""
        impl = _make_impl(forward_headers={"tid": "X-Tenant-ID", "tok": "Authorization"})
        impl.get_request_provider_data = MagicMock(
            return_value=_make_provider_data(tid="acme", tok="my-token")
        )  # type: ignore[method-assign]

        # mock tool_store
        fake_tool = MagicMock()
        fake_tool.metadata = {"endpoint": "http://mcp-server:8080/sse"}
        impl.tool_store = AsyncMock()
        impl.tool_store.get_tool.return_value = fake_tool

        captured: dict[str, Any] = {}

        async def _capture_invoke_mcp_tool(*args: Any, **kwargs: Any) -> Any:  # pragma: no cover - trivial passthrough
            captured["headers"] = kwargs.get("headers")

        impl.runtime_handler.invoke_mcp_tool = AsyncMock(side_effect=_capture_invoke_mcp_tool)

        await impl.invoke_tool(
            tool_name="some-tool",
            arguments={},
            metadata={},
        )

        assert captured["headers"] == {
            "X-Tenant-ID": "acme",
            "Authorization": "my-token",
        }

    async def test_forwarded_and_legacy_mcp_headers_are_merged_and_forwarded_headers_take_precedence(
        self,
        monkeypatch: Any,
    ) -> None:
        """Both legacy mcp_headers and forward_headers are honored; forward_headers take precedence."""
        impl = _make_impl(forward_headers={"tid": "X-Tenant-ID", "tok": "Authorization"})

        endpoint = "http://mcp-server:8080/sse"

        # Legacy mcp_headers for this endpoint
        legacy_mcp_headers = {
            "Authorization": "legacy-token",
            "X-Legacy": "legacy-value",
        }

        # Request-scoped headers that will be forwarded according to forward_headers
        request_headers = {
            "tid": "acme",
            "tok": "forward-token",
        }

        provider_data = SimpleNamespace(
            headers=request_headers,
            mcp_headers={endpoint: legacy_mcp_headers},
        )
        impl.get_request_provider_data = MagicMock(return_value=provider_data)  # type: ignore[method-assign]

        # mock tool_store
        fake_tool = MagicMock()
        fake_tool.metadata = {"endpoint": endpoint}
        impl.tool_store = AsyncMock()
        impl.tool_store.get_tool.return_value = fake_tool

        captured: dict[str, Any] = {}

        async def _capture_invoke_mcp_tool(*args: Any, **kwargs: Any) -> Any:  # pragma: no cover - trivial passthrough
            captured["headers"] = kwargs.get("headers")

        impl.runtime_handler.invoke_mcp_tool = AsyncMock(side_effect=_capture_invoke_mcp_tool)

        await impl.invoke_tool(
            tool_name="some-tool",
            arguments={},
            metadata={},
        )

        # Expect:
        # - legacy mcp_headers are present
        # - forwarded headers are added
        # - for overlapping keys (e.g. Authorization) forwarded headers win
        assert captured["headers"] == {
            "X-Tenant-ID": "acme",          # from forward_headers mapping tid -> X-Tenant-ID
            "Authorization": "forward-token",  # forwarded value overrides legacy mcp_headers.Authorization
            "X-Legacy": "legacy-value",     # preserved from legacy mcp_headers
        }

```

1. Ensure `SimpleNamespace` is imported at the top of `tests/integration/tool_runtime/test_passthrough_mcp.py`, e.g.:
   `from types import SimpleNamespace`.
2. If `impl.invoke_tool` uses different parameter names than `tool_name`, `arguments`, and `metadata`, adjust the call sites in both tests to match the actual signature used elsewhere in this test file.
3. If `get_request_provider_data` expects attributes other than `headers` and `mcp_headers`, extend the `SimpleNamespace` in the new test with those attributes as needed.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

# ---------------------------------------------------------------------------


class TestListRuntimeToolsWiring:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Missing tests for interaction between forward_headers and legacy mcp_headers merging in list_runtime_tools

These tests only cover the forwarded-headers path and the no-headers case. Please add a test that:

  • Sets mcp_headers in the provider data for an endpoint.
  • Configures forward_headers (e.g. "tid" -> "X-Tenant-ID").
  • Asserts that list_runtime_tools passes a merged headers dict to list_mcp_tools, with the correct precedence between legacy and forwarded headers.

This will validate the backward-compatible path and ensure the new merge logic doesn’t drop or wrongly override legacy headers.

Suggested implementation:

class TestListRuntimeToolsWiring:
    async def test_forwarded_headers_passed_to_list_mcp_tools(self, monkeypatch):
        """forward_headers config causes headers to be passed to list_mcp_tools."""
        from llama_stack_api import URL

        impl = _make_impl(forward_headers={"tid": "X-Tenant-ID"})
        impl.get_request_provider_data = MagicMock(
            return_value=_make_provider_data(tid="acme")
        )  # type: ignore[method-assign]

        captured: dict[str, Any] = {}

        async def fake_list_mcp_tools(endpoint, headers=None, authorization=None, **_):
            captured["headers"] = headers
            captured["authorization"] = authorization

        # Wire our fake into the runtime impl
        monkeypatch.setattr(impl.runtime, "list_mcp_tools", fake_list_mcp_tools)

        # Exercise list_runtime_tools; this should compute forwarded headers
        await impl.list_runtime_tools(URL("https://example.com/mcp"))

        # We only care that the forwarded header mapping is honored
        assert captured["headers"] == {"X-Tenant-ID": "acme"}

    async def test_forwarded_and_legacy_mcp_headers_are_merged(self, monkeypatch):
        """forward_headers and legacy mcp_headers are merged with legacy taking precedence."""
        from llama_stack_api import URL

        # Map request headers -> MCP headers
        impl = _make_impl(
            forward_headers={
                "tid": "X-Tenant-ID",
                "org": "X-Org-ID",
            }
        )

        # Legacy MCP headers configured on the provider
        legacy_mcp_headers = {
            "X-Tenant-ID": "legacy-tenant",  # should win over forwarded value
            "X-Static": "static-value",
        }

        # Request-scoped headers (available for forwarding)
        impl.get_request_provider_data = MagicMock(
            return_value=_make_provider_data(
                tid="forwarded-tenant",
                org="acme-org",
                mcp_headers=legacy_mcp_headers,
            )
        )  # type: ignore[method-assign]

        captured: dict[str, Any] = {}

        async def fake_list_mcp_tools(endpoint, headers=None, authorization=None, **_):
            captured["headers"] = headers
            captured["authorization"] = authorization

        # Wire our fake into the runtime impl
        monkeypatch.setattr(impl.runtime, "list_mcp_tools", fake_list_mcp_tools)

        # Exercise list_runtime_tools; this should merge forwarded and legacy headers
        await impl.list_runtime_tools(URL("https://example.com/mcp"))

        # Expected behavior:
        # - Non-overlapping forwarded headers are included.
        # - Overlapping keys are resolved in favor of legacy mcp_headers (backward compatible).
        assert captured["headers"] == {
            "X-Tenant-ID": "legacy-tenant",  # legacy wins
            "X-Static": "static-value",
            "X-Org-ID": "acme-org",  # only comes from forwarded headers
        }

I had to reconstruct parts of the existing test_forwarded_headers_passed_to_list_mcp_tools body and the wiring to impl.runtime.list_mcp_tools / impl.list_runtime_tools based on the incomplete snippet.

To integrate this cleanly, please:

  1. Align the use of URL("https://example.com/mcp") and the impl.list_runtime_tools(...) call with how other tests in this file currently invoke list_runtime_tools (e.g., if they pass an endpoint ID or different argument shape, mirror that instead).
  2. Ensure _make_provider_data(...) actually accepts an mcp_headers keyword argument and that it places those headers where list_runtime_tools expects to find legacy MCP headers. If the shape differs (e.g., nested under a specific endpoint key), adapt the return_value accordingly.
  3. If impl.runtime.list_mcp_tools lives under a different attribute path or module in your codebase, adjust the monkeypatch.setattr(...) calls to match the actual location.
  4. If your existing test_forwarded_headers_passed_to_list_mcp_tools already has more assertions or setup not visible in the snippet, merge those back into the reconstructed body above so no coverage is lost.

# ---------------------------------------------------------------------------


class TestInvokeToolWiring:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider adding a test that legacy mcp_headers are still honored in invoke_tool alongside forwarded headers

Right now this only exercises the forwarded_headers path. Please add a test where provider_data.mcp_headers is set for the tool endpoint and forward_headers also injects a header, then assert that invoke_mcp_tool sees both and that the intended precedence between them is preserved. This will cover the backward-compatible headers behavior in invoke_tool.

Suggested implementation:

# ---------------------------------------------------------------------------
# invoke_tool wiring tests
# ---------------------------------------------------------------------------


class TestInvokeToolWiring:
    async def test_forwarded_headers_passed_to_invoke_mcp_tool(self, monkeypatch):
        """forward_headers config causes headers to be passed to invoke_mcp_tool."""
        impl = _make_impl(forward_headers={"tid": "X-Tenant-ID", "tok": "Authorization"})
        impl.get_request_provider_data = MagicMock(
            return_value=_make_provider_data(tid="acme", tok="my-token")
        )  # type: ignore[method-assign]

        # mock tool_store
        fake_tool = MagicMock()
        fake_tool.metadata = {"endpoint": "http://mcp-server:8080/sse"}
        impl.tool_store = AsyncMock()
        impl.tool_store.get_tool.return_value = fake_tool

        captured: dict[str, Any] = {}

        async def _capture_invoke_mcp_tool(*args: Any, **kwargs: Any) -> Any:  # pragma: no cover - trivial passthrough
            captured["headers"] = kwargs.get("headers")

        impl.runtime_handler.invoke_mcp_tool = AsyncMock(side_effect=_capture_invoke_mcp_tool)

        await impl.invoke_tool(
            tool_name="some-tool",
            arguments={},
            metadata={},
        )

        assert captured["headers"] == {
            "X-Tenant-ID": "acme",
            "Authorization": "my-token",
        }

    async def test_forwarded_and_legacy_mcp_headers_are_merged_and_forwarded_headers_take_precedence(
        self,
        monkeypatch: Any,
    ) -> None:
        """Both legacy mcp_headers and forward_headers are honored; forward_headers take precedence."""
        impl = _make_impl(forward_headers={"tid": "X-Tenant-ID", "tok": "Authorization"})

        endpoint = "http://mcp-server:8080/sse"

        # Legacy mcp_headers for this endpoint
        legacy_mcp_headers = {
            "Authorization": "legacy-token",
            "X-Legacy": "legacy-value",
        }

        # Request-scoped headers that will be forwarded according to forward_headers
        request_headers = {
            "tid": "acme",
            "tok": "forward-token",
        }

        provider_data = SimpleNamespace(
            headers=request_headers,
            mcp_headers={endpoint: legacy_mcp_headers},
        )
        impl.get_request_provider_data = MagicMock(return_value=provider_data)  # type: ignore[method-assign]

        # mock tool_store
        fake_tool = MagicMock()
        fake_tool.metadata = {"endpoint": endpoint}
        impl.tool_store = AsyncMock()
        impl.tool_store.get_tool.return_value = fake_tool

        captured: dict[str, Any] = {}

        async def _capture_invoke_mcp_tool(*args: Any, **kwargs: Any) -> Any:  # pragma: no cover - trivial passthrough
            captured["headers"] = kwargs.get("headers")

        impl.runtime_handler.invoke_mcp_tool = AsyncMock(side_effect=_capture_invoke_mcp_tool)

        await impl.invoke_tool(
            tool_name="some-tool",
            arguments={},
            metadata={},
        )

        # Expect:
        # - legacy mcp_headers are present
        # - forwarded headers are added
        # - for overlapping keys (e.g. Authorization) forwarded headers win
        assert captured["headers"] == {
            "X-Tenant-ID": "acme",          # from forward_headers mapping tid -> X-Tenant-ID
            "Authorization": "forward-token",  # forwarded value overrides legacy mcp_headers.Authorization
            "X-Legacy": "legacy-value",     # preserved from legacy mcp_headers
        }
  1. Ensure SimpleNamespace is imported at the top of tests/integration/tool_runtime/test_passthrough_mcp.py, e.g.:
    from types import SimpleNamespace.
  2. If impl.invoke_tool uses different parameter names than tool_name, arguments, and metadata, adjust the call sites in both tests to match the actual signature used elsewhere in this test file.
  3. If get_request_provider_data expects attributes other than headers and mcp_headers, extend the SimpleNamespace in the new test with those attributes as needed.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 23, 2026

Recording workflow completed

Providers: ollama

Recordings have been generated and will be committed automatically by the companion workflow.

View workflow run

…lamastack#5134)

# What does this PR do?

Adds per-request HTTP header forwarding to the `remote::passthrough`
inference provider, following the pattern established by the safety
passthrough provider (PR llamastack#5004, already merged).

A `forward_headers` config field maps provider-data keys to outbound
HTTP header names. Only explicitly listed keys are forwarded from
`X-LlamaStack-Provider-Data` to the downstream service (default-deny).
An `extra_blocked_headers` field lets operators add custom blocked names
on top of the core security list.

The shared utility `providers/utils/forward_headers.py` is used by both
the inference and safety passthrough providers, keeping the forwarding
logic and blocked-header policy in one place.

Closes llamastack#5040
Relates llamastack#4607

## Test Plan

Unit tests cover the full path — config validation, header extraction,
CRLF sanitization, blocked-header enforcement, auth priority chain, and
concurrent request isolation:

```bash
uv run pytest tests/unit/providers/inference/test_passthrough_forward_headers.py -v
```

Tests cover:
- `build_forwarded_headers()` — key mapping, default-deny, CRLF
stripping, SecretStr unwrap, case-insensitive dedup
- `validate_forward_headers_config()` — blocked header rejection,
operator extra blocklist, invalid names
- Adapter auth priority — static api_key > passthrough_api_key >
forwarded Authorization
- Provider data validator — extra fields preserved for forwarding,
reserved keys rejected
- Concurrent request isolation — contextvars don't leak between parallel
requests

Also tested end-to-end locally against a mock inference server and a
mock `/v1/moderations` server. Headers land on the downstream exactly as
configured and blocked headers are rejected at stack startup, not at
request time.

Example config:

```yaml
providers:
  inference:
    - provider_id: maas-inference
      provider_type: remote::passthrough
      config:
        base_url: ${env.PASSTHROUGH_URL}
        forward_headers:
          maas_api_token: "Authorization"
          tenant_id: "X-Tenant-ID"
```

Callers pass credentials via `X-LlamaStack-Provider-Data`:

```bash
curl http://localhost:8321/v1/chat/completions \
  -H 'X-LlamaStack-Provider-Data: {"maas_api_token": "Bearer user-jwt", "tenant_id": "acme"}' \
  -d '{"model": "passthrough/my-model", "messages": [{"role": "user", "content": "hello"}]}'
```

The downstream receives `Authorization: Bearer user-jwt` and
`X-Tenant-ID: acme`. Only keys explicitly listed in `forward_headers`
are forwarded to the downstream service. Any keys in
`X-LlamaStack-Provider-Data` that don't have a mapping in
`forward_headers` are ignored — they never leave the stack. This is the
default-deny policy: if it's not in the config, it doesn't get
forwarded.
@skamenan7 skamenan7 force-pushed the feat/5152-mcp-tool-passthrough branch from bfbfeb5 to a07a34e Compare March 24, 2026 11:33
eoinfennessy and others added 5 commits March 24, 2026 15:08
)

## Summary

- Add Bedrock (`bedrock/openai.gpt-oss-20b`) to the responses test suite
in CI
- Pre-register the Bedrock model in the ci-tests distribution config
- Include ~330 newly recorded Bedrock response test recordings (104
passing, 87% coverage)
- Update the provider compatibility matrix to reflect Bedrock coverage

## Test plan

- [x] CI replay tests pass for Bedrock responses suite
- [x] Provider compatibility matrix renders correctly with Bedrock
column
- [ ] Provider record job works

---------

Co-authored-by: Sébastien Han <seb@redhat.com>
…ng (llamastack#5265)

# What does this PR do?
Add .pytest_cache/, .mypy_cache/, htmlcov/, *.log, *.swp, *.swo, and
*.tsbuildinfo to cover common editor, test, and build artifacts missing
from the standard Python/TypeScript gitignore templates.

Improves AgentReady .gitignore_completeness score from 61 to 89/100
(Threshold: ≥70)

Signed-off-by: Eleanor Hu <ehu@redhat.com>
Co-authored-by: Sébastien Han <seb@redhat.com>
…5251)

Adds conventional-pre-commit hook to enforce conventional commit format.

- Recognized by agentready conventional_commits test

Signed-off-by: Derek Higgins <derekh@redhat.com>
Co-authored-by: Sébastien Han <seb@redhat.com>
## Summary
- The `test_openai_chat_completion_is_async` test intermittently fails
on CI runners due to a tight timing assertion (2x multiplier on 0.5s
sleep)
- Increases the multiplier from 2x to 3x (1.5s threshold), which still
validates concurrency (sequential would be 2.0s) while tolerating CI
jitter
- Fixes flake seen in
https://github.com/llamastack/llama-stack/actions/runs/23495075039

## Test plan
- [x] Verified the threshold still validates concurrency: 4 parallel
0.5s sleeps must complete in < 1.5s (sequential would take 2.0s)
- [ ] CI unit tests pass without flaking

Signed-off-by: Charlie Doern <cdoern@redhat.com>
Co-authored-by: Sébastien Han <seb@redhat.com>
@skamenan7 skamenan7 force-pushed the feat/5152-mcp-tool-passthrough branch 2 times, most recently from 9410037 to c430e0d Compare March 24, 2026 19:14
# What does this PR do?
Trim README for clarity and structure

  - Add table of contents
  - Fix heading hierarchy (H3 → H2 throughout)
- Merge related sections (Documentation + Client SDKs → Resources;
Community + Star History + Contributors)
- Fold one-line installer into Overview, remove redundant Benefits
sub-section
  - Tighten prose throughout

Improves AgentReady Concise Documentation score from 69 to 85/100
(Threshold: ≥75)

Signed-off-by: Eleanor Hu <ehu@redhat.com>
Co-authored-by: Sébastien Han <seb@redhat.com>
@skamenan7 skamenan7 force-pushed the feat/5152-mcp-tool-passthrough branch from c430e0d to d8f03a9 Compare March 25, 2026 08:45
Fixed llamastack#5256

# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->

<!-- If resolving an issue, uncomment and update the line below -->

Add OpenTelemetry metrics to track which optional parameters callers
explicitly provide in Responses API (`POST /v1/responses`) requests.
This helps understand API usage patterns and prioritize OpenAI parity
work. introduced new counter metric
`llama_stack.responses.parameter_usage_total` with `operation` and
`parameter` attributes

For a request like:
```bash
curl -X POST http://localhost:8321/v1/responses \
  -H 'Content-Type: application/json' \
  -d '{
    "input": "What is Python?",
    "model": "ollama/llama3.2:3b-instruct-fp16",
    "temperature": 0.7,
    "stream": true,
    "tools": [],
    "instructions": "Be concise"
  }'
```

The following Prometheus metrics are emitted:
```
llama_stack_llama_stack_responses_parameter_usage_total{operation="create_response", parameter="temperature"} 1
llama_stack_llama_stack_responses_parameter_usage_total{operation="create_response", parameter="stream"} 1
llama_stack_llama_stack_responses_parameter_usage_total{operation="create_response", parameter="tools"} 1
llama_stack_llama_stack_responses_parameter_usage_total{operation="create_response", parameter="instructions"} 1
```

```console
(llama-stack) gualiu@gualiu-mac llama-stack % curl -s 'http://localhost:9090/api/v1/query?query=llama_stack_llama_stack_responses_parameter_usage_total' | jq
{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "llama_stack_llama_stack_responses_parameter_usage_total",
          "exported_job": "llama-stack-server",
          "instance": "otel-collector:9464",
          "job": "otel-collector",
          "operation": "create_response",
          "otel_scope_name": "llama_stack.responses",
          "otel_scope_version": "1.0.0",
          "parameter": "instructions"
        },
        "value": [
          1774293626.183,
          "7"
        ]
      },
      {
        "metric": {
          "__name__": "llama_stack_llama_stack_responses_parameter_usage_total",
          "exported_job": "llama-stack-server",
          "instance": "otel-collector:9464",
          "job": "otel-collector",
          "operation": "create_response",
          "otel_scope_name": "llama_stack.responses",
          "otel_scope_version": "1.0.0",
          "parameter": "stream"
        },
        "value": [
          1774293626.183,
          "7"
        ]
      },
      {
        "metric": {
          "__name__": "llama_stack_llama_stack_responses_parameter_usage_total",
          "exported_job": "llama-stack-server",
          "instance": "otel-collector:9464",
          "job": "otel-collector",
          "operation": "create_response",
          "otel_scope_name": "llama_stack.responses",
          "otel_scope_version": "1.0.0",
          "parameter": "temperature"
        },
        "value": [
          1774293626.183,
          "4"
        ]
      },
      {
        "metric": {
          "__name__": "llama_stack_llama_stack_responses_parameter_usage_total",
          "exported_job": "llama-stack-server",
          "instance": "otel-collector:9464",
          "job": "otel-collector",
          "operation": "create_response",
          "otel_scope_name": "llama_stack.responses",
          "otel_scope_version": "1.0.0",
          "parameter": "tools"
        },
        "value": [
          1774293626.183,
          "4"
        ]
      }
    ]
  }
}
```

Co-authored-by: Sébastien Han <seb@redhat.com>
@skamenan7 skamenan7 force-pushed the feat/5152-mcp-tool-passthrough branch from d8f03a9 to f04296b Compare March 25, 2026 09:01
…ck#5271)

- Adds a `markdownlint-cli` pre-commit hook with `--fix` to enforce
consistent markdown style across the repository
- Fixes all existing violations across 75 files, one rule enabled per
commit for easier review
- Updates Jinja doc templates (`dell`, `nvidia`, `oci`) and codegen
scripts (`provider_compat_matrix.py`) to ensure generated docs converge
with markdownlint

### Commit-by-commit rule fixes

1. **MD003** — heading style consistency (atx, no closing hashes)
2. **MD024** — no duplicate sibling headings
3. **MD045** — image alt text
4. **MD051** — valid link fragments
5. **MD025** — single top-level heading per document
6. **MD055** — consistent table pipe style
7. **MD059** — descriptive link text (no bare `[here]` links)
8. **MD001** — heading level increment
9. **MD036** — no emphasis as heading (with `markdownlint-disable` for
legitimate uses)
10. **MD040** — fenced code blocks must have language specified

### Permanently disabled rules

| Rule | Reason |
|------|--------|
| MD013 | Line length — URLs, badges, and tables often exceed any
reasonable limit |
| MD033 | Inline HTML — project uses `<details>`, `<summary>`, and other
HTML |
| MD041 | First line heading — generated reports and some docs don't
follow this |
| MD060 | Table column style — cosmetic pipe alignment, not
auto-fixable, 300+ violations |

## Test plan

- [x] `npx markdownlint-cli '**/*.md'` reports zero violations
- [x] `uv run pre-commit run --all-files` passes (including codegen
convergence hooks)
- [x] Distribution template codegen (`distro_codegen.py`) produces
identical output on repeated runs

---------

Signed-off-by: Eoin Fennessy <efenness@redhat.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
@skamenan7 skamenan7 force-pushed the feat/5152-mcp-tool-passthrough branch from f04296b to 1884f72 Compare March 25, 2026 13:45
## Summary

Remove the `starter-gpu` distribution which is functionally identical to
`starter`.

The only difference was the `post_training` provider
(`inline::huggingface-gpu` vs `inline::torchtune-cpu`), but the
fine-tuning/post-training API is no longer supported. All other
providers, configs, and build specs were identical.

No external references to `starter-gpu` exist anywhere in the codebase —
CI workflows, docs, tests, and scripts all only reference `starter`.

## Test plan

- [x] `distro_codegen.py` runs clean without starter-gpu
- [x] `provider_codegen.py` runs clean
- [x] Unit tests pass
- [x] No references to `starter-gpu` remain in CI, docs, or tests

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@skamenan7 skamenan7 force-pushed the feat/5152-mcp-tool-passthrough branch from 1884f72 to fff9fb0 Compare March 25, 2026 14:45
## Summary

**Breaking change** — renames the internal `agents` API to `responses`
to better reflect its purpose: it implements the OpenAI Responses API.

### Changes
- `Api.agents` → `Api.responses` (enum value)
- Provider directory: `providers/inline/agents/` →
`providers/inline/responses/`
- Provider registry: `registry/agents.py` → `registry/responses.py`
- API module: `llama_stack_api/agents/` → `llama_stack_api/responses/`
- Classes: `BuiltinAgentsImpl` → `BuiltinResponsesImpl`
- Config sections: `agents:` → `responses:` in all distribution YAMLs
- File: `agents.py` → `impl.py` (avoids collision with `responses/`
subpackage)

### Breaking for
- User config files with `agents:` provider section → must change to
`responses:`
- Code using `Api.agents` → must use `Api.responses`
- Imports from `llama_stack_api.agents` → must use
`llama_stack_api.responses`

### Migration
Could add deprecated aliases to soften the transition — looking for
feedback on approach.

## Test plan

- [x] 217 unit tests pass
- [ ] CI passes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@skamenan7 skamenan7 force-pushed the feat/5152-mcp-tool-passthrough branch from fff9fb0 to 7a6fc8d Compare March 25, 2026 16:00
Replace weak thread-alive check with actual port connectivity test to
prevent connection refused errors in docker mode integration tests.

--
Testing is this is the cause of flakes

Signed-off-by: Derek Higgins <derekh@redhat.com>
@skamenan7 skamenan7 force-pushed the feat/5152-mcp-tool-passthrough branch from 7a6fc8d to 179042a Compare March 26, 2026 09:15
…ssing keys (llamastack#5289)

All other KVStore backends (SQLite, Redis, MongoDB, Postgres) silently
ignore deletes on non-existent keys, this is the standard behavior of
`DELETE` in SQL, `delete_one` in MongoDB, and `DEL` in Redis. The
in-memory implementation was the only one that raised `KeyError`,
breaking the implicit API contract and causing code that works with
production backends to fail when switched to the in-memory store for
testing.

The fix is change `InmemoryKVStoreImpl.delete()` to use
`self._store.pop(key, None)` instead of `del self._store[key]`, so
deleting a non-existent key is a silent no-op rather than raising
`KeyError`.

# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

<!-- For API changes, include:
1. A testing script (Python, curl, etc.) that exercises the new/modified
endpoints
2. The output from running your script

Example:
```python
...
...
```

Output:
```
<paste actual output here>
```
-->

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
@skamenan7 skamenan7 force-pushed the feat/5152-mcp-tool-passthrough branch from 179042a to 7f6d945 Compare March 26, 2026 09:30
# What does this PR do?

Splits 5 Python files that exceeded 1000 lines into smaller, focused
modules. Pure refactoring with no behavior changes. All public names
remain importable from the same paths.

| Original file | Lines | Split into |
|---|---|---|
| `src/llama_stack/models/llama/sku_list.py` | 1029 → 915 |
`sku_list_download.py` (128L) |
| `scripts/openapi_generator/schema_transforms.py` | 1062 → 871 |
`_schema_output.py` (217L) |
| `tests/unit/.../vertexai/test_adapter.py` | 1578 |
`test_adapter_core.py` (365L), `test_adapter_chat.py` (651L),
`test_adapter_params.py` (477L) |
| `tests/unit/.../vertexai/test_converters.py` | 1368 |
`test_converters_requests.py` (724L), `test_converters_responses.py`
(613L), `test_converters_completions.py` (330L) |
| `tests/unit/.../vector_io/test_vector_io_openai_vector_stores.py` |
1571 | `test_vector_io_stores_index.py` (422L), `..._files.py` (215L),
`..._batches.py` (702L), `..._config.py` (385L) |

## Test Plan

~~~bash
# vertexai adapter tests (96 pass)
uv run pytest
tests/unit/providers/inference/vertexai/test_adapter_core.py \
  tests/unit/providers/inference/vertexai/test_adapter_chat.py \
  tests/unit/providers/inference/vertexai/test_adapter_params.py -v

# vertexai converter tests (118 pass)
uv run pytest
tests/unit/providers/inference/vertexai/test_converters_requests.py \
  tests/unit/providers/inference/vertexai/test_converters_responses.py \
tests/unit/providers/inference/vertexai/test_converters_completions.py
-v

# vector_io tests (92 pass)
uv run pytest
tests/unit/providers/vector_io/test_vector_io_stores_index.py \
  tests/unit/providers/vector_io/test_vector_io_stores_files.py \
  tests/unit/providers/vector_io/test_vector_io_stores_batches.py \
  tests/unit/providers/vector_io/test_vector_io_stores_config.py -v

# import verification
uv run python -c "from llama_stack.models.llama.sku_list import
resolve_model, all_registered_models;
print(len(all_registered_models()), 'models')"
uv run python -c "from scripts.openapi_generator import
schema_transforms; assert hasattr(schema_transforms,
'_write_yaml_file')"
~~~

No breaking changes. All public imports preserved via re-exports.

---------

Signed-off-by: Sumanth Kamenani <skamenan@redhat.com>
Signed-off-by: skamenan7 <skamenan@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@skamenan7 skamenan7 force-pushed the feat/5152-mcp-tool-passthrough branch from 7f6d945 to 7876559 Compare March 26, 2026 10:30
…ck#5284)

The default host was incorrectly set to a list ["::", "0.0.0.0"] which
caused uvicorn to crash with workers > 1. Uvicorn expects the host
parameter to be a string, not a list.

Changed the default host to "::" which enables dual-stack binding on
modern Linux systems, accepting both IPv4 and IPv6 connections. Added
error handling for IPv4-only systems with a helpful message directing
users to set server.host: "0.0.0.0" in their config.

Fixes: llamastack#5283

Signed-off-by: Derek Higgins <derekh@redhat.com>
Co-authored-by: Sébastien Han <seb@redhat.com>
@skamenan7 skamenan7 force-pushed the feat/5152-mcp-tool-passthrough branch from 7876559 to fde93fc Compare March 26, 2026 12:31
eoinfennessy and others added 2 commits March 26, 2026 13:40
## Summary

- Add actionlint as a pre-commit hook to lint GitHub Actions workflows
(runs via existing pre-commit CI workflow)
- Fix all actionlint and shellcheck violations across 16 workflow files,
including:
- Untrusted input injection risks (pass `github.event.*` through `env:`
instead of inline `${{ }}`)
  - Always-true `if:` condition from redundant `${{ }}` wrapper
  - Unquoted variables and command substitutions (SC2086, SC2046)
  - `ls` replaced with `find` for robustness (SC2012)
  - Grouped consecutive redirects (SC2129)
  - Removed unused variables (SC2034)
  - Replaced `sed` with parameter expansion where equivalent (SC2001)
  - Fixed string vs integer comparison (SC2170)

## Approach

Each lint rule is enabled in a separate commit for easier review — first
commit disables all failing checks, then each subsequent commit enables
one rule and fixes all its violations.

## Test plan

- [x] `uv run pre-commit run actionlint --all-files` passes with zero
violations
- [ ] CI workflows behave identically (all changes are style/safety
fixes with no behavioral changes)

---------

Signed-off-by: Eoin Fennessy <efenness@redhat.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Sébastien Han <seb@redhat.com>
@skamenan7 skamenan7 force-pushed the feat/5152-mcp-tool-passthrough branch from fde93fc to cd9ac53 Compare March 26, 2026 12:46
@skamenan7 skamenan7 closed this Mar 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants