Skip to content

feat(bedrock): add AWS SigV4 and STS web identity authentication#3

Closed
skamenan7 wants to merge 13 commits intomainfrom
fix/4730-bedrock-sts-auth-v2
Closed

feat(bedrock): add AWS SigV4 and STS web identity authentication#3
skamenan7 wants to merge 13 commits intomainfrom
fix/4730-bedrock-sts-auth-v2

Conversation

@skamenan7
Copy link
Copy Markdown
Owner

@skamenan7 skamenan7 commented Mar 30, 2026

What does this PR do?

The Bedrock inference provider previously depended on a pre-signed bearer token (AWS_BEARER_TOKEN_BEDROCK). That works for short-lived manual setups, but it is a poor fit for environments that already rely on AWS identity, like Kubernetes/OpenShift with IRSA, GitHub Actions with OIDC, EC2, ECS, and Lambda.

This PR adds AWS SigV4 support for the Bedrock OpenAI-compatible endpoint using standard AWS credential sources. When no api_key is configured, requests are signed with SigV4 via botocore. Static credentials, profiles, IAM roles, and web identity are supported. When aws_role_arn is configured, the explicit assume-role and web-identity path uses RefreshableBotoSession to refresh temporary credentials automatically. Bearer token mode is unchanged: if api_key is set in config or passed via x-llamastack-provider-data, it takes precedence.

This also updates the endpoint URL from the legacy bedrock-mantle hostname to bedrock-runtime.<region>.amazonaws.com/openai/v1.

Closes llamastack#4730

Auth modes

Bearer token, unchanged:

providers:
  inference:
    - provider_type: remote::bedrock
      config:
        api_key: ${env.AWS_BEARER_TOKEN_BEDROCK}
        region_name: us-west-2

SigV4 via AWS credentials, new:

providers:
  inference:
    - provider_type: remote::bedrock
      config:
        region_name: us-west-2
        # api_key intentionally omitted

STS web identity / IRSA, new:

providers:
  inference:
    - provider_type: remote::bedrock
      config:
        region_name: us-west-2
        aws_role_arn: ${env.AWS_ROLE_ARN}
        aws_web_identity_token_file: ${env.AWS_WEB_IDENTITY_TOKEN_FILE}

Per-request bearer override still works on a SigV4-mode server. If x-llamastack-provider-data includes {"aws_bearer_token_bedrock": "<token>"}, that request uses the bearer path. Empty, whitespace-only, or null values fall back to SigV4.

Implementation notes

  • BedrockSigV4Auth is an httpx.Auth implementation that removes the OpenAI SDK placeholder Authorization header and replaces it with a SigV4 signature generated through botocore.
  • The signed httpx.AsyncClient is built once in initialize() and reused, so get_extra_client_params() can stay synchronous and the adapter does not create a new client per request.
  • asyncio.to_thread() is used so signing and credential resolution do not block the event loop.
  • asyncio.shield() is used around signing and shutdown cleanup so cancellation does not interrupt those operations halfway through.

Test plan

Unit tests:

uv run pytest \
  tests/unit/providers/inference/bedrock/ \
  tests/unit/providers/inference/test_bedrock_adapter.py \
  tests/unit/providers/inference/test_bedrock_config.py \
  tests/unit/providers/inference/test_bedrock_sts.py \
  tests/unit/providers/safety/test_bedrock_safety_adapter.py \
  -v --tb=short

Focused SigV4 unit tests:

uv run pytest tests/unit/providers/inference/bedrock/test_sigv4_auth.py -x --tb=short

Local live validation in SigV4 mode:

Server config with api_key intentionally omitted:

providers:
  inference:
    - provider_type: remote::bedrock
      config:
        region_name: us-west-2

Validated locally for:

  • non-streaming requests
  • streaming requests
  • bearer override on a SigV4-mode server
  • empty, whitespace, and null bearer values falling back to SigV4
  • malformed provider-data handling
  • invalid bearer rejection without leaking token data
  • concurrent mixed-auth request isolation

Summary by Sourcery

Add AWS SigV4-based authentication and STS web identity support to the Bedrock inference and safety providers, updating configuration, networking utilities, and error handling to support cloud-native credential flows while preserving legacy bearer token mode.

New Features:

  • Introduce SigV4 httpx.Auth implementation for Bedrock OpenAI-compatible endpoints, enabling authentication via standard AWS credential sources and STS web identity.
  • Add configurable AWS role, web identity, and session settings to Bedrock provider configs to support IRSA and other federated identity setups.
  • Expose per-request provider data handling improvements so Bedrock can switch between bearer token and SigV4 modes dynamically.

Enhancements:

  • Update Bedrock inference adapter to use the new bedrock-runtime endpoint URL and to reuse a shared SigV4 httpx client when operating in SigV4 mode.
  • Improve network client construction and reuse by fingerprinting network configuration to avoid unnecessary httpx client recreation.
  • Refine Bedrock authentication error handling to return sanitized InternalServerError responses without leaking provider-specific details.
  • Extend refreshable boto session utility to handle explicit STS assume-role-with-web-identity flows and optional static credentials.
  • Tighten type checking and config validation for Bedrock adapters and request provider data handling.

Documentation:

  • Document new Bedrock configuration options for AWS credentials, STS roles, and web identity, including sample configurations for the inference and safety providers.

Tests:

  • Add extensive unit tests for SigV4 authentication behavior, credential chains, async signing, and OpenAI SDK integration.
  • Add tests for new Bedrock config STS fields, Bedrock inference adapter error handling, and safety provider initialization with STS settings.
  • Update remote inference provider config tests to cover providers without API key requirements.

@sourcery-ai
Copy link
Copy Markdown

sourcery-ai bot commented Mar 30, 2026

Reviewer's Guide

Adds AWS SigV4 and STS web identity auth support to the Bedrock inference/safety providers, switches Bedrock to the new bedrock-runtime OpenAI-compatible endpoint, tightens request-provider-data validation, and introduces shared SigV4/http client utilities with extensive unit tests.

Sequence diagram for Bedrock SigV4 vs bearer authentication flow

sequenceDiagram
    actor ClientApp
    participant LlamaServer
    participant RequestProviderDataContext
    participant BedrockInferenceAdapter
    participant OpenAIMixinClient
    participant BedrockSigV4Auth
    participant AWSSTS
    participant AWSBedrockRuntime

    ClientApp->>LlamaServer: HTTP request with x_llamastack_provider_data
    LlamaServer->>RequestProviderDataContext: request_provider_data_context(headers, user)
    activate RequestProviderDataContext
    RequestProviderDataContext-->>LlamaServer: provider_data stored in contextvar
    deactivate RequestProviderDataContext

    LlamaServer->>BedrockInferenceAdapter: openai_chat_completion(params)
    activate BedrockInferenceAdapter

    BedrockInferenceAdapter->>BedrockInferenceAdapter: _should_use_sigv4()
    BedrockInferenceAdapter->>BedrockInferenceAdapter: _bedrock_config.has_bearer_token()
    alt Bearer token configured in config or provider data
        BedrockInferenceAdapter-->>BedrockInferenceAdapter: use_sigv4 = False
        BedrockInferenceAdapter->>OpenAIMixinClient: get_api_key() -> bearer token
        BedrockInferenceAdapter->>OpenAIMixinClient: get_extra_client_params() -> {}
    else Use SigV4 via AWS credentials
        BedrockInferenceAdapter-->>BedrockInferenceAdapter: use_sigv4 = True
        BedrockInferenceAdapter->>BedrockInferenceAdapter: initialize() builds _sigv4_http_client
        BedrockInferenceAdapter->>OpenAIMixinClient: get_api_key() -> placeholder NOTUSED
        BedrockInferenceAdapter->>OpenAIMixinClient: get_extra_client_params() -> http_client=_sigv4_http_client
        OpenAIMixinClient->>BedrockSigV4Auth: async_auth_flow(httpx_request)
        activate BedrockSigV4Auth
        BedrockSigV4Auth->>BedrockSigV4Auth: _get_credentials()
        alt Role assumption with web identity
            BedrockSigV4Auth->>AWSSTS: assume_role_with_web_identity
            AWSSTS-->>BedrockSigV4Auth: temporary credentials
        else Direct credentials or instance role
            BedrockSigV4Auth-->>BedrockSigV4Auth: load from env or profile
        end
        BedrockSigV4Auth->>BedrockSigV4Auth: _sign_request(request) using SigV4Auth
        BedrockSigV4Auth-->>OpenAIMixinClient: signed request
        deactivate BedrockSigV4Auth
    end

    OpenAIMixinClient->>AWSBedrockRuntime: HTTPS POST /openai/v1/chat/completions
    AWSBedrockRuntime-->>OpenAIMixinClient: response or auth error

    OpenAIMixinClient-->>BedrockInferenceAdapter: result or AuthenticationError
    alt AuthenticationError
        BedrockInferenceAdapter->>BedrockInferenceAdapter: _handle_auth_error(msg, error, use_sigv4)
        BedrockInferenceAdapter-->>LlamaServer: InternalServerError (generic message)
    else Success
        BedrockInferenceAdapter-->>LlamaServer: OpenAIChatCompletion or stream
    end

    LlamaServer-->>ClientApp: HTTP response
Loading

Updated class diagram for Bedrock config, adapter, and SigV4 utilities

classDiagram
    class RemoteInferenceProviderConfig

    class BedrockBaseConfig {
        +SecretStr aws_access_key_id
        +SecretStr aws_secret_access_key
        +SecretStr aws_session_token
        +str aws_role_arn
        +str aws_web_identity_token_file
        +str aws_role_session_name
        +str region_name
        +str profile_name
        +int total_max_attempts
        +str retry_mode
        +float connect_timeout
        +float read_timeout
        +int session_ttl
    }

    class BedrockConfig {
        +SecretStr auth_credential
        +str region_name
        +bool has_bearer_token()
        +dict sample_run_config(**kwargs)
    }

    class OpenAIMixin {
        +AsyncOpenAI client
        +dict get_extra_client_params()
        +str get_api_key()
    }

    class BedrockInferenceAdapter {
        +BedrockConfig config
        +str provider_data_api_key_field
        +httpx_AsyncClient _sigv4_http_client
        +BedrockConfig _bedrock_config
        +str get_base_url()
        +bool _should_use_sigv4()
        +httpx_AsyncClient _build_sigv4_http_client()
        +initialize()
        +shutdown()
        +str get_api_key()
        +dict get_extra_client_params()
        +Iterable~str~ list_provider_model_ids()
        +bool check_model_availability(model)
        +openai_chat_completion(params)
        +_handle_auth_error(error_msg, original_error, use_sigv4)
    }

    class RefreshableBotoSession {
        +str region_name
        +str aws_access_key_id
        +str aws_secret_access_key
        +str aws_session_token
        +str profile_name
        +str sts_arn
        +str web_identity_token_file
        +str session_name
        +int session_ttl
        -__get_session_credentials()
        +Session refreshable_session()
    }

    class BedrockSigV4Auth {
        +str _region
        +str _service
        +str _aws_access_key_id
        +str _aws_secret_access_key
        +str _aws_session_token
        +str _profile_name
        +str _aws_role_arn
        +str _aws_web_identity_token_file
        +str _aws_role_session_name
        +int _session_ttl
        -object _session
        -object _lock
        +_get_credentials()
        +_sign_request(request)
        +auth_flow(request)
        +async_auth_flow(request)
    }

    class NeedsRequestProviderData {
        +ProviderSpec __provider_spec__
        +get_request_provider_data()
    }

    class RequestProviderDataContext {
        +dict provider_data
        +__enter__()
        +__exit__(exc_type, exc_val, exc_tb)
    }

    RemoteInferenceProviderConfig <|-- BedrockBaseConfig
    BedrockBaseConfig <|-- BedrockConfig
    OpenAIMixin <|-- BedrockInferenceAdapter
    NeedsRequestProviderData <|-- BedrockInferenceAdapter

    BedrockInferenceAdapter --> BedrockConfig
    BedrockInferenceAdapter ..> BedrockSigV4Auth : uses
    BedrockSigV4Auth ..> RefreshableBotoSession : uses for STS
    BedrockInferenceAdapter ..> RequestProviderDataContext : uses via context
Loading

File-Level Changes

Change Details Files
Add SigV4-based auth path to the Bedrock inference adapter with runtime bearer-token override and updated endpoint URL.
  • Introduce lazy BedrockConfig type-checking and helper property on the adapter.
  • Change base URL to bedrock-runtime..amazonaws.com/openai/v1 with a default region fallback.
  • Implement _should_use_sigv4() to choose between bearer token and SigV4 per request, honoring per-request provider data overrides including empty/whitespace semantics.
  • Build and cache a shared httpx.AsyncClient using BedrockSigV4Auth in initialize(), and expose it via get_extra_client_params() only when SigV4 is active.
  • Return a placeholder api_key when in SigV4 mode to satisfy OpenAI client validation while letting SigV4 overwrite the Authorization header.
  • Override list_provider_model_ids() and check_model_availability() to reflect Bedrock’s lack of /v1/models support.
  • Add graceful shutdown of the shared SigV4 http client using asyncio.shield().
  • Refactor authentication-error handling into _handle_auth_error(), mapping both SigV4 and bearer failures into sanitized InternalServerError messages without leaking credential details, and adjust logging accordingly.
  • Log whether SigV4 is used on each chat-completions call.
src/llama_stack/providers/remote/inference/bedrock/bedrock.py
Extend Bedrock configuration to support AWS credential chain and STS web-identity, and re-alias api_key to a generic auth_credential field.
  • Make BedrockConfig inherit from shared BedrockBaseConfig instead of RemoteInferenceProviderConfig.
  • Rename underlying auth field to auth_credential with api_key alias and add has_bearer_token() helper that trims whitespace.
  • Change region_name to be optional with a default pulled from AWS_DEFAULT_REGION and us-east-2 fallback in sample config.
  • Expose aws_role_arn and aws_web_identity_token_file in sample_run_config and tests, and ensure they are read from environment variables.
  • Add tests to confirm STS-related fields are correctly populated and included in sample config output.
src/llama_stack/providers/remote/inference/bedrock/config.py
tests/unit/providers/inference/test_bedrock_config.py
docs/docs/providers/inference/remote_bedrock.mdx
src/llama_stack/providers/utils/bedrock/config.py
Enhance RefreshableBotoSession to support static credentials and web-identity role assumption, and wire it into Bedrock client creation.
  • Generalize RefreshableBotoSession constructor to accept explicit AWS credentials, web-identity token file, and role session name, using DEFAULT_SESSION_TTL.
  • Update internal session construction to pass through static or profile-based credentials and to use assume_role_with_web_identity when a web identity token file is configured, otherwise assume_role.
  • Use RefreshableBotoSession from BedrockSigV4Auth and create_bedrock_client to obtain refreshable sessions, including STS and web identity fields.
  • Adjust session TTL default consumption to respect DEFAULT_SESSION_TTL when unset.
src/llama_stack/providers/utils/bedrock/refreshable_boto_session.py
src/llama_stack/providers/utils/bedrock/client.py
Introduce BedrockSigV4Auth httpx.Auth implementation plus exhaustive unit tests for signing behavior, auth-mode selection, and error handling.
  • Implement BedrockSigV4Auth which resolves credentials via boto3 or RefreshableBotoSession, strips OpenAI’s placeholder Bearer header, and signs requests with SigV4Auth using the correct service name (bedrock) and stable headers only.
  • Ensure x-amz-security-token is included for STS/web-identity credentials, and that missing credentials raise a clear RuntimeError.
  • Support both sync and async auth flows, using asyncio.to_thread and asyncio.shield to keep credential resolution/signing off the event loop and cancellation-safe.
  • Add broad test coverage for SigV4 signing (including mock transports, host header/port behavior, STS tokens, credential refresh semantics, and replacement of Bearer headers).
  • Add focused tests around BedrockInferenceAdapter’s auth-mode decision logic, per-request overrides, placeholder api_key behavior, SigV4 client caching, and sanitized InternalServerError mapping.
  • Add OpenAI SDK integration tests that check base URL, service name, and Authorization header shape in SigV4 mode, plus STS token propagation.
  • Add STS-specific tests that validate SigV4Auth initialization, use of RefreshableBotoSession, and that the adapter passes STS/web-identity config fields through to the auth layer.
src/llama_stack/providers/utils/bedrock/sigv4_auth.py
tests/unit/providers/inference/bedrock/test_sigv4_auth.py
tests/unit/providers/inference/bedrock/test_openai_sdk_integration.py
tests/unit/providers/inference/test_bedrock_sts.py
tests/unit/providers/inference/test_bedrock_adapter.py
Improve reusable HTTP client/network configuration utilities to support caching and reuse, and update OpenAIMixin to use the new helpers.
  • Expose build_network_client_kwargs() (renamed from _build_network_client_kwargs) for reuse and adjust all call sites.
  • Introduce a stable fingerprint computation for NetworkConfig and store it on httpx clients so that _merge_network_config_into_client can detect when rebuilding is unnecessary.
  • Update client merge logic to propagate the network fingerprint through DefaultAsyncHttpxClient wrappers.
  • Switch OpenAIMixin.client() to use the new build_network_client_kwargs helper when constructing/merging network config.
src/llama_stack/providers/utils/inference/http_client.py
src/llama_stack/providers/utils/inference/openai_mixin.py
tests/unit/providers/utils/inference/test_remote_inference_provider_config.py
Harden request provider-data handling and mixin wiring to avoid malformed JSON and type errors.
  • Add type-checking import of ProviderSpec and a provider_spec attribute on NeedsRequestProviderData to make the contract explicit.
  • Tighten parse_request_provider_data() to reject non-JSON-object payloads (including null and non-dict JSON), logging an error and returning None.
  • Update RequestProviderDataContext to enforce provider_data must be a dict, logging and dropping invalid values.
  • Adjust request_provider_data_context signature to accept a User instead of auth_attributes and propagate it into the provider-data context.
src/llama_stack/core/request_headers.py
Extend Bedrock safety provider docs and tests to cover the new auth options and STS integration.
  • Document STS role, web-identity, and session-name fields for the safety provider config alongside existing credential-chain settings.
  • Add a safety adapter unit test to ensure initialize() constructs both bedrock-runtime and control-plane clients using the shared create_bedrock_client helper and passes through STS/web-identity config.
docs/docs/providers/safety/remote_bedrock.mdx
tests/unit/providers/safety/test_bedrock_safety_adapter.py
Tighten adapter factory typing and error reporting for remote Bedrock inference.
  • Change get_adapter_impl to raise TypeError when given a non-BedrockConfig config instead of asserting, improving error messages in misconfiguration cases.
src/llama_stack/providers/remote/inference/bedrock/__init__.py

Assessment against linked issues

Issue Objective Addressed Explanation
llamastack#4730 Implement STS-based authentication (via AWS SigV4 and standard AWS credential sources, including web identity/assume-role) for the AWS Bedrock inference provider so that it can operate without long-lived API keys.
llamastack#4730 Document and expose configuration options for using STS/SigV4 with the AWS Bedrock inference provider (including role ARN and web identity token file) so users can enable this auth mode.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@skamenan7 skamenan7 force-pushed the fix/4730-bedrock-sts-auth-v2 branch 2 times, most recently from 742fe2a to 4302e2f Compare March 30, 2026 22:15
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 30, 2026

Recording workflow completed

Providers: ollama

Recordings have been generated and will be committed automatically by the companion workflow.

View workflow run

gyliu513 and others added 2 commits March 31, 2026 08:09
…k#5374)

# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
Follow up for
llamastack#5340 (review)

Change `rerank()` to use `_get_api_key_from_config_or_provider_data()`
instead of `get_api_key()`, so API keys passed via
`x-llamastack-provider-data` header are honoured

@leseb ^^

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

<!-- For API changes, include:
1. A testing script (Python, curl, etc.) that exercises the new/modified
endpoints
2. The output from running your script

Example:
```python
...
...
```

Output:
```
<paste actual output here>
```
-->
…tack#5357)

Splitting llamastack#5206 into two stages. CI integration tests and recordings are
causing issues when API Spec and core implementation changes are pushed
together.

**Stage 1** (this PR): Introduce API types required for reasoning
support in the Responses API:
 - `ReasoningItem`, `ReasoningContent`, `ReasoningSummary` output types
  - summary field on OpenAIResponseReasoning
  - An internal `AssistantMessageWithReasoning` type


**Stage 2**: After this PR is merged, will create a clean version of
llamastack#5206 with core implementation for reasoning propagation.

---------

Co-authored-by: Charlie Doern <cdoern@redhat.com>
@skamenan7 skamenan7 force-pushed the fix/4730-bedrock-sts-auth-v2 branch from 03048ec to 59a252b Compare March 31, 2026 11:31
gyliu513 and others added 11 commits March 31, 2026 12:08
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

<!-- For API changes, include:
1. A testing script (Python, curl, etc.) that exercises the new/modified
endpoints
2. The output from running your script

Example:
```python
...
...
```

Output:
```
<paste actual output here>
```
-->
Adds SigV4 request signing and STS web identity token authentication
to the Bedrock inference provider, enabling use of IAM roles and
federated identity without long-lived static credentials.

- add sigv4_auth.py with SigV4Signer for signing httpx requests
- add STS web identity token refresh via refreshable boto session
- support per-request bearer token override in SigV4 mode
- expose auth_method config field (sigv4, sts_web_identity, api_key)
- sanitize auth failures to avoid leaking credentials in error messages
- avoid calling get_request_provider_data() during initialize()
- fix httpx client leaks by managing client lifecycle properly
- export build_network_client_kwargs / network_config_fingerprint
- add comprehensive unit tests: sigv4 signing, STS, SDK integration

Signed-off-by: skamenan7 <skamenan@redhat.com>
- remove session_ttl from boto3.Session() args in client.py — Session
  does not accept this parameter; caused TypeError for static-key deployments
- add ConfigDict(extra='forbid') to BedrockProviderDataValidator
- remove type: ignore[attr-defined] from sigv4_auth.py — _session typed as Any
- remove redundant network field redeclaration from BedrockConfig
- delete stale tests/unit/providers/test_bedrock.py with wrong bedrock-mantle URLs
- async tests in TestAsyncAuthFlow now run — asyncio_mode=auto handles them
remove docstrings that restate the method name and reduce inline comments
to only the non-obvious parts: the signing name mismatch, the lazy boto3
import, the placeholder api_key trick, the asyncio.shield rationale, and
the per-request sigv4 bypass check.

Signed-off-by: skamenan7 <skamenan@redhat.com>
Keep the lazy optional-dependency boundary in the Bedrock adapter while making the botocore signing dependency explicit in sigv4_auth.py.

Signed-off-by: skamenan7 <skamenan@redhat.com>
Made-with: Cursor
- remove extra="forbid" from BedrockProviderDataValidator; the
  __authenticated_user key injected by request_headers.py would
  silently fail Pydantic validation and break per-request bearer
  token overrides for authenticated users

- consolidate the 3600 session TTL default into a single
  DEFAULT_SESSION_TTL constant in utils/bedrock/config.py and
  reference it from sigv4_auth.py, refreshable_boto_session.py,
  and client.py

- shield asyncio.to_thread(_sign_request) in async_auth_flow so
  a rolling-restart cancellation cannot abort mid-sign and leave
  the connection in an inconsistent auth state

Signed-off-by: skamenan7 <skamenan@redhat.com>
- remove dead else branch in _handle_auth_error; the if-branch always
  raises so the else was unreachable
- remove duplicate copyright/license header from test_bedrock_safety_adapter.py
- drop the TGIImplConfig import and parametrize row from
  test_remote_inference_provider_config.py; the import was broken
  (TGIImplConfig doesn't exist at that path) and the TGI case is
  unrelated to this PR

Signed-off-by: skamenan7 <skamenan@redhat.com>
- catch PermissionDeniedError (403) alongside AuthenticationError (401)
  in openai_chat_completion; SigV4 AccessDenied and SignatureDoesNotMatch
  from AWS come back as 403, not 401, and were bypassing the sanitized
  error path and leaking raw provider messages

- add boto3 to pip_packages in the inference Bedrock provider registry
  spec; sigv4_auth.py imports botocore at module level so provider-scoped
  installs without boto3 would fail at runtime (safety spec already had it)

- make aws_role_arn the first branch in create_bedrock_client so that
  static credentials plus a role ARN correctly triggers assume-role
  instead of silently ignoring the role (matches inference adapter behavior)

- fix broken import in test_network_config.py; _build_network_client_kwargs
  was renamed to build_network_client_kwargs (public) in this PR but the
  test file was not updated, causing an ImportError at collection time

Signed-off-by: skamenan7 <skamenan@redhat.com>
- update vllm.py to import build_network_client_kwargs (public name);
  the private _build_network_client_kwargs was renamed in http_client.py
  in this PR but the vllm provider was missed, breaking unit test collection

- regenerate distribution configs after adding boto3 to the bedrock
  inference provider pip_packages

Signed-off-by: skamenan7 <skamenan@redhat.com>
…errors

- client.py: the aws_role_arn branch was not passing botocore.Config so
  total_max_attempts, retry_mode, connect_timeout, and read_timeout were
  silently ignored for the safety provider when role assumption is used;
  factor the config build to the top of create_bedrock_client and pass it
  through all three branches (role, static creds, credential chain)

- bedrock.py: RuntimeError (missing AWS creds) and OSError (unreadable
  web identity token file) from SigV4 credential resolution were falling
  through to the generic except Exception branch and surfacing as raw
  exception messages; add an explicit catch that converts these to a
  sanitized InternalServerError when in SigV4 mode

Signed-off-by: skamenan7 <skamenan@redhat.com>
pass Session args explicitly instead of via **dict unpacking — mypy
was unable to narrow the dict[str, str | None] type after filtering
and flagged the **kwargs call as incompatible with Session's signature

Signed-off-by: skamenan7 <skamenan@redhat.com>
@skamenan7 skamenan7 force-pushed the fix/4730-bedrock-sts-auth-v2 branch from 879668c to 4822ec1 Compare March 31, 2026 12:16
@skamenan7 skamenan7 closed this Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for AWS Security Token Service (STS) for AWS Bedrock inference provider

3 participants