feat(bedrock): add AWS SigV4 and STS web identity authentication#3
Closed
feat(bedrock): add AWS SigV4 and STS web identity authentication#3
Conversation
Reviewer's GuideAdds AWS SigV4 and STS web identity auth support to the Bedrock inference/safety providers, switches Bedrock to the new bedrock-runtime OpenAI-compatible endpoint, tightens request-provider-data validation, and introduces shared SigV4/http client utilities with extensive unit tests. Sequence diagram for Bedrock SigV4 vs bearer authentication flowsequenceDiagram
actor ClientApp
participant LlamaServer
participant RequestProviderDataContext
participant BedrockInferenceAdapter
participant OpenAIMixinClient
participant BedrockSigV4Auth
participant AWSSTS
participant AWSBedrockRuntime
ClientApp->>LlamaServer: HTTP request with x_llamastack_provider_data
LlamaServer->>RequestProviderDataContext: request_provider_data_context(headers, user)
activate RequestProviderDataContext
RequestProviderDataContext-->>LlamaServer: provider_data stored in contextvar
deactivate RequestProviderDataContext
LlamaServer->>BedrockInferenceAdapter: openai_chat_completion(params)
activate BedrockInferenceAdapter
BedrockInferenceAdapter->>BedrockInferenceAdapter: _should_use_sigv4()
BedrockInferenceAdapter->>BedrockInferenceAdapter: _bedrock_config.has_bearer_token()
alt Bearer token configured in config or provider data
BedrockInferenceAdapter-->>BedrockInferenceAdapter: use_sigv4 = False
BedrockInferenceAdapter->>OpenAIMixinClient: get_api_key() -> bearer token
BedrockInferenceAdapter->>OpenAIMixinClient: get_extra_client_params() -> {}
else Use SigV4 via AWS credentials
BedrockInferenceAdapter-->>BedrockInferenceAdapter: use_sigv4 = True
BedrockInferenceAdapter->>BedrockInferenceAdapter: initialize() builds _sigv4_http_client
BedrockInferenceAdapter->>OpenAIMixinClient: get_api_key() -> placeholder NOTUSED
BedrockInferenceAdapter->>OpenAIMixinClient: get_extra_client_params() -> http_client=_sigv4_http_client
OpenAIMixinClient->>BedrockSigV4Auth: async_auth_flow(httpx_request)
activate BedrockSigV4Auth
BedrockSigV4Auth->>BedrockSigV4Auth: _get_credentials()
alt Role assumption with web identity
BedrockSigV4Auth->>AWSSTS: assume_role_with_web_identity
AWSSTS-->>BedrockSigV4Auth: temporary credentials
else Direct credentials or instance role
BedrockSigV4Auth-->>BedrockSigV4Auth: load from env or profile
end
BedrockSigV4Auth->>BedrockSigV4Auth: _sign_request(request) using SigV4Auth
BedrockSigV4Auth-->>OpenAIMixinClient: signed request
deactivate BedrockSigV4Auth
end
OpenAIMixinClient->>AWSBedrockRuntime: HTTPS POST /openai/v1/chat/completions
AWSBedrockRuntime-->>OpenAIMixinClient: response or auth error
OpenAIMixinClient-->>BedrockInferenceAdapter: result or AuthenticationError
alt AuthenticationError
BedrockInferenceAdapter->>BedrockInferenceAdapter: _handle_auth_error(msg, error, use_sigv4)
BedrockInferenceAdapter-->>LlamaServer: InternalServerError (generic message)
else Success
BedrockInferenceAdapter-->>LlamaServer: OpenAIChatCompletion or stream
end
LlamaServer-->>ClientApp: HTTP response
Updated class diagram for Bedrock config, adapter, and SigV4 utilitiesclassDiagram
class RemoteInferenceProviderConfig
class BedrockBaseConfig {
+SecretStr aws_access_key_id
+SecretStr aws_secret_access_key
+SecretStr aws_session_token
+str aws_role_arn
+str aws_web_identity_token_file
+str aws_role_session_name
+str region_name
+str profile_name
+int total_max_attempts
+str retry_mode
+float connect_timeout
+float read_timeout
+int session_ttl
}
class BedrockConfig {
+SecretStr auth_credential
+str region_name
+bool has_bearer_token()
+dict sample_run_config(**kwargs)
}
class OpenAIMixin {
+AsyncOpenAI client
+dict get_extra_client_params()
+str get_api_key()
}
class BedrockInferenceAdapter {
+BedrockConfig config
+str provider_data_api_key_field
+httpx_AsyncClient _sigv4_http_client
+BedrockConfig _bedrock_config
+str get_base_url()
+bool _should_use_sigv4()
+httpx_AsyncClient _build_sigv4_http_client()
+initialize()
+shutdown()
+str get_api_key()
+dict get_extra_client_params()
+Iterable~str~ list_provider_model_ids()
+bool check_model_availability(model)
+openai_chat_completion(params)
+_handle_auth_error(error_msg, original_error, use_sigv4)
}
class RefreshableBotoSession {
+str region_name
+str aws_access_key_id
+str aws_secret_access_key
+str aws_session_token
+str profile_name
+str sts_arn
+str web_identity_token_file
+str session_name
+int session_ttl
-__get_session_credentials()
+Session refreshable_session()
}
class BedrockSigV4Auth {
+str _region
+str _service
+str _aws_access_key_id
+str _aws_secret_access_key
+str _aws_session_token
+str _profile_name
+str _aws_role_arn
+str _aws_web_identity_token_file
+str _aws_role_session_name
+int _session_ttl
-object _session
-object _lock
+_get_credentials()
+_sign_request(request)
+auth_flow(request)
+async_auth_flow(request)
}
class NeedsRequestProviderData {
+ProviderSpec __provider_spec__
+get_request_provider_data()
}
class RequestProviderDataContext {
+dict provider_data
+__enter__()
+__exit__(exc_type, exc_val, exc_tb)
}
RemoteInferenceProviderConfig <|-- BedrockBaseConfig
BedrockBaseConfig <|-- BedrockConfig
OpenAIMixin <|-- BedrockInferenceAdapter
NeedsRequestProviderData <|-- BedrockInferenceAdapter
BedrockInferenceAdapter --> BedrockConfig
BedrockInferenceAdapter ..> BedrockSigV4Auth : uses
BedrockSigV4Auth ..> RefreshableBotoSession : uses for STS
BedrockInferenceAdapter ..> RequestProviderDataContext : uses via context
File-Level Changes
Assessment against linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
742fe2a to
4302e2f
Compare
|
Recording workflow completed Providers: ollama Recordings have been generated and will be committed automatically by the companion workflow. |
…k#5374) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> Follow up for llamastack#5340 (review) Change `rerank()` to use `_get_api_key_from_config_or_provider_data()` instead of `get_api_key()`, so API keys passed via `x-llamastack-provider-data` header are honoured @leseb ^^ <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> <!-- For API changes, include: 1. A testing script (Python, curl, etc.) that exercises the new/modified endpoints 2. The output from running your script Example: ```python ... ... ``` Output: ``` <paste actual output here> ``` -->
…tack#5357) Splitting llamastack#5206 into two stages. CI integration tests and recordings are causing issues when API Spec and core implementation changes are pushed together. **Stage 1** (this PR): Introduce API types required for reasoning support in the Responses API: - `ReasoningItem`, `ReasoningContent`, `ReasoningSummary` output types - summary field on OpenAIResponseReasoning - An internal `AssistantMessageWithReasoning` type **Stage 2**: After this PR is merged, will create a clean version of llamastack#5206 with core implementation for reasoning propagation. --------- Co-authored-by: Charlie Doern <cdoern@redhat.com>
03048ec to
59a252b
Compare
# What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> <!-- For API changes, include: 1. A testing script (Python, curl, etc.) that exercises the new/modified endpoints 2. The output from running your script Example: ```python ... ... ``` Output: ``` <paste actual output here> ``` -->
Adds SigV4 request signing and STS web identity token authentication to the Bedrock inference provider, enabling use of IAM roles and federated identity without long-lived static credentials. - add sigv4_auth.py with SigV4Signer for signing httpx requests - add STS web identity token refresh via refreshable boto session - support per-request bearer token override in SigV4 mode - expose auth_method config field (sigv4, sts_web_identity, api_key) - sanitize auth failures to avoid leaking credentials in error messages - avoid calling get_request_provider_data() during initialize() - fix httpx client leaks by managing client lifecycle properly - export build_network_client_kwargs / network_config_fingerprint - add comprehensive unit tests: sigv4 signing, STS, SDK integration Signed-off-by: skamenan7 <skamenan@redhat.com>
- remove session_ttl from boto3.Session() args in client.py — Session does not accept this parameter; caused TypeError for static-key deployments - add ConfigDict(extra='forbid') to BedrockProviderDataValidator - remove type: ignore[attr-defined] from sigv4_auth.py — _session typed as Any - remove redundant network field redeclaration from BedrockConfig - delete stale tests/unit/providers/test_bedrock.py with wrong bedrock-mantle URLs - async tests in TestAsyncAuthFlow now run — asyncio_mode=auto handles them
remove docstrings that restate the method name and reduce inline comments to only the non-obvious parts: the signing name mismatch, the lazy boto3 import, the placeholder api_key trick, the asyncio.shield rationale, and the per-request sigv4 bypass check. Signed-off-by: skamenan7 <skamenan@redhat.com>
Keep the lazy optional-dependency boundary in the Bedrock adapter while making the botocore signing dependency explicit in sigv4_auth.py. Signed-off-by: skamenan7 <skamenan@redhat.com> Made-with: Cursor
- remove extra="forbid" from BedrockProviderDataValidator; the __authenticated_user key injected by request_headers.py would silently fail Pydantic validation and break per-request bearer token overrides for authenticated users - consolidate the 3600 session TTL default into a single DEFAULT_SESSION_TTL constant in utils/bedrock/config.py and reference it from sigv4_auth.py, refreshable_boto_session.py, and client.py - shield asyncio.to_thread(_sign_request) in async_auth_flow so a rolling-restart cancellation cannot abort mid-sign and leave the connection in an inconsistent auth state Signed-off-by: skamenan7 <skamenan@redhat.com>
- remove dead else branch in _handle_auth_error; the if-branch always raises so the else was unreachable - remove duplicate copyright/license header from test_bedrock_safety_adapter.py - drop the TGIImplConfig import and parametrize row from test_remote_inference_provider_config.py; the import was broken (TGIImplConfig doesn't exist at that path) and the TGI case is unrelated to this PR Signed-off-by: skamenan7 <skamenan@redhat.com>
- catch PermissionDeniedError (403) alongside AuthenticationError (401) in openai_chat_completion; SigV4 AccessDenied and SignatureDoesNotMatch from AWS come back as 403, not 401, and were bypassing the sanitized error path and leaking raw provider messages - add boto3 to pip_packages in the inference Bedrock provider registry spec; sigv4_auth.py imports botocore at module level so provider-scoped installs without boto3 would fail at runtime (safety spec already had it) - make aws_role_arn the first branch in create_bedrock_client so that static credentials plus a role ARN correctly triggers assume-role instead of silently ignoring the role (matches inference adapter behavior) - fix broken import in test_network_config.py; _build_network_client_kwargs was renamed to build_network_client_kwargs (public) in this PR but the test file was not updated, causing an ImportError at collection time Signed-off-by: skamenan7 <skamenan@redhat.com>
- update vllm.py to import build_network_client_kwargs (public name); the private _build_network_client_kwargs was renamed in http_client.py in this PR but the vllm provider was missed, breaking unit test collection - regenerate distribution configs after adding boto3 to the bedrock inference provider pip_packages Signed-off-by: skamenan7 <skamenan@redhat.com>
…errors - client.py: the aws_role_arn branch was not passing botocore.Config so total_max_attempts, retry_mode, connect_timeout, and read_timeout were silently ignored for the safety provider when role assumption is used; factor the config build to the top of create_bedrock_client and pass it through all three branches (role, static creds, credential chain) - bedrock.py: RuntimeError (missing AWS creds) and OSError (unreadable web identity token file) from SigV4 credential resolution were falling through to the generic except Exception branch and surfacing as raw exception messages; add an explicit catch that converts these to a sanitized InternalServerError when in SigV4 mode Signed-off-by: skamenan7 <skamenan@redhat.com>
pass Session args explicitly instead of via **dict unpacking — mypy was unable to narrow the dict[str, str | None] type after filtering and flagged the **kwargs call as incompatible with Session's signature Signed-off-by: skamenan7 <skamenan@redhat.com>
879668c to
4822ec1
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
The Bedrock inference provider previously depended on a pre-signed bearer token (
AWS_BEARER_TOKEN_BEDROCK). That works for short-lived manual setups, but it is a poor fit for environments that already rely on AWS identity, like Kubernetes/OpenShift with IRSA, GitHub Actions with OIDC, EC2, ECS, and Lambda.This PR adds AWS SigV4 support for the Bedrock OpenAI-compatible endpoint using standard AWS credential sources. When no
api_keyis configured, requests are signed with SigV4 via botocore. Static credentials, profiles, IAM roles, and web identity are supported. Whenaws_role_arnis configured, the explicit assume-role and web-identity path usesRefreshableBotoSessionto refresh temporary credentials automatically. Bearer token mode is unchanged: ifapi_keyis set in config or passed viax-llamastack-provider-data, it takes precedence.This also updates the endpoint URL from the legacy
bedrock-mantlehostname tobedrock-runtime.<region>.amazonaws.com/openai/v1.Closes llamastack#4730
Auth modes
Bearer token, unchanged:
SigV4 via AWS credentials, new:
STS web identity / IRSA, new:
Per-request bearer override still works on a SigV4-mode server. If
x-llamastack-provider-dataincludes{"aws_bearer_token_bedrock": "<token>"}, that request uses the bearer path. Empty, whitespace-only, or null values fall back to SigV4.Implementation notes
BedrockSigV4Authis anhttpx.Authimplementation that removes the OpenAI SDK placeholderAuthorizationheader and replaces it with a SigV4 signature generated through botocore.httpx.AsyncClientis built once ininitialize()and reused, soget_extra_client_params()can stay synchronous and the adapter does not create a new client per request.asyncio.to_thread()is used so signing and credential resolution do not block the event loop.asyncio.shield()is used around signing and shutdown cleanup so cancellation does not interrupt those operations halfway through.Test plan
Unit tests:
Focused SigV4 unit tests:
Local live validation in SigV4 mode:
Server config with
api_keyintentionally omitted:Validated locally for:
Summary by Sourcery
Add AWS SigV4-based authentication and STS web identity support to the Bedrock inference and safety providers, updating configuration, networking utilities, and error handling to support cloud-native credential flows while preserving legacy bearer token mode.
New Features:
Enhancements:
Documentation:
Tests: