Skip to content

Add support for S3 request signing #2280

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

adutra
Copy link
Contributor

@adutra adutra commented Aug 6, 2025

@singhpk234
Copy link
Contributor

Thank you so much for this @adutra !

when ready please let us know (would be really helpful if you can share some description to walk us through your thought process), happy to help with reviews :) !

Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like good 1st step.

As all signing requests will have to access the backend for every little data file chunk being accessed by clients, this implementation will cause quite a slow experience for users. Until we don't have a faster implementation (signed access-rules), I think we should label this feature as "experimental".

* S3 request signer that creates presigned URLs for S3 operations using AWS credentials obtained
* from the storage integration.
*/
public class S3RequestSignerImpl implements S3RequestSigner {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public class S3RequestSignerImpl implements S3RequestSigner {
class S3RequestSignerImpl implements S3RequestSigner {

private final AwsCredentialsStorageIntegration storageIntegration;
private final StorageCredentialCache storageCredentialCache;

public S3RequestSignerImpl(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public S3RequestSignerImpl(
S3RequestSignerImpl(

.method(method)
.headers(signingRequest.headers());

// FIXME is this correct?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope.
See ListObjects + ListObjectsV2 REST spec.

Be careful: Not all GET /{bucket/ requests are list-objects requests!
The "v1" list-objects has no required parameter, so you cannot just check for the presence of a req param. It could also be a get-bucket-cors or get-bucket-encryption or ...-policy or ...-website or ......
Nice, no?

Maybe it's okay to assume that every client uses the v2 api, which has a unique, required request param.

@adutra adutra force-pushed the request-signing branch 14 times, most recently from 6bf40a9 to 8d1f354 Compare August 8, 2025 13:08
@adutra adutra marked this pull request as ready for review August 8, 2025 13:09
@adutra adutra force-pushed the request-signing branch 2 times, most recently from df10af7 to 6a26337 Compare August 8, 2025 14:14
@adutra
Copy link
Contributor Author

adutra commented Aug 8, 2025

Ready for review. The test failures are unrelated to this change.

\cc @singhpk234 @snazy @dimas-b @metadaddy

mapOf(
"IcebergErrorResponse" to "org.apache.iceberg.rest.responses.ErrorResponse",
"S3SignRequest" to "org.apache.polaris.service.aws.sign.model.PolarisS3SignRequest",
"SignS3Request200Response" to
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI The name of the logical response type is a bit weird because it is inlined, instead of being a $ref in the OpanAPI spec.

@@ -146,11 +148,13 @@ public abstract class PolarisRestCatalogIntegrationBase extends CatalogTests<RES
private RESTCatalog restCatalog;
private String currentCatalogName;
private Map<String, String> restCatalogConfig;
private URI externalCatalogBase;
private URI externalCatalogBaseLocation;
Copy link
Contributor Author

@adutra adutra Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I refactored this class a bit, notably to facilitate subclassing, but there is no functional change in this class.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor request here, but you could pull this and the new IntegerationTest Helpers out into a separate PR and reduce the size of this PR by a bit. I think that would also be a very fast review

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup good idea!

#2384

@@ -39,9 +39,6 @@ val distributionElements by
}

dependencies {
implementation(project(":polaris-core"))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not directly referenced in this module.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another change I would try to pull out to another PR, it's not related to this changeset correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct:

#2385

@@ -297,12 +298,6 @@ public CreateNamespaceResponse createNamespace(CreateNamespaceRequest request) {
}
}

private static boolean isStaticFacade(CatalogEntity catalog) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored to become a method of CatalogEntity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! It definitely helps to keep this sort of refactor out of the larger functional change

* <p>The returned provider is not meant to be vended directly to clients, but rather used with
* STS, unless credential subscoping is disabled.
*/
default AwsCredentialsProvider awsSystemCredentials() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to awsSystemCredentials because these are not "just" credentials for STS, they are server credentials.

prefix,
catalog -> {
S3SignResponse response = catalog.signS3Request(s3SignRequest, tableIdentifier);
return Response.status(Response.Status.OK).entity(response).build();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory the response should include Cache-Control headers. This is left for a later improvement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... but that would require clients to respect those 🤷

@@ -101,6 +110,32 @@ public FileIO loadFileIO(
properties.putAll(accessConfig.get().credentials());
properties.putAll(accessConfig.get().extraProperties());
properties.putAll(accessConfig.get().internalProperties());
} else {
// If no subscoped creds were produced, use system-wide AWS or GCP credentials if available.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bit is not directly related to remote signing, but I realized that, when credentials subscoping is disabled, the server is unable to use the credentials configured in application.properties.

This change allows that.

That being said, the server would still be unable to use e.g. a custom S3 endpoint, simply because this property is not exposed in application.properties.

IMHO each property that can be configured in AccessConfig for a specific catalog should also be configurable by a system-wide property in application.properties.

@metadaddy
Copy link

Looks good. When I get an opportunity, I'll try setting this up with Backblaze B2 as S3-compatible storage and Trino as the Iceberg REST catalog client. Thanks!

Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM.
Not an "always boring" change" though - therefore a few more comments.

prefix,
catalog -> {
S3SignResponse response = catalog.signS3Request(s3SignRequest, tableIdentifier);
return Response.status(Response.Status.OK).entity(response).build();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... but that would require clients to respect those 🤷

Comment on lines +107 to +110
polarisEventListener,
storageIntegrationProvider,
prefixParser,
uriInfo);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Not related to this PR)
The consistently increasing amount of parameters passed to these types worries me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. We could look into making it a CDI bean.

Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, it's very very close :)

// TODO authorize based on the request's method?

try {
authorizeBasicTableLikeOperationOrThrow(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm - probably worth to mention in the changelog

@singhpk234
Copy link
Contributor

Hey @adutra thank you the change, i was reading the description and it turns out we are introducing new priviledges, IMHO this requires broader set of eyes and feedbacks as its just no longer implementation of the specification of IRC which can be contained in the PR.

Can you please send your Proposal in the dev-list (if you have already have one, can you please link this to the PR)
I only see this thread presently - https://lists.apache.org/thread/qvzwc3qxlfrk9vr7yfbx6zxfhz9lhlbc

snazy
snazy previously approved these changes Aug 14, 2025
@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Aug 14, 2025
import org.immutables.value.Value;

/**
* Request for S3 signing requests.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Request for S3 signing requests.
* Request for S3 signing.

Comment on lines 151 to 168
if (storageConfig.getAwsPartition().equals("aws-us-gov") && region == null) {
throw new IllegalArgumentException(
String.format(
"AWS region must be set when using partition %s", storageConfig.getAwsPartition()));
}

AccessConfig.Builder accessConfig = AccessConfig.builder();
if (region != null) {
accessConfig.put(StorageAccessProperty.CLIENT_REGION, region);
}

URI endpointUri = storageConfig.getEndpointUri();
if (endpointUri != null) {
accessConfig.put(StorageAccessProperty.AWS_ENDPOINT, endpointUri.toString());
}
if (Boolean.TRUE.equals(storageConfig.getPathStyleAccess())) {
accessConfig.put(StorageAccessProperty.AWS_PATH_STYLE_ACCESS, Boolean.TRUE.toString());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we put these in a helper, to be used here and the method above ?

// no need to materialize the catalog here, as we only need the catalog entity
}

public PolarisS3SignResponse signS3Request(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can i have a sign priviledge without table read priviledge ? a client may directly contact the sign endpoint

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not anymore, I refactored this bit to comply with the design doc.

For a read request you need TABLE_REMOTE_SIGN + TABLE_READ_DATA.

For a write request you need TABLE_REMOTE_SIGN + TABLE_WRITE_DATA.


LOGGER.debug("Requesting s3 signing for {}: {}", tableIdentifier, s3SignRequest);

// TODO authorize based on the request's method?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we ? SIGN_S3_REQUEST shouldn't be a blanket approval for deletes ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my previous comment. For a DELETE method you now need TABLE_REMOTE_SIGN + TABLE_WRITE_DATA.

.method(method)
.headers(signingRequest.headers());

AwsCredentials credentials = storageConfiguration.awsSystemCredentials().resolveCredentials();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are the credentials refreshed ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the server credentials – they must not need refreshing.

@adutra
Copy link
Contributor Author

adutra commented Aug 18, 2025

FYI: I was requested to produce a design doc for this feature:

https://docs.google.com/document/d/1ygdia7u4bUHUt6n8XhZo48aKoIyyrCvKqan3XP25iB8/edit?usp=sharing

@adutra adutra force-pushed the request-signing branch 2 times, most recently from 55c0ce9 to 1bc4526 Compare August 19, 2025 08:40
adutra added a commit to adutra/polaris that referenced this pull request Aug 19, 2025
This change promotes CatalogConfig and RestCatalogConfig to top-level, public annotations and introduces a few "hooks" in PolarisRestCatalogIntegrationBase that can be overridden by subclasses.

This change is a preparatory work for apache#2280 (S3 remote signing).
// Must be done after the authorization check, as the auth check creates the catalog entity
throwIfRemoteSigningNotEnabled(callContext.getRealmConfig(), catalogEntity);

// TODO S3 location access checks
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, this is left for a follow-up PR.

}

String prefix = prefixParser.catalogNameToPrefix(callContext.getRealmContext(), catalogName);
URI signerUri = uriInfo.getBaseUri().resolve("api/");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, resolving the signer URI in case of HTTP proxies is left for a follow-up PR.

@adutra adutra force-pushed the request-signing branch 4 times, most recently from 22a07f6 to 7fefc85 Compare August 19, 2025 09:43
adutra added a commit that referenced this pull request Aug 19, 2025
This change promotes `CatalogConfig` and `RestCatalogConfig` to top-level, public annotations and introduces a few "hooks" in `PolarisRestCatalogIntegrationBase` that can be overridden by subclasses.

This change is a preparatory work for #2280 (S3 remote signing).
@adutra
Copy link
Contributor Author

adutra commented Aug 19, 2025

FYI I just rebased and synchronized this PR with the design doc.

bmlyr pushed a commit to bmlyr/polaris that referenced this pull request Aug 19, 2025
This change promotes `CatalogConfig` and `RestCatalogConfig` to top-level, public annotations and introduces a few "hooks" in `PolarisRestCatalogIntegrationBase` that can be overridden by subclasses.

This change is a preparatory work for apache#2280 (S3 remote signing).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE REQUEST] On-Premise S3 & Remote Signing
5 participants