Skip to content

[ENHANCEMENT] Support non-AWS S3 storage that does not have STS #2207

@lastranget

Description

@lastranget

Describe the bug

I'm trying to set up a polaris catalog that points to our on-premises Pure FlashBlade s3 storage instance.

I'm getting the following error when I try to create a table via spark sql shell:

org.apache.iceberg.exceptions.RESTException: Unable to process: Failed to get subscoped credentials: (Service: Sts, Status Code: 400, Request ID: null) (SDK Attempt Count: 1)

This is similar to the errors reported in #1146 , but those errors have additional details as to the actual issue after "subscoped credentials:", whereas in this case, the error message is left incomplete.

I believe from #1913 that external s3 providers should be supported currently.

The full stack trace is as follows:

spark-sql ()> CREATE NAMESPACE ICE_NS;
Time taken: 1.11 seconds
spark-sql ()> USE NAMESPACE ICE_NS;
Time taken: 0.071 seconds
spark-sql (ICE_NS)> CREATE TABLE PEOPLE (id int, name string) USING iceberg;
25/07/29 17:57:42 ERROR SparkSQLDriver: Failed in [CREATE TABLE PEOPLE (id int, name string) USING iceberg]
org.apache.iceberg.exceptions.RESTException: Unable to process: Failed to get subscoped credentials: (Service: Sts, Status Code: 400, Request ID: null) (SDK Attempt Count: 1)
	at org.apache.iceberg.rest.ErrorHandlers$DefaultErrorHandler.accept(ErrorHandlers.java:248)
	at org.apache.iceberg.rest.ErrorHandlers$TableErrorHandler.accept(ErrorHandlers.java:123)
	at org.apache.iceberg.rest.ErrorHandlers$TableErrorHandler.accept(ErrorHandlers.java:107)
	at org.apache.iceberg.rest.HTTPClient.throwFailure(HTTPClient.java:215)
	at org.apache.iceberg.rest.HTTPClient.execute(HTTPClient.java:299)
	at org.apache.iceberg.rest.BaseHTTPClient.post(BaseHTTPClient.java:88)
	at org.apache.iceberg.rest.RESTSessionCatalog$Builder.create(RESTSessionCatalog.java:771)
	at org.apache.iceberg.CachingCatalog$CachingTableBuilder.lambda$create$0(CachingCatalog.java:264)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2406)
	at java.base/java.util.concurrent.ConcurrentHashMap.compute(Unknown Source)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2404)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2387)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108)
	at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62)
	at org.apache.iceberg.CachingCatalog$CachingTableBuilder.create(CachingCatalog.java:260)
	at org.apache.iceberg.spark.SparkCatalog.createTable(SparkCatalog.java:246)
	at org.apache.polaris.spark.SparkCatalog.createTable(SparkCatalog.java:153)
	at org.apache.spark.sql.connector.catalog.TableCatalog.createTable(TableCatalog.java:223)
	at org.apache.spark.sql.execution.datasources.v2.CreateTableExec.run(CreateTableExec.scala:44)
	at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result$lzycompute(V2CommandExec.scala:43)
	at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result(V2CommandExec.scala:43)
	at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.executeCollect(V2CommandExec.scala:49)
	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:107)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:125)
	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:201)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:108)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900)
	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:66)
	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:107)
	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98)
	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:461)
	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:76)
	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:461)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:32)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32)
	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:437)
	at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:98)
	at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:85)
	at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:83)
	at org.apache.spark.sql.Dataset.<init>(Dataset.scala:220)
	at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900)
	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
	at org.apache.spark.sql.SparkSession.$anonfun$sql$4(SparkSession.scala:691)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900)
	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:682)
	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:713)
	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:744)
	at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:651)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:68)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:501)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1(SparkSQLCLIDriver.scala:619)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1$adapted(SparkSQLCLIDriver.scala:613)
	at scala.collection.Iterator.foreach(Iterator.scala:943)
	at scala.collection.Iterator.foreach$(Iterator.scala:943)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
	at scala.collection.IterableLike.foreach(IterableLike.scala:74)
	at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processLine(SparkSQLCLIDriver.scala:613)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:310)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
	at java.base/java.lang.reflect.Method.invoke(Unknown Source)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1034)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:199)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:222)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1125)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1134)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)


To Reproduce

I have a docker compose file to initialize the polaris server:


services:

  polaris:
    image: apache/polaris:latest
    platform: linux/amd64
    ports:
      - "8181:8181"
      - "8182:8182"
    environment:
      AWS_ACCESS_KEY_ID: <pure-s3-access-key>
      AWS_SECRET_ACCESS_KEY: <pure-s3-secret-key>
      AWS_REGION: us-east-2
      AWS_ENDPOINT_URL_S3: <pure-s3-endpoint-url>
      AWS_ENDPOINT_URL_STS: <same pure s3 endpoint url as immediately above>
      POLARIS_BOOTSTRAP_CREDENTIALS: default-realm,root,secret
      # polaris.features."SUPPORTED_CATALOG_STORAGE_TYPES": "[\"FILE\",\"S3\",\"GCS\",\"AZURE\"]"
      polaris.features.DROP_WITH_PURGE_ENABLED: true # allow dropping tables from the SQL client
      polaris.realm-context.realms: default-realm
      polaris.features."SKIP_CREDENTIAL_SUBSCOPING_INDIRECTION": false
      polaris.features."SUPPORTED_CATALOG_STORAGE_TYPES": "[\"S3\"]" 

FYI, I get the same malformed error without the skip credential subscoping indirection = false, and I get the error when I set the docker image version to 1.0.1-incubating-rc0 manually (I've also tried this on 1.0.0-incubating)

I then have two scripts to initialize the polaris server:


ACCESS_TOKEN=$(curl -X POST \
  http://localhost:8181/api/catalog/v1/oauth/tokens \
  -d 'grant_type=client_credentials&client_id=root&client_secret=secret&scope=PRINCIPAL_ROLE:ALL' \
  | jq -r '.access_token')

curl -i -X POST \
  -H "Authorization: Bearer $ACCESS_TOKEN" \
  http://localhost:8181/api/management/v1/catalogs \
  -H "Content-Type: application/json" \
  --data '{
    "name": "polariscatalog",
    "type": "INTERNAL",
    "properties": {
      "default-base-location": "s3://polaris-txl25-1",
      "s3.endpoint": "<pure-s3-endpoint>",
      "s3.path-style-access": "true",
      "s3.access-key-id": "<pure-s3-access-key>",
      "s3.secret-access-key": "<pure-s3-secret-key>",
      "s3.region": "us-east-2"
    },
    "storageConfigInfo": {
      "roleArn": "arn:aws:iam::000000000000:role/dummy-polaris-role",
      "storageType": "S3",
      "allowedLocations": [
        "s3://polaris-txl25-1/*"
      ]
    }
  }'

and

ACCESS_TOKEN=$(curl -X POST \
  http://localhost:8181/api/catalog/v1/oauth/tokens \
  -d 'grant_type=client_credentials&client_id=root&client_secret=secret&scope=PRINCIPAL_ROLE:ALL' \
  | jq -r '.access_token')

# Create a catalog admin role
curl -X PUT http://localhost:8181/api/management/v1/catalogs/polariscatalog/catalog-roles/catalog_admin/grants \
  -H "Authorization: Bearer $ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  --data '{"grant":{"type":"catalog", "privilege":"CATALOG_MANAGE_CONTENT"}}'

# Create a data engineer role
curl -X POST http://localhost:8181/api/management/v1/principal-roles \
  -H "Authorization: Bearer $ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  --data '{"principalRole":{"name":"data_engineer"}}'

# Connect the roles
curl -X PUT http://localhost:8181/api/management/v1/principal-roles/data_engineer/catalog-roles/polariscatalog \
  -H "Authorization: Bearer $ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  --data '{"catalogRole":{"name":"catalog_admin"}}'

# Give root the data engineer role
curl -X PUT http://localhost:8181/api/management/v1/principals/root/principal-roles \
  -H "Authorization: Bearer $ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  --data '{"principalRole": {"name":"data_engineer"}}'

These scripts initiate the catalog, and then create a role for the user.

Then, I use the following docker file to create a spark environment that has my spark submit script

FROM docker-hub/spark:3.5.6

ENV AWS_ACCESS_KEY_ID <pure-s3-access-key>
ENV AWS_SECRET_ACCESS_KEY <pure-s3-secret-key>
ENV AWS_ENDPOINT_URL <pure-s3-endpoint>
ENV AWS_REGION us-east-2

COPY txl25-polaris-sql.sh /opt/spark/bin/txl25-polaris-sql.sh

USER root
RUN apt-get update
RUN apt-get install libcap2-bin libcap-dev less -y
USER spark

And then, from within the running spark docker container (running on host networking so that it can talk to polaris), I execute the following:

./spark-sql \
--packages org.apache.polaris:polaris-spark-3.5_2.12:1.0.0-incubating,org.apache.iceberg:iceberg-aws-bundle:1.9.0,io.delta:delta-spark_2.12:3.3.1 \
 --conf spark.driver.extraJavaOptions="-Divy.cache.dir=/tmp -Divy.home=/tmp" \
--conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,io.delta.sql.DeltaSparkSessionExtension \
--conf spark.sql.catalog.lorna.warehouse=polariscatalog \
--conf spark.sql.catalog.lorna.header.X-Iceberg-Access-Delegation=vended-credentials \
--conf spark.sql.catalog.lorna=org.apache.polaris.spark.SparkCatalog \
--conf spark.sql.catalog.lorna.uri=http://localhost:8181/api/catalog \
--conf spark.sql.catalog.lorna.credential='root:secret' \
--conf spark.sql.catalog.lorna.scope='PRINCIPAL_ROLE:ALL' \
--conf spark.sql.catalog.lorna.token-refresh-enabled=true \
--conf spark.sql.catalog.lorna.io-impl=org.apache.iceberg.io.ResolvingFileIO \
--conf spark.hadoop.fs.s3.impl=org.apache.hadoop.fs.s3a.S3AFileSystem \
--conf spark.sql.catalog.lorna.s3.region=us-east-2 \
--conf spark.sql.catalog.lorna.s3.endpoint=<pure-s3-endpoint>

From within the spark sql shell, I execute the following:


USE lorna;
CREATE NAMESPACE ICE_NS;
USE NAMESPACE ICE_NS;
CREATE TABLE PERSON (id int, name string) USING iceberg;

And then the error report occurs.

Actual Behavior

No response

Expected Behavior

I would expect the system to be able to interact with my custom s3 endpoint, but at the very least I think that the error message should be fully formed as to give me a better clue why this is failing.

Additional context

I do get this warning when selecting my spark catalog, so maybe this is relevant:

spark-sql (default)> USE lorna;
25/07/29 18:23:08 WARN AuthManagers: Inferring rest.auth.type=oauth2 since property credential was provided. Please explicitly set rest.auth.type to avoid this warning.
25/07/29 18:23:08 WARN OAuth2Manager: Iceberg REST client is missing the OAuth2 server URI configuration and defaults to http://localhost:8181/api/catalog/v1/oauth/tokens. This automatic fallback will be removed in a future Iceberg release.It is recommended to configure the OAuth2 endpoint using the 'oauth2-server-uri' property to be prepared. This warning will disappear if the OAuth2 endpoint is explicitly configured. See https://github.com/apache/iceberg/issues/10537
25/07/29 18:23:10 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
Time taken: 2.29 

Here's the last bit of my polaris docker logs:

2025-07-29 18:47:05,312 INFO  [io.qua.htt.access-log] [,default-realm] [,,,] (executor-thread-1) 172.19.0.1 - - [29/Jul/2025:18:47:05 +0000] "POST /api/catalog/v1/oauth/tokens HTTP/1.1" 200 753
2025-07-29 18:47:05,451 INFO  [io.qua.htt.access-log] [,default-realm] [,,,] (executor-thread-1) 172.19.0.1 - root [29/Jul/2025:18:47:05 +0000] "GET /api/catalog/v1/config?warehouse=polariscatalog HTTP/1.1" 200 2351
2025-07-29 18:47:05,532 INFO  [io.qua.htt.access-log] [,default-realm] [,,,] (executor-thread-1) 172.19.0.1 - root [29/Jul/2025:18:47:05 +0000] "GET /api/catalog/v1/config?warehouse=polariscatalog HTTP/1.1" 200 2351
2025-07-29 18:47:15,943 INFO  [org.apa.pol.ser.exc.IcebergExceptionMapper] [,default-realm] [,,,] (executor-thread-1) Handling runtimeException Namespace does not exist: ice_ns
2025-07-29 18:47:15,961 INFO  [io.qua.htt.access-log] [,default-realm] [,,,] (executor-thread-1) 172.19.0.1 - root [29/Jul/2025:18:47:15 +0000] "GET /api/catalog/v1/polariscatalog/namespaces/ice_ns HTTP/1.1" 404 101
2025-07-29 18:47:16,066 INFO  [org.apa.pol.ser.cat.ice.IcebergCatalogHandler] [,default-realm] [,,,] (executor-thread-1) Initializing non-federated catalog
2025-07-29 18:47:16,109 INFO  [io.qua.htt.access-log] [,default-realm] [,,,] (executor-thread-1) 172.19.0.1 - root [29/Jul/2025:18:47:16 +0000] "POST /api/catalog/v1/polariscatalog/namespaces HTTP/1.1" 200 96
2025-07-29 18:48:00,501 INFO  [org.apa.pol.ser.cat.ice.IcebergCatalogHandler] [,default-realm] [,,,] (executor-thread-1) Initializing non-federated catalog
2025-07-29 18:48:00,525 INFO  [io.qua.htt.access-log] [,default-realm] [,,,] (executor-thread-1) 172.19.0.1 - root [29/Jul/2025:18:48:00 +0000] "GET /api/catalog/v1/polariscatalog/namespaces?pageToken= HTTP/1.1" 200 50
2025-07-29 18:48:06,053 INFO  [org.apa.pol.ser.cat.ice.IcebergCatalogHandler] [,default-realm] [,,,] (executor-thread-1) Initializing non-federated catalog
2025-07-29 18:48:06,060 INFO  [io.qua.htt.access-log] [,default-realm] [,,,] (executor-thread-1) 172.19.0.1 - root [29/Jul/2025:18:48:06 +0000] "GET /api/catalog/v1/polariscatalog/namespaces/ice_ns HTTP/1.1" 200 96
2025-07-29 18:48:26,456 INFO  [org.apa.pol.ser.exc.IcebergExceptionMapper] [,default-realm] [,,,] (executor-thread-1) Handling runtimeException Table does not exist: ice_ns.person
2025-07-29 18:48:26,459 INFO  [io.qua.htt.access-log] [,default-realm] [,,,] (executor-thread-1) 172.19.0.1 - root [29/Jul/2025:18:48:26 +0000] "GET /api/catalog/v1/polariscatalog/namespaces/ice_ns/tables/person?snapshots=all HTTP/1.1" 404 100
2025-07-29 18:48:26,485 INFO  [org.apa.pol.ser.exc.IcebergExceptionMapper] [,default-realm] [,,,] (executor-thread-4) Handling runtimeException Generic table does not exist: ice_ns.person
2025-07-29 18:48:26,487 INFO  [io.qua.htt.access-log] [,default-realm] [,,,] (executor-thread-4) 172.19.0.1 - root [29/Jul/2025:18:48:26 +0000] "GET /api/catalog/polaris/v1/polariscatalog/namespaces/ice_ns/generic-tables/person HTTP/1.1" 404 108
2025-07-29 18:48:26,614 INFO  [org.apa.pol.ser.cat.ice.IcebergCatalogHandler] [,default-realm] [,,,] (executor-thread-4) Initializing non-federated catalog
2025-07-29 18:48:26,640 INFO  [org.apa.ice.BaseMetastoreCatalog] [,default-realm] [,,,] (executor-thread-4) Table properties set at catalog level through catalog properties: {}
2025-07-29 18:48:26,650 INFO  [org.apa.ice.BaseMetastoreCatalog] [,default-realm] [,,,] (executor-thread-4) Table properties enforced at catalog level through catalog properties: {}
2025-07-29 18:48:27,087 INFO  [org.apa.pol.ser.exc.IcebergExceptionMapper] [,default-realm] [,,,] (executor-thread-4) Handling runtimeException Failed to get subscoped credentials: (Service: Sts, Status Code: 400, Request ID: null) (SDK Attempt Count: 1)
2025-07-29 18:48:27,088 INFO  [io.qua.htt.access-log] [,default-realm] [,,,] (executor-thread-4) 172.19.0.1 - root [29/Jul/2025:18:48:27 +0000] "POST /api/catalog/v1/polariscatalog/namespaces/ice_ns/tables HTTP/1.1" 422 183

System information

Tested with polaris docker images 1.0.1-incubating-rc0 and 1.0.0-incubating. Using a spark docker image versioned 3.5.6

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions