Skip to content

Iceberg REST Catalog: Support for vended credentials for Azure #23238

@c-thiel

Description

@c-thiel

I am currently extending the integration tests for our iceberg rest catalog implementation.

S3 integration with trino works nicely, but I can't get vended-credentails up and running.

My configuration looks as follows:

    CREATE CATALOG test_azure USING iceberg
    WITH (
        "iceberg.catalog.type" = 'rest',
        "iceberg.rest-catalog.uri" = 'https://api.tabular.io/ws/',
        "iceberg.rest-catalog.warehouse" = 'cth-azure',
        "iceberg.rest-catalog.security" = 'OAUTH2',
        "iceberg.rest-catalog.oauth2.token" = '{token}',
        "iceberg.rest-catalog.vended-credentials-enabled" = 'true',
        "fs.native-azure.enabled" = 'true'
    )

I can use all endpoints as usual, but data operations fail with the following error:

TrinoExternalError(type=EXTERNAL, name=ICEBERG_FILESYSTEM_ERROR, message="Failed checking new table's location: abfss://<filesystem>@<storage-account-name>.dfs.core.windows.net/0191b3ad-6c11-7fc2-bb2a-12a7648dc115/my_table-a90cff2771dd44d3aa55235cf4f77a47", query_id=20240902_165932_00260_3zngw)

I believe the returned config attribute for the table:
adls.sas-token.<storage-account-name>.dfs.core.windows.net: "skoid=...."
is not being used.
Instead it is trying to load credentials from well-known locations:

2024-09-02T16:59:32.697Z        INFO    Query-20240902_165932_00258_3zngw-817   com.azure.identity.ChainedTokenCredential       Azure Identity => Attempted credential EnvironmentCredential is unavailable.
2024-09-02T16:59:32.697Z        INFO    Query-20240902_165932_00258_3zngw-817   com.azure.identity.ChainedTokenCredential       Azure Identity => Attempted credential WorkloadIdentityCredential is unavailable.
2024-09-02T16:59:32.698Z        WARN    ForkJoinPool.commonPool-worker-15       com.microsoft.aad.msal4j.ConfidentialClientApplication  [Correlation ID: aac0fdfd-a2cf-4916-8da2-704b4a23630c] Execution of class com.microsoft.aad.msal4j.AcquireTokenByClientCredentialSupplier failed: java.util.concurrent.ExecutionException: com.azure.identity.CredentialUnavailableException: ManagedIdentityCredential authentication unavailable. Connection to IMDS endpoint cannot be established, Connection refused.
2024-09-02T16:59:32.698Z        INFO    ForkJoinPool.commonPool-worker-15       com.azure.identity.ChainedTokenCredential       Azure Identity => Attempted credential ManagedIdentityCredential is unavailable.
2024-09-02T16:59:32.698Z        INFO    ForkJoinPool.commonPool-worker-16       com.azure.identity.ChainedTokenCredential       Azure Identity => Attempted credential SharedTokenCacheCredential is unavailable.
2024-09-02T16:59:32.698Z        INFO    ForkJoinPool.commonPool-worker-16       com.azure.identity.ChainedTokenCredential       Azure Identity => Attempted credential IntelliJCredential is unavailable.
2024-09-02T16:59:32.701Z        INFO    ForkJoinPool.commonPool-worker-16       com.azure.identity.ChainedTokenCredential       Azure Identity => Attempted credential AzureCliCredential is unavailable.
2024-09-02T16:59:32.705Z        INFO    ForkJoinPool.commonPool-worker-16       com.azure.identity.ChainedTokenCredential       Azure Identity => Attempted credential AzurePowerShellCredential is unavailable.
2024-09-02T16:59:32.706Z        INFO    ForkJoinPool.commonPool-worker-16       com.azure.identity.ChainedTokenCredential       Azure Identity => Attempted credential AzureDeveloperCliCredential is unavailable.
2024-09-02T16:59:32.707Z        ERROR   ForkJoinPool.commonPool-worker-16       com.azure.core.implementation.AccessTokenCache  {"az.sdk.message":"Failed to acquire a new access token.","exception":"EnvironmentCredential authentication unavailable. Environment variables are not fully configured.To mitigate this issue, please refer to the troubleshooting guidelines here at https://aka.ms/azsdk/java/identity/environmentcredential/troubleshoot\r\nWorkloadIdentityCredential authentication unavailable. The workload options are not fully configured. See the troubleshooting guide for more information. https://aka.ms/azsdk/java/identity/workloadidentitycredential/troubleshoot\r\nManaged Identity authentication is not available.\r\nSharedTokenCacheCredential authentication unavailable. No accounts were found in the cache.\r\nIntelliJ Authentication not available. Please log in with Azure Tools for IntelliJ plugin in the IDE. Fore more details refer to the troubleshooting guidelines here at https://aka.ms/azsdk/java/identity/intellijcredential/troubleshoot\r\nAzureCliCredential authentication unavailable. Azure CLI not installed.To mitigate this issue, please refer to the troubleshooting guidelines here at https://aka.ms/azsdk/java/identity/azclicredential/troubleshoot\r\nEncountered error when deserializing response from Azure Power Shell.\r\nAzureDeveloperCliCredential authentication unavailable. Azure Developer CLI not installed.To mitigate this issue, please refer to the troubleshooting guidelines here at https://aka.ms/azsdk/java/identity/azdevclicredential/troubleshootTo mitigate this issue, please refer to the troubleshooting guidelines here at https://aka.ms/azure-identity-java-default-azure-credential-troubleshoot"}
2024-09-02T16:59:32.707Z        INFO    dispatcher-query-86     io.trino.event.QueryMonitor     TIMELINE: Query 20240902_165932_00258_3zngw :: FAILED (ICEBERG_FILESYSTEM_ERROR) :: elapsed 601ms :: planning 0ms :: waiting 0ms :: scheduling 601ms :: running 0ms :: finishing 601ms :: begin 2024-09-02T16:59:32.105Z :: end 2024-09-02T16:59:32.706Z

It would be great if someone could check why the returned sas token is not being used. The java iceberg package supports it and spark respects the token as well.

Let me know if I can support this in any way!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions