Skip to content

Conversation

goktugkose
Copy link

@goktugkose goktugkose commented Oct 9, 2025

This commit introduces BigQuery Metastore to Trino.

Description

BigQuery Metastore can be reachable from different external engines such as Spark and Flink but integration with Trino not provided yet. This PR introduces BigQuery Metastore integration to Trino.

Relates to: Google Official Documentation
Inspired by: apache/iceberg#12808

The following configuration options can be used to interact with BigQuery Metastore.

iceberg.catalog.type=bigquery_metastore
iceberg.bqms-catalog.project-id=<gcp-project-id>
iceberg.bqms-catalog.location=<gcp-project-location>
iceberg.bqms-catalog.list-all-tables=<flag-for-listing-tables>
iceberg.bqms-catalog.warehouse=<warehouse-path-in-gs-format>
iceberg.bqms-catalog.json-key-file-path=<key-file-path>
fs.native-gcs.enabled=true
gcs.project-id=<gcs-bucket-project-id>
gcs.json-key-file-path=<key-file-path-for-gcs>

Additional context and related issues

TODO: iceberg.bqms-catalog.bq_connectionoption will be added.

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( X) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

Summary by Sourcery

Add BigQuery Metastore catalog support to the Trino Iceberg plugin

New Features:

  • Introduce BIGQUERY_METASTORE catalog type in the Iceberg plugin
  • Implement TrinoBigQueryMetastoreCatalog for namespace and table operations via BigQuery Metastore
  • Add BigQueryMetastoreIcebergTableOperations and provider for managing Iceberg table metadata in BigQuery
  • Expose configuration options for project ID, location, warehouse path, credentials, and list-all-tables flag

Build:

  • Add Google BigQuery API, google-cloud-bigquery, and iceberg-bigquery dependencies to both plugin and root POMs

Tests:

  • Add unit tests for IcebergBigQueryMetastoreCatalogConfig mapping
  • Provide a smoke integration test for the BigQuery Metastore connector

Copy link

cla-bot bot commented Oct 9, 2025

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Salih Göktuğ Köse.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

Copy link

sourcery-ai bot commented Oct 9, 2025

Reviewer's Guide

This PR integrates BigQuery Metastore into the Trino Iceberg plugin by adding Google BigQuery client dependencies, defining configuration and Guice modules, registering a new catalog type, and implementing both the TrinoCatalog and IcebergTableOperations to map catalog operations to BigQuery Metastore API calls, alongside utility classes and tests.

Class diagram for new BigQuery Metastore integration classes

classDiagram
    class IcebergBigQueryMetastoreCatalogConfig {
        +String projectID
        +String location
        +String listAllTables
        +String warehouse
        +String jsonKeyFilePath
        +setProjectID(String)
        +setLocation(String)
        +setListAllTables(String)
        +setWarehouse(String)
        +setJsonKeyFilePath(String)
        +getProjectID()
        +getLocation()
        +getListAllTables()
        +getWarehouse()
        +getJsonKeyFilePath()
    }
    class TrinoBigQueryMetastoreCatalogFactory {
        +create(ConnectorIdentity): TrinoCatalog
        -IcebergTableOperationsProvider tableOperationsProvider
        -BigQueryMetastoreClientImpl bigQueryMetastoreClient
        -CatalogName catalogName
        -TypeManager typeManager
        -TrinoFileSystemFactory fileSystemFactory
        -ForwardingFileIoFactory fileIoFactory
        -String projectID
        -String gcpLocation
        -String listAllTables
        -String warehouse
        -boolean isUniqueTableLocation
    }
    class TrinoBigQueryMetastoreCatalog {
        +namespaceExists(ConnectorSession, String): boolean
        +listNamespaces(ConnectorSession): List<String>
        +dropNamespace(ConnectorSession, String)
        +createNamespace(ConnectorSession, String, Map<String, Object>, TrinoPrincipal)
        +listTables(ConnectorSession, Optional<String>): List<TableInfo>
        +listIcebergTables(ConnectorSession, Optional<String>): List<SchemaTableName>
        +dropTable(ConnectorSession, SchemaTableName)
        +loadTable(ConnectorSession, SchemaTableName): BaseTable
        +defaultTableLocation(ConnectorSession, SchemaTableName): String
        -BigQueryMetastoreClientImpl client
        -String projectID
        -String gcpLocation
        -String warehouse
        -boolean listAllTables
    }
    class BigQueryMetastoreIcebergTableOperations {
        +commitNewTable(TableMetadata)
        +commitToExistingTable(TableMetadata, TableMetadata)
        +getRefreshedLocation(boolean): String
        -BigQueryMetastoreClientImpl client
        -String projectId
        -String datasetId
        -String tableId
        -TableReference tableReference
        -String tableName
    }
    class BigQueryMetastoreIcebergTableOperationsProvider {
        +createTableOperations(TrinoCatalog, ConnectorSession, String, String, Optional<String>, Optional<String>): IcebergTableOperations
        -TrinoFileSystemFactory fileSystemFactory
        -ForwardingFileIoFactory fileIoFactory
        -BigQueryMetastoreClientImpl bqmsClient
        -String projectId
    }
    class BigQueryMetastoreIcebergUtil {
        +createExternalCatalogTableOptions(String, Map<String, String>): ExternalCatalogTableOptions
        +createExternalCatalogDatasetOptions(String, Map<String, Object>): ExternalCatalogDatasetOptions
    }
    class IcebergBigQueryMetastoreModule {
        +setup(Binder)
        +createBigQueryMetastoreClient(IcebergBigQueryMetastoreCatalogConfig): BigQueryMetastoreClientImpl
    }
    IcebergBigQueryMetastoreCatalogConfig <.. IcebergBigQueryMetastoreModule
    IcebergBigQueryMetastoreCatalogConfig <.. TrinoBigQueryMetastoreCatalogFactory
    TrinoBigQueryMetastoreCatalogFactory <|.. TrinoBigQueryMetastoreCatalog
    BigQueryMetastoreIcebergTableOperationsProvider <|.. BigQueryMetastoreIcebergTableOperations
    BigQueryMetastoreIcebergTableOperationsProvider <.. TrinoBigQueryMetastoreCatalog
    BigQueryMetastoreIcebergUtil <.. BigQueryMetastoreIcebergTableOperations
    BigQueryMetastoreIcebergUtil <.. TrinoBigQueryMetastoreCatalog
Loading

Class diagram for updated CatalogType enum

classDiagram
    class CatalogType {
        <<enum>>
        JDBC
        NESSIE
        SNOWFLAKE
        BIGQUERY_METASTORE
    }
Loading

File-Level Changes

Change Details Files
Add Google BigQuery and GCS filesystem dependencies
  • Add com.google.api, google-api-services-bigquery, google-cloud-bigquery, google-cloud-core dependencies
  • Enable trino-filesystem-gcs and jakarta.inject-api
  • Include iceberg-bigquery module and remove unused Hadoop test dependency
plugin/trino-iceberg/pom.xml
pom.xml
Register new BIGQUERY_METASTORE catalog
  • Extend CatalogType enum with BIGQUERY_METASTORE
  • Bind IcebergBigQueryMetastoreModule in IcebergCatalogModule
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/CatalogType.java
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/IcebergCatalogModule.java
Introduce configuration for BigQuery Metastore catalog
  • Create IcebergBigQueryMetastoreCatalogConfig with @config annotations
  • Add unit test for config default and explicit mappings
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/bigquery/IcebergBigQueryMetastoreCatalogConfig.java
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/catalog/bigquery/TestIcebergBigQueryMetastoreCatalogConfig.java
Add Guice module for BigQuery Metastore integration
  • Bind config, table operations provider, and catalog factory
  • Provide BigQueryMetastoreClientImpl singleton from JSON key and project/location
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/bigquery/IcebergBigQueryMetastoreModule.java
Implement TrinoCatalog factory
  • Create TrinoBigQueryMetastoreCatalogFactory injecting config and dependencies
  • Instantiate TrinoBigQueryMetastoreCatalog with client and session parameters
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/bigquery/TrinoBigQueryMetastoreCatalogFactory.java
Implement TrinoBigQueryMetastoreCatalog
  • Extend AbstractTrinoCatalog to handle namespace/table operations via BigQueryMetastoreClientImpl
  • Add caching for table metadata and conversion utilities
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/bigquery/TrinoBigQueryMetastoreCatalog.java
Implement BigQuery-based IcebergTableOperations
  • Define BigQueryMetastoreIcebergTableOperations for commit and refresh logic
  • Handle metadata location, etag checks, and snapshot summary parameters
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/bigquery/BigQueryMetastoreIcebergTableOperations.java
Provide TableOperationsProvider for BigQuery Metastore
  • Implement BigQueryMetastoreIcebergTableOperationsProvider
  • Instantiate operations with file IO, session, and project ID
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/bigquery/BigQueryMetastoreIcebergTableOperationsProvider.java
Create utility for BigQuery external catalog options
  • Add BigQueryMetastoreIcebergUtil to build ExternalCatalogTableOptions and DatasetOptions
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/bigquery/BigQueryMetastoreIcebergUtil.java
Add connector smoke tests for BigQuery Metastore
  • Implement TestIcebergBigQueryMetastoreCatalogConnectorSmokeTest requiring real GCP environment
  • Disable BaseTrinoCatalogTest for BigQueryMetastore due to final client class
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/catalog/bigquery/TestIcebergBigQueryMetastoreCatalogConnectorSmokeTest.java
plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/catalog/bigquery/TestTrinoBigQueryMetastoreCatalog.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@github-actions github-actions bot added the iceberg Iceberg connector label Oct 9, 2025
Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • In TrinoBigQueryMetastoreCatalog.toDatasetReference you call Arrays.stream(namespace.split("\\.")).toList().getFirst() which won’t compile—use namespace.split("\\.")[0] or stream.findFirst() instead and simplify your namespace splitting logic.
  • The config keys in IcebergBigQueryMetastoreCatalogConfig (iceberg.bqms-catalog.*) don’t match the constants in TrinoBigQueryMetastoreCatalog (gcp.bigquery.*), please unify property names or remove unused constants to avoid confusion.
  • Add precondition checks in TrinoBigQueryMetastoreCatalogFactory to fail fast when mandatory settings (project‐id, location, warehouse) are missing so users get a clear error at startup rather than NPEs later.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In TrinoBigQueryMetastoreCatalog.toDatasetReference you call `Arrays.stream(namespace.split("\\.")).toList().getFirst()` which won’t compile—use `namespace.split("\\.")[0]` or `stream.findFirst()` instead and simplify your namespace splitting logic.
- The config keys in IcebergBigQueryMetastoreCatalogConfig (`iceberg.bqms-catalog.*`) don’t match the constants in TrinoBigQueryMetastoreCatalog (`gcp.bigquery.*`), please unify property names or remove unused constants to avoid confusion.
- Add precondition checks in TrinoBigQueryMetastoreCatalogFactory to fail fast when mandatory settings (project‐id, location, warehouse) are missing so users get a clear error at startup rather than NPEs later.

## Individual Comments

### Comment 1
<location> `plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/bigquery/BigQueryMetastoreIcebergTableOperations.java:170-174` </location>
<code_context>
+    }
+
+    /** Update table properties with concurrent update detection using etag. */
+    private void updateTable(
+            String oldMetadataLocation, String newMetadataLocation, TableMetadata metadata)
+    {
+        Table table = client.load(tableReference);
+        if (table.getEtag().isEmpty()) {
+            throw new ValidationException(
+                    "Etag of legacy table %s is empty, manually update the table via the BigQuery API or"
</code_context>

<issue_to_address>
**issue (bug_risk):** Etag check may not be robust for null values.

Using `isEmpty()` directly on `getEtag()` can cause a NullPointerException if the value is null. Please use a null-safe check such as `Strings.isNullOrEmpty(table.getEtag())`.
</issue_to_address>

### Comment 2
<location> `plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/bigquery/IcebergBigQueryMetastoreModule.java:63-68` </location>
<code_context>
+        if (config.getLocation() != null) {
+            optionsBuilder.setLocation(config.getLocation());
+        }
+        if (config.getJsonKeyFilePath() != null) {
+            try (FileInputStream fs = new FileInputStream(config.getJsonKeyFilePath())) {
+                optionsBuilder.setCredentials(ServiceAccountCredentials.fromStream(fs));
+            }
+            catch (Exception e) {
+                throw new CredentialException("Unable to locate GCP Service Account JSON file");
+            }
</code_context>

<issue_to_address>
**suggestion (bug_risk):** Credential exception handling may mask underlying IO errors.

Consider logging or including the original exception details in the CredentialException to aid debugging and avoid losing information about the underlying error.

```suggestion
            try (FileInputStream fs = new FileInputStream(config.getJsonKeyFilePath())) {
                optionsBuilder.setCredentials(ServiceAccountCredentials.fromStream(fs));
            }
            catch (Exception e) {
                throw new CredentialException("Unable to locate GCP Service Account JSON file: " + e.getMessage(), e);
            }
```
</issue_to_address>

### Comment 3
<location> `plugin/trino-iceberg/src/test/java/io/trino/plugin/iceberg/catalog/bigquery/TestIcebergBigQueryMetastoreCatalogConnectorSmokeTest.java:128-83` </location>
<code_context>
+
+public class TestIcebergBigQueryMetastoreCatalogConfig
+{
+    @Test
+    public void testDefaults()
+    {
</code_context>

<issue_to_address>
**suggestion (testing):** Missing tests for error scenarios with invalid configuration/environment variables.

Please add tests to cover cases where required environment variables are missing or invalid, to confirm the connector handles configuration errors appropriately.

Suggested implementation:

```java
    @AfterAll
    public void teardown()
            throws IOException

    @Nested
    class ConfigurationErrorTests
    {
        @Test
        public void testMissingRequiredEnvironmentVariable()
        {
            // Simulate missing environment variable
            String originalValue = System.getenv("BIGQUERY_PROJECT_ID");
            try {
                // Unset the environment variable (using reflection hack for test only)
                TestUtils.unsetEnv("BIGQUERY_PROJECT_ID");
                IcebergConfig config = new IcebergConfig();
                config.setBigQueryProjectId(null); // explicitly set to null

                Exception exception = assertThrows(IllegalArgumentException.class, () -> {
                    new IcebergBigQueryMetastoreCatalog(config);
                });
                assertTrue(exception.getMessage().contains("BIGQUERY_PROJECT_ID"));
            }
            finally {
                // Restore environment variable
                if (originalValue != null) {
                    TestUtils.setEnv("BIGQUERY_PROJECT_ID", originalValue);
                }
            }
        }

        @Test
        public void testInvalidEnvironmentVariableValue()
        {
            String originalValue = System.getenv("BIGQUERY_PROJECT_ID");
            try {
                TestUtils.setEnv("BIGQUERY_PROJECT_ID", "!!!invalid_project_id!!!");
                IcebergConfig config = new IcebergConfig();
                config.setBigQueryProjectId("!!!invalid_project_id!!!");

                Exception exception = assertThrows(RuntimeException.class, () -> {
                    new IcebergBigQueryMetastoreCatalog(config);
                });
                assertTrue(exception.getMessage().contains("invalid project id"));
            }
            finally {
                if (originalValue != null) {
                    TestUtils.setEnv("BIGQUERY_PROJECT_ID", originalValue);
                }
            }
        }
    }

```

- You will need to implement `TestUtils.setEnv` and `TestUtils.unsetEnv` methods to manipulate environment variables for testing purposes. This can be done using reflection (see various Java testing guides for details).
- Ensure that `IcebergBigQueryMetastoreCatalog` and `IcebergConfig` throw appropriate exceptions when required environment variables are missing or invalid.
- Adjust exception types and messages in the assertions to match your actual implementation.
- If your connector uses a different configuration mechanism, adapt the test setup accordingly.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link

cla-bot bot commented Oct 9, 2025

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Salih Göktuğ Köse.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

Copy link
Member

@ebyhr ebyhr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to add this catalog. See #26219

@goktugkose
Copy link
Author

I don't think we want to add this catalog. See #26219

Hi,

I saw #26219 that integrates BigLake Metastore via REST API. However, that feature is not generally available at the moment. Also, organizations that heavily depend on Google Cloud services may benefit from better performance compared to REST solution.

Copy link

cla-bot bot commented Oct 9, 2025

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Salih Göktuğ Köse.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

goktugkose and others added 2 commits October 9, 2025 14:45
This commit introduces BigQuery Metastore to Trino.
@goktugkose goktugkose force-pushed the add-bigquery-metastore-to-trino branch from 813e578 to 7e852f3 Compare October 9, 2025 11:47
Copy link

cla-bot bot commented Oct 9, 2025

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@ebyhr
Copy link
Member

ebyhr commented Oct 9, 2025

that feature is not generally available at the moment.

I know it’s currently in the preview phase, but I expect it will eventually become generally available.

better performance compared to REST solution

Could you share the benchmark result that shows this catalog is faster than the REST catalog approach?

@goktugkose
Copy link
Author

goktugkose commented Oct 10, 2025

that feature is not generally available at the moment.

I know it’s currently in the preview phase, but I expect it will eventually become generally available.

better performance compared to REST solution

Could you share the benchmark result that shows this catalog is faster than the REST catalog approach?

Sorry for the late response 🙏
I have conducted a benchmark study for comparing REST Catalog integration and this PR. To provide a fair comparison, I have merged these two implementations and run them on the Trino development server together. I used a simple dataset containing 100 rows and examined two different scenarios: DDL and query operations. You can check the report below.

Some other notes:

  • BigLake REST API has a limitation that prevents me from using a bucket in the EU region.
  • After creating artifacts using the method on this PR, I am able to see datasets and tables on the BigQuery UI. However, I was not able to see them on BigQuery UI when I used REST catalog. As far as I know, the REST catalog should have been able to provide the same functionality.
  • Typically, basic DDL operations respond within 2–5 seconds via the REST API, though occasionally response times can extend up to 40 seconds even without any changes.
================================================================================
Populating Table (Schema: test_schema)
================================================================================

Catalog: rest
--------------------------------------------------------------------------------
  Creating table schema... ✓ (2.672s)
  Inserting data... ✓ (100 rows, 16.200s, 6.2 rows/sec)
  Verifying data... ✓ (100 rows)

Catalog: bqms
--------------------------------------------------------------------------------
  Creating table schema... ✓ (1.858s)
  Inserting data... ✓ (100 rows, 19.584s, 5.1 rows/sec)
  Verifying data... ✓ (100 rows)

================================================================================
TIMING COMPARISON
================================================================================

Catalog         CREATE (s)      INSERT (s)      TOTAL (s)       Rows/sec       
--------------------------------------------------------------------------------
rest            2.672           16.200          18.872          6.2            
bqms            1.858           19.584          21.441          5.1            

REST is 13.6% faster overall
================================================================================

================================================================================
Starting Benchmark Suite
================================================================================
Catalogs: rest, bqms
Schema: test_schema
Table: test_table
Iterations per query: 3
================================================================================


1. Full Table Scan
--------------------------------------------------------------------------------
  Running 1. Full Table Scan on rest... ✓ (avg: 0.394s)
  Running 1. Full Table Scan on bqms... ✓ (avg: 0.226s)

2. Count All Rows
--------------------------------------------------------------------------------
  Running 2. Count All Rows on rest... ✓ (avg: 0.398s)
  Running 2. Count All Rows on bqms... ✓ (avg: 0.219s)

3. Simple Filter (Survived)
--------------------------------------------------------------------------------
  Running 3. Simple Filter (Survived) on rest... ✓ (avg: 0.396s)
  Running 3. Simple Filter (Survived) on bqms... ✓ (avg: 0.202s)

4. Multiple Filters
--------------------------------------------------------------------------------
  Running 4. Multiple Filters on rest... ✓ (avg: 0.408s)
  Running 4. Multiple Filters on bqms... ✓ (avg: 0.197s)

5. Aggregation by Class
--------------------------------------------------------------------------------
  Running 5. Aggregation by Class on rest... ✓ (avg: 0.404s)
  Running 5. Aggregation by Class on bqms... ✓ (avg: 0.211s)

6. Complex Aggregation
--------------------------------------------------------------------------------
  Running 6. Complex Aggregation on rest... ✓ (avg: 0.386s)
  Running 6. Complex Aggregation on bqms... ✓ (avg: 0.211s)

7. String Operations (LIKE)
--------------------------------------------------------------------------------
  Running 7. String Operations (LIKE) on rest... ✓ (avg: 0.456s)
  Running 7. String Operations (LIKE) on bqms... ✓ (avg: 0.195s)

8. String Operations (UPPER)
--------------------------------------------------------------------------------
  Running 8. String Operations (UPPER) on rest... ✓ (avg: 0.390s)
  Running 8. String Operations (UPPER) on bqms... ✓ (avg: 0.197s)

9. Self Join
--------------------------------------------------------------------------------
  Running 9. Self Join on rest... ✓ (avg: 0.464s)
  Running 9. Self Join on bqms... ✓ (avg: 0.249s)

10. Complex Filter with Aggregation
--------------------------------------------------------------------------------
  Running 10. Complex Filter with Aggregation on rest... ✓ (avg: 0.436s)
  Running 10. Complex Filter with Aggregation on bqms... ✓ (avg: 0.197s)

11. Subquery
--------------------------------------------------------------------------------
  Running 11. Subquery on rest... ✓ (avg: 0.443s)
  Running 11. Subquery on bqms... ✓ (avg: 0.263s)

12. Name Length Analysis
--------------------------------------------------------------------------------
  Running 12. Name Length Analysis on rest... ✓ (avg: 0.424s)
  Running 12. Name Length Analysis on bqms... ✓ (avg: 0.206s)


================================================================================
BENCHMARK RESULTS
================================================================================

Query                                         Catalog    Avg Time (s)    Min/Max (s)          Status    
---------------------------------------------------------------------------------------------------------
1. Full Table Scan                            rest       0.394           0.367 / 0.442        ✓         
                                              bqms       0.226           0.214 / 0.250        ✓         

10. Complex Filter with Aggregation           rest       0.436           0.410 / 0.473        ✓         
                                              bqms       0.197           0.190 / 0.201        ✓         

11. Subquery                                  rest       0.443           0.432 / 0.460        ✓         
                                              bqms       0.263           0.248 / 0.282        ✓         

12. Name Length Analysis                      rest       0.424           0.391 / 0.483        ✓         
                                              bqms       0.206           0.180 / 0.234        ✓         

2. Count All Rows                             rest       0.398           0.393 / 0.404        ✓         
                                              bqms       0.219           0.205 / 0.244        ✓         

3. Simple Filter (Survived)                   rest       0.396           0.384 / 0.414        ✓         
                                              bqms       0.202           0.183 / 0.225        ✓         

4. Multiple Filters                           rest       0.408           0.399 / 0.416        ✓         
                                              bqms       0.197           0.167 / 0.222        ✓         

5. Aggregation by Class                       rest       0.404           0.389 / 0.423        ✓         
                                              bqms       0.211           0.184 / 0.235        ✓         

6. Complex Aggregation                        rest       0.386           0.371 / 0.414        ✓         
                                              bqms       0.211           0.182 / 0.245        ✓         

7. String Operations (LIKE)                   rest       0.456           0.390 / 0.500        ✓         
                                              bqms       0.195           0.188 / 0.200        ✓         

8. String Operations (UPPER)                  rest       0.390           0.363 / 0.422        ✓         
                                              bqms       0.197           0.184 / 0.213        ✓         

9. Self Join                                  rest       0.464           0.404 / 0.566        ✓         
                                              bqms       0.249           0.211 / 0.297        ✓         


================================================================================
SUMMARY
================================================================================

Total execution time (BQMS): 2.574s
Total execution time (REST): 4.999s

Query-by-query comparison:
  1. Full Table Scan                             BQMS is 74.0% faster
  2. Count All Rows                              BQMS is 81.7% faster
  3. Simple Filter (Survived)                    BQMS is 95.6% faster
  4. Multiple Filters                            BQMS is 106.7% faster
  5. Aggregation by Class                        BQMS is 91.4% faster
  6. Complex Aggregation                         BQMS is 82.6% faster
  7. String Operations (LIKE)                    BQMS is 133.2% faster
  8. String Operations (UPPER)                   BQMS is 97.8% faster
  9. Self Join                                   BQMS is 86.4% faster
  10. Complex Filter with Aggregation            BQMS is 121.6% faster
  11. Subquery                                   BQMS is 68.7% faster
  12. Name Length Analysis                       BQMS is 106.2% faster

================================================================================
Starting DDL Benchmark Suite
================================================================================
Catalogs: rest, bqms
Schema: test_schema
Iterations per operation: 1
================================================================================

CREATE TABLE
--------------------------------------------------------------------------------
  CREATE TABLE on rest... ✓ (avg: 2.406s)
  CREATE TABLE on bqms... ✓ (avg: 1.695s)

DROP TABLE
--------------------------------------------------------------------------------
  DROP TABLE on rest... ✓ (avg: 0.763s)
  DROP TABLE on bqms... ✓ (avg: 0.673s)

ADD COLUMN
--------------------------------------------------------------------------------
  ADD COLUMN on rest... ✓ (avg: 1.226s)
  ADD COLUMN on bqms... ✓ (avg: 1.260s)

DROP COLUMN
--------------------------------------------------------------------------------
  DROP COLUMN on rest... ✓ (avg: 1.247s)
  DROP COLUMN on bqms... ✓ (avg: 1.370s)


================================================================================
DDL BENCHMARK RESULTS
================================================================================

Operation            Catalog    Avg Time (s)    Min/Max (s)          Status    
--------------------------------------------------------------------------------
CREATE TABLE         rest       2.406           2.406 / 2.406        ✓         
                     bqms       1.695           1.695 / 1.695        ✓         

DROP TABLE           rest       0.763           0.763 / 0.763        ✓         
                     bqms       0.673           0.673 / 0.673        ✓         

ADD COLUMN           rest       1.226           1.226 / 1.226        ✓         
                     bqms       1.260           1.260 / 1.260        ✓         

DROP COLUMN          rest       1.247           1.247 / 1.247        ✓         
                     bqms       1.370           1.370 / 1.370        ✓         


================================================================================
SUMMARY
================================================================================

DDL Operation Performance Comparison:

  CREATE TABLE          BQMS is 42.0% faster
  DROP TABLE            BQMS is 13.4% faster
  ADD COLUMN            REST is 2.8% faster
  DROP COLUMN           REST is 9.9% faster

Copy link

cla-bot bot commented Oct 13, 2025

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@ebyhr
Copy link
Member

ebyhr commented Oct 13, 2025

@talatuyarer @rambleraptor Do you have any idea why Iceberg REST catalog endpoint (#26219) is slower than this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

iceberg Iceberg connector

Development

Successfully merging this pull request may close these issues.

2 participants