Add SQLAlchemy-based SqlClient implementations for metricflow tests by tlento · Pull Request #1966 · dbt-labs/metricflow

tlento · 2026-02-02T17:53:54Z

Add SQLAlchemy-based SqlClient implementations for metricflow tests

The engine tests currently execute via calls through dbt adapters
maanged by an AdapterBackedSqlClient instance (and its corresponding
DDL-enabled class).

In order to allow for ongoing development and testing in a world where
dbt-core depends on metricflow we need to remove the pass-through
dependency on dbt-core that our reliance on dbt-adapters imposes on
us.

This is the first step in removing metricflow's test package dependencies
on dbt adapters - moving test execution to a SQLAlchemy-based client
that has many of the same test configuration advantages of the
AdapterBackedSqlClient without the dbt dependencies.

The full changeset associated with this (PR #1966) includes the addition
of the client plus independent changes to cut over each engine, as follows:

Cut DuckDB tests over to SqlAlchemySqlClient

This includes one necessary update to error messaging snapshots, as the dbt
adapters use a custom exception wrapper that does dbt-specific formatting.

Cut postgres tests over to SqlAlchemySqlClient

No special changes required.

Cut Trino over to SqlAlchemy SqlClient

Trino uses a catalog rather than a database, and in their
sqlalchemy dialect they override the standard URL class to always
be a string url. Since that was a nuisance I chose to simply overload
the database element in the standard SqlAlchemy URL format and URL class
to be the catalog for the Trino case. This works in our tests, and probably
works in general, but it might not be robust in all scenarios.

In addition, semi-colons in sql expressions were causing syntax
errors with the sqlalchemy Trino client, so these were removed from the
affected test cases.

Cut Databricks tests over to SqlAlchemySqlClient

This includes a bug fix for extracting the httppath parameter
from the engine URL. The original issue never emerged because it was in
a dead code branch, but the SqlAlchemy client uses it.

Cut Snowflake tests over to SqlAlchemySqlClient

Snowflake's SQLAlchemy connector hews to the SQLAlchemy standard
of rendering case-insensitive object names in lower-case, so this
change updates our snapshots accordingly.

Cut Redshift tests over to SqlAlchemySqlClient

Making Redshift work with SqlAlchemy was a bit trickier than expected,
as the "official" SqlAlchemy client for Redshift hasn't been updated
in 2 years and does not work with SqlAlchemy 2.x.

As a hack around this problem, this change adds a bare-bones custom
dialect override for psycopg2 that clobbers some class property to
allow for backwards compatibility with Redshift. Since we only ever
pass in sql query strings as sa_text() and always qualify our
relation names with the schema inline in the queries this works, but
if we ever need to relax those constraints we'll either need to find
a more full-featured dialect for Redshift or else switch to using the
redshift-connector package directly.

Cut BigQuery tests over to SqlAlchemySqlClient

BigQuery has some special connection requirements, and does not support
the EXPLAIN keyword, so the dry run configuration is also special.

This updates the client accordingly.

github-actions · 2026-02-02T17:54:06Z

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

tlento · 2026-02-02T17:54:11Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

tests_metricflow/fixtures/sql_client_fixtures.py

plypaul

Thanks Tom! Left a few comments.

tests_metricflow/fixtures/sql_clients/sqlalchemy_client.py

tests_metricflow/fixtures/sql_client_fixtures.py

tests_metricflow/fixtures/sql_clients/sqlalchemy_client.py

plypaul · 2026-02-03T17:49:26Z

tests_metricflow/fixtures/sql_clients/sqlalchemy_client.py

+        """
+        start = time.perf_counter()
+
+        if sql_bind_parameter_set.param_dict:


We actually received a request to support bind parameters again. Outside the scope of the PR, but would it be straightforward to enable this case?

Nothing about bind parameters is straightforward. 😛 But given that, this client shouldn't be a major source of problems for you.

We use sa_text() everywhere, so you can pass bind parameters through to the execute() call. Within a test context this shouldn't be too difficult to get working. The issue is we have to render the inline parameter using the SqlAlchemy format, which works for some-but-not-all dialects even within SqlAlchemy. Databricks, for example, caused a bunch of problems back in the day, but we were able to work around that.

More broadly, you'll have to be cautious about how those bind parameters get specified and rendered in the final query text. The rendering is probably not the same for every dialect within SqlAlchemy (and our Redshift shim might cause weird issues, although I'd expect it to work). It also may not be the same for SqlAlchemy vs JDBC/DBAPI/ArrowFlightSQL/etc. - engines do have divergence in input and output formats between their SqlAlchemy implementations and their direct-access APIs.

tlento

Thanks @plypaul ! I'll update inlines tomorrow and re-run the engine tests before merging.

tests_metricflow/fixtures/sql_client_fixtures.py

tests_metricflow/fixtures/sql_clients/sqlalchemy_client.py

tlento · 2026-02-03T20:58:21Z

tests_metricflow/fixtures/sql_clients/sqlalchemy_client.py

+        """
+        start = time.perf_counter()
+
+        if sql_bind_parameter_set.param_dict:


Nothing about bind parameters is straightforward. 😛 But given that, this client shouldn't be a major source of problems for you.

We use sa_text() everywhere, so you can pass bind parameters through to the execute() call. Within a test context this shouldn't be too difficult to get working. The issue is we have to render the inline parameter using the SqlAlchemy format, which works for some-but-not-all dialects even within SqlAlchemy. Databricks, for example, caused a bunch of problems back in the day, but we were able to work around that.

More broadly, you'll have to be cautious about how those bind parameters get specified and rendered in the final query text. The rendering is probably not the same for every dialect within SqlAlchemy (and our Redshift shim might cause weird issues, although I'd expect it to work). It also may not be the same for SqlAlchemy vs JDBC/DBAPI/ArrowFlightSQL/etc. - engines do have divergence in input and output formats between their SqlAlchemy implementations and their direct-access APIs.

tests_metricflow/fixtures/sql_clients/sqlalchemy_client.py

The engine tests currently execute via calls through dbt adapters maanged by an AdapterBackedSqlClient instance (and its corresponding DDL-enabled class). In order to allow for ongoing development and testing in a world where dbt-core depends on metricflow we need to remove the pass-through dependency on dbt-core that our reliance on dbt-adapters imposes on us. This is the first step in removing metricflow's test package dependencies on dbt adapters - a SQLAlchemy-based client that has many of the same test configuration advantages of the AdapterBackedSqlClient without the dbt dependencies. The base functionality is in place, and all tests run against all engines pass (and all environments build), but the client itself is not currently in use and as such there may be runtime flaws that we have not yet detected. This is not totally desirable, but I'm too lazy to adjust the robot's approach to avoid what amoutns to a dead code commit. Subsequent changes will enable this client across all of the engines, and we will fix up any issues we uncover as we go.

This removes the dbt-duckdb dependency from the base dev-env and moves all tests over to use the SqlAlchemySqlClient instead of the AdapterBackedSqlCLient. It includes one necessary update to error messaging snapshots, as the dbt adapters use a custom exception wrapper that does dbt-specific formatting. With this we are ready to proceed with moving over engine-specific clients.

This change moves Trino over with the minimum required updates for making things work. In particular: 1. Trino uses a catalog rather than a database, and in their sqlalchemy dialect they override the standard URL class to always be a string url. Since that was a nuisance I chose to simply overload the database element in the standard SqlAlchemy URL format and URL class to be the catalog for the Trino case. This works in our tests, and probably works in general, but it might not be robust in all scenarios. 2. For some reason semi-colons in sql expressions were causing syntax errors with the sqlalchemy Trino client. I guess the dbt-trino adapter (or the base dbt adapter) was removing trailing semicolons on execute.

This includes a bug fix for what turned out to be a dead code branch, which the dbt adapter integration did not use but we do. Now the httppath parameter is parsed correctly and we can use SqlAlchemy with Databricks.

Snowflake has some strange ideas about how to render case-insensitive database object names, so historically we've had these capitalized names in our snapshots. Snowflake's SQLAlchemy connector hews to the SQLAlchemy standard of rendering case-insensitive object names in lower-case, so this change updates our snapshots accordingly.

Making Redshift work with SqlAlchemy was a bit trickier than expected, as the "official" SqlAlchemy client for Redshift hasn't been updated in 2 years and does not work with SqlAlchemy 2.x. As a hack around this problem, this commit adds a bare-bones custom dialect override for psycopg2 that clobbers some class property to allow for backwards compatibility with Redshift. Since we only ever pass in sql query strings as sa_text() and always qualify our relation names with the schema inline in the queries this works, but if we ever need to relax those constraints we'll either need to find a more full-featured dialect for Redshift or else switch to using the redshift-connector package directly.

BigQuery has some special connection requirements, and does not support the EXPLAIN keyword, so the dry run configuration is also special. This updates the client accordingly.

tlento · 2026-02-05T01:39:58Z

Merge activity

Feb 5, 1:39 AM UTC: A user started a stack merge that includes this pull request via Graphite.
Feb 5, 1:40 AM UTC: @tlento merged this pull request with Graphite.

cla-bot bot added the cla:yes label Feb 2, 2026

github-advanced-security bot found potential problems Feb 2, 2026

View reviewed changes

tests_metricflow/fixtures/sql_client_fixtures.py Fixed Show fixed Hide fixed

tlento force-pushed the add-sqlalchemy-sqlclient branch 2 times, most recently from c5c977c to 648f98f Compare February 2, 2026 18:21

tlento added the Run Tests With Other SQL Engines Runs the test suite against the SQL engines in our target environment label Feb 2, 2026

tlento temporarily deployed to DW_INTEGRATION_TESTS February 2, 2026 19:02 — with GitHub Actions Inactive

tlento mentioned this pull request Feb 2, 2026

Remove all dbt dependencies from metricflow test environments #1967

Merged

tlento marked this pull request as ready for review February 2, 2026 20:10

tlento requested a review from a team as a code owner February 2, 2026 20:10

tlento removed the Run Tests With Other SQL Engines Runs the test suite against the SQL engines in our target environment label Feb 2, 2026

plypaul approved these changes Feb 3, 2026

View reviewed changes

tlento mentioned this pull request Feb 3, 2026

[DO NOT MERGE] Change prefix for populate source schema #1968

Closed

tlento commented Feb 3, 2026

View reviewed changes

tlento added 10 commits February 4, 2026 19:31

Cut postgres tests over to SqlAlchemySqlClient

0fa204f

Cut Databricks tests over to SqlAlchemySqlClient

d73f962

This includes a bug fix for what turned out to be a dead code branch, which the dbt adapter integration did not use but we do. Now the httppath parameter is parsed correctly and we can use SqlAlchemy with Databricks.

Cut BigQuery tests over to SqlAlchemySqlClient

da8fde4

BigQuery has some special connection requirements, and does not support the EXPLAIN keyword, so the dry run configuration is also special. This updates the client accordingly.

Changelog

dddf89f

Address comments

5a5ef7e

tlento force-pushed the add-sqlalchemy-sqlclient branch from 648f98f to 5a5ef7e Compare February 5, 2026 00:56

tlento merged commit 182e6da into main Feb 5, 2026
14 checks passed

tlento deleted the add-sqlalchemy-sqlclient branch February 5, 2026 01:40

Conversation

tlento commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 2, 2026

Uh oh!

tlento commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

plypaul left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

plypaul Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

tlento Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

tlento left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tlento Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tlento commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tlento commented Feb 2, 2026 •

edited

Loading

tlento commented Feb 2, 2026 •

edited

Loading

tlento commented Feb 5, 2026 •

edited

Loading