fix(athena): partition tmp table in incremental to reduce batch scan cost by dtaniwaki · Pull Request #1711 · dbt-labs/dbt-adapters

dtaniwaki · 2026-03-04T07:10:59Z

resolves #1744
docs dbt-labs/docs.getdbt.com/#

Thank you for maintaining this project! I'd appreciate your review on this fix to the Athena adapter's incremental materialization.

Problem

In safe_create_table_as, the temporary=True branch always created the tmp table (__dbt_tmp) without partitioning (skip_partitioning=True), bypassing TOO_MANY_OPEN_PARTITIONS handling entirely.

This caused an O(N) scan cost for batch incremental inserts: since the tmp table had no partitions, every batch in batch_incremental_insert performed a full scan of __dbt_tmp. With N batches, the total cost was (N+1) × full scan.

Solution

Unified the temporary=True and temporary=False code paths in safe_create_table_as so that tmp tables now go through run_query_with_partitions_limit_catching and fall back to create_table_as_with_partitions on TOO_MANY_OPEN_PARTITIONS.

With this change, __dbt_tmp is created with partitions (when partitioned_by is configured), so each batch benefits from partition pruning. Total scan cost drops from (N+1) × full scan to roughly 2 × full scan.

Models without partitioned_by are unaffected — they produce an unpartitioned CTAS as before.

Checklist

I have read the contributing guide and understand what's expected of me
I have run this code in development and it appears to resolve the stated issue
This PR includes tests, or tests are not required/relevant for this PR
This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX

…al to reduce batch scan cost

…cost Signed-off-by: Daisuke Taniwaki <daisuketaniwaki@gmail.com>

nicor88 · 2026-03-11T06:40:55Z

@dtaniwaki, what you proposed here is against the original implementation decision - when we create a temporary table, we always write unpartitioned data, to then finally write to the final target location in a partitioned table.
We are aware that the first unpartititioned write lead to a full scan - this is because as you wrote, we want to bypass the TOO_MANY_OPEN_PARTITIONS entirely - therefore, I discurage to proceed with your approach as it can lead to other hidden issues.

dtaniwaki · 2026-03-11T08:55:16Z

@nicor88 I see. Then, how can we avoid massive full scan queries against huge data with massive number of partitions? I mistakenly created a model of this situation and waisted lots of money...

…al to reduce batch scan cost

nicor88 · 2026-03-13T08:04:56Z

@dtaniwaki how about introducing the approach that you suggested, but being able to control it via a "config" flag? The flag can be false by default, and properly documented, but then in your case you can set to "true".

doing so, we avoid any regretion, and you and other users are covered for those edge cases - I believe that that's the best compromise. Think about a good descriptive name for such configuration flag.

cla-bot bot added the cla:yes The PR author has signed the CLA label Mar 4, 2026

dtaniwaki marked this pull request as ready for review March 4, 2026 07:37

dtaniwaki requested a review from a team as a code owner March 4, 2026 07:37

dtaniwaki force-pushed the feat/athena-incremental-tmp-partition branch 2 times, most recently from bdd91c6 to 6ee0b04 Compare March 4, 2026 08:36

dtaniwaki added a commit to dtaniwaki/dbt-adapters that referenced this pull request Mar 6, 2026

Merge PR dbt-labs#1711: fix(athena): partition tmp table in increment…

b7091d4

…al to reduce batch scan cost

This was referenced Mar 11, 2026

[Bug] dbt-athena: TOO_MANY_OPEN_PARTITIONS crashes with TypeError for unpartitioned models #1742

Open

fix(athena): handle unpartitioned models in create_table_as_with_partitions #1743

Open

dtaniwaki force-pushed the feat/athena-incremental-tmp-partition branch from 6ee0b04 to 7cde3fb Compare March 11, 2026 05:13

fix(athena): partition tmp table in incremental to reduce batch scan …

589e727

…cost Signed-off-by: Daisuke Taniwaki <daisuketaniwaki@gmail.com>

dtaniwaki force-pushed the feat/athena-incremental-tmp-partition branch from 7cde3fb to 589e727 Compare March 11, 2026 05:14

dtaniwaki added a commit to dtaniwaki/dbt-adapters that referenced this pull request Mar 12, 2026

Merge PR dbt-labs#1711: fix(athena): partition tmp table in increment…

b0026cc

…al to reduce batch scan cost

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(athena): partition tmp table in incremental to reduce batch scan cost#1711

fix(athena): partition tmp table in incremental to reduce batch scan cost#1711
dtaniwaki wants to merge 1 commit intodbt-labs:mainfrom
dtaniwaki:feat/athena-incremental-tmp-partition

dtaniwaki commented Mar 4, 2026 •

edited

Loading

Uh oh!

nicor88 commented Mar 11, 2026

Uh oh!

dtaniwaki commented Mar 11, 2026

Uh oh!

nicor88 commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dtaniwaki commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Checklist

Uh oh!

nicor88 commented Mar 11, 2026

Uh oh!

dtaniwaki commented Mar 11, 2026

Uh oh!

nicor88 commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dtaniwaki commented Mar 4, 2026 •

edited

Loading