-
Notifications
You must be signed in to change notification settings - Fork 286
Description
Summary
Add a profile credentials option (drop_without_lock or similar) to skip the global reentrant lock that currently serializes all DROP statement execution in dbt-redshift.
Background
dbt-redshift wraps every drop_relation() call in a fresh_transaction() context manager that acquires a process-wide RLock. This was introduced to prevent the race condition where DROP ... CASCADE on two tables concurrently can raise:
table was dropped by a concurrent transaction
The lock is correct and safe for the general case. However, it forces all DROP statements to execute serially, which negates the parallelism configured via threads. For users whose projects contain no downstream views (i.e., no CASCADE side-effects to worry about), this lock is unnecessary overhead.
Proposed Change
Add a boolean profile credential field drop_without_lock (default False) that, when True, skips acquiring the lock in drop_relation().
Files to change
dbt-redshift/src/dbt/adapters/redshift/connections.py
Add to RedshiftCredentials dataclass (~line 190):
# Drop behavior
drop_without_lock: bool = FalseAdd to _connection_keys() (~line 228):
"drop_without_lock",dbt-redshift/src/dbt/adapters/redshift/impl.py
Modify drop_relation() (lines 147–163):
def drop_relation(self, relation):
"""
In Redshift, DROP TABLE ... CASCADE should not be used
inside a transaction. Redshift doesn't prevent the CASCADE
part from conflicting with concurrent transactions. If we do
attempt to drop two tables with CASCADE at once, we'll often
get the dreaded:
table was dropped by a concurrent transaction
So, we need to lock around calls to the underlying
drop_relation() function.
https://docs.aws.amazon.com/redshift/latest/dg/r_DROP_TABLE.html
Users with no downstream views (no CASCADE side-effects) can opt out of
the lock via `drop_without_lock: true` in their profile for better parallelism.
"""
if self.config.credentials.drop_without_lock:
return super().drop_relation(relation)
with self.connections.fresh_transaction():
return super().drop_relation(relation)The fresh_transaction() method in connections.py (lines 600–620) that holds the RLock remains unchanged — this option simply bypasses it.
Safety
- Default is
False— no change in behavior for existing users. - Safe only for projects with no downstream views (unbound views, no
CREATE VIEW ... AS SELECT FROM <table>). Users must opt in explicitly. - Should be documented with a clear warning about the concurrent transaction risk.
Related
- The lock was introduced specifically due to
CASCADEsemantics. Issue [EPIC] Setup dbt-adapter packaging and release process #2 (skip CASCADE) pairs naturally with this: if a user sets both options, they get maximum parallelization with no risk.