fix(databricks): lazily create memtable volume at first use#11962
fix(databricks): lazily create memtable volume at first use#11962NickCrews wants to merge 1 commit intoibis-project:mainfrom
Conversation
42b386f to
4429f4c
Compare
|
@VENKATASAI1994 what do you think of this alternate formulation? |
Looks good now and works, i have tested this with read only. |
Fixes ibis-project#11598 Previously, the memtable volume was created eagerly in _post_connect() immediately after connecting, which caused read-only users to fail with a PermissionDenied error even when they never intended to use memtables. This change defers volume creation until the first memtable is actually registered via _register_in_memory_table(), allowing read-only users to connect and query existing data without requiring CREATE VOLUME privilege.
4429f4c to
2823bf9
Compare
|
The failing check is for codecov and I'm fine merging with it failing, as I said I don't see an easy way to test. @deepyaman or @gforsyth perhaps you have the bandwidth for a review of this to help unblock @VENKATASAI1994 (and several others who have chimed in on #11598). I think the new implementation is pretty easy to understand, and I think I described the rationale and the changes pretty well in the top comment in this PR. Thanks in case you do have time! |
Fixes #11598
Previously, the memtable volume was created eagerly in _post_connect() immediately after connecting, which caused read-only users to fail with a PermissionDenied error even when they never intended to use memtables.
This change defers volume creation until the first memtable is actually registered via _register_in_memory_table(), allowing read-only users to connect and query existing data without requiring CREATE VOLUME privilege.
This is a redo of #11956,
where I take a more complete approach:
memtable_volumeparam is interpreted. Before, it was interpreted as just the suffix to a larger path that was generated, eg if you passedfoothen the memtables would actually be created at/Volumes/the_current_catalog_at_init_time/the_current_database_at_init_time/foo. After this change, you just get the plainfoo, so the called has much more control over what they want the volume to be.I didn't add tests because I couldn't find a way to easily create a readonly connection. Perhaps we could create a new role in our databricks account that we use for testing? But mostly I treat this as a refactor, so if existing tests don't begin failing, that's good enough for me.