Trino s3 point queries taking unusually high time on hive connector #26937
Unanswered
p-chaturvedi
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I am using Trino to access data from a hive catalog from s3. I have a use case where partitioning is not possible and files which I am trying to access are encrypted text files so other connectors like iceberg are not the correct use case. The silver-lining is that I always have a full s3 path with me before querying.
aws s3 cp
command takes about 5-6 sec on my bucket while the equivalent Trino query takes about 5-6 mins. The apparent culprit is Trino doing a full ListObject even when full path is provided. If that is true, what's the thought process behind it and can that be optimised?Are there any tips to optimise Trino queries for my use-case? I want to use Trino even when I have no partitioning because, I can still do inter-catalog joins and have a central place for decryption logic.
example trino query I use:
SELECT content FROM hive.test.data where "$path" = 's3://<bucket>/<folder>/<filename>'
Beta Was this translation helpful? Give feedback.
All reactions