-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Improved experience when remote object store URL does not end in / #17364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@alamb My apologies, I have resubmitted this fix. Could you please review it again? Thank you. |
Do you have time to help me take a look? @alamb |
Sorry @xiedeyantu -- I have been on vacation. I am looking at this PR now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @xiedeyantu -- this looks like a great start.
I left some comments about how to improve this PR's code, but in the process of reviewing this PR I ended up making many of the changes locally for your consideration here:
Since this PR makes changes to the core datafusion crate, I thought we should add some additional tests in the datafusion crate. I was writing up how this might look but ended up writing them directly. Please take a look:
Simplify implementation, add tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @xiedeyantu , this is a nice improvement I think
I also tested the reproducer from #17364 locally with this PR and it works great: DataFusion CLI v49.0.2
> CREATE EXTERNAL TABLE nyc_taxi_rides
STORED AS PARQUET LOCATION 's3://altinity-clickhouse-data/nyc_taxi_rides/data/tripdata_parquet';
0 row(s) fetched.
Elapsed 2.268 seconds.
> select * from nyc_taxi_rides limit 1;
+-------------+----+-----------+----------------------+----------------------+-----------------+---------------+------------------+-----------------+--------------+--------------------+-------------------+------------------+--------------+-------------+-------+---------+------------+--------------+-----------------------+--------------+--------------------+---------------------+-------+-------+
| pickup_date | id | vendor_id | pickup_datetime | dropoff_datetime | passenger_count | trip_distance | pickup_longitude | pickup_latitude | rate_code_id | store_and_fwd_flag | dropoff_longitude | dropoff_latitude | payment_type | fare_amount | extra | mta_tax | tip_amount | tolls_amount | improvement_surcharge | total_amount | pickup_location_id | dropoff_location_id | junk1 | junk2 |
+-------------+----+-----------+----------------------+----------------------+-----------------+---------------+------------------+-----------------+--------------+--------------------+-------------------+------------------+--------------+-------------+-------+---------+------------+--------------+-----------------------+--------------+--------------------+---------------------+-------+-------+
| 2009-01-31 | 0 | VTS | 2009-01-31T14:25:00Z | 2009-01-31T14:42:00Z | 4 | 6.12 | 0.0 | 0.0 | | | 0.0 | 0.0 | CASH | 16.5 | 0 | 0.0 | 0.0 | 0.0 | 0.0 | 16.5 | 0 | 0 | | |
+-------------+----+-----------+----------------------+----------------------+-----------------+---------------+------------------+-----------------+--------------+--------------------+-------------------+------------------+--------------+-------------+-------+---------+------------+--------------+-----------------------+--------------+--------------------+---------------------+-------+-------+
1 row(s) fetched.
Elapsed 0.579 seconds. |
Thank you for your help and guidance! @alamb |
Thank you for sticking with it! |
Which issue does this PR close?
/
(retry as paths) #16302Rationale for this change
It would be automatically add a / to the path if the first one was not found and try again.
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?