Skip to content

Basic read/write support for ORC #2236

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

mccormickt12
Copy link

@mccormickt12 mccormickt12 commented Jul 22, 2025

Related to #20
Some reference to https://github.com/apache/iceberg-python/pull/790/files

Rationale for this change

Support basic reading and writing of ORC data

Are these changes tested?

Unit tests added

Are there any user-facing changes?

From what I understand there are some discrepancies between pyarrow version with respect to their support for ORC, so I need to do a a bit more

Next Steps

This PR is the most basic read/write support for ORC. There also seems to be limited support for ORC in pyarrow (im still trying to understand exactly whats missing). I want to enhance the pyarrow ORC library, so this can be properly consumed here in pyiceberg

@kevinjqliu
Copy link
Contributor

im trying to finish up some of the remaining items so we can release 0.10, https://github.com/apache/iceberg-python/milestone/10

will take a look at this PR right after :)

@mccormickt12
Copy link
Author

mccormickt12 commented Aug 7, 2025

im trying to finish up some of the remaining items so we can release 0.10, https://github.com/apache/iceberg-python/milestone/10

will take a look at this PR right after :)

@kevinjqliu Feel free to take a look. Im not going to try to merge this PR, its more for visibility early review.I will separate into read and write PRs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants