- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1k
Open
Labels
enhancementAny new improvement worthy of a entry in the changelogAny new improvement worthy of a entry in the changelog
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Arrow has added REE support apache/arrow#14176, similar to dictionary arrays that allow repeated values to be encoded in a space efficient manner that also allows fast processing.
Describe the solution you'd like
Implement REE in arrow-rs. Some likely candidate:
- Support in DataType
- Support in ArrayData
- New REE array
- Support REE in IPC
- Support REE in cast kernels
- Support REE in compute kernels
Describe alternatives you've considered
Remaining tasks:
- arrow-row: Add support for REE #7649
-  arrow-select: Implement concat for RunArrays #7487
-  arrow-data: Add REE support for build_extendandbuild_extend_nulls#7671
-  Implement PartialEqfor RunArray #7691
- Reduce repetition in tests for arrow-row/src/run.rs #7692
- Improve performance of RunArray --> Row conversion #7693
- Potential Optimization for interleave/take on RunEndEncoded arrays #7710
- Implemented casting for RunEnd Encoding #7713
Additional context
Among other things, @brancz is working to improve aggregation performance in DataFusion using Runarrays, see
stuartcarnie, kylebarron, suremarc, asubiotto and vegarsti
Metadata
Metadata
Assignees
Labels
enhancementAny new improvement worthy of a entry in the changelogAny new improvement worthy of a entry in the changelog