Skip to content

feat: Add maintain_order parameter to merge_sorted#27263

Open
jonathanchang31 wants to merge 3 commits intopola-rs:mainfrom
jonathanchang31:feat/merge-sorted-maintain-order
Open

feat: Add maintain_order parameter to merge_sorted#27263
jonathanchang31 wants to merge 3 commits intopola-rs:mainfrom
jonathanchang31:feat/merge-sorted-maintain-order

Conversation

@jonathanchang31
Copy link
Copy Markdown

Summary

Closes #27114.

Adds a maintain_order: bool parameter to merge_sorted(). When set to True, the output is guaranteed to have left-biased ordering for equal keys: rows from the left frame always appear before rows from the right frame when their keys match.

  • Threads maintain_order through the full stack: Python API → PyO3 bindings → DSL/IR plan → streaming engine → in-memory engine
  • Core streaming fix: find_mergeable() holds back right-side rows at chunk boundaries whose keys equal the left's maximum (uses gt_eq instead of gt), preventing right-side ties from being emitted before left-side ties arriving in later morsels
  • The in-memory engine already produces left-biased output, so the flag is only load-bearing in the streaming path
  • Defaults to False for backward compatibility

Test plan

  • 3 new test functions (6 cases) covering both streaming and in-memory engines:
    • Basic left-biased ordering with overlapping keys
    • All keys identical
    • DataFrame.merge_sorted() API path
  • All 41 existing merge_sorted tests pass with no regressions
  • Rust cargo check --features merge_sorted passes

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars first-contribution First contribution by user labels Apr 10, 2026
@github-actions github-actions bot added the changes-dsl Do not merge if this label is present and red. label Apr 10, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 10, 2026

Codecov Report

❌ Patch coverage is 87.87879% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.59%. Comparing base (880651f) to head (ef3dafe).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
crates/polars-plan/src/plans/visitor/hash.rs 0.00% 2 Missing ⚠️
...rates/polars-python/src/lazyframe/visitor/nodes.rs 0.00% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main   #27263   +/-   ##
=======================================
  Coverage   81.58%   81.59%           
=======================================
  Files        1820     1820           
  Lines      251036   251083   +47     
  Branches     3149     3149           
=======================================
+ Hits       204808   204867   +59     
+ Misses      45420    45408   -12     
  Partials      808      808           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changes-dsl Do not merge if this label is present and red. enhancement New feature or an improvement of an existing feature first-contribution First contribution by user python Related to Python Polars rust Related to Rust Polars

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add maintain_order: bool to merge_sorted()

1 participant