Skip to content

Conversation

adriangb
Copy link
Contributor

@adriangb adriangb commented Sep 3, 2025

Summary

  • Fixed issue where distinct_count was not properly handled when merging statistics
  • Set distinct_count to Precision::Absent during merge operations as the actual distinct count cannot be accurately determined from merged statistics

Test plan

  • Added test test_try_merge_distinct_count_absent to verify distinct_count becomes Absent after merge
  • All existing tests continue to pass

🤖 Generated with Claude Code

When merging statistics, the distinct count cannot be accurately
determined from the merged data, so it should be set to Absent
rather than attempting to combine the values.

Added test to verify distinct_count becomes Absent after merge.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@github-actions github-actions bot added the common Related to common crate label Sep 3, 2025
Copy link
Contributor

@crepererum crepererum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you

@crepererum crepererum merged commit 71d3d29 into apache:main Sep 4, 2025
28 checks passed
@adriangb adriangb deleted the drop-distinct-count branch September 4, 2025 13:31
destrex271 pushed a commit to destrex271/datafusion that referenced this pull request Sep 5, 2025
When merging statistics, the distinct count cannot be accurately
determined from the merged data, so it should be set to Absent
rather than attempting to combine the values.

Added test to verify distinct_count becomes Absent after merge.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: Claude <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
common Related to common crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants