Skip to content

Conversation

Tpt
Copy link
Contributor

@Tpt Tpt commented Oct 20, 2025

  • introduce a new "result table" intermediate table storing all already emitted results
  • use existing physical operators to deduplicate the output of both static and recursive terms and remove from the recursive term output the already emitted results
  • add a simple test of a transitive closure on a cyclic graph

This is very naive and slow. It would be much better to build the result table as a hash table inside of RecursiveQueryExec and use if for deduplication, removing the need for any extra operator. I guess it will requires significant coding because we need an incremental hash map (build and probe steps are interleaved).

Edit: made it a draft, I want to experiment with hash table first.

@github-actions github-actions bot added logical-expr Logical plan and expressions core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) physical-plan Changes to the physical-plan crate labels Oct 20, 2025
@Tpt Tpt force-pushed the tpt/distinct-cte branch from d05f004 to 5c7c816 Compare October 20, 2025 20:02
@Tpt Tpt marked this pull request as draft October 21, 2025 05:02
- introduce a new "result table" intermediate table storing all already emitted results
- use existing physical operators to deduplicate the output of both static and recursive terms and remove from the recursive term output the already emitted results
- add a simple test of a transitive closure on a cyclic graph
@Tpt Tpt force-pushed the tpt/distinct-cte branch from 5c7c816 to a517468 Compare October 21, 2025 18:46
@Tpt Tpt changed the title Naive deduplicating recursive CTE implementation feat: Naive deduplicating recursive CTE implementation Oct 21, 2025
@Tpt Tpt marked this pull request as ready for review October 21, 2025 19:29
@Tpt Tpt marked this pull request as draft October 22, 2025 19:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate logical-expr Logical plan and expressions physical-plan Changes to the physical-plan crate sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support deduplicating UNION in recursive CTE

1 participant