-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Describe the bug
Just a follow up to this comment.
Currently, DataFusion computes bounds for all queries that contain a HashJoinExec node whenever the option enable_dynamic_filter_pushdown is set to true (default). It might make sense to compute these bounds only when we explicitly know there is a consumer that will use them.
One way to achieve this could be during physical planning: while traversing the plan, check whether there is any scan/leaf node that is “interested in” or supports dynamic filters (determined by gather_filters_for_pushdown
). This might just require adding some logic to the filter pushdown optimization rule itself I think?
Then, only if there is at least one interested consumer that accepts the DynamicFilterPhysicalExpr
, set a flag on HashJoinExec to build the bounds accumulator, otherwise, skip bounds computation entirely.
To Reproduce
No response
Expected behavior
No response
Additional context
No response