Skip to content

Only compute bounds/ dynamic filters if consumer asks for it #17527

@LiaCastaneda

Description

@LiaCastaneda

Describe the bug

Just a follow up to this comment.

Currently, DataFusion computes bounds for all queries that contain a HashJoinExec node whenever the option enable_dynamic_filter_pushdown is set to true (default). It might make sense to compute these bounds only when we explicitly know there is a consumer that will use them.

One way to achieve this could be during physical planning: while traversing the plan, check whether there is any scan/leaf node that is “interested in” or supports dynamic filters (determined by gather_filters_for_pushdown). This might just require adding some logic to the filter pushdown optimization rule itself I think?

Then, only if there is at least one interested consumer that accepts the DynamicFilterPhysicalExpr, set a flag on HashJoinExec to build the bounds accumulator, otherwise, skip bounds computation entirely.

To Reproduce

No response

Expected behavior

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions