Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions datafusion/physical-plan/src/joins/cross_join.rs
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,12 @@ impl CrossJoinExec {
/// Returns a new `ExecutionPlan` that computes the same join as this one,
/// with the left and right inputs swapped using the specified
/// `partition_mode`.
///
/// # Notes:
///
/// This function should be called BEFORE inserting any repartitioning
/// operators on the join's children. Check [`super::HashJoinExec::swap_inputs`]
/// for more details.
pub fn swap_inputs(&self) -> Result<Arc<dyn ExecutionPlan>> {
let new_join =
CrossJoinExec::new(Arc::clone(&self.right), Arc::clone(&self.left));
Expand Down
15 changes: 15 additions & 0 deletions datafusion/physical-plan/src/joins/hash_join/exec.rs
Original file line number Diff line number Diff line change
Expand Up @@ -638,6 +638,21 @@ impl HashJoinExec {
///
/// This function is public so other downstream projects can use it to
/// construct `HashJoinExec` with right side as the build side.
///
/// For using this interface directly, please refer to below:
///
/// Hash join execution may require specific input partitioning (for example,
/// the left child may have a single partition while the right child has multiple).
///
/// Calling this function on join nodes whose children have already been repartitioned
/// (e.g., after a `RepartitionExec` has been inserted) may break the partitioning
/// requirements of the hash join. Therefore, ensure you call this function
/// before inserting any repartitioning operators on the join's children.
///
/// In DataFusion's default SQL interface, this function is used by the `JoinSelection`
/// physical optimizer rule to determine a good join order, which is
/// executed before the `EnforceDistribution` rule (the rule that may
/// insert `RepartitionExec` operators).
pub fn swap_inputs(
&self,
partition_mode: PartitionMode,
Expand Down
6 changes: 6 additions & 0 deletions datafusion/physical-plan/src/joins/nested_loop_join.rs
Original file line number Diff line number Diff line change
Expand Up @@ -348,6 +348,12 @@ impl NestedLoopJoinExec {

/// Returns a new `ExecutionPlan` that runs NestedLoopsJoins with the left
/// and right inputs swapped.
///
/// # Notes:
///
/// This function should be called BEFORE inserting any repartitioning
/// operators on the join's children. Check [`super::HashJoinExec::swap_inputs`]
/// for more details.
pub fn swap_inputs(&self) -> Result<Arc<dyn ExecutionPlan>> {
let left = self.left();
let right = self.right();
Expand Down
5 changes: 5 additions & 0 deletions datafusion/physical-plan/src/joins/sort_merge_join/exec.rs
Original file line number Diff line number Diff line change
Expand Up @@ -304,6 +304,11 @@ impl SortMergeJoinExec {
))
}

/// # Notes:
///
/// This function should be called BEFORE inserting any repartitioning
/// operators on the join's children. Check [`super::super::HashJoinExec::swap_inputs`]
/// for more details.
pub fn swap_inputs(&self) -> Result<Arc<dyn ExecutionPlan>> {
let left = self.left();
let right = self.right();
Expand Down