Skip to content

Conversation

@jiaqizho
Copy link
Contributor

Fixes #ISSUE_Number

What does this PR do?

Type of Change

  • Bug fix (non-breaking change)
  • New feature (non-breaking change)
  • Breaking change (fix or feature with breaking changes)
  • Documentation update

Breaking Changes

Test Plan

  • Unit tests added/updated
  • Integration tests added/updated
  • Passed make installcheck
  • Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Additional Context

CI Skip Instructions


@my-ship-it
Copy link
Contributor

Just tried:

-- 状态转换函数:处理单个输入值并更新状态
CREATE OR REPLACE FUNCTION avg_trans_fn(
    state numeric[],  -- 当前状态:[sum, count]
    next_val numeric  -- 新输入值
) RETURNS numeric[] AS $$
BEGIN
    IF next_val IS NOT NULL THEN
        state[1] := COALESCE(state[1], 0) + next_val;
        state[2] := COALESCE(state[2], 0) + 1;
    END IF;
    RETURN state;
END;
$$ LANGUAGE plpgsql;

-- 合并函数:合并两个部分聚合状态
CREATE OR REPLACE FUNCTION avg_combine_fn(
    state1 numeric[],
    state2 numeric[]
) RETURNS numeric[] AS $$
BEGIN
    IF state1 IS NULL THEN
        RETURN state2;
    ELSIF state2 IS NULL THEN
        RETURN state1;
    ELSE
        RETURN ARRAY[state1[1] + state2[1], state1[2] + state2[2]];
    END IF;
END;
$$ LANGUAGE plpgsql;

-- 最终函数:从状态计算最终结果
CREATE OR REPLACE FUNCTION avg_final_fn(
    state numeric[]
) RETURNS numeric AS $$
BEGIN
    IF state[2] = 0 THEN
        RETURN NULL;
    ELSE
        RETURN state[1] / state[2];
    END IF;
END;
$$ LANGUAGE plpgsql;

CREATE AGGREGATE my_avg(numeric) (
    sfunc = avg_trans_fn,       -- 状态转换函数
    stype = numeric[],          -- 状态类型
    combinefunc = avg_combine_fn, -- 合并函数
    finalfunc = avg_final_fn,   -- 最终计算函数
    initcond = '{0,0}'          -- 初始状态
);

CREATE TABLE tbl1(v1 int, v2 numeric);
CREATE TABLE tbl2(v1 int, v2 numeric);

The agg pushdown doesn't work in ORCA but works in postgres Postgres query optimizer.
image

@jiaqizho
Copy link
Contributor Author

Just tried:

-- 状态转换函数:处理单个输入值并更新状态
CREATE OR REPLACE FUNCTION avg_trans_fn(
    state numeric[],  -- 当前状态:[sum, count]
    next_val numeric  -- 新输入值
) RETURNS numeric[] AS $$
BEGIN
    IF next_val IS NOT NULL THEN
        state[1] := COALESCE(state[1], 0) + next_val;
        state[2] := COALESCE(state[2], 0) + 1;
    END IF;
    RETURN state;
END;
$$ LANGUAGE plpgsql;

-- 合并函数:合并两个部分聚合状态
CREATE OR REPLACE FUNCTION avg_combine_fn(
    state1 numeric[],
    state2 numeric[]
) RETURNS numeric[] AS $$
BEGIN
    IF state1 IS NULL THEN
        RETURN state2;
    ELSIF state2 IS NULL THEN
        RETURN state1;
    ELSE
        RETURN ARRAY[state1[1] + state2[1], state1[2] + state2[2]];
    END IF;
END;
$$ LANGUAGE plpgsql;

-- 最终函数:从状态计算最终结果
CREATE OR REPLACE FUNCTION avg_final_fn(
    state numeric[]
) RETURNS numeric AS $$
BEGIN
    IF state[2] = 0 THEN
        RETURN NULL;
    ELSE
        RETURN state[1] / state[2];
    END IF;
END;
$$ LANGUAGE plpgsql;

CREATE AGGREGATE my_avg(numeric) (
    sfunc = avg_trans_fn,       -- 状态转换函数
    stype = numeric[],          -- 状态类型
    combinefunc = avg_combine_fn, -- 合并函数
    finalfunc = avg_final_fn,   -- 最终计算函数
    initcond = '{0,0}'          -- 初始状态
);

CREATE TABLE tbl1(v1 int, v2 numeric);
CREATE TABLE tbl2(v1 int, v2 numeric);

The agg pushdown doesn't work in ORCA but works in postgres Postgres query optimizer. image

Without any data in the table tbl1 and tbl2. ORCA choose the one-stage agg, but we do have the pushdown path.

we can use the optimizer_force_multistage_agg to force ORCA choose the multi-stage path(But won't create the no-pushdown path, because the enforce of distribution).

image

also use the my_avg

image

@jiaqizho jiaqizho force-pushed the push-local-agg-to-join branch from 2e02282 to 21c03c6 Compare June 25, 2025 10:14
@jiaqizho jiaqizho requested a review from my-ship-it June 26, 2025 01:44
Copy link
Contributor

@my-ship-it my-ship-it left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gongxun0928 gongxun0928 force-pushed the push-local-agg-to-join branch from 21c03c6 to 95dae30 Compare August 14, 2025 08:09
Different with CXformUtils::PexprPushGbBelowJoin(will pushdown the one-stage agg)
Push the partial agg only follow the three rules:
 1. The keyset of group by in(contains all) the keyset of join
 2. The output from join outer child contains the scalar from groupby
   - the scalar from groupby is not the groupby key, but it's the output of agg
 3. The output of group by contains(all) the join scalar used from outer child

Partial agg cannot be pushed through join because
(1) no group by keysets in(contains all) join keysets, or
(3) Gb uses columns from both join children, or
(4) Gb hides columns required for the join scalar child

Not like the CXformUtils::PexprPushGbBelowJoin, pushdown a partial agg **NO REQUIRED**
the unique key in the key of join inner child.

Also we should **NEVER** pushdown the multi-stage agg from `CXformSplitDQA`(the DISTINCT case).
@jiaqizho jiaqizho force-pushed the push-local-agg-to-join branch from 95dae30 to 1c382e7 Compare August 15, 2025 06:43
@jiaqizho jiaqizho merged commit 7752ebd into apache:main Aug 15, 2025
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants