Skip to content

Incorrect result with cum_sum in dynamic_group_by ternary expression #24566

@kdn36

Description

@kdn36

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import polars as pl


df = pl.select(pl.date_range(pl.date(2023, 1, 1), pl.date(2023, 1, 5))).with_row_index()

out = df.group_by_dynamic(
    index_column="date",
    period="3d",
    every="1d",
).agg(
    [
        pl.col("index"),
        pl.when(pl.col("date") >= pl.col("date"))
        .then(pl.col("index").cast(pl.Int64).cum_sum())
        .last()
        .alias("cum_sum_last"),
    ],
)
print(out)

Log output

shape: (5, 3)
┌────────────┬───────────┬──────────────┐
│ date       ┆ index     ┆ cum_sum_last │
│ ---        ┆ ---       ┆ ---          │
│ date       ┆ list[u32] ┆ i64          │
╞════════════╪═══════════╪══════════════╡
│ 2023-01-01 ┆ [0, 1, 2] ┆ 3            │
│ 2023-01-02 ┆ [1, 2, 3] ┆ 1            │
│ 2023-01-03 ┆ [2, 3, 4] ┆ 3            │
│ 2023-01-04 ┆ [3, 4]    ┆ 3            │
│ 2023-01-05 ┆ [4]       ┆ 3            │
└────────────┴───────────┴──────────────┘

Issue description

The cum_sum_last result is incorrect. Internal note: the issue is related to how the groups are not updated correctly throughout the evaluation of the expression (see unroll() path).

Expected behavior

Expected result:

shape: (5, 3)
┌────────────┬───────────┬──────────────┐
│ date       ┆ index     ┆ cum_sum_last │
│ ---        ┆ ---       ┆ ---          │
│ date       ┆ list[u32] ┆ i64          │
╞════════════╪═══════════╪══════════════╡
│ 2023-01-01 ┆ [0, 1, 2] ┆ 3            │
│ 2023-01-02 ┆ [1, 2, 3] ┆ 6            │
│ 2023-01-03 ┆ [2, 3, 4] ┆ 9            │
│ 2023-01-04 ┆ [3, 4]    ┆ 7            │
│ 2023-01-05 ┆ [4]       ┆ 4            │
└────────────┴───────────┴──────────────┘

Which is what we get with the following expression (note the change in the when predicate):

out = df.group_by_dynamic(
    index_column="date",
    period="3d",
    every="1d",
).agg(
    [
        pl.col("index"),
        pl.when([True])
        .then(pl.col("index").cast(pl.Int64).cum_sum())
        .last()
        .alias("cum_sum_last"),
    ],
)

Installed versions

--------Version info---------
Polars:              1.33.1
Index type:          UInt32
Platform:            Linux-6.6.87.2-microsoft-standard-WSL2-x86_64-with-glibc2.39
Python:              3.13.5 (main, Jul 18 2025, 09:47:32) [GCC 13.3.0]
LTS CPU:             False

Metadata

Metadata

Assignees

Labels

A-rollingP-highPriority: highacceptedReady for implementationbugSomething isn't workingpythonRelated to Python Polars

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions