Skip to content

Conversation

@kosiew
Copy link
Contributor

@kosiew kosiew commented Oct 21, 2025

Which issue does this PR close?

Rationale for this change

The current implementation of the nvl2 function in DataFusion eagerly evaluates all its arguments, which can lead to unnecessary computation and incorrect behavior when handling expressions that should only be conditionally evaluated. This PR introduces lazy evaluation for nvl2, aligning its behavior with other conditional expressions like coalesce and improving both performance and correctness.

This change also introduces a simplification rule that rewrites nvl2 expressions into equivalent CASE statements, allowing for better optimization during query planning and execution.

What changes are included in this PR?

  • Refactored nvl2 implementation in datafusion/functions/src/core/nvl2.rs:

    • Added support for short-circuit (lazy) evaluation using short_circuits().
    • Implemented simplify() method to rewrite expressions into CASE form.
    • Introduced return_field_from_args() for correct nullability and type inference.
    • Replaced the previous eager nvl2_func() logic with an optimized, more declarative approach.
  • Added comprehensive unit tests:

    • test_nvl2_short_circuit in dataframe_functions.rs verifies correct short-circuit behavior.
    • test_create_physical_expr_nvl2 in expr_api/mod.rs validates physical expression creation and output correctness.

Are these changes tested?

✅ Yes, multiple new tests are included:

  • test_nvl2_short_circuit ensures nvl2 does not evaluate unnecessary branches.
  • test_create_physical_expr_nvl2 checks the correctness of evaluation and type coercion behavior.

All existing and new tests pass successfully.

Are there any user-facing changes?

Yes, but they are non-breaking and performance-enhancing:

  • nvl2 now evaluates lazily, meaning only the required branch is computed based on the nullity of the test expression.
  • Expression simplification will yield more optimized query plans.

There are no API-breaking changes. However, users may observe improved performance and reduced computation for expressions involving nvl2.

…ullability handling, and marked the evaluator as short-circuiting to avoid unreachable execution paths.

Added a dataframe regression test that exercises nvl2 with a potentially failing branch to confirm lazy evaluation behaviour.
Handle NVL2 execution when simplifier is skipped using null
masks to select between branch values. Add a regression test
for expr_api to validate SessionContext::create_physical_expr
with NVL2, ensuring successful evaluation without prior
simplification.
@github-actions github-actions bot added core Core DataFusion crate functions Changes to functions implementation labels Oct 21, 2025
@kosiew kosiew marked this pull request as ready for review October 21, 2025 08:50

fn invoke_with_args(&self, args: ScalarFunctionArgs) -> Result<ColumnarValue> {
nvl2_func(&args.args)
let [test, if_non_null, if_null] = take_function_args(self.name(), args.args)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #17357 the author chose to make invoke_with_args return an internal error instead of retaining the implementation. Would we want to do the same here?

I'm a bit on the fence myself. On the one hand, this is effectively dead code for most users. On the other hand, raising an error here may cause breakage for users who have customised their optimiser passes and are not doing simplification. No idea if anyone actually does that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point 🤔

Personally I'm of the mind to remove this impl and have it return error; part of the benefit of this PR is reducing the amount of code we'd need to maintain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make invoke_with_args return an internal error instead of retaining the implementation.

The new fallback evaluator is exercised directly in test_create_physical_expr_nvl2, which builds physical expressions without running the simplifier. Returning an error here would regress those non-simplified code paths.

It isn’t just about satisfying a unit test, it's about preserving a supported API surface. SessionContext::create_physical_expr explicitly states that it performs coercion and rewrite passes but does not run the expression simplifier, so any expression handed directly to that API must still execute correctly without being rewritten to a CASE statement first.
The test_create_physical_expr_nvl2 fixture exercises exactly that public workflow by building a physical expression through SessionContext::create_physical_expr and evaluating it without simplification.
If we changed invoke_with_args to return an error, that flow would regress for library users in the same way it would fail for the test.

Rather than removing or rewriting the test, I think we should keep it to guard this behavior; it’s effectively documenting that nvl2 continues to work for consumers who rely on the non-simplifying physical-expr builder, which the function implementation currently supports.

I recommend keeping the implementation so those tests—and any downstream consumers that bypass simplification—continue to work.

Copy link
Contributor

@pepijnve pepijnve Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change in coalesce (and now indirectly also nvl/ifnull) already broke this though. If unsimplified execution is desirable, perhaps nvl should be restored too because to not have arbitrary behaviour depending on the used UDF. In other words, I think you have to be consistent about this. Either all physical exprs should work or you shouldn’t bother with this. Cherry picking is a bit pointless in my opinion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any expression handed directly to that API must still execute correctly without being rewritten to a CASE statement first.

One subtlety here is that there is a change in semantics before and after simplification. nvl2(1, 1, 1 / 0) will fail pre simplification but will work correctly once simplified due to the switch from eager to lazy evaluation. I think I would prefer a clear failure over a subtle difference in behaviour.

If we do want to keep the invoke_with_args implementations, one option could be to consider #17997 (or some variant of that idea) so that it can also be implemented lazily.

Regarding code maintenance/duplication, nvl2 is an instance of the ExpressionOrExpression evaluation method from CaseExpr. Perhaps a slightly modified version of CaseExpr::expr_or_expr could be made so that nvl and nvl2 could call that? I think what I'm trying to say is that maybe code reuse via simplify is maybe not the best idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #17357 the author chose to make invoke_with_args return an internal error instead of retaining the implementation. Would we want to do the same here?

I amended invoke_with_args to return internal_err for consistency and also to reduce code.


fn invoke_with_args(&self, args: ScalarFunctionArgs) -> Result<ColumnarValue> {
nvl2_func(&args.args)
let [test, if_non_null, if_null] = take_function_args(self.name(), args.args)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point 🤔

Personally I'm of the mind to remove this impl and have it return error; part of the benefit of this PR is reducing the amount of code we'd need to maintain.

kosiew and others added 2 commits October 22, 2025 11:13
…cution without being simplified to a CASE expression, removing the eager evaluation helpers that previously enforced eager semantics.

Updated the expr_api integration test to assert that unsimplified nvl2 evaluation now fails with the expected internal error message.
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb
Copy link
Contributor

alamb commented Oct 23, 2025

This is one we probably should run the extended tests on too.
I started it locally with

INCLUDE_SQLITE=true cargo test --profile release-nonlto --test sqllogictests

And will report back

@alamb
Copy link
Contributor

alamb commented Oct 24, 2025

Extended tests passed for me. Thanks again @kosiew @pepijnve and @Jefffrey

@alamb alamb added this pull request to the merge queue Oct 24, 2025
Merged via the queue into apache:main with commit 22c4214 Oct 24, 2025
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate functions Changes to functions implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement lazy evaluation for nvl2

4 participants