Skip to content

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Jun 23, 2025

Which issue does this PR close?

Rationale for this change

Following @berkaysynnada 's suggestion in #14837 having parallel implementations of functions allows sync and async UDFs to drift apart.

Let's try and make the differences as small as possible

What changes are included in this PR?

  1. Make Avoid duplication of AsyncUDFImpl and instead use the same structures for ScalarUdfImpl
  2. Update examples to match

Are these changes tested?

Are there any user-facing changes?

API is changed, but it has not been released yet so this is not an external API change

@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates core Core DataFusion crate physical-plan Changes to the physical-plan crate labels Jun 23, 2025
async fn invoke_async_with_args(
&self,
args: AsyncScalarFunctionArgs,
args: ScalarFunctionArgs,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is interesting here that invoke_async_with_args has a copy of the config_options` 🤔 -- I think that is soemthing that @Omega359 has tried to get into normal scalar functions for a while

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to me. We allow the custom config. Allowing access to the config option can make UDF flexible.

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jun 23, 2025
@@ -35,34 +35,7 @@ use std::sync::Arc;
///
/// The name is chosen to mirror ScalarUDFImpl
#[async_trait]
pub trait AsyncScalarUDFImpl: Debug + Send + Sync {
/// the function cast as any
fn as_any(&self) -> &dyn Any;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of this PR is to remove this duplication from AsyncScalarUDFImpl and instead use ScalarUDFImpl instead

@alamb alamb marked this pull request as ready for review June 23, 2025 19:43
@alamb
Copy link
Contributor Author

alamb commented Jun 23, 2025

I feel like there may be some more duplication we can remove as part of the PhysicalExpr layer too but I don't have time to pursue that at the moment. Maybe @goldmedal can give it a look

@alamb alamb requested review from goldmedal and berkaysynnada June 23, 2025 20:23
Copy link
Contributor

@goldmedal goldmedal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alamb. Looks good to me 👍

async fn invoke_async_with_args(
&self,
args: AsyncScalarFunctionArgs,
args: ScalarFunctionArgs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to me. We allow the custom config. Allowing access to the config option can make UDF flexible.

@alamb
Copy link
Contributor Author

alamb commented Jun 26, 2025

Thanks again for the review @goldmedal

@alamb alamb merged commit b405380 into apache:main Jun 26, 2025
28 checks passed
@alamb alamb deleted the alamb/simplify_async_udfs branch June 26, 2025 15:25
adriangb pushed a commit to pydantic/datafusion that referenced this pull request Jun 27, 2025
* Simplify AsyncScalarUdfImpl so it extends ScalarUdfImpl

* Update one example

* Update one example

* prettier
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate documentation Improvements or additions to documentation logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates physical-plan Changes to the physical-plan crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update AsyncScalarUDFImpl API to match ScalarUDFImpl API
2 participants