-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Feat: [datafusion-spark] Implement ceil function. #18174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| use datafusion_functions::utils::make_scalar_function; | ||
|
|
||
| /// <https://spark.apache.org/docs/latest/api/sql/index.html#ceil> | ||
| /// Difference between spark: There is no second optional argument to control the rounding behaviour. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have highlighted the difference between spark here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is... interesting
I did not realise our ceil accepts two arguments 🤯
> select ceil(50.5123, 0);
+------------------------+
| ceil(Float64(50.5123)) |
+------------------------+
| 51.0 |
+------------------------+
1 row(s) fetched.
Elapsed 0.006 seconds.
> select ceil(50.5123, 1);
+------------------------+
| ceil(Float64(50.5123)) |
+------------------------+
| 51.0 |
+------------------------+
1 row(s) fetched.
Elapsed 0.002 seconds.I'll try understand what's happening here 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raised #18175
| use datafusion_functions::utils::make_scalar_function; | ||
|
|
||
| /// <https://spark.apache.org/docs/latest/api/sql/index.html#ceil> | ||
| /// Difference between spark: There is no second optional argument to control the rounding behaviour. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is... interesting
I did not realise our ceil accepts two arguments 🤯
> select ceil(50.5123, 0);
+------------------------+
| ceil(Float64(50.5123)) |
+------------------------+
| 51.0 |
+------------------------+
1 row(s) fetched.
Elapsed 0.006 seconds.
> select ceil(50.5123, 1);
+------------------------+
| ceil(Float64(50.5123)) |
+------------------------+
| 51.0 |
+------------------------+
1 row(s) fetched.
Elapsed 0.002 seconds.I'll try understand what's happening here 🤔
| } | ||
|
|
||
| fn return_type(&self, _arg_types: &[DataType]) -> Result<DataType> { | ||
| Ok(Int64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is another difference; our ceil returns float I believe. So Spark requires integer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after some research while implementing decimal support figured this should return float as well 😅, thanks for raising this!
| Ok(Arc::new(array)) | ||
| } | ||
| Int64 => Ok(Arc::clone(&args[0])), | ||
| _ => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will decimal also be supported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is Comet's ceil implementation for reference. It supports decimal types.
https://github.com/apache/datafusion-comet/blob/main/native/spark-expr/src/math_funcs/ceil.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
decimal should be supported as well, right now it would be coerced as a float
Modify slt result assertions to return a float value.
0c1cd43 to
d306988
Compare
|
I wonder if we address #18175 then will we still need a separate |
I was not aware of the attempts made with #15958 and on having a look the return type has been kept the same as the input type (and for floor it is i128) so I am now doubtful whether my conclusion on what it should return is correct. Would love to have the thoughts of @shehabgamin and @andygrove for some clarity. |
@codetyri0n You can port over the implementation from Sail if you'd like: It should be a straightforward port over. Further, we have some additional tests not covered by the Spark test suite here: |
Which issue does this PR close?
datafusion-sparkSpark Compatible Functions #15914 and[EPIC] Migrate to functions in
datafusion-sparkcrate datafusion-comet#2084Closes #15916
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?