[SPARK-52848][SQL] Avoid cast to `Double` in casting TIME/TIMESTAMP to DECIMAL #51539

MaxGekk · 2025-07-17T16:28:09Z

What changes were proposed in this pull request?

In the PR, I propose to simplify casting TIME/TIMESTAMP to DECIMAL, and avoid intermediate casting to Double.

Why are the changes needed?

To avoid unnecessary arithmetic operations and to improve code maintenance.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

By running the affected test suites:

$ build/sbt "test:testOnly *CastWithAnsiOnSuite"
$ build/sbt "test:testOnly *CastWithAnsiOffSuite"
$ build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z cast.sql"

Was this patch authored or co-authored using generative AI tooling?

No.

MaxGekk · 2025-07-17T18:43:17Z

@uros-db Please, review this PR,

uros-db · 2025-07-17T18:51:02Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala

+    case TimestampType => buildCast[Long](_, t => changePrecision(
+        // 19 digits is enough to represent any TIMESTAMP value in Long.
+        // 6 digits of scale is for microseconds precision of TIMESTAMP values.
+        Decimal.apply(t, 19, 6), target))
+    case _: TimeType => buildCast[Long](_, t => changePrecision(
+      // 14 digits is enough to cover the full range of TIME value [0, 24:00) which is
+      // [0, 24 * 60 * 60 * 1000 * 1000 * 1000) = [0, 86400000000000).
+      // 9 digits of scale is for nanoseconds precision of TIME values.
+      Decimal.apply(t, precision = 14, scale = 9), target))


Are we really sure that we always want to use fixed precision and scale here?

We have to use fixed scale at lest to get correct decimal. And precision should guarantee that we cover full input range.

Yeah, that part is alright. However, as a consequence of this, we also get fixed precision/scale in error messages (e.g. #51539 (comment)). Let's continue the discussion over there.

uros-db · 2025-07-17T18:51:32Z

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastWithAnsiOnSuite.scala

@@ -810,7 +810,7 @@ class CastWithAnsiOnSuite extends CastSuiteBase with QueryErrorsBase {
        ),
        condition = "NUMERIC_VALUE_OUT_OF_RANGE.WITH_SUGGESTION",
        parameters = Map(
-          "value" -> "86399.123456",
+          "value" -> "86399.123456000",


Related to the previous comment, I think that these error messages become a bit counter-intuitive for users?

It seems it is counter-intuitive independently from the scale. No doubt there is a room for error improvement. We could print the original value of the source type like TIME'23:59:59.123456' instead of 86399.123456 or maybe together.

It seems it is counter-intuitive independently from the scale.

This is a good argument. And yes, source type is likely the best fit here, at least from the user perspective I think.

uros-db

Nice trick with using both changePrecision and Decimal without double, although some error messages look a bit weird. If we're fine with this, then LGTM. Otherwise, I have no concerns regarding this PR.

MaxGekk added 2 commits July 17, 2025 17:52

Regen golden files

6ee0c78

Add comments

57aa895

github-actions bot added the SQL label Jul 17, 2025

MaxGekk changed the title ~~[WIP][SQL] Avoid cast to Double in casting of TIME/TIMESTAMP to DECIMAL~~ [WIP][SPARK-52848][SQL] Avoid cast to Double in casting TIME/TIMESTAMP to DECIMAL Jul 17, 2025

MaxGekk changed the title ~~[WIP][SPARK-52848][SQL] Avoid cast to Double in casting TIME/TIMESTAMP to DECIMAL~~ [SPARK-52848][SQL] Avoid cast to Double in casting TIME/TIMESTAMP to DECIMAL Jul 17, 2025

MaxGekk marked this pull request as ready for review July 17, 2025 18:42

uros-db reviewed Jul 17, 2025

View reviewed changes

uros-db approved these changes Jul 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-52848][SQL] Avoid cast to `Double` in casting TIME/TIMESTAMP to DECIMAL #51539

[SPARK-52848][SQL] Avoid cast to `Double` in casting TIME/TIMESTAMP to DECIMAL #51539

Uh oh!

MaxGekk commented Jul 17, 2025 •

edited

Loading

Uh oh!

MaxGekk commented Jul 17, 2025

Uh oh!

uros-db Jul 17, 2025

Uh oh!

MaxGekk Jul 17, 2025

Uh oh!

uros-db Jul 17, 2025

Uh oh!

uros-db Jul 17, 2025 •

edited

Loading

Uh oh!

MaxGekk Jul 17, 2025 •

edited

Loading

Uh oh!

uros-db Jul 17, 2025

Uh oh!

uros-db left a comment

Uh oh!

Uh oh!

[SPARK-52848][SQL] Avoid cast to Double in casting TIME/TIMESTAMP to DECIMAL #51539

Are you sure you want to change the base?

[SPARK-52848][SQL] Avoid cast to Double in casting TIME/TIMESTAMP to DECIMAL #51539

Uh oh!

Conversation

MaxGekk commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

MaxGekk commented Jul 17, 2025

Uh oh!

uros-db Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

MaxGekk Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

uros-db Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

uros-db Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MaxGekk Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

uros-db Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

uros-db left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

[SPARK-52848][SQL] Avoid cast to `Double` in casting TIME/TIMESTAMP to DECIMAL #51539

[SPARK-52848][SQL] Avoid cast to `Double` in casting TIME/TIMESTAMP to DECIMAL #51539

MaxGekk commented Jul 17, 2025 •

edited

Loading

uros-db Jul 17, 2025 •

edited

Loading

MaxGekk Jul 17, 2025 •

edited

Loading