Skip to content

feat(snowflake)!: Transpilation support for TO_DECIMAL, TO_NUMBER,NUMERIC#7315

Open
fivetran-ashashankar wants to merge 3 commits intomainfrom
RD-1069319_TO_NUMBER_DECIMAL_NUMERIC
Open

feat(snowflake)!: Transpilation support for TO_DECIMAL, TO_NUMBER,NUMERIC#7315
fivetran-ashashankar wants to merge 3 commits intomainfrom
RD-1069319_TO_NUMBER_DECIMAL_NUMERIC

Conversation

@fivetran-ashashankar
Copy link
Collaborator

No description provided.

@fivetran-ashashankar fivetran-ashashankar changed the title feat(snowflake)!: Transpilation support for TO_DECIMAL, TO_NUMBER, TO… feat(snowflake)!: Transpilation support for TO_DECIMAL, TO_NUMBER,NUMERIC Mar 17, 2026
@@ -894,9 +894,9 @@ def eliminate_join_marks(expression: exp.Expr) -> exp.Expr:
if not left_join_table:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make style formatter modified the transforms.py

@@ -4181,6 +4181,84 @@ def strtok_sql(self, expression: exp.Strtok) -> str:

return self.function_fallback_sql(expression)

Copy link
Collaborator Author

@fivetran-ashashankar fivetran-ashashankar Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Snowflake: TO_NUMBER(expr) defaults to NUMBER(38, 0) -> truncates decimals -> BIGINT
Oracle/others: TO_NUMBER(expr) defaults to NUMBER -> keeps decimals -> DOUBLE
Working on handling this difference. should I use is_snowflake here.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 17, 2026

SQLGlot Integration Test Results

Comparing:

  • this branch (sqlglot:RD-1069319_TO_NUMBER_DECIMAL_NUMERIC, sqlglot version: RD-1069319_TO_NUMBER_DECIMAL_NUMERIC)
  • baseline (main, sqlglot version: 0.0.1.dev1)

⚠️ Limited to dialects: duckdb

By Dialect

dialect main sqlglot:RD-1069319_TO_NUMBER_DECIMAL_NUMERIC transitions links
duckdb -> duckdb 4003/4004 passed (100.0%) 4003/4004 passed (100.0%) No change full result / delta

Overall

main: 4004 total, 4003 passed (pass rate: 100.0%), sqlglot version: 0.0.1.dev1

sqlglot:RD-1069319_TO_NUMBER_DECIMAL_NUMERIC: 4004 total, 4003 passed (pass rate: 100.0%), sqlglot version: RD-1069319_TO_NUMBER_DECIMAL_NUMERIC

Transitions:
No change

…_NUMERIC . 3.9 support. string concatenation ("\\" + c) instead of f-string interpolation
Comment on lines 897 to +932
@@ -927,9 +927,9 @@ def eliminate_join_marks(expression: exp.Expr) -> exp.Expr:

if query_from.alias_or_name in new_joins:
only_old_joins = old_joins.keys() - new_joins.keys()
assert len(only_old_joins) >= 1, (
"Cannot determine which table to use in the new FROM clause"
)
assert (
len(only_old_joins) >= 1
), "Cannot determine which table to use in the new FROM clause"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest main, with a fresh venv should solve this.

Comment on lines +4190 to +4196
# Parse arguments: format_arg could be format string or precision
if format_arg and format_arg.is_string:
format_string = format_arg.this
precision, scale = precision_arg, scale_arg
else:
format_string = None
precision, scale = format_arg, precision_arg
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't handle cases like:

WITH t AS (SELECT '$9,999.99' AS f) SELECT TO_DECIMAL('$3,741.72', f, 6, 2) FROM t

precision, scale = format_arg, precision_arg

# Handle hexadecimal format
if format_string and format_string.lower() in ("xxx", "xxxx"):
Copy link
Collaborator

@geooo109 geooo109 Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't cover all the cases right ?

For example:

Snowflake:
SELECT TO_DECIMAL('ae5', '0XX')  
> 2789

Transpiled Duckdb:
SELECT CAST('ae5' AS BIGINT)
> Conversion Error:
Could not convert string 'ae5' to INT64
LINE 1: SELECT CAST('ae5' AS BIGINT);

Also, again what happens if we dont have the value of format ?

WITH t AS (SELECT '0XX' AS f) SELECT TO_DECIMAL('ae5', f) from t;

We should generate the solution on top of the query right ? Instead of handling the values on parse time.

# Parse arguments: format_arg could be format string or precision
if format_arg and format_arg.is_string:
format_string = format_arg.this
precision, scale = precision_arg, scale_arg
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for re-assignment of persicion and scale.

Comment on lines +4211 to +4228
if format_string:
chars_to_remove = {",": ",", "$": "$", "£": "£", "€": "€", "¥": "¥"}
chars = [c for symbol, c in chars_to_remove.items() if symbol in format_string]

if chars:
if len(chars) == 1:
pattern = chars[0]
else:
# Escape special regex characters (Python 3.9 compatible)
regex_special = r".^$*+?{}[]\|()"
escaped = [c if c not in regex_special else "\\" + c for c in chars]
pattern = "[" + "".join(escaped) + "]"
this = exp.RegexpReplace(
this=this,
expression=exp.Literal.string(pattern),
replacement=exp.Literal.string(""),
modifiers=exp.Literal.string("g"),
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be done on the query side.

Check what we do in this PR: https://github.com/tobymao/sqlglot/pull/7283/changes#diff-a286b0dbb51576bf61912639b01aa715baa497e89392148823a841e1ef138d5dR4095

Also, let's ensure that we cover all the formats. For example do we cover select TO_NUMBER('-123.45', 'MI999.99') ?

Comment on lines +4241 to +4249
prec_val = int(precision.to_py())

if scale and not isinstance(scale, exp.Literal):
self.unsupported(
"TO_NUMBER with non-literal scale is not supported. Using DOUBLE instead of DECIMAL."
)
return self.sql(exp.cast(this, exp.DataType.Type.DOUBLE))

scale_val = int(scale.to_py()) if scale else 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do to_py and cast into int ?

The precision and the scale will always be literals in snowflake right ? (Can you search for the rest of the dialects ? if we always have a literal here and not an identifier)

So, we can directly use them in the cast, avoiding the back and forth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants