Skip to content

[SPARK-52832][SQL] Fix JDBC dialect identifier quoting #51533

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

urosstan-db
Copy link
Contributor

@urosstan-db urosstan-db commented Jul 17, 2025

What changes were proposed in this pull request?

  • Quote identifier escapes double quotes in base JDBCDialect
  • Quote identifier in MySQLDialect escapes backticks with another backticks
  • Quote identifier in DatabricksDialect escapes backticks with another backticks

Why are the changes needed?

Resolving a bug when JDBC connector needs to quote an identifier when identifier contains double quotes/backticks.

Does this PR introduce any user-facing change?

Fix identifier quoting in JDBC dialect when identifier contains double quotes/backticks.

How was this patch tested?

Using JDBCSuite

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Jul 17, 2025
@urosstan-db urosstan-db changed the title [SPARK-52832][CORE] Fix oracle dialect identifier quoting [SPARK-52832][SQL] Fix oracle dialect identifier quoting Jul 17, 2025
val mySQLDialect = JdbcDialects.get("jdbc:mysql://127.0.0.1/db")
val postgresDialect = JdbcDialects.get("jdbc:postgresql://127.0.0.1/db")
val derbyDialect = JdbcDialects.get("jdbc:derby:db")
val oracleDialect = JdbcDialects.get("jdbc:oracle:thin:@//localhost:1521/orcl")

val columns = Seq("abc", "key")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put "col\"" here to test all dialects?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but many of them currently does not handle escaping properly (except the ones where quotes can't be part of identifiers).
I would rather fix one by one, or in smaller batches than to scope up PR and make some mistake.
I am fine also with making default implementation like Oracle's implementation, since if certain dialect use double quotes for quoting, but use some other escape character, we won't break anything, since currently escaping does not work at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea I think we should make it the default, as it won't break anything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I checked almost all dialects, it seems we are fine with this rule of escaping. The only one that may not work are dialects out of Spark core project, but those don't work already.

@urosstan-db urosstan-db changed the title [SPARK-52832][SQL] Fix oracle dialect identifier quoting [SPARK-52832][SQL] Fix JDBC dialect identifier quoting Jul 17, 2025
…dentifier-quoting' into SPARK-52832-fix-oracle-dialect-identifier-quoting

# Conflicts:
#	sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala
assertResult(Seq("\"abc\"", "\"key\"", "\"double_quote\"\"\"", "\"back`\""))(postgresColumns)
assertResult(Seq("\"abc\"", "\"key\"", "\"double_quote\"\"\"", "\"back`\""))(derbyColumns)
assertResult(Seq("\"abc\"", "\"key\"", "\"double_quote\"\"\"", "\"back`\""))(oracleColumns)
assertResult(Seq("`abc`", "`key`", "`double_quote\"`", "`back```"))(databricksColumns)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertResult is used because it gives better error message, it accepts expected and actual parameters, so it can tell what is expected values in error message.a

@@ -286,6 +286,11 @@ private case class OracleDialect() extends JdbcDialect with SQLConfHelper with N
case _ => super.classifyException(e, condition, messageParameters, description, isRuntime)
}
}

override def quoteIdentifier(colName: String): String = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can remove it now as it's the same as default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants