-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-52832][SQL] Fix JDBC dialect identifier quoting #51533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[SPARK-52832][SQL] Fix JDBC dialect identifier quoting #51533
Conversation
val mySQLDialect = JdbcDialects.get("jdbc:mysql://127.0.0.1/db") | ||
val postgresDialect = JdbcDialects.get("jdbc:postgresql://127.0.0.1/db") | ||
val derbyDialect = JdbcDialects.get("jdbc:derby:db") | ||
val oracleDialect = JdbcDialects.get("jdbc:oracle:thin:@//localhost:1521/orcl") | ||
|
||
val columns = Seq("abc", "key") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put "col\""
here to test all dialects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but many of them currently does not handle escaping properly (except the ones where quotes can't be part of identifiers).
I would rather fix one by one, or in smaller batches than to scope up PR and make some mistake.
I am fine also with making default implementation like Oracle's implementation, since if certain dialect use double quotes for quoting, but use some other escape character, we won't break anything, since currently escaping does not work at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea I think we should make it the default, as it won't break anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I checked almost all dialects, it seems we are fine with this rule of escaping. The only one that may not work are dialects out of Spark core project, but those don't work already.
…dentifier-quoting' into SPARK-52832-fix-oracle-dialect-identifier-quoting # Conflicts: # sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala
…2-fix-oracle-dialect-identifier-quoting
assertResult(Seq("\"abc\"", "\"key\"", "\"double_quote\"\"\"", "\"back`\""))(postgresColumns) | ||
assertResult(Seq("\"abc\"", "\"key\"", "\"double_quote\"\"\"", "\"back`\""))(derbyColumns) | ||
assertResult(Seq("\"abc\"", "\"key\"", "\"double_quote\"\"\"", "\"back`\""))(oracleColumns) | ||
assertResult(Seq("`abc`", "`key`", "`double_quote\"`", "`back```"))(databricksColumns) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assertResult
is used because it gives better error message, it accepts expected and actual parameters, so it can tell what is expected values in error message.a
@@ -286,6 +286,11 @@ private case class OracleDialect() extends JdbcDialect with SQLConfHelper with N | |||
case _ => super.classifyException(e, condition, messageParameters, description, isRuntime) | |||
} | |||
} | |||
|
|||
override def quoteIdentifier(colName: String): String = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can remove it now as it's the same as default.
What changes were proposed in this pull request?
JDBCDialect
MySQLDialect
escapes backticks with another backticksDatabricksDialect
escapes backticks with another backticksWhy are the changes needed?
Resolving a bug when JDBC connector needs to quote an identifier when identifier contains double quotes/backticks.
Does this PR introduce any user-facing change?
Fix identifier quoting in JDBC dialect when identifier contains double quotes/backticks.
How was this patch tested?
Using JDBCSuite
Was this patch authored or co-authored using generative AI tooling?
No