-
Notifications
You must be signed in to change notification settings - Fork 76
Refactored DataFrame JDBC API plus DataSource handling #1487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 6 commits
Commits
Show all changes
26 commits
Select commit
Hold shift + click to select a range
142c0eb
Refactored DataFrame JDBC API for enhanced DataSource handling
zaleslaw de3a97a
Refactored schema extraction to use `readSqlTable` and `readSqlQuery`…
zaleslaw 477da67
Refactored and modularized schema extraction utilities into a dedicat…
zaleslaw 448eab6
Refactor: Replace `DataFrame` with `DataFrameSchema` for schema-relat…
zaleslaw e02a6be
Update logging levels in validation utilities to debug and minor sche…
zaleslaw 7f66bf4
Refactor: support custom `PreparedStatement` configuration, unify que…
zaleslaw b53e9f1
Refactor: enhance `DbType` with batch size and query timeout properti…
zaleslaw 08d927e
Refactor: centralize `makeCommonSqlToKTypeMapping` in `DbType`, strea…
zaleslaw 9f75a84
Refactored query execution logic by introducing `readDataFrameFromDat…
zaleslaw 565e969
Refactored ResultSet-processing utilities to use mutable lists for im…
zaleslaw cc8c861
Add `configureStatement` missed parameters
zaleslaw ecbbe53
Refactored JDBC utilities: added comprehensive error handling in `rea…
zaleslaw 3310382
Update the exception type in the ` read from non-existing table` test…
zaleslaw 23f7c1b
Renamed schema extraction functions from `getSchemaFor*` to `from*` f…
zaleslaw 3377e98
Rename `fromSqlTable` and `fromSqlQuery` to `readSqlTable` and `readS…
zaleslaw e4e84c2
Update `GenerateDataSchemaTask` to use `DataFrameSchema` methods for …
zaleslaw 3aea719
Refactor: improve code consistency, update parameter documentation, s…
zaleslaw 5489889
Replace `DEFAULT_LIMIT` with nullable `limit` parameter, defaulting t…
zaleslaw 328c46f
Add `validateLimit` utility to ensure limit parameter is null or posi…
zaleslaw e007fa5
Add `validateLimit` calls across all JDBC read methods to enforce lim…
zaleslaw 56d93f5
Clarify "limit" parameter documentation and rename `readDataFrameFrom…
zaleslaw f0e4d69
Refactor JDBC data handling: relocate and centralize `buildSchemaByTa…
zaleslaw 6f2e2de
Ktlint with Junie
zaleslaw 1c772a2
Linter with Junie, part 2
zaleslaw 7009a61
Refactor and enhance JDBC: update references for improved consistency…
zaleslaw f928057
Add `DataFrameSchema.Companion` class to core API
zaleslaw File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
1 change: 1 addition & 0 deletions
1
core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/schema/DataFrameSchema.kt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
51 changes: 51 additions & 0 deletions
51
dataframe-jdbc/src/main/kotlin/org/jetbrains/kotlinx/dataframe/io/DbConnectionConfig.kt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,51 @@ | ||
| package org.jetbrains.kotlinx.dataframe.io | ||
|
|
||
| /** | ||
| * Represents the configuration for an internally managed JDBC database connection. | ||
| * | ||
| * This class defines connection parameters used by the library to create a `Connection` | ||
| * when the user does not provide one explicitly. | ||
| * It is designed for safe, read-only access by default. | ||
| * | ||
| * @property url The JDBC URL of the database, e.g., `"jdbc:postgresql://localhost:5432/mydb"`. | ||
| * Must follow the standard format: `jdbc:subprotocol:subname`. | ||
| * | ||
| * @property user The username used for authentication. | ||
| * Optional, default is an empty string. | ||
| * | ||
| * @property password The password used for authentication. | ||
| * Optional, default is an empty string. | ||
| * | ||
| * @property readOnly If `true` (default), the library will create the connection in read-only mode. | ||
| * This enables the following behavior: | ||
| * - `Connection.setReadOnly(true)` | ||
| * - `Connection.setAutoCommit(false)` | ||
| * - automatic `rollback()` at the end of execution | ||
| * | ||
| * If `false`, the connection will be created with JDBC defaults (usually read-write), | ||
| * but the library will still reject any queries that appear to modify data | ||
| * (e.g. contain `INSERT`, `UPDATE`, `DELETE`, etc.). | ||
| * | ||
| * Note: Connections created using this configuration are managed entirely by the library. | ||
| * Users do not have access to the underlying `Connection` instance and cannot commit or close it manually. | ||
| * | ||
| * ### Examples: | ||
| * | ||
| * ```kotlin | ||
| * // Safe read-only connection (default) | ||
| * val config = DbConnectionConfig("jdbc:sqlite::memory:") | ||
| * val df = DataFrame.readSqlQuery(config, "SELECT * FROM books") | ||
| * | ||
| * // Use default JDBC connection settings (still protected against mutations) | ||
| * val config = DbConnectionConfig( | ||
| * url = "jdbc:sqlite::memory:", | ||
| * readOnly = false | ||
| * ) | ||
| * ``` | ||
| */ | ||
| public data class DbConnectionConfig( | ||
| val url: String, | ||
| val user: String = "", | ||
| val password: String = "", | ||
| val readOnly: Boolean = true, | ||
| ) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,14 +1,13 @@ | ||
| package org.jetbrains.kotlinx.dataframe.io.db | ||
|
|
||
| import org.jetbrains.kotlinx.dataframe.io.DbConnectionConfig | ||
| import org.jetbrains.kotlinx.dataframe.io.TableColumnMetadata | ||
| import org.jetbrains.kotlinx.dataframe.io.TableMetadata | ||
| import org.jetbrains.kotlinx.dataframe.io.getSchemaForAllSqlTables | ||
| import org.jetbrains.kotlinx.dataframe.io.readAllSqlTables | ||
| import org.jetbrains.kotlinx.dataframe.schema.ColumnSchema | ||
| import java.sql.Connection | ||
| import java.sql.DatabaseMetaData | ||
| import java.sql.DriverManager | ||
| import java.sql.PreparedStatement | ||
| import java.sql.ResultSet | ||
| import kotlin.reflect.KType | ||
|
|
||
|
|
@@ -40,6 +39,10 @@ public abstract class DbType(public val dbTypeInJdbcUrl: String) { | |
| */ | ||
| public open val tableTypes: List<String>? = listOf("TABLE", "BASE TABLE") | ||
|
|
||
|
|
||
| public open val defaultFetchSize: Int = 1000 | ||
| public open val defaultQueryTimeout: Int? = null // null = no timeout | ||
|
|
||
| /** | ||
| * Returns a [ColumnSchema] produced from [tableColumnMetadata]. | ||
| */ | ||
|
|
@@ -70,14 +73,70 @@ public abstract class DbType(public val dbTypeInJdbcUrl: String) { | |
| */ | ||
| public abstract fun convertSqlTypeToKType(tableColumnMetadata: TableColumnMetadata): KType? | ||
|
|
||
|
|
||
| /** | ||
| * Builds a SELECT query for reading from a table. | ||
| * | ||
| * @param [tableName] the name of the table to query. | ||
| * @param [limit] the maximum number of rows to retrieve. If 0 or negative, no limit is applied. | ||
| * @return the SQL query string. | ||
| */ | ||
| public open fun buildSelectTableQueryWithLimit(tableName: String, limit: Int): String { | ||
| val quotedTableName = quoteIdentifier(tableName) | ||
| return if (limit > 0) { | ||
| buildSqlQueryWithLimit("SELECT * FROM $quotedTableName", limit) | ||
| } else { | ||
| "SELECT * FROM $quotedTableName" | ||
| } | ||
| } | ||
|
|
||
| /** | ||
| * Configures a [PreparedStatement] for optimal read performance. | ||
| * This method is called automatically before statement execution. | ||
| * | ||
| * @param [statement] the prepared statement to configure. | ||
| */ | ||
| public open fun configureReadStatement( | ||
| statement: PreparedStatement | ||
zaleslaw marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| ) { | ||
| // Set fetch size for better streaming performance | ||
| statement.fetchSize = defaultFetchSize | ||
|
|
||
|
|
||
| if (defaultQueryTimeout != null) { | ||
| statement.queryTimeout = defaultQueryTimeout!! | ||
| } | ||
|
|
||
|
|
||
| // Set the fetch direction (forward-only for read-only operations) | ||
| statement.fetchDirection = ResultSet.FETCH_FORWARD | ||
| } | ||
|
|
||
| /** | ||
| * Quotes an identifier (table or column name) according to database-specific rules. | ||
| * | ||
| * Examples: | ||
| * - PostgreSQL: "tableName" or "schema"."table" | ||
| * - MySQL: `tableName` or `schema`.`table` | ||
| * - MS SQL: [tableName] or [schema].[table] | ||
|
||
| * - SQLite/H2: no quotes for simple names | ||
| * | ||
| * @param [name] the identifier to quote (can contain dots for schema.table). | ||
| * @return the quoted identifier. | ||
| */ | ||
| public open fun quoteIdentifier(name: String): String { | ||
| // Default: no quoting (works for SQLite, H2, simple names) | ||
| return name | ||
| } | ||
|
|
||
| /** | ||
| * Constructs a SQL query with a limit clause. | ||
| * | ||
| * @param sqlQuery The original SQL query. | ||
| * @param limit The maximum number of rows to retrieve from the query. Default is 1. | ||
| * @return A new SQL query with the limit clause added. | ||
| */ | ||
| public open fun sqlQueryLimit(sqlQuery: String, limit: Int = 1): String = "$sqlQuery LIMIT $limit" | ||
| public open fun buildSqlQueryWithLimit(sqlQuery: String, limit: Int = 1): String = "$sqlQuery LIMIT $limit" | ||
|
|
||
| /** | ||
| * Creates a database connection using the provided configuration. | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
4 changes: 2 additions & 2 deletions
4
dataframe-jdbc/src/main/kotlin/org/jetbrains/kotlinx/dataframe/io/db/Sqlite.kt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
20 changes: 20 additions & 0 deletions
20
dataframe-jdbc/src/main/kotlin/org/jetbrains/kotlinx/dataframe/io/db/TableColumnMetadata.kt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| package org.jetbrains.kotlinx.dataframe.io.db | ||
|
|
||
| /** | ||
| * Represents a column in a database table to keep all required meta-information. | ||
| * | ||
| * @property [name] the name of the column. | ||
| * @property [sqlTypeName] the SQL data type of the column. | ||
| * @property [jdbcType] the JDBC data type of the column produced from [java.sql.Types]. | ||
| * @property [size] the size of the column. | ||
| * @property [javaClassName] the class name in Java. | ||
| * @property [isNullable] true if column could contain nulls. | ||
| */ | ||
| public data class TableColumnMetadata( | ||
zaleslaw marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| val name: String, | ||
| val sqlTypeName: String, | ||
| val jdbcType: Int, | ||
| val size: Int, | ||
| val javaClassName: String, | ||
| val isNullable: Boolean = false, | ||
| ) | ||
14 changes: 14 additions & 0 deletions
14
dataframe-jdbc/src/main/kotlin/org/jetbrains/kotlinx/dataframe/io/db/TableMetadata.kt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| package org.jetbrains.kotlinx.dataframe.io.db | ||
|
|
||
| /** | ||
| * Represents a table metadata to store information about a database table, | ||
| * including its name, schema name, and catalogue name. | ||
| * | ||
| * NOTE: we need to extract both, [schemaName] and [catalogue] | ||
| * because the different databases have different implementations of metadata. | ||
| * | ||
| * @property [name] the name of the table. | ||
| * @property [schemaName] the name of the schema the table belongs to (optional). | ||
| * @property [catalogue] the name of the catalogue the table belongs to (optional). | ||
| */ | ||
| public data class TableMetadata(val name: String, val schemaName: String?, val catalogue: String?) |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.


Uh oh!
There was an error while loading. Please reload this page.