Skip to content

Commit a376b5c

Browse files
committed
updating jdbc docs with DuckDB examples (using korro) as well as fixing some grammar issues
1 parent afd8847 commit a376b5c

File tree

8 files changed

+186
-18
lines changed

8 files changed

+186
-18
lines changed

docs/StardustDocs/d.tree

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -202,6 +202,7 @@
202202
<toc-element topic="SQLite.md"/>
203203
<toc-element topic="H2.md"/>
204204
<toc-element topic="MariaDB.md"/>
205+
<toc-element topic="DuckDB.md"/>
205206
<toc-element topic="Custom-SQL-Source.md"/>
206207
</toc-element>
207208
<toc-element topic="Integrations.md"/>

docs/StardustDocs/topics/dataSources/Data-Sources.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ Below you'll find a list of supported sources along with instructions on how to
2828
- [SQLite](SQLite.md)
2929
- [H2](H2.md)
3030
- [MariaDB](MariaDB.md)
31+
- [DuckDB](DuckDB.md)
3132
- [Custom SQL Source](Custom-SQL-Source.md)
3233
- [Custom integrations with unsupported data sources](Integrations.md)
3334

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
# DuckDB
2+
3+
<web-summary>
4+
Work with DuckDB databases in Kotlin — read tables and queries into DataFrames using JDBC.
5+
</web-summary>
6+
7+
<card-summary>
8+
Use Kotlin DataFrame to query and transform DuckDB data directly via JDBC.
9+
</card-summary>
10+
11+
<link-summary>
12+
Read DuckDB data into Kotlin DataFrame with JDBC support.
13+
</link-summary>
14+
15+
<!---IMPORT org.jetbrains.kotlinx.dataframe.samples.io.DuckDb-->
16+
17+
Kotlin DataFrame supports reading from [DuckDB](https://duckdb.org/) databases using JDBC.
18+
19+
This requires the [`dataframe-jdbc` module](Modules.md#dataframe-jdbc),
20+
which is included by default in the general [`dataframe` artifact](Modules.md#dataframe-general)
21+
and in [`%use dataframe`](SetupKotlinNotebook.md#integrate-kotlin-dataframe) for Kotlin Notebook.
22+
23+
You’ll also need [the official DuckDB JDBC driver](https://duckdb.org/docs/stable/clients/java):
24+
25+
<tabs>
26+
<tab title="Gradle project">
27+
28+
```kotlin
29+
dependencies {
30+
implementation("org.duckdb:duckdb_jdbc:$version")
31+
}
32+
```
33+
34+
</tab>
35+
<tab title="Kotlin Notebook">
36+
37+
```kotlin
38+
USE {
39+
dependencies("org.duckdb:duckdb_jdbc:$version")
40+
}
41+
```
42+
43+
</tab>
44+
</tabs>
45+
46+
The actual Maven Central driver version can be found
47+
[here](https://mvnrepository.com/artifact/org.duckdb/duckdb_jdbc).
48+
49+
## Read
50+
51+
A [`DataFrame`](DataFrame.md) instance can be loaded from a database in several ways:
52+
a user can read data from a SQL table by a given name ([`readSqlTable`](readSqlDatabases.md)),
53+
as the result of a user-defined SQL query ([`readSqlQuery`](readSqlDatabases.md)),
54+
or from a given `ResultSet` ([`readResultSet`](readSqlDatabases.md)).
55+
It is also possible to load all data from non-system tables, each into a separate `DataFrame` ([
56+
`readAllSqlTables`](readSqlDatabases.md)).
57+
58+
See [](readSqlDatabases.md) for more details.
59+
60+
<!---FUN readSqlTable-->
61+
62+
```kotlin
63+
val url = "jdbc:duckdb:/testDatabase"
64+
val username = "duckdb"
65+
val password = "password"
66+
67+
val dbConfig = DbConnectionConfig(url, username, password)
68+
69+
val tableName = "Customer"
70+
71+
val df = DataFrame.readSqlTable(dbConfig, tableName)
72+
```
73+
74+
<!---END-->
75+
76+
### Extensions
77+
78+
DuckDB has a special trick up its sleeve: it has support
79+
for [extensions](https://duckdb.org/docs/stable/extensions/overview).
80+
These can be installed, loaded, and used to connect to a different database via DuckDB.
81+
See [Core Extensions](https://duckdb.org/docs/stable/core_extensions/overview) for a list of available extensions.
82+
83+
For example, let's load a dataframe
84+
from [Apache Iceberg via DuckDB](https://duckdb.org/docs/stable/core_extensions/iceberg/overview.html),
85+
as Iceberg is an unsupported data source in DataFrame at the moment:
86+
87+
<!---FUN readIcebergExtension-->
88+
89+
```kotlin
90+
// Creating an in-memory DuckDB database
91+
val connection = DriverManager.getConnection("jdbc:duckdb:")
92+
val df = connection.use { connection ->
93+
// install and load Iceberg
94+
connection.createStatement().execute("INSTALL iceberg; LOAD iceberg;")
95+
96+
// query a table from Iceberg using a specific SQL query
97+
DataFrame.readSqlQuery(
98+
connection = connection,
99+
sqlQuery = "SELECT * FROM iceberg_scan('data/iceberg/lineitem_iceberg', allow_moved_paths = true);",
100+
)
101+
}
102+
```
103+
104+
<!---END-->
105+
106+
As you can see, the process is very similar to reading from any other JDBC database,
107+
just without needing explicit DataFrame support.

docs/StardustDocs/topics/dataSources/sql/SQL.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ Kotlin DataFrame provides out-of-the-box support for the most common SQL databas
2929
- [SQLite](SQLite.md)
3030
- [H2](H2.md)
3131
- [MariaDB](MariaDB.md)
32+
- [DuckDB](DuckDB.md)
3233

3334
You can also define a [Custom SQL Source](Custom-SQL-Source.md)
3435
to work with any other JDBC-compatible database.

docs/StardustDocs/topics/readSqlDatabases.md

Lines changed: 25 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -44,71 +44,79 @@ Also, there are a few **extension functions** available on `Connection`,
4444

4545

4646
**NOTE:** This is an experimental module, and for now,
47-
we only support four databases: MS SQL, MariaDB, MySQL, PostgreSQL, and SQLite.
47+
we only support these databases: MS SQL, MariaDB, MySQL, PostgreSQL, SQLite, and DuckDB.
4848

4949
Moreover, since release 0.15 we support the possibility to register custom SQL database, read more in our [guide](readSqlFromCustomDatabase.md).
5050

5151
Additionally, support for JSON and date-time types is limited.
5252
Please take this into consideration when using these functions.
5353

54-
## Getting started with reading from SQL database in Gradle Project
54+
## Getting started with reading from SQL database in a Gradle Project
5555

56-
In the first, you need to add a dependency
56+
First, you need to add a dependency
5757

5858
```kotlin
5959
implementation("org.jetbrains.kotlinx:dataframe-jdbc:$dataframe_version")
6060
```
6161

62-
after that, you need to add a dependency for a JDBC driver for the used database, for example
62+
after that, you need to add the dependency for the database's JDBC driver, for example
6363

6464
For **MariaDB**:
6565

6666
```kotlin
6767
implementation("org.mariadb.jdbc:mariadb-java-client:$version")
6868
```
6969

70-
Maven Central version could be found [here](https://mvnrepository.com/artifact/org.mariadb.jdbc/mariadb-java-client).
70+
The Maven Central version can be found [here](https://mvnrepository.com/artifact/org.mariadb.jdbc/mariadb-java-client).
7171

7272
For **PostgreSQL**:
7373

7474
```kotlin
7575
implementation("org.postgresql:postgresql:$version")
7676
```
7777

78-
Maven Central version could be found [here](https://mvnrepository.com/artifact/org.postgresql/postgresql).
78+
The Maven Central version can be found [here](https://mvnrepository.com/artifact/org.postgresql/postgresql).
7979

8080
For **MySQL**:
8181

8282
```kotlin
8383
implementation("com.mysql:mysql-connector-j:$version")
8484
```
8585

86-
Maven Central version could be found [here](https://mvnrepository.com/artifact/com.mysql/mysql-connector-j).
86+
The Maven Central version can be found [here](https://mvnrepository.com/artifact/com.mysql/mysql-connector-j).
8787

8888
For **SQLite**:
8989

9090
```kotlin
9191
implementation("org.xerial:sqlite-jdbc:$version")
9292
```
9393

94-
Maven Central version could be found [here](https://mvnrepository.com/artifact/org.xerial/sqlite-jdbc).
94+
The Maven Central version can be found [here](https://mvnrepository.com/artifact/org.xerial/sqlite-jdbc).
9595

9696
For **MS SQL**:
9797

9898
```kotlin
9999
implementation("com.microsoft.sqlserver:mssql-jdbc:$version")
100100
```
101101

102-
Maven Central version could be found [here](https://mvnrepository.com/artifact/com.microsoft.sqlserver/mssql-jdbc).
102+
The Maven Central version can be found [here](https://mvnrepository.com/artifact/com.microsoft.sqlserver/mssql-jdbc).
103103

104-
In the second, be sure that you can establish a connection to the database.
104+
For **DuckDB**:
105105

106-
For this, usually, you need to have three things: a URL to a database, a username, and a password.
106+
```kotlin
107+
implementation("org.duckdb:duckdb_jdbc:$version")
108+
```
109+
110+
The Maven Central version can be found [here](https://mvnrepository.com/artifact/org.duckdb/duckdb_jdbc).
111+
112+
Next, be sure that you can establish a connection to the database.
113+
114+
For this, usually, you need to have three things: a URL to the database, a username, and a password.
107115

108-
Call one of the following functions to collect data from a database and transform it to the dataframe.
116+
Call one of the following functions to collect data from the database and transform it to a dataframe.
109117

110-
For example, if you have a local PostgreSQL database named as `testDatabase` with table `Customer`,
111-
you could read first 100 rows and print the data just copying the code below:
118+
For example, if you have a local PostgreSQL database named `testDatabase` with a table `Customer`,
119+
you can read the first 100 rows and print the data by just copying the code below:
112120

113121
```kotlin
114122
import org.jetbrains.kotlinx.dataframe.io.DbConnectionConfig
@@ -127,7 +135,7 @@ val df = DataFrame.readSqlTable(dbConfig, tableName, 100)
127135
df.print()
128136
```
129137

130-
Find a full example project [here](https://github.com/zaleslaw/KotlinDataFrame-SQL-Examples/).
138+
You can find a full example project [here](https://github.com/zaleslaw/KotlinDataFrame-SQL-Examples/).
131139

132140
## Getting Started with Notebooks
133141

@@ -317,7 +325,7 @@ Note that reading from the `ResultSet` could potentially change its state.
317325

318326
The `dbType: DbType` parameter specifies the type of our database (e.g., PostgreSQL, MySQL, etc.),
319327
supported by a library.
320-
Currently, the following classes are available: `H2, MsSql, MariaDb, MySql, PostgreSql, Sqlite`.
328+
Currently, the following classes are available: `H2, MsSql, MariaDb, MySql, PostgreSql, Sqlite, DuckDb`.
321329

322330
Also, users have an ability to pass objects, describing their custom databases, more information in [guide](readSqlFromCustomDatabase.md).
323331

@@ -525,7 +533,7 @@ This function reads the schema from a `ResultSet` object provided by the user.
525533

526534
The `dbType: DbType` parameter specifies the type of our database (e.g., PostgreSQL, MySQL, etc.),
527535
supported by a library.
528-
Currently, the following classes are available: `H2, MariaDb, MySql, PostgreSql, Sqlite`.
536+
Currently, the following classes are available: `H2, MsSql, MariaDb, MySql, PostgreSql, Sqlite, DuckDB`.
529537

530538
Also, users have an ability to pass objects, describing their custom databases, more information in [guide](readSqlFromCustomDatabase.md).
531539

docs/StardustDocs/topics/schemas/gradle/Gradle-Plugin.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -184,7 +184,7 @@ dataframes {
184184
Find full example code [here](https://github.com/zaleslaw/KotlinDataFrame-SQL-Examples/blob/master/src/main/kotlin/Example_3_Import_schema_via_Gradle.kt).
185185

186186
**NOTE:** This is an experimental functionality and, for now,
187-
we only support four databases: MariaDB, MySQL, PostgreSQL, and SQLite.
187+
we only support these databases: MariaDB, MySQL, PostgreSQL, SQLite, MS SQL, and DuckDB.
188188

189189
Additionally, support for JSON and date-time types is limited.
190190
Please take this into consideration when using these functions.

tests/build.gradle.kts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,12 +66,14 @@ korro {
6666
include("docs/StardustDocs/topics/write.md")
6767
include("docs/StardustDocs/topics/rename.md")
6868
include("docs/StardustDocs/topics/guides/*.md")
69+
include("docs/StardustDocs/topics/dataSources/sql/*.md")
6970
}
7071

7172
samples = fileTree(project.projectDir) {
7273
include("src/test/kotlin/org/jetbrains/kotlinx/dataframe/samples/*.kt")
7374
include("src/test/kotlin/org/jetbrains/kotlinx/dataframe/samples/api/*.kt")
7475
include("src/test/kotlin/org/jetbrains/kotlinx/dataframe/samples/guides/*.kt")
76+
include("src/test/kotlin/org/jetbrains/kotlinx/dataframe/samples/io/*.kt")
7577
}
7678

7779
groupSamples {
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
package org.jetbrains.kotlinx.dataframe.samples.io
2+
3+
import org.jetbrains.kotlinx.dataframe.DataFrame
4+
import org.jetbrains.kotlinx.dataframe.io.DbConnectionConfig
5+
import org.jetbrains.kotlinx.dataframe.io.readSqlQuery
6+
import org.jetbrains.kotlinx.dataframe.io.readSqlTable
7+
import org.junit.Ignore
8+
import org.junit.Test
9+
import java.sql.DriverManager
10+
11+
class DuckDb {
12+
13+
@Ignore
14+
@Test
15+
fun readSqlTable() {
16+
// SampleStart
17+
val url = "jdbc:duckdb:/testDatabase"
18+
val username = "duckdb"
19+
val password = "password"
20+
21+
val dbConfig = DbConnectionConfig(url, username, password)
22+
23+
val tableName = "Customer"
24+
25+
val df = DataFrame.readSqlTable(dbConfig, tableName)
26+
// SampleEnd
27+
}
28+
29+
// source: https://duckdb.org/docs/stable/core_extensions/iceberg/overview.html
30+
@Ignore
31+
@Test
32+
fun readIcebergExtension() {
33+
// SampleStart
34+
// Creating an in-memory DuckDB database
35+
val connection = DriverManager.getConnection("jdbc:duckdb:")
36+
val df = connection.use { connection ->
37+
// install and load Iceberg
38+
connection.createStatement().execute("INSTALL iceberg; LOAD iceberg;")
39+
40+
// query a table from Iceberg using a specific SQL query
41+
DataFrame.readSqlQuery(
42+
connection = connection,
43+
sqlQuery = "SELECT * FROM iceberg_scan('data/iceberg/lineitem_iceberg', allow_moved_paths = true);",
44+
)
45+
}
46+
// SampleEnd
47+
}
48+
}

0 commit comments

Comments
 (0)