Skip to content

Commit cd931a9

Browse files
authored
Merge pull request #74 from shiva-cirus/feature/mariadb
Mariadb Plugins
2 parents db1afd4 + c6fb1bd commit cd931a9

27 files changed

+2980
-0
lines changed

database-commons/src/test/java/io/cdap/plugin/db/batch/DatabasePluginTestBase.java

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,16 @@ public static Schema getSchemaWithInvalidTypeMapping(String columnName, Schema.T
5757
);
5858
}
5959

60+
protected static void assertRuntimeFailure(ApplicationId appId, ETLBatchConfig etlConfig,
61+
ArtifactSummary datapipelineArtifact, String failureMessage, int runCount)
62+
throws Exception {
63+
AppRequest<ETLBatchConfig> appRequest = new AppRequest<>(datapipelineArtifact, etlConfig);
64+
ApplicationManager appManager = deployApplication(appId, appRequest);
65+
final WorkflowManager workflowManager = appManager.getWorkflowManager(SmartWorkflow.NAME);
66+
workflowManager.start();
67+
workflowManager.waitForRuns(ProgramRunStatus.FAILED, runCount, 3, TimeUnit.MINUTES);
68+
}
69+
6070
protected static void assertDeploymentFailure(ApplicationId appId, ETLBatchConfig etlConfig,
6171
ArtifactSummary datapipelineArtifact, String failureMessage)
6272
throws Exception {

mariadb-plugin/docs/Mariadb-action.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# MariaDB Action
2+
3+
4+
Description
5+
-----------
6+
Action that runs a MariaDB command.
7+
8+
9+
Use Case
10+
--------
11+
The action can be used whenever you want to run a MariaDB command before or after a data pipeline.
12+
For example, you may want to run a sql update command on a database before the pipeline source pulls data from tables.
13+
14+
15+
Properties
16+
----------
17+
**Driver Name:** Name of the JDBC driver to use.
18+
19+
**Database Query:** Database query to execute.
20+
21+
**Host:** Host that MariaDB is running on.
22+
23+
**Port:** Port that MariaDB is running on.
24+
25+
**Database:** MariaDB database name.
26+
27+
**Username:** User identity for connecting to the specified database.
28+
29+
**Password:** Password to use to connect to the specified database.
30+
31+
**Connection Arguments:** A list of arbitrary string key/value pairs as connection arguments. These arguments
32+
will be passed to the JDBC driver as connection arguments for JDBC drivers that may need additional configurations.
33+
34+
**Auto Reconnect:** Should the driver try to re-establish stale and/or dead connections.
35+
36+
**Use SSL:** Turns on SSL encryption. The connection will fail if SSL is not available.
37+
38+
**Keystore URL:** URL to the client certificate KeyStore (if not specified, use defaults). Must be accessible at the
39+
same location on host where CDAP Master is running and all hosts on which at least one HDFS, MapReduce, or YARN daemon
40+
role is running.
41+
42+
**Keystore Password:** Password for the client certificates KeyStore.
43+
44+
**Truststore URL:** URL to the trusted root certificate KeyStore (if not specified, use defaults). Must be accessible at
45+
the same location on host where CDAP Master is running and all hosts on which at least one HDFS, MapReduce, or YARN
46+
daemon role is running.
47+
48+
**Truststore Password:** Password for the trusted root certificates KeyStore
49+
50+
**Use Compression:** Use zlib compression when communicating with the server. Select this option for WAN
51+
connections.
52+
53+
**Use ANSI Quotes:** Treats " as an identifier quote character and not as a string quote character.
54+
55+
**SQL_MODE:** Override the default SQL_MODE session variable used by the server.
56+
57+
58+
Example
59+
-------
60+
Suppose you want to execute a query against a MariaDB database named "prod" that is running on "localhost"
61+
port 3306, then configure the plugin with:
62+
63+
```
64+
Driver Name: "mariadb"
65+
Database Query: "UPDATE table_name SET price = 20 WHERE ID = 6"
66+
Host: "localhost"
67+
Port: 3306
68+
Database: "prod"
69+
```
Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
# MariaDB Batch Sink
2+
3+
4+
Description
5+
-----------
6+
Writes records to a MariaDB table. Each record will be written to a row in the table.
7+
8+
9+
Use Case
10+
--------
11+
This sink is used whenever you need to write to a MariaDB table.
12+
Suppose you periodically build a recommendation model for products on your online store.
13+
The model is stored in a FileSet and you want to export the contents
14+
of the FileSet to a MariaDB table where it can be served to your users.
15+
16+
Column names would be autodetected from input schema.
17+
18+
Properties
19+
----------
20+
**Reference Name:** Name used to uniquely identify this sink for lineage, annotating metadata, etc.
21+
22+
**Driver Name:** Name of the JDBC driver to use.
23+
24+
**Host:** Host that MariaDB is running on.
25+
26+
**Port:** Port that MariaDB is running on.
27+
28+
**Database:** MariaDB database name.
29+
30+
**Table Name:** Name of the table to export to.
31+
32+
**Username:** User identity for connecting to the specified database.
33+
34+
**Password:** Password to use to connect to the specified database.
35+
36+
**Connection Arguments:** A list of arbitrary string key/value pairs as connection arguments. These arguments
37+
will be passed to the JDBC driver as connection arguments for JDBC drivers that may need additional configurations.
38+
39+
**Auto Reconnect:** Should the driver try to re-establish stale and/or dead connections.
40+
41+
**Use SSL:** Turns on SSL encryption. The connection will fail if SSL is not available.
42+
43+
**Keystore URL:** URL to the client certificate KeyStore (if not specified, use defaults). Must be accessible at the
44+
same location on host where CDAP Master is running and all hosts on which at least one HDFS, MapReduce, or YARN daemon
45+
role is running.
46+
47+
**Keystore Password:** Password for the client certificates KeyStore.
48+
49+
**Truststore URL:** URL to the trusted root certificate KeyStore (if not specified, use defaults). Must be accessible at
50+
the same location on host where CDAP Master is running and all hosts on which at least one HDFS, MapReduce, or YARN
51+
daemon role is running.
52+
53+
**Truststore Password:** Password for the trusted root certificates KeyStore
54+
55+
**Use Compression:** Use zlib compression when communicating with the server. Select this option for WAN
56+
connections.
57+
58+
**SQL_MODE:** Override the default SQL_MODE session variable used by the server.
59+
60+
61+
Data Types Mapping
62+
----------
63+
+--------------------------------+-----------------------+------------------------------------+
64+
| MariaDB Data Type | CDAP Schema Data Type | Comment |
65+
+--------------------------------+-----------------------+------------------------------------+
66+
| TINYINT | int | |
67+
| BOOLEAN, BOOL | boolean | |
68+
| SMALLINT | int | |
69+
| MEDIUMINT | int | |
70+
| INT, INTEGER | int | |
71+
| BIGINT | long | |
72+
| DECIMAL, DEC, NUMERIC, FIXED | decimal | |
73+
| FLOAT | float | |
74+
| DOUBLE, DOUBLE PRECISION, REAL | decimal | |
75+
| BIT | boolean | |
76+
| CHAR | string | |
77+
| VARCHAR | string | |
78+
| BINARY | bytes | |
79+
| CHAR BYTE | bytes | |
80+
| VARBINARY | bytes | |
81+
| TINYBLOB | bytes | |
82+
| BLOB | bytes | |
83+
| MEDIUMBLOB | bytes | |
84+
| LONGBLOB | bytes | |
85+
| TINYTEXT | string | |
86+
| TEXT | string | |
87+
| MEDIUMTEXT | string | |
88+
| LONGTEXT | string | |
89+
| JSON | string | In MariaDB it is alias to LONGTEXT |
90+
| ENUM | string | Mapping to String by default |
91+
| SET | string | |
92+
| DATE | date | |
93+
| TIME | time_micros | |
94+
| DATETIME | timestamp_micros | |
95+
| TIMESTAMP | timestamp_micros | |
96+
| YEAR | date | |
97+
+--------------------------------+-----------------------+------------------------------------+
98+
99+
Example
100+
-------
101+
Suppose you want to write output records to "users" table of MariaDB database named "prod" that is running on "localhost",
102+
port 3306, as "root" user with "root" password, then configure the plugin with:
103+
104+
```
105+
Reference Name: "snk1"
106+
Driver Name: "mariadb"
107+
Host: "localhost"
108+
Port: 3306
109+
Database: "prod"
110+
Table Name: "users"
111+
Username: "root"
112+
Password: "root"
113+
```
Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
# MariaDB Batch Source
2+
3+
4+
Description
5+
-----------
6+
Reads from a MariaDB instance using a configurable SQL query.
7+
Outputs one record for each row returned by the query.
8+
9+
10+
Use Case
11+
--------
12+
The source is used whenever you need to read from a MariaDB instance. For example, you may want
13+
to create daily snapshots of a database table by using this source and writing to
14+
a TimePartitionedFileSet.
15+
16+
17+
Properties
18+
----------
19+
**Reference Name:** Name used to uniquely identify this source for lineage, annotating metadata, etc.
20+
21+
**Driver Name:** Name of the JDBC driver to use.
22+
23+
**Host:** Host that MariaDB is running on.
24+
25+
**Port:** Port that MariaDB is running on.
26+
27+
**Database:** MariaDB database name.
28+
29+
**Import Query:** The SELECT query to use to import data from the specified table.
30+
You can specify an arbitrary number of columns to import, or import all columns using \*. The Query should
31+
contain the '$CONDITIONS' string. For example, 'SELECT * FROM table WHERE $CONDITIONS'.
32+
The '$CONDITIONS' string will be replaced by 'splitBy' field limits specified by the bounding query.
33+
The '$CONDITIONS' string is not required if numSplits is set to one.
34+
35+
**Bounding Query:** Bounding Query should return the min and max of the values of the 'splitBy' field.
36+
For example, 'SELECT MIN(id),MAX(id) FROM table'. Not required if numSplits is set to one.
37+
38+
**Split-By Field Name:** Field Name which will be used to generate splits. Not required if numSplits is set to one.
39+
40+
**Number of Splits to Generate:** Number of splits to generate.
41+
42+
**Username:** User identity for connecting to the specified database.
43+
44+
**Password:** Password to use to connect to the specified database.
45+
46+
**Connection Arguments:** A list of arbitrary string key/value pairs as connection arguments. These arguments
47+
will be passed to the JDBC driver as connection arguments for JDBC drivers that may need additional configurations.
48+
49+
**Auto Reconnect:** Should the driver try to re-establish stale and/or dead connections.
50+
51+
**Schema:** The schema of records output by the source. This will be used in place of whatever schema comes
52+
back from the query. However, it must match the schema that comes back from the query,
53+
except it can mark fields as nullable and can contain a subset of the fields.
54+
55+
**Use SSL:** Turns on SSL encryption. The connection will fail if SSL is not available.
56+
57+
**Keystore URL:** URL to the client certificate KeyStore (if not specified, use defaults). Must be accessible at the
58+
same location on host where CDAP Master is running and all hosts on which at least one HDFS, MapReduce, or YARN daemon
59+
role is running.
60+
61+
**Keystore Password:** Password for the client certificates KeyStore.
62+
63+
**Truststore URL:** URL to the trusted root certificate KeyStore (if not specified, use defaults). Must be accessible at
64+
the same location on host where CDAP Master is running and all hosts on which at least one HDFS, MapReduce, or YARN
65+
daemon role is running.
66+
67+
**Truststore Password:** Password for the trusted root certificates KeyStore
68+
69+
**Use Compression:** Use zlib compression when communicating with the server. Select this option for WAN
70+
connections.
71+
72+
**Use ANSI Quotes:** Treats " as an identifier quote character and not as a string quote character.
73+
74+
**SQL_MODE:** Override the default SQL_MODE session variable used by the server.
75+
76+
77+
Data Types Mapping
78+
----------
79+
80+
+--------------------------------+-----------------------+------------------------------------+
81+
| MariaDB Data Type | CDAP Schema Data Type | Comment |
82+
+--------------------------------+-----------------------+------------------------------------+
83+
| TINYINT | int | |
84+
| BOOLEAN, BOOL | boolean | |
85+
| SMALLINT | int | |
86+
| MEDIUMINT | int | |
87+
| INT, INTEGER | int | |
88+
| BIGINT | long | |
89+
| DECIMAL, DEC, NUMERIC, FIXED | decimal | |
90+
| FLOAT | float | |
91+
| DOUBLE, DOUBLE PRECISION, REAL | decimal | |
92+
| BIT | boolean | |
93+
| CHAR | string | |
94+
| VARCHAR | string | |
95+
| BINARY | bytes | |
96+
| CHAR BYTE | bytes | |
97+
| VARBINARY | bytes | |
98+
| TINYBLOB | bytes | |
99+
| BLOB | bytes | |
100+
| MEDIUMBLOB | bytes | |
101+
| LONGBLOB | bytes | |
102+
| TINYTEXT | string | |
103+
| TEXT | string | |
104+
| MEDIUMTEXT | string | |
105+
| LONGTEXT | string | |
106+
| JSON | string | In MariaDB it is alias to LONGTEXT |
107+
| ENUM | string | Mapping to String by default |
108+
| SET | string | |
109+
| DATE | date | |
110+
| TIME | time_micros | |
111+
| DATETIME | timestamp_micros | |
112+
| TIMESTAMP | timestamp_micros | |
113+
| YEAR | date | |
114+
+--------------------------------+-----------------------+------------------------------------+
115+
116+
117+
Example
118+
------
119+
Suppose you want to read data from MariaDB database named "prod" that is running on "localhost" port 3306,
120+
as "root" user with "root" password, then configure plugin with:
121+
122+
123+
```
124+
Reference Name: "src1"
125+
Driver Name: "mariadb"
126+
Host: "localhost"
127+
Port: 3306
128+
Database: "prod"
129+
Import Query: "select id, name, email, phone from users;"
130+
Number of Splits to Generate: 1
131+
Username: "root"
132+
Password: "root"
133+
```
134+
135+
For example, if the 'id' column is a primary key of type int and the other columns are
136+
non-nullable varchars, output records will have this schema:
137+
138+
+----------------+---------------------+
139+
| Field Name | Type |
140+
+----------------+---------------------+
141+
| id | int |
142+
| name | string |
143+
| email | string |
144+
| phone | string |
145+
+----------------+---------------------+

0 commit comments

Comments
 (0)