You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/integrations/data-ingestion/clickpipes/mysql/faq.md
+4Lines changed: 4 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -34,3 +34,7 @@ You have several options to resolve these issues:
34
34
3.**Configure server certificate** - Update your server's SSL certificate to include all connection hostnames and use a trusted Certificate Authority.
35
35
36
36
4.**Skip certificate verification** - For self-hosted MySQL or MariaDB, whose default configurations provision a self-signed certificate we can't validate ([MySQL](https://dev.mysql.com/doc/refman/8.4/en/creating-ssl-rsa-files-using-mysql.html#creating-ssl-rsa-files-using-mysql-automatic), [MariaDB](https://mariadb.com/kb/en/securing-connections-for-client-and-server/#enabling-tls-for-mariadb-server)). Relying on this certificate encrypts the data in transit but runs the risk of server impersonation. We recommend properly signed certificates for production environments, but this option is useful for testing on a one-off instance or connecting to legacy infrastructure.
37
+
38
+
### Do you support schema changes? {#do-you-support-schema-changes}
39
+
40
+
Please refer to the [ClickPipes for MySQL: Schema Changes Propagation Support](./schema-changes) page for more information.
Copy file name to clipboardExpand all lines: docs/integrations/data-ingestion/clickpipes/mysql/index.md
+16-19Lines changed: 16 additions & 19 deletions
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,8 @@
1
1
---
2
-
sidebar_label: 'ClickPipes for MySQL'
2
+
sidebar_label: 'Ingesting Data from MySQL to ClickHouse'
3
3
description: 'Describes how to seamlessly connect your MySQL to ClickHouse Cloud.'
4
4
slug: /integrations/clickpipes/mysql
5
-
title: 'Ingesting Data from MySQL to ClickHouse (using CDC)'
5
+
title: 'Ingesting data from MySQL to ClickHouse (using CDC)'
6
6
---
7
7
8
8
import BetaBadge from '@theme/badges/BetaBadge';
@@ -15,20 +15,15 @@ import select_destination_db from '@site/static/images/integrations/data-ingesti
15
15
import ch_permissions from '@site/static/images/integrations/data-ingestion/clickpipes/postgres/ch-permissions.jpg'
16
16
import Image from '@theme/IdealImage';
17
17
18
-
# Ingesting data from MySQL to ClickHouse using CDC
18
+
# Ingesting data from MySQL to ClickHouse (using CDC)
19
19
20
20
<BetaBadge/>
21
21
22
-
:::info
23
-
Currently, ingesting data from MySQL to ClickHouse Cloud via ClickPipes is in Private Preview.
24
-
:::
25
-
26
-
27
-
You can use ClickPipes to ingest data from your source MySQL database into ClickHouse Cloud. The source MySQL database can be hosted on-premises or in the cloud.
22
+
You can use ClickPipes to ingest data from your source MySQL database into ClickHouse Cloud. The source MySQL database can be hosted on-premises or in the cloud using services like Amazon RDS, Google Cloud SQL, and others.
28
23
29
24
## Prerequisites {#prerequisites}
30
25
31
-
To get started, you first need to make sure that your MySQL database is set up correctly. Depending on your source MySQL instance, you may follow any of the following guides:
26
+
To get started, you first need to ensure that your MySQL database is correctly configured for binlog replication. The configuration steps depend on how you're deploying MySQL, so please follow the relevant guide below:
32
27
33
28
1.[Amazon RDS MySQL](./mysql/source/rds)
34
29
@@ -44,7 +39,7 @@ To get started, you first need to make sure that your MySQL database is set up c
44
39
45
40
Once your source MySQL database is set up, you can continue creating your ClickPipe.
46
41
47
-
## Create your ClickPipe {#creating-your-clickpipe}
42
+
## Create your ClickPipe {#create-your-clickpipe}
48
43
49
44
Make sure you are logged in to your ClickHouse Cloud account. If you don't have an account yet, you can sign up [here](https://cloud.clickhouse.com/).
50
45
@@ -61,20 +56,18 @@ Make sure you are logged in to your ClickHouse Cloud account. If you don't have
### Add your source MySQL database connection {#adding-your-source-mysql-database-connection}
59
+
### Add your source MySQL database connection {#add-your-source-mysql-database-connection}
65
60
66
61
4. Fill in the connection details for your source MySQL database which you configured in the prerequisites step.
67
62
68
63
:::info
69
-
70
64
Before you start adding your connection details make sure that you have whitelisted ClickPipes IP addresses in your firewall rules. On the following page you can find a [list of ClickPipes IP addresses](../index.md#list-of-static-ips).
71
65
For more information refer to the source MySQL setup guides linked at [the top of this page](#prerequisites).
72
-
73
66
:::
74
67
75
68
<Imageimg={mysql_connection_details}alt="Fill in connection details"size="lg"border/>
76
69
77
-
#### (Optional) Set up SSH tunneling {#optional-setting-up-ssh-tunneling}
70
+
#### (Optional) Set up SSH Tunneling {#optional-set-up-ssh-tunneling}
78
71
79
72
You can specify SSH tunneling details if your source MySQL database is not publicly accessible.
80
73
@@ -88,12 +81,10 @@ You can specify SSH tunneling details if your source MySQL database is not publi
88
81
4. Click on "Verify Connection" to verify the connection.
89
82
90
83
:::note
91
-
92
84
Make sure to whitelist [ClickPipes IP addresses](../clickpipes#list-of-static-ips) in your firewall rules for the SSH bastion host so that ClickPipes can establish the SSH tunnel.
93
-
94
85
:::
95
86
96
-
Once the connection details are filled in, click on "Next".
87
+
Once the connection details are filled in, click `Next`.
@@ -106,7 +97,7 @@ You can configure the advanced settings if needed. A brief description of each s
106
97
-**Snapshot number of tables in parallel**: This is the number of tables that will be fetched in parallel during the initial snapshot. This is useful when you have a large number of tables and you want to control the number of tables fetched in parallel.
107
98
108
99
109
-
### Configure the tables {#configuring-the-tables}
100
+
### Configure the tables {#configure-the-tables}
110
101
111
102
5. Here you can select the destination database for your ClickPipe. You can either select an existing database or create a new one.
112
103
@@ -121,3 +112,9 @@ You can configure the advanced settings if needed. A brief description of each s
Finally, please refer to the ["ClickPipes for MySQL FAQ"](/integrations/clickpipes/mysql/faq) page for more information about common issues and how to resolve them.
115
+
116
+
## What's next? {#whats-next}
117
+
118
+
[//]: #"TODO Write a MySQL-specific migration guide and best practices similar to the existing one for PostgreSQL. The current migration guide points to the MySQL table engine, which is not ideal."
119
+
120
+
Once you've set up your ClickPipe to replicate data from MySQL to ClickHouse Cloud, you can focus on how to query and model your data for optimal performance. For common questions around MySQL CDC and troubleshooting, see the [MySQL FAQs page](/integrations/data-ingestion/clickpipes/mysql/faq.md).
description: 'Page describing schema change types detectable by ClickPipes in the source tables'
5
+
---
6
+
7
+
ClickPipes for MySQL can detect schema changes in the source tables and, in some cases, automatically propagate the changes to the destination tables. The way each DDL operation is handled is documented below:
8
+
9
+
[//]: #"TODO Extend this page with behavior on rename, data type changes, and truncate + guidance on how to handle incompatible schema changes."
| Adding a new column (`ALTER TABLE ADD COLUMN ...`) | Propagated automatically. The new column(s) will be populated for all rows replicated after the schema change |
14
+
| Adding a new column with a default value (`ALTER TABLE ADD COLUMN ... DEFAULT ...`) | Propagated automatically. The new column(s) will be populated for all rows replicated after the schema change, but existing rows will not show the default value without a full table refresh |
15
+
| Dropping an existing column (`ALTER TABLE DROP COLUMN ...`) | Detected, but **not** propagated. The dropped column(s) will be populated with `NULL` for all rows replicated after the schema change |
Copy file name to clipboardExpand all lines: docs/integrations/data-ingestion/clickpipes/mysql/source/aurora.md
+48-36Lines changed: 48 additions & 36 deletions
Original file line number
Diff line number
Diff line change
@@ -19,83 +19,91 @@ import Image from '@theme/IdealImage';
19
19
20
20
# Aurora MySQL source setup guide
21
21
22
-
This is a step-by-step guide on how to configure your Aurora MySQL instance for replicating its data via the MySQL ClickPipe.
23
-
<br/>
24
-
:::info
25
-
We also recommend going through the MySQL FAQs [here](/integrations/data-ingestion/clickpipes/mysql/faq.md). The FAQs page is being actively updated.
26
-
:::
22
+
This step-by-step guide shows you how to configure Amazon Aurora MySQL to replicate data into ClickHouse Cloud using the [MySQL ClickPipe](../index.md). For common questions around MySQL CDC, see the [MySQL FAQs page](/integrations/data-ingestion/clickpipes/mysql/faq.md).
The binary log is a set of log files that contain information about data modifications made to an MySQL server instance, and binary log files are required for replication. Both of the below steps must be followed:
30
25
31
-
### 1. Enable binary logging via automated backup {#enable-binlog-logging-aurora}
32
-
The automated backups feature determines whether binary logging is turned on or off for MySQL. It can be set in the AWS console:
26
+
The binary log is a set of log files that contain information about data modifications made to a MySQL server instance, and binary log files are required for replication. To configure binary log retention in Aurora MySQL, you must [enable binary logging](#enable-binlog-logging) and [increase the binlog retention interval](#binlog-retention-interval).
27
+
28
+
### 1. Enable binary logging via automated backup {#enable-binlog-logging}
29
+
30
+
The automated backups feature determines whether binary logging is turned on or off for MySQL. Automated backups can be configured for your instance in the RDS Console by navigating to **Modify** > **Additional configuration** > **Backup** and selecting the **Enable automated backups** checkbox (if not selected already).
33
31
34
32
<Imageimg={rds_backups}alt="Enabling automated backups in Aurora"size="lg"border/>
35
33
36
-
Setting backup retention to a reasonably long value depending on the replication use-case is advisable.
34
+
We recommend setting the **Backup retention period** to a reasonably long value, depending on the replication use case.
35
+
36
+
### 2. Increase the binlog retention interval {#binlog-retention-interval}
37
+
38
+
:::warning
39
+
If ClickPipes tries to resume replication and the required binlog files have been purged due to the configured binlog retention value, the ClickPipe will enter an errored state and a resync is required.
40
+
:::
41
+
42
+
By default, Aurora MySQL purges the binary log as soon as possible (i.e., _lazy purging_). We recommend increasing the binlog retention interval to at least **72 hours** to ensure availability of binary log files for replication under failure scenarios. To set an interval for binary log retention ([`binlog retention hours`](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/mysql-stored-proc-configuring.html#mysql_rds_set_configuration-usage-notes.binlog-retention-hours)), use the [`mysql.rds_set_configuration`](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/mysql-stored-proc-configuring.html#mysql_rds_set_configuration) procedure:
The procedure below must be called to ensure availability of binary logs for replication:
44
+
[//]: #"NOTE Most CDC providers recommend the maximum retention period for Aurora RDS (7 days/168 hours). Since this has an impact on disk usage, we conservatively recommend a mininum of 3 days/72 hours."
If this configuration isn't set, Amazon RDS purges the binary logs as soon as possible, leading to gaps in the binary logs.
45
49
46
-
## Configure binlog settings in the parameter group {#binlog-parameter-group-aurora}
50
+
If this configuration isn't set or is set to a low interval, it can lead to gaps in the binary logs, compromising ClickPipes' ability to resume replication.
51
+
52
+
## Configure binlog settings {#binlog-settings}
47
53
48
-
The parameter group can be found when you click on your MySQL instance in the RDS Console, and then heading over to the `Configurations` tab.
54
+
The parameter group can be found when you click on your MySQL instance in the RDS Console, and then navigate to the **Configuration** tab.
55
+
56
+
:::tip
57
+
If you have a MySQL cluster, the parameters below can be found in the [DB cluster](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/USER_WorkingWithParamGroups.CreatingCluster.html) parameter group instead of the DB instance group.
58
+
:::
49
59
50
60
<Imageimg={aurora_config}alt="Where to find parameter group in Aurora"size="lg"border/>
51
61
52
-
Upon clicking on the parameter group link, you will be taken to the page for it. You will see an Edit button in the top-right.
62
+
<br/>
63
+
Click the parameter group link, which will take you to its dedicated page. You should see an **Edit** button in the top right.
Then click on `Save Changes` in the top-right. You may need to reboot your instance for the changes to take effect - a way of knowing this is if you see `Pending reboot` next to the parameter group link in the Configurations tab of the RDS instance.
71
82
<br/>
83
+
Then, click on **Save Changes** in the top right corner. You may need to reboot your instance for the changes to take effect — a way of knowing this is if you see `Pending reboot` next to the parameter group link in the **Configuration** tab of the Aurora instance.
84
+
85
+
## Enable GTID mode (recommended) {#gtid-mode}
86
+
72
87
:::tip
73
-
If you have a MySQL cluster, the above parameters would be found in a [DB Cluster](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/USER_WorkingWithParamGroups.CreatingCluster.html) parameter group and not the DB instance group.
88
+
The MySQL ClickPipe also supports replication without GTID mode. However, enabling GTID mode is recommended for better performance and easier troubleshooting.
74
89
:::
75
90
76
-
## Enabling GTID mode {#gtid-mode-aurora}
77
-
Global Transaction Identifiers (GTIDs) are unique IDs assigned to each committed transaction in MySQL. They simplify binlog replication and make troubleshooting more straightforward.
91
+
[Global Transaction Identifiers (GTIDs)](https://dev.mysql.com/doc/refman/8.0/en/replication-gtids.html) are unique IDs assigned to each committed transaction in MySQL. They simplify binlog replication and make troubleshooting more straightforward. We **recommend** enabling GTID mode, so that the MySQL ClickPipe can use GTID-based replication.
78
92
79
-
If your MySQL instance is MySQL 5.7, 8.0 or 8.4, we recommend enabling GTID mode so that the MySQL ClickPipe can use GTID replication.
93
+
GTID-based replication is supported for Amazon Aurora MySQL v2 (MySQL 5.7) and v3 (MySQL 8.0), as well as Aurora Serverless v2. To enable GTID mode for your Aurora MySQL instance, follow these steps:
80
94
81
-
To enable GTID mode for your MySQL instance, follow the steps as follows:
82
95
1. In the RDS Console, click on your MySQL instance.
83
-
2. Click on the `Configurations` tab.
96
+
2. Click on the **Configuration** tab.
84
97
3. Click on the parameter group link.
85
-
4. Click on the `Edit` button in the top-right corner.
98
+
4. Click on the **Edit** button in the topright corner.
86
99
5. Set `enforce_gtid_consistency` to `ON`.
87
100
6. Set `gtid-mode` to `ON`.
88
-
7. Click on `Save Changes` in the top-right corner.
101
+
7. Click on **Save Changes** in the topright corner.
89
102
8. Reboot your instance for the changes to take effect.
The MySQL ClickPipe also supports replication without GTID mode. However, enabling GTID mode is recommended for better performance and easier troubleshooting.
96
-
:::
97
-
98
-
## Configure a database user {#configure-database-user-aurora}
106
+
## Configure a database user {#configure-database-user}
99
107
100
108
Connect to your Aurora MySQL instance as an admin user and execute the following commands:
101
109
@@ -122,12 +130,16 @@ Connect to your Aurora MySQL instance as an admin user and execute the following
122
130
123
131
### IP-based access control {#ip-based-access-control}
124
132
125
-
If you want to restrict traffic to your Aurora instance, please add the [documented static NAT IPs](../../index.md#list-of-static-ips) to the `Inbound rules` of your Aurora security group as shown below:
133
+
To restrict traffic to your Aurora MySQL instance, add the [documented static NAT IPs](../../index.md#list-of-static-ips) to the **Inbound rules** of your Aurora security group.
126
134
127
135
<Image img={security_group_in_rds_mysql} alt="Where to find security group in Aurora MySQL?" size="lg" border/>
128
136
129
137
<Image img={edit_inbound_rules} alt="Edit inbound rules for the above security group" size="lg" border/>
130
138
131
139
### Private access via AWS PrivateLink {#private-access-via-aws-privatelink}
132
140
133
-
To connect to your Aurora instance through a private network, you can use AWS PrivateLink. Follow our [AWS PrivateLink setup guide for ClickPipes](/knowledgebase/aws-privatelink-setup-for-clickpipes) to set up the connection.
141
+
To connect to your Aurora MySQL instance through a private network, you can use AWS PrivateLink. Follow the [AWS PrivateLink setup guide for ClickPipes](/knowledgebase/aws-privatelink-setup-for-clickpipes) to set up the connection.
142
+
143
+
## What's next? {#whats-next}
144
+
145
+
Now that your Amazon Aurora MySQL instance is configured for binlog replication and securely connecting to ClickHouse Cloud, you can [create your first MySQL ClickPipe](/integrations/clickpipes/mysql/#create-your-clickpipe). For common questions around MySQL CDC, see the [MySQL FAQs page](/integrations/data-ingestion/clickpipes/mysql/faq.md).
0 commit comments