Skip to content

Commit 2ab93fc

Browse files
authored
add explicit anchor tags
1 parent 06891b0 commit 2ab93fc

File tree

1 file changed

+7
-7
lines changed

1 file changed

+7
-7
lines changed

docs/integrations/data-ingestion/clickpipes/mysql/parallel_initial_load.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -14,12 +14,12 @@ This document explains parallelized snapshot/initial load in the MySQL ClickPipe
1414
Please reach out to us via a support ticket to enable this feature for your ClickHouse organization.
1515
:::
1616

17-
## Overview
17+
## Overview {#overview}
1818

1919
Initial load is the first phase of a CDC ClickPipe, where the ClickPipe syncs the historical data of the tables in the source database over to ClickHouse, before then starting CDC. A lot of the times, developers do this in a single-threaded manner.
2020
However, the MySQL ClickPipe can parallelize this process, which can significantly speed up the initial load.
2121

22-
### Partition key column
22+
### Partition key column {#partition-key-column}
2323

2424
Once we've enabled the feature flag, you should see the below setting in the ClickPipe table picker (both during creation and editing of a ClickPipe):
2525
<img src={partition_key} alt="Partition key column" />
@@ -36,18 +36,18 @@ Let's talk about the below settings:
3636

3737
<img src={snapshot_params} alt="Snapshot parameters" />
3838

39-
#### Snapshot number of rows per partition
39+
#### Snapshot number of rows per partition {#snapshot-number-of-rows-per-partition}
4040
This setting controls how many rows constitute a partition. The ClickPipe will read the source table in chunks of this size, and each chunk will be processed in parallel. The default value is 100,000 rows per partition.
4141

42-
#### Initial load parallelism
42+
#### Initial load parallelism {#initial-load-parallelism}
4343
This setting controls how many partitions will be processed in parallel. The default value is 4, which means that the ClickPipe will read 4 partitions of the source table in parallel. This can be increased to speed up the initial load, but it is recommended to keep it to a reasonable value depending on your source instance specs to avoid overwhelming the source database. The ClickPipe will automatically adjust the number of partitions based on the size of the source table and the number of rows per partition.
4444

45-
#### Snapshot number of tables in parallel
45+
#### Snapshot number of tables in parallel {#snapshot-number-of-tables-parallel}
4646
Not really related to parallel snapshot, but this setting controls how many tables will be processed in parallel during the initial load. The default value is 1. Note that is on top of the parallelism of the partitions, so if you have 4 partitions and 2 tables, the ClickPipe will read 8 partitions in parallel.
4747

48-
### Monitoring parallel snapshot in MySQL
48+
### Monitoring parallel snapshots in MySQL {#monitoring-parallel-snapshot-mysql}
4949
You can run **SHOW processlist** in MySQL to see the parallel snapshot in action. The ClickPipe will create multiple connections to the source database, each reading a different partition of the source table. If you see **SELECT** queries with different ranges, it means that the ClickPipe is reading the source tables. You can also see the COUNT(*) and the partitioning query in here.
5050

51-
### Limitations
51+
### Limitations {#limitations}
5252
- The snapshot parameters cannot be edited after pipe creation. If you want to change them, you will have to create a new ClickPipe.
5353
- When adding tables to an existing ClickPipe, you cannot change the snapshot parameters. The ClickPipe will use the existing parameters for the new tables.

0 commit comments

Comments
 (0)