Skip to content

Commit 0677941

Browse files
feedback
1 parent fccb102 commit 0677941

File tree

3 files changed

+4
-3
lines changed

3 files changed

+4
-3
lines changed

docs/integrations/data-ingestion/clickpipes/mysql/controlling_sync.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ Sync interval can be set to any positive integer value, but it is recommended to
2828
The pull batch size is the number of records that the ClickPipe will pull from the source database in one batch. Records mean inserts, updates and deletes done on the tables that are part of the pipe.
2929

3030
The default is **100,000** records.
31+
A safe maximum is 10 million.
3132

3233
### An exception: Long-running transactions on source
3334
When a transaction is run on the source database, the ClickPipe waits until it receives the COMMIT of the transaction before it moves forward. This with **overrides** both the sync interval and the pull batch size.

docs/integrations/data-ingestion/clickpipes/mysql/parallel_initial_load.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ Once we've enabled the feature flag, you should see the below setting in the Cli
2727
The MySQL ClickPipe uses a column on your source table to logically partition the source tables. This column is called the **partition key column**. It is used to divide the source table into partitions, which can then be processed in parallel by the ClickPipe.
2828

2929
:::warning
30-
The partition key column must be indexed in the source table to see a good performance boost.
30+
The partition key column must be indexed in the source table to see a good performance boost. This can be seen by running `SHOW INDEX FROM <table_name>` in MySQL.
3131
:::
3232

3333
### Logical partitioning
@@ -37,7 +37,7 @@ Let's talk about the below settings:
3737
<img src={snapshot_params} alt="Snapshot parameters" />
3838

3939
#### Snapshot number of rows per partition
40-
This setting controls how many rows constitute a partition. The ClickPipe will read the source table in chunks of this size, and each chunk will be processed in parallel. The default value is 100,000 rows per partition.
40+
This setting controls how many rows constitute a partition. The ClickPipe will read the source table in chunks of this size, and chunks will be processed in parallel based on the initial load parallelism set. The default value is 100,000 rows per partition.
4141

4242
#### Initial load parallelism
4343
This setting controls how many partitions will be processed in parallel. The default value is 4, which means that the ClickPipe will read 4 partitions of the source table in parallel. This can be increased to speed up the initial load, but it is recommended to keep it to a reasonable value depending on your source instance specs to avoid overwhelming the source database. The ClickPipe will automatically adjust the number of partitions based on the size of the source table and the number of rows per partition.

docs/integrations/data-ingestion/clickpipes/postgres/parallel_initial_load.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Let's talk about the below settings:
2525
<img src={snapshot_params} alt="Snapshot parameters" />
2626

2727
#### Snapshot number of rows per partition
28-
This setting controls how many rows constitute a partition. The ClickPipe will read the source table in chunks of this size, and each chunk will be processed in parallel. The default value is 100,000 rows per partition.
28+
This setting controls how many rows constitute a partition. The ClickPipe will read the source table in chunks of this size, and chunks will be processed in parallel based on the initial load parallelism set. The default value is 100,000 rows per partition.
2929

3030
#### Initial load parallelism
3131
This setting controls how many partitions will be processed in parallel. The default value is 4, which means that the ClickPipe will read 4 partitions of the source table in parallel. This can be increased to speed up the initial load, but it is recommended to keep it to a reasonable value depending on your source instance specs to avoid overwhelming the source database. The ClickPipe will automatically adjust the number of partitions based on the size of the source table and the number of rows per partition.

0 commit comments

Comments
 (0)