Skip to content

Conversation

@WenyXu
Copy link
Member

@WenyXu WenyXu commented Nov 11, 2025

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

What's changed and what's your intention?

This PR introduces parallelism improvements for both COPY DATABASE operations and CLI export functionality to enhance performance when handling multiple tables.

Changes

1. Support parallel table operations in COPY DATABASE

  • Added parallelism option to COPY DATABASE operations (defaults to total CPU cores)
  • Tables are now processed in parallel instead of sequentially
  • Enhanced logging to show progress with table count (e.g., "Copy table(1/10): ...")
  • Added common-stat dependency to get system CPU core count

2. Add parallelism parameter to CLI export

  • Added new --parallelism flag to export command (default: 8)
  • Renamed internal parallelism field to export_jobs for clarity
  • The parallelism parameter is now passed to SQL queries via WITH clause

Example Usage

COPY DATABASE with custom parallelism

COPY DATABASE my_db TO 'file:///path/to/backup/' WITH (format='parquet', parallelism=16);

CLI export with both parameters

greptime cli export --database my_db --export-jobs 4 --parallelism 16

PR Checklist

Please convert it to a draft if some of the following conditions are not met.

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR requires documentation updates.
  • API changes are backward compatible.
  • Schema or data changes are backward compatible.

@github-actions github-actions bot added size/S docs-not-required This change does not impact docs. docs-required This change requires docs update. and removed docs-not-required This change does not impact docs. labels Nov 11, 2025
@github-actions github-actions bot added size/M and removed size/S labels Nov 13, 2025
@WenyXu WenyXu marked this pull request as ready for review November 13, 2025 03:39
@WenyXu WenyXu requested a review from a team as a code owner November 13, 2025 03:39
Signed-off-by: WenyXu <[email protected]>
@WenyXu WenyXu changed the title feat: support parallel table operations in COPY DATABASE and export feat: support parallel table operations in COPY DATABASE Nov 13, 2025
Copy link
Contributor

@killme2008 killme2008 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@WenyXu WenyXu requested a review from fengjiachun November 17, 2025 03:19
Copy link
Collaborator

@fengjiachun fengjiachun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

@WenyXu WenyXu added this pull request to the merge queue Nov 17, 2025
Merged via the queue into GreptimeTeam:main with commit 6adc348 Nov 17, 2025
43 checks passed
@WenyXu WenyXu deleted the feat/paralall-copy-database branch November 17, 2025 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-required This change requires docs update. size/M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants