Conversation
✅ Deploy Preview for cockroachdb-interactivetutorials-docs canceled.
|
Files changed:
|
✅ Deploy Preview for cockroachdb-api-docs canceled.
|
✅ Netlify Preview
To edit notification comments on pull requests, go to your Netlify project configuration. |
… molt-redux-phase-2
…draft diagram, improved readability
Migration doc rework phase 2
…figure Molt Fetch, updated sidebar
ryanluu12345
left a comment
There was a problem hiding this comment.
Very good refactor. I think it's clear to me when and why I should use approaches. I also like that the information deep dives where it needs to and provides links where it's part of a larger flow.
I'm going to defer to the rest of the team's review and @Jeremyyang920's stamp here
|
|
||
| - This approach does not utilize [continuous replication]({% link molt/migration-considerations-replication.md %}). | ||
|
|
||
| - [Rollback]({% link molt/migration-considerations-rollback.md %}) is manual, but in most cases it's simple, as the source database is preserved and write traffic begins on the target all at once. |
There was a problem hiding this comment.
Should we specify which sub-section within rollback they should focus on? It appears a bit vague what the manual rollback they'll have to do is? Would this be configuring reverse replication or just copying rows back. Unclear right now
|
|
||
| ## Step 8: Stop forward replication | ||
|
|
||
| Before you can cut over traffic to the target, the changes to the source database need to finish being written to the target. Once the source is no longer receiving write traffic, MOLT Replicator will take some seconds to finish replicating the final changes. This is known as _drainage_. |
There was a problem hiding this comment.
I think it's important to also call out that what folks should be watching here is the replication lag and that most the data is moved over. That is the metric that indicates completion of "drainage"
There was a problem hiding this comment.
You can probably point them to the source and target lag metrics. Here are a few they can look at that Replicator exposes:
source_lag_seconds_histogram
target_lag_seconds_histogram
source_commit_to_apply_lag_seconds
| @@ -0,0 +1,335 @@ | |||
| A [*Phased Bulk Load Migration*]({% link molt/migration-approach-phased-bulk-load.md %}) involves [migrating data to CockroachDB]({% link molt/migration-overview.md %}) in several phases. Data can be sliced per tenant, per service, per region, or per table to suit the needs of the migration. In this approach, you stop application traffic to the source database _only_ for the tables in a particular slice of data. You then migrate that phase of data to the target cluster using [MOLT Fetch]({% link molt/molt-fetch.md %}) during a **downtime window**. Application traffic is then cut over to those target tables after schema finalization and data verification. This process is repeated for each phase of data. | |||
There was a problem hiding this comment.
I recognize this is an entirely different pattern than was covered before, but has a lot of shared steps with just a simple monolithic migration for bulk load only. This is more of a nit, but any way we can reuse content from the above? I see we do references already, but wondering if there is more we can do to consolidate and not have to maintain as much content.
There was a problem hiding this comment.
Overall, looks solid! Wish I could've given this a more comprehensive review but it's hard to do that without turning this into a full review of the whole Migrations docs, since it's hard to see the delta here.
I like the addition of examples, diagrams, and concrete examples for users to grasp and use to help visualize their migration strategy, as well as listing and giving walkthroughs of certain migration strategies! Good stuff - mainly just small comments about wordings and a question to my TL and manager. Also seems like two flags may have been dropped, and the flags are necessary to forward replication to pg
This is a chunky PR, so here's a high-level description of the changes that have been made, so you can stay oriented.
Note that the line count that github provides for this PR (10000+ lines added) is super misleading due to how github counts .svg files (which are basically just images but which for some reason github interprets as many hundreds of new lines). The real number is far less than the one provided.
Phase 1: New guidance documentation
This is all of the content under the Migrate > Migration Considerations section of the nav. That includes:
I also moved Migration Strategies to Migrate > Migration Best Practices. It's mostly the same, but I moved some of its contents into the other guidance docs.
These pages will need the most significant tech review (though probably not from eng, probably from people in the field?) as I want to make sure that all of this strategic guidance is accurate.
Phase 2: Re-organizing the MOLT tooling docs
This is all of the content under Migrate > MOLT Tools, primarily under Fetch and Replicator. I broke up the contents by task, concept, and reference documentation. I also merged from main to include all of the Replicator userscript documentation. I did a lot of refining of the main Fetch and Replicator guides, including a new graphic in the Fetch guide. All of it is more readable and navigable now.
Phase 3: Source-db specific, step-by-step walkthroughs broken up by strategy
This is the content under Migrate > Common Migration Approaches. Mainly:
Note that each of these pages is an overview, that links to three source-DB-specific walkthroughs.
These pages re-work the content that was originally under "Migration Flows", but is more rigorously task-based, is broken up by different source DB types, tries to reduce info duplication, includes some graphics, and offers greater specificity about things like application downtime.
Other changes
A lot of files required changes to links, due to the other changes here. Various includes were removed/renamed. Therefore, many random files have minor, invisible changes (like links).