-
Notifications
You must be signed in to change notification settings - Fork 72
Open
Milestone
Description
Context / Problem
An application can enter the INOPERABLE state for many reasons, but some of them are after transient failures such as timeouts, network failures and system out of memory.
In many of these cases the data up to an earlier epoch is still valid; only later epochs are incomplete.
Today the only fix is removing and re-registering the app.
Node operators could use a tool to rewind the application to a known-good epoch and let it sync and re-process from there. Also useful on development environment.
Just to be clear, it would only rollback the node state for that particular application, not touching L1. Any later claim already on chain would be synced until the first unprocessed epoch.
Suggested Solution
- New CLI command
cartesi-rollups-cli app rollback <app-name|address> <epoch> [–force]
<epoch>Target epoch (must be less than current epoch).--forceSkips interactive “are you sure?” prompt (for CI).
- Pre-flight checks
- Verify the app is in DISABLED or INOPERABLE state.
- Confirm
epoch Nexists
- Rollback procedure
- Start a single DB transaction:
- Delete rows after epoch N in all application-scoped tables.
- Reset specific counters and pointers (
processed inputs,last_input_check_box, etc).
- Commit
- Safety features
- Should ask for confirmation.
- Should warn the operator about doing a backup first.
- Documentation
- Add “Recovery scenarios” page enumerating cases where rollback is safe or makes sense.
Deliverables
- New sub-command implementation in
cmd/cli/app/rollback.go. - Helper package
internal/rollbackwith unit tests simulating: - Happy path.
- Attempt to rollback past finalised epoch (should error).
- Integration test under
tests/cli/rollback_test.gothat:
- Seeds a sample app with 5 epochs.
- Runs
app rollback sample-app 3. - Asserts DB state now ends at epoch 3 and node resumes processing.
- Docs update (
docs/recovery/rollback.md).
Acceptance Criteria
| # | Scenario | Expected outcome |
|---|---|---|
| 1 | Roll back a disabled app from epoch 5 → 3 | Rows after epoch 3 removed; current_epoch = 3 |
| 2 | Rollback attempt while app ENABLED | CLI aborts with “must disable first” error |
| 3 | Run rollback, then re-enable app | Node processes epochs 4-5 anew without entering INOPERABLE |
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
No status