Skip to content

CLI: app rollback command to rewind an app to a chosen epoch for recovery #683

@vfusco

Description

@vfusco

Context / Problem

An application can enter the INOPERABLE state for many reasons, but some of them are after transient failures such as timeouts, network failures and system out of memory.

In many of these cases the data up to an earlier epoch is still valid; only later epochs are incomplete.
Today the only fix is removing and re-registering the app.

Node operators could use a tool to rewind the application to a known-good epoch and let it sync and re-process from there. Also useful on development environment.

Just to be clear, it would only rollback the node state for that particular application, not touching L1. Any later claim already on chain would be synced until the first unprocessed epoch.


Suggested Solution

  1. New CLI command
cartesi-rollups-cli app rollback <app-name|address> <epoch> [–force]
  • <epoch> Target epoch (must be less than current epoch).
  • --force Skips interactive “are you sure?” prompt (for CI).
  1. Pre-flight checks
  • Verify the app is in DISABLED or INOPERABLE state.
  • Confirm epoch N exists
  1. Rollback procedure
  • Start a single DB transaction:
    1. Delete rows after epoch N in all application-scoped tables.
    2. Reset specific counters and pointers (processed inputs, last_input_check_box, etc).
  • Commit
  1. Safety features
  • Should ask for confirmation.
  • Should warn the operator about doing a backup first.
  1. Documentation
  • Add “Recovery scenarios” page enumerating cases where rollback is safe or makes sense.

Deliverables

  • New sub-command implementation in cmd/cli/app/rollback.go.
  • Helper package internal/rollback with unit tests simulating:
  • Happy path.
  • Attempt to rollback past finalised epoch (should error).
  • Integration test under tests/cli/rollback_test.go that:
  1. Seeds a sample app with 5 epochs.
  2. Runs app rollback sample-app 3.
  3. Asserts DB state now ends at epoch 3 and node resumes processing.
  • Docs update (docs/recovery/rollback.md).

Acceptance Criteria

# Scenario Expected outcome
1 Roll back a disabled app from epoch 5 → 3 Rows after epoch 3 removed; current_epoch = 3
2 Rollback attempt while app ENABLED CLI aborts with “must disable first” error
3 Run rollback, then re-enable app Node processes epochs 4-5 anew without entering INOPERABLE

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    No status

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions