fix(re-create): force rescan if inbound vote monitoring fails using a context that can timeout #4278

ws4charlie · 2025-09-29T20:16:34Z

Description

This PR reintroduces the following changes with additional enhancement

bring back fix: force rescan if inbound vote monitoring fails #4183
bring back fix: add timeout to monitoring routine #4196

How Has This Been Tested?

Note

Force a rescan when inbound vote monitoring fails by propagating errors via a channel and resetting last scanned height, using a timeout-bound context; update interfaces, observers, and tests accordingly.

Inbound vote monitoring & rescan:
- Introduce ErrTxMonitor and monitor error channel to propagate monitoring failures.
- In Observer.PostVoteInbound, pass monitorErrCh and spawn handler handleMonitoringError(...) to log and ForceSaveLastBlockScanned(inboundHeight-1) to trigger rescan.
- Add MonitoringErrHandlerRoutineTimeout and use zctx.CopyWithTimeout to bound monitoring duration.
Observer core changes:
- Add forceResetLastScanned state; change WithLastBlockScanned(block, forceReset) signature and behavior to avoid overwriting when reset is pending.
- Add ForceSaveLastBlockScanned to persist lower heights and enforce rescan.
- Update usages across BTC/EVM/Solana observers to new API.
Zetacore client:
- PostVoteInbound(ctx, gasLimit, retryGasLimit, msg, monitorErrCh); monitoring goroutine uses same ctx and reports errors via channel.
- MonitorVoteInboundResult accepts monitorErrCh and resends on OOG; improved logging.
Context utilities:
- Add CopyWithTimeout(from, to, timeout) plus tests.
Interfaces & mocks:
- Update ZetacoreWriter/PostVoteInbound signature; regenerate/update mocks and test helpers.
Tests:
- Adjust unit tests for new APIs; add case for last scanned reset scenario; add CopyWithTimeout tests.
Docs/Changelog:
- Add changelog entry: "force rescan if inbound vote monitoring fails".

^{Written by Cursor Bugbot for commit ca462b5. This will update automatically on new commits. Configure here.}

Summary by CodeRabbit

New Features
- Automatically triggers a rescan if inbound vote monitoring reports an issue, improving reliability.
- Adds timeout-backed monitoring for inbound votes to surface failures faster.
- Improves inbound scan range calculation to better catch up when the scanner falls behind.
Bug Fixes
- More robust last-scanned synchronization to reduce chances of stale or skipped blocks.
Documentation
- Changelog updated to describe the new auto-rescan behavior on inbound vote monitoring failure.

…n timeout

coderabbitai · 2025-09-29T20:16:42Z

📝 Walkthrough

Walkthrough

Introduces structured monitoring via ErrTxMonitor and a monitorErrCh wired through PostVoteInbound and MonitorVoteInboundResult. Adds observer logic to handle monitor errors and optionally force-reset last-scanned blocks. Updates WithLastBlockScanned signature and usages. Adds context CopyWithTimeout utility. Adjusts tests and mocks accordingly. Minor range-calculation helper and changelog entry.

Changes

Cohort / File(s)	Summary
Changelog `changelog.md`	Adds entry describing forced rescan on inbound vote monitoring failure.
Error type for monitoring `pkg/errors/monitor_error.go`	Adds ErrTxMonitor struct with fields and Error() for monitoring goroutine results.
Observer core: monitoring and last-scanned state `zetaclient/chains/base/observer.go`	Adds monitor error handling flow, monitorErrCh usage, forceResetLastScanned flag, new MonitoringErrHandlerRoutineTimeout, ForceSaveLastBlockScanned, and changes WithLastBlockScanned signature to `(uint64, bool)` returning previous reset state.
Inbound scan range helper `zetaclient/chains/base/confirmation.go`	Adds calcUnscannedBlockRange helper and illustrative comment; no exported API change.
Observer base tests `zetaclient/chains/base/confirmation_test.go`, `zetaclient/chains/base/observer_test.go`	Updates WithLastBlockScanned calls to pass a boolean; adds test for scan range when monitoring reset lowers lastScanned; aligns fast-path test.
Interfaces `zetaclient/chains/interfaces/zetacore.go`	Extends ZetacoreWriter.PostVoteInbound to include `monitorErrCh chan<- zetaerrors.ErrTxMonitor`.
Zetacore client: vote and monitor `zetaclient/zetacore/client_vote.go`, `zetaclient/zetacore/client_monitor.go`	PostVoteInbound and MonitorVoteInboundResult accept and propagate monitorErrCh; monitoring goroutine reports ErrTxMonitor on errors; logging updated; context usage aligned.
Context utilities `zetaclient/context/context.go`, `zetaclient/context/context_test.go`	Adds CopyWithTimeout(from, to, timeout) to copy AppContext with deadline; tests verify propagation and timeout behavior.
Chain observers (signature updates) `zetaclient/chains/bitcoin/observer/db.go`, `.../db_test.go`, `zetaclient/chains/evm/observer/observer.go`, `.../observer_test.go`, `zetaclient/chains/solana/observer/inbound.go`, `zetaclient/chains/sui/observer/observer_test.go`, `zetaclient/chains/ton/observer/observer_test.go`	Updates WithLastBlockScanned callsites to `(val, false)`; adjusts PostVoteInbound mocks to accept additional monitorErrCh argument.
Test mocks `zetaclient/testutils/mocks/zetacore_client.go`, `zetaclient/testutils/mocks/zetacore_client_opts.go`	Extends mocked PostVoteInbound to include monitorErrCh; updates function adapters and expectations.
Zetacore client tests `zetaclient/zetacore/tx_test.go`	Adds nil monitorErrCh argument to PostVoteInbound and MonitorVoteInboundResult calls.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Obs as Observer
  participant ZC as ZetacoreClient
  participant Mon as Monitor Goroutine
  participant H as handleMonitoringError

  rect rgb(245,250,255)
    note over Obs: Post vote inbound
    Obs->>ZC: PostVoteInbound(ctx, gas, retryGas, msg, monitorErrCh)
    activate ZC
    ZC-->>Obs: (zetaTxHash, ballotIndex)
  end

  par Start monitoring
    ZC->>Mon: MonitorVoteInboundResult(ctx, zetaTxHash, retryGas, msg, monitorErrCh)
    activate Mon
    alt success
      Mon-->>ZC: nil
      Mon-->>monitorErrCh: ErrTxMonitor{Err:nil}
    else error/timeout
      Mon-->>monitorErrCh: ErrTxMonitor{Err, InboundBlockHeight, ZetaTxHash, BallotIndex}
    end
    deactivate Mon
  and Handle monitor result
    Obs->>H: handleMonitoringError(monitorErrCh)
    activate H
    alt ErrTxMonitor with InboundBlockHeight>0
      H->>Obs: WithLastBlockScanned(height, true)
      note right of Obs: Set forceResetLastScanned
    else no error or no height
      H-->>Obs: no state change
    end
    deactivate H
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

revert: force rescan if inbound vote monitoring fails #4250 — Reverts these exact monitoring and API signature changes, indicating direct linkage.
fix: force rescan if inbound vote monitoring fails #4183 — Touches the same monitoring error channel, ErrTxMonitor type, and observer reset wiring.
feat(zetaclient): minor code improvements #4014 — Modifies MonitorVoteInboundResult paths and retry/monitoring behavior in the same client area.

Suggested reviewers

lumtis
kingpinXD
renan061

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The provided title accurately summarizes the core change of forcing a rescan when inbound vote monitoring fails and also indicates the use of a timeout‐capable context for monitoring. However, the “(re-create)” prefix is unclear and the detail about using a timeout context may be too implementation-specific for a concise title. Refining these aspects will improve clarity and maintain the title’s focus on the primary behavior change.
Description Check	✅ Passed	The pull request description includes both the required “# Description” and “# How Has This Been Tested?” sections, provides a concise summary of the changes while referencing the original issue and related PRs, and uses the prescribed test checklist. Although it does not explicitly list any dependencies or elaborate on the motivation beyond reintroducing prior work, it adheres to the core structural requirements of the repository’s template.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch force-rescan-if-inbound-vote-monitor-fail

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2025-09-29T20:24:33Z

Codecov Report

❌ Patch coverage is 51.32743% with 55 lines in your changes missing coverage. Please review.
✅ Project coverage is 65.98%. Comparing base (4af473c) to head (ca462b5).
⚠️ Report is 2 commits behind head on develop.

Files with missing lines	Patch %	Lines
zetaclient/chains/base/observer.go	55.55%	33 Missing and 3 partials ⚠️
pkg/errors/monitor_error.go	0.00%	6 Missing ⚠️
zetaclient/zetacore/client_vote.go	33.33%	6 Missing ⚠️
zetaclient/context/context.go	57.14%	2 Missing and 1 partial ⚠️
zetaclient/zetacore/client_monitor.go	50.00%	3 Missing ⚠️
zetaclient/chains/solana/observer/inbound.go	0.00%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #4278      +/-   ##
===========================================
- Coverage    66.05%   65.98%   -0.08%     
===========================================
  Files          454      455       +1     
  Lines        33566    33649      +83     
===========================================
+ Hits         22172    22203      +31     
- Misses       10433    10481      +48     
- Partials       961      965       +4

Files with missing lines	Coverage Δ
zetaclient/chains/base/confirmation.go	`100.00% <ø> (ø)`
zetaclient/chains/bitcoin/observer/db.go	`93.61% <100.00%> (ø)`
zetaclient/chains/evm/observer/observer.go	`70.54% <100.00%> (ø)`
zetaclient/chains/solana/observer/inbound.go	`36.36% <0.00%> (ø)`
zetaclient/context/context.go	`71.42% <57.14%> (-7.15%)`	⬇️
zetaclient/zetacore/client_monitor.go	`50.45% <50.00%> (+0.45%)`	⬆️
pkg/errors/monitor_error.go	`0.00% <0.00%> (ø)`
zetaclient/zetacore/client_vote.go	`51.56% <33.33%> (-2.10%)`	⬇️
zetaclient/chains/base/observer.go	`74.69% <55.55%> (-8.02%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (4)

zetaclient/context/context.go (1)

43-50: Consider validating the timeout parameter.

The function does not validate whether timeout is positive. While goctx.WithTimeout handles non-positive durations by creating an already-expired context, explicitly documenting or validating this behavior would improve clarity and prevent subtle bugs.
zetaclient/context/context_test.go (2)
79-87: Timing assertions may be flaky in slow CI environments.

The upper bound assertion elapsed < timeout*2 (line 86) may fail in resource-constrained CI runners where goroutine scheduling can be delayed. Consider using a more generous upper bound (e.g., timeout + 1*time.Second) or leveraging test utilities that account for execution environment variability.

60-88: Add test coverage for the error path.

The test verifies the happy path where AppContext is present, but does not cover the scenario where FromContext fails (i.e., when the source context lacks AppContext). This path should be tested to ensure the function still returns a valid timeout context and cancel function.

Add a test case similar to:
func TestCopyWithTimeout_NoAppContext(t *testing.T) {
	// ARRANGE
	ctx1 := goctx.Background() // no AppContext
	timeout := 100 * time.Millisecond

	// ACT
	ctx2, cancel := context.CopyWithTimeout(ctx1, goctx.Background(), timeout)
	defer cancel()

	// ASSERT
	// Verify that AppContext is not present
	_, err := context.FromContext(ctx2)
	assert.ErrorIs(t, err, context.ErrNotSet)

	// Verify that timeout still works
	<-ctx2.Done()
	assert.ErrorIs(t, ctx2.Err(), goctx.DeadlineExceeded)
}
pkg/errors/monitor_error.go (1)
5-19: Review the nil Err pattern for semantic clarity.

The Error() method returns "monitoring completed without error" when Err is nil, which is semantically unusual—this type is being used to signal both success and failure on the monitoring channel. Consider whether:

The channel should send nil to signal success (idiomatic Go) and ErrTxMonitor only for errors, OR

A separate success signal type is warranted if additional metadata (block height, tx hash, ballot index) must accompany successful completion.

As implemented, calling .Error() on a success case produces an error-like string, which may confuse logging and error handling downstream.

If the current design is intentional and success metadata is required, document this dual-purpose behavior clearly in a comment above the type definition. Otherwise, refactor to separate success/failure signaling:
+// ErrTxMonitor represents an error from the monitoring goroutine.
+// When Err is nil, monitoring completed successfully but metadata is preserved.
+// Callers should check if Err is nil before treating this as an error.
 type ErrTxMonitor struct {
 	Err                error
 	InboundBlockHeight uint64
 	ZetaTxHash         string
 	BallotIndex        string
 }

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 4af473c and ca462b5.

📒 Files selected for processing (21)

changelog.md (1 hunks)
pkg/errors/monitor_error.go (1 hunks)
zetaclient/chains/base/confirmation.go (1 hunks)
zetaclient/chains/base/confirmation_test.go (2 hunks)
zetaclient/chains/base/observer.go (11 hunks)
zetaclient/chains/base/observer_test.go (1 hunks)
zetaclient/chains/bitcoin/observer/db.go (1 hunks)
zetaclient/chains/bitcoin/observer/db_test.go (1 hunks)
zetaclient/chains/evm/observer/observer.go (1 hunks)
zetaclient/chains/evm/observer/observer_test.go (1 hunks)
zetaclient/chains/interfaces/zetacore.go (2 hunks)
zetaclient/chains/solana/observer/inbound.go (1 hunks)
zetaclient/chains/sui/observer/observer_test.go (2 hunks)
zetaclient/chains/ton/observer/observer_test.go (1 hunks)
zetaclient/context/context.go (2 hunks)
zetaclient/context/context_test.go (2 hunks)
zetaclient/testutils/mocks/zetacore_client.go (3 hunks)
zetaclient/testutils/mocks/zetacore_client_opts.go (1 hunks)
zetaclient/zetacore/client_monitor.go (6 hunks)
zetaclient/zetacore/client_vote.go (3 hunks)
zetaclient/zetacore/tx_test.go (2 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.go

⚙️ CodeRabbit configuration file

Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

Files:

zetaclient/chains/solana/observer/inbound.go
zetaclient/chains/base/observer_test.go
zetaclient/chains/ton/observer/observer_test.go
zetaclient/context/context_test.go
zetaclient/chains/evm/observer/observer_test.go
zetaclient/chains/bitcoin/observer/db_test.go
zetaclient/chains/evm/observer/observer.go
zetaclient/context/context.go
zetaclient/chains/base/confirmation_test.go
zetaclient/testutils/mocks/zetacore_client_opts.go
zetaclient/chains/interfaces/zetacore.go
zetaclient/zetacore/tx_test.go
zetaclient/chains/sui/observer/observer_test.go
zetaclient/testutils/mocks/zetacore_client.go
pkg/errors/monitor_error.go
zetaclient/chains/base/confirmation.go
zetaclient/zetacore/client_monitor.go
zetaclient/zetacore/client_vote.go
zetaclient/chains/bitcoin/observer/db.go
zetaclient/chains/base/observer.go

🧬 Code graph analysis (10)

zetaclient/chains/base/observer_test.go (1)

zetaclient/chains/base/observer.go (1)

Observer (45-87)

zetaclient/context/context_test.go (3)

zetaclient/context/app.go (1)

New (38-45)

zetaclient/config/config_chain.go (1)

New (15-32)

zetaclient/context/context.go (3)

WithAppContext (15-17)

CopyWithTimeout (43-50)

FromContext (20-27)

zetaclient/chains/base/confirmation_test.go (1)

zetaclient/chains/base/observer.go (1)

Observer (45-87)

zetaclient/chains/interfaces/zetacore.go (1)

pkg/errors/monitor_error.go (1)

ErrTxMonitor (6-11)

zetaclient/chains/sui/observer/observer_test.go (1)

pkg/errors/monitor_error.go (1)

ErrTxMonitor (6-11)

zetaclient/testutils/mocks/zetacore_client.go (1)

pkg/errors/monitor_error.go (1)

ErrTxMonitor (6-11)

zetaclient/zetacore/client_monitor.go (3)

pkg/errors/monitor_error.go (1)

ErrTxMonitor (6-11)

pkg/retry/retry.go (1)

Retry (126-136)

zetaclient/logs/fields.go (1)

FieldZetaTx (25-25)

zetaclient/zetacore/client_vote.go (1)

pkg/errors/monitor_error.go (1)

ErrTxMonitor (6-11)

zetaclient/chains/bitcoin/observer/db.go (2)

pkg/chains/bitcoin.go (1)

IsBitcoinRegnet (46-48)

zetaclient/chains/bitcoin/observer/observer.go (1)

RegnetStartBlock (69-69)

zetaclient/chains/base/observer.go (9)

zetaclient/chains/evm/observer/observer.go (1)

Observer (59-73)

zetaclient/chains/bitcoin/observer/observer.go (1)

Observer (94-132)

zetaclient/chains/solana/observer/observer.go (1)

Observer (70-87)

zetaclient/chains/sui/observer/observer.go (1)

Observer (19-32)

zetaclient/chains/ton/observer/observer.go (1)

Observer (20-30)

pkg/errors/monitor_error.go (1)

ErrTxMonitor (6-11)

zetaclient/context/context.go (1)

CopyWithTimeout (43-50)

zetaclient/chains/interfaces/zetacore.go (1)

ZetacoreClient (66-124)

zetaclient/logs/fields.go (3)

FieldZetaTx (25-25)

FieldBallotIndex (35-35)

FieldBlock (16-16)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: build-zetanode
GitHub Check: analyze (go)
GitHub Check: build
GitHub Check: gosec
GitHub Check: lint
GitHub Check: build-and-test

🔇 Additional comments (22)

zetaclient/context/context.go (1)
44-47: Return the cancel function in the error path.

The error path returns only the context from goctx.WithTimeout, discarding the cancel function. This creates inconsistent return behavior and prevents callers from properly releasing resources associated with the timeout context.

Apply this diff to return both values:
 func CopyWithTimeout(from, to goctx.Context, timeout time.Duration) (goctx.Context, goctx.CancelFunc) {
 	app, err := FromContext(from)
 	if err != nil {
-		return goctx.WithTimeout(to, timeout)
+		ctxWithTimeout, cancel := goctx.WithTimeout(to, timeout)
+		return ctxWithTimeout, cancel
 	}
 	ctxWithTimeout, cancel := goctx.WithTimeout(to, timeout)
 	return WithAppContext(ctxWithTimeout, app), cancel
 }
Likely an incorrect or invalid review comment.
zetaclient/chains/base/confirmation_test.go (3)

70-79: LGTM! Test case validates scan range for reset scenario.

The new test case correctly validates the scan range calculation when lastScanned significantly lags behind lastBlock (50 vs 100), ensuring the observer handles forced resets appropriately.

86-86: LGTM! API signature updated correctly.

The call to WithLastBlockScanned now includes the second boolean parameter (false), aligning with the API evolution across the codebase.

144-144: LGTM! Consistent API usage.

The signature update is consistent with the broader API change to WithLastBlockScanned(lastScanned uint64, includeInBatch bool).

zetaclient/chains/ton/observer/observer_test.go (1)

247-247: LGTM! Mock signature updated for monitoring channel.

The mock expectation correctly accommodates the additional monitorErrCh parameter introduced to PostVoteInbound, enabling error monitoring for inbound vote processing.

zetaclient/chains/solana/observer/inbound.go (1)

57-57: LGTM! API signature updated correctly.

The call to WithLastBlockScanned includes the second boolean parameter (false), properly updating the last scanned block for metrics when no new signatures are found and the slot query succeeds.

zetaclient/chains/bitcoin/observer/db_test.go (1)

99-99: LGTM! Test correctly resets state with updated API.

The call to WithLastBlockScanned(0, false) properly resets the last scanned block to trigger the RPC code path, enabling validation of error handling.

zetaclient/testutils/mocks/zetacore_client_opts.go (1)

50-50: LGTM! Mock helper updated for monitoring channel.

The WithPostVoteInbound helper now correctly mocks the extended PostVoteInbound signature with five parameters, including the monitorErrCh channel for vote monitoring.

zetaclient/chains/evm/observer/observer_test.go (1)

241-241: LGTM: Test correctly updated for new signature.

The addition of the false parameter to WithLastBlockScanned aligns with the broader API change. In this test context, where the last block is intentionally reset to 0 to trigger RPC loading, passing false (presumably meaning "don't skip") is the correct behavior.

zetaclient/zetacore/tx_test.go (2)

240-245: LGTM: Test updated correctly for new signature.

The addition of nil for the monitorErrCh parameter is appropriate for this unit test, which focuses on the "already voted" scenario and doesn't require monitoring channel verification.

282-287: LGTM: Test updated correctly for new signature.

Passing nil for the monitorErrCh parameter is appropriate here, as this test validates the basic monitoring flow without requiring error channel assertions.

zetaclient/chains/interfaces/zetacore.go (2)

13-13: LGTM: Import alias added appropriately.

The zetaerrors alias avoids collision with the standard errors package and clearly identifies the custom error types.

49-54: LGTM: Interface signature extended correctly.

The addition of monitorErrCh chan<- zetaerrors.ErrTxMonitor to PostVoteInbound is well-typed (send-only channel) and consistently propagated across implementations and mocks according to the PR context.

zetaclient/chains/sui/observer/observer_test.go (2)

18-18: LGTM: Import alias added appropriately.

Consistent with the pattern used in other test files to avoid collision with the standard errors package.

605-615: LGTM: Test mock updated correctly for new signature.

The CatchInboundVotes test helper correctly:

Adds the monitorErrCh parameter to the callback signature (unused via _, which is appropriate for this test)

Updates the mock expectation to match the new 5-parameter signature

The test continues to capture inbound votes as intended without requiring monitoring channel verification.

zetaclient/chains/base/observer_test.go (1)

196-196: LGTM: Test correctly updated for new API signature.

The addition of the false parameter aligns with the updated WithLastBlockScanned signature across the codebase.

zetaclient/chains/base/confirmation.go (1)

98-98: LGTM: Example documentation accurately illustrates edge case.

The added example correctly demonstrates the scenario where lastScanned is significantly behind lastBlock, producing 41 unscanned blocks in the range [51, 91]. The parenthetical note provides helpful context about monitoring-driven resets.

zetaclient/chains/evm/observer/observer.go (1)

264-264: LGTM: Correct parameter added for initial block setup.

The false parameter appropriately indicates this is a normal initialization path rather than a forced reset scenario.

zetaclient/chains/bitcoin/observer/db.go (2)

50-50: LGTM: API signature correctly updated.

The false parameter is appropriate for this initialization path.

55-55: LGTM: Regtest initialization correctly updated.

Consistent usage of the false parameter for the regtest-specific initialization.
zetaclient/zetacore/client_vote.go (2)
159-159: LGTM: Structured error monitoring added.

The new monitorErrCh parameter enables structured error propagation from the monitoring goroutine, improving observability of vote monitoring failures.

193-210: Revise log and reevaluate monitoring context
In client_vote.go’s goroutine select, change
- case <-ctx.Done():
-     c.logger.Error().Msg("context cancelled: timeout")
+ case <-ctx.Done():
+     c.logger.Error().Err(ctx.Err()).Msg("context cancelled: failed to send monitor error")
to report the actual cancellation reason.
Confirm whether the monitor goroutine should share the parent ctx (and be cancelled when the request ends) or use a separate context that outlives the caller to avoid dropping monitoring errors.

changelog.md

zetaclient/chains/base/confirmation_test.go

zetaclient/chains/base/observer.go

lumtis

I'm think about changing our approach for this one.
Initially we were in a context where we wanted a quick patch for the missed inbound issue.
This rescan approach was the simplest implementation for it.
But this has drawback, like rescanning blocks that you already know will not contains event and blocking the scanning process if for some reason one inbound can't be processed.
In the end I think the best approach would be to keep the current workflow for the block scanning and report the events that failed to be observed in a internal tracker cache.
We have a separated workflow to iterate these missed observation, the same as the current behavior with the tracker lists.
This way we ensure that most ZetaClient will remain at the same pace of block observation and one is not falling behind.

ws4charlie · 2025-10-01T19:53:00Z

I'm think about changing our approach for this one. Initially we were in a context where we wanted a quick patch for the missed inbound issue. This rescan approach was the simplest implementation for it. But this has drawback, like rescanning blocks that you already know will not contains event and blocking the scanning process if for some reason one inbound can't be processed. In the end I think the best approach would be to keep the current workflow for the block scanning and report the events that failed to be observed in a internal tracker cache. We have a separated workflow to iterate these missed observation, the same as the current behavior with the tracker lists. This way we ensure that most ZetaClient will remain at the same pace of block observation and one is not falling behind.

I also think the previous solution could be improved. Creating zetaclient internal inbound trackers (like a backlog) is worth trying. I'll put this PR on draft and open it later.

ws4charlie · 2025-10-02T20:13:41Z

Close this PR, it's replaced by #4295

force rescan if inbound vote monitoring fails using a context that ca…

11d8f2f

…n timeout

ws4charlie added zetaclient Issues related to ZetaClient enhancement New feature or request SOLANA_TESTS Run make start-solana-test SUI_TESTS Run make start-sui-tests TON_TESTS Runs TON E2E tests labels Sep 29, 2025

ws4charlie and others added 2 commits September 29, 2025 15:34

add changelog entry

844d193

Merge branch 'develop' into force-rescan-if-inbound-vote-monitor-fail

ca462b5

ws4charlie marked this pull request as ready for review October 1, 2025 03:20

ws4charlie requested a review from a team as a code owner October 1, 2025 03:20

coderabbitai bot reviewed Oct 1, 2025

View reviewed changes

changelog.md Show resolved Hide resolved

zetaclient/chains/base/confirmation_test.go Show resolved Hide resolved

zetaclient/chains/base/observer.go Show resolved Hide resolved

lumtis reviewed Oct 1, 2025

View reviewed changes

ws4charlie marked this pull request as draft October 1, 2025 19:55

ws4charlie closed this Oct 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(re-create): force rescan if inbound vote monitoring fails using a context that can timeout #4278

fix(re-create): force rescan if inbound vote monitoring fails using a context that can timeout #4278

Uh oh!

ws4charlie commented Sep 29, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Sep 29, 2025 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

codecov bot commented Sep 29, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lumtis left a comment

Uh oh!

ws4charlie commented Oct 1, 2025 •

edited

Loading

Uh oh!

ws4charlie commented Oct 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix(re-create): force rescan if inbound vote monitoring fails using a context that can timeout #4278

fix(re-create): force rescan if inbound vote monitoring fails using a context that can timeout #4278

Uh oh!

Conversation

ws4charlie commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How Has This Been Tested?

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

codecov bot commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lumtis left a comment

Choose a reason for hiding this comment

Uh oh!

ws4charlie commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ws4charlie commented Oct 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ws4charlie commented Sep 29, 2025 •

edited

Loading

coderabbitai bot commented Sep 29, 2025 •

edited

Loading

codecov bot commented Sep 29, 2025 •

edited

Loading

ws4charlie commented Oct 1, 2025 •

edited

Loading