Skip to content

Conversation

@ws4charlie
Copy link
Contributor

@ws4charlie ws4charlie commented Sep 29, 2025

Description

This PR reintroduces the following changes with additional enhancement

  1. bring back fix: force rescan if inbound vote monitoring fails  #4183
  2. bring back fix: add timeout to monitoring routine #4196

Closes #4245

How Has This Been Tested?

  • Tested CCTX in localnet
  • Tested in development environment
  • Go unit tests
  • Go integration tests
  • Tested via GitHub Actions

Note

Force a rescan when inbound vote monitoring fails by propagating errors via a channel and resetting last scanned height, using a timeout-bound context; update interfaces, observers, and tests accordingly.

  • Inbound vote monitoring & rescan:
    • Introduce ErrTxMonitor and monitor error channel to propagate monitoring failures.
    • In Observer.PostVoteInbound, pass monitorErrCh and spawn handler handleMonitoringError(...) to log and ForceSaveLastBlockScanned(inboundHeight-1) to trigger rescan.
    • Add MonitoringErrHandlerRoutineTimeout and use zctx.CopyWithTimeout to bound monitoring duration.
  • Observer core changes:
    • Add forceResetLastScanned state; change WithLastBlockScanned(block, forceReset) signature and behavior to avoid overwriting when reset is pending.
    • Add ForceSaveLastBlockScanned to persist lower heights and enforce rescan.
    • Update usages across BTC/EVM/Solana observers to new API.
  • Zetacore client:
    • PostVoteInbound(ctx, gasLimit, retryGasLimit, msg, monitorErrCh); monitoring goroutine uses same ctx and reports errors via channel.
    • MonitorVoteInboundResult accepts monitorErrCh and resends on OOG; improved logging.
  • Context utilities:
    • Add CopyWithTimeout(from, to, timeout) plus tests.
  • Interfaces & mocks:
    • Update ZetacoreWriter/PostVoteInbound signature; regenerate/update mocks and test helpers.
  • Tests:
    • Adjust unit tests for new APIs; add case for last scanned reset scenario; add CopyWithTimeout tests.
  • Docs/Changelog:
    • Add changelog entry: "force rescan if inbound vote monitoring fails".

Written by Cursor Bugbot for commit ca462b5. This will update automatically on new commits. Configure here.

Summary by CodeRabbit

  • New Features

    • Automatically triggers a rescan if inbound vote monitoring reports an issue, improving reliability.
    • Adds timeout-backed monitoring for inbound votes to surface failures faster.
    • Improves inbound scan range calculation to better catch up when the scanner falls behind.
  • Bug Fixes

    • More robust last-scanned synchronization to reduce chances of stale or skipped blocks.
  • Documentation

    • Changelog updated to describe the new auto-rescan behavior on inbound vote monitoring failure.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 29, 2025

📝 Walkthrough

Walkthrough

Introduces structured monitoring via ErrTxMonitor and a monitorErrCh wired through PostVoteInbound and MonitorVoteInboundResult. Adds observer logic to handle monitor errors and optionally force-reset last-scanned blocks. Updates WithLastBlockScanned signature and usages. Adds context CopyWithTimeout utility. Adjusts tests and mocks accordingly. Minor range-calculation helper and changelog entry.

Changes

Cohort / File(s) Summary
Changelog
changelog.md
Adds entry describing forced rescan on inbound vote monitoring failure.
Error type for monitoring
pkg/errors/monitor_error.go
Adds ErrTxMonitor struct with fields and Error() for monitoring goroutine results.
Observer core: monitoring and last-scanned state
zetaclient/chains/base/observer.go
Adds monitor error handling flow, monitorErrCh usage, forceResetLastScanned flag, new MonitoringErrHandlerRoutineTimeout, ForceSaveLastBlockScanned, and changes WithLastBlockScanned signature to (uint64, bool) returning previous reset state.
Inbound scan range helper
zetaclient/chains/base/confirmation.go
Adds calcUnscannedBlockRange helper and illustrative comment; no exported API change.
Observer base tests
zetaclient/chains/base/confirmation_test.go, zetaclient/chains/base/observer_test.go
Updates WithLastBlockScanned calls to pass a boolean; adds test for scan range when monitoring reset lowers lastScanned; aligns fast-path test.
Interfaces
zetaclient/chains/interfaces/zetacore.go
Extends ZetacoreWriter.PostVoteInbound to include monitorErrCh chan<- zetaerrors.ErrTxMonitor.
Zetacore client: vote and monitor
zetaclient/zetacore/client_vote.go, zetaclient/zetacore/client_monitor.go
PostVoteInbound and MonitorVoteInboundResult accept and propagate monitorErrCh; monitoring goroutine reports ErrTxMonitor on errors; logging updated; context usage aligned.
Context utilities
zetaclient/context/context.go, zetaclient/context/context_test.go
Adds CopyWithTimeout(from, to, timeout) to copy AppContext with deadline; tests verify propagation and timeout behavior.
Chain observers (signature updates)
zetaclient/chains/bitcoin/observer/db.go, .../db_test.go, zetaclient/chains/evm/observer/observer.go, .../observer_test.go, zetaclient/chains/solana/observer/inbound.go, zetaclient/chains/sui/observer/observer_test.go, zetaclient/chains/ton/observer/observer_test.go
Updates WithLastBlockScanned callsites to (val, false); adjusts PostVoteInbound mocks to accept additional monitorErrCh argument.
Test mocks
zetaclient/testutils/mocks/zetacore_client.go, zetaclient/testutils/mocks/zetacore_client_opts.go
Extends mocked PostVoteInbound to include monitorErrCh; updates function adapters and expectations.
Zetacore client tests
zetaclient/zetacore/tx_test.go
Adds nil monitorErrCh argument to PostVoteInbound and MonitorVoteInboundResult calls.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Obs as Observer
  participant ZC as ZetacoreClient
  participant Mon as Monitor Goroutine
  participant H as handleMonitoringError

  rect rgb(245,250,255)
    note over Obs: Post vote inbound
    Obs->>ZC: PostVoteInbound(ctx, gas, retryGas, msg, monitorErrCh)
    activate ZC
    ZC-->>Obs: (zetaTxHash, ballotIndex)
  end

  par Start monitoring
    ZC->>Mon: MonitorVoteInboundResult(ctx, zetaTxHash, retryGas, msg, monitorErrCh)
    activate Mon
    alt success
      Mon-->>ZC: nil
      Mon-->>monitorErrCh: ErrTxMonitor{Err:nil}
    else error/timeout
      Mon-->>monitorErrCh: ErrTxMonitor{Err, InboundBlockHeight, ZetaTxHash, BallotIndex}
    end
    deactivate Mon
  and Handle monitor result
    Obs->>H: handleMonitoringError(monitorErrCh)
    activate H
    alt ErrTxMonitor with InboundBlockHeight>0
      H->>Obs: WithLastBlockScanned(height, true)
      note right of Obs: Set forceResetLastScanned
    else no error or no height
      H-->>Obs: no state change
    end
    deactivate H
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • lumtis
  • kingpinXD
  • renan061

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The provided title accurately summarizes the core change of forcing a rescan when inbound vote monitoring fails and also indicates the use of a timeout‐capable context for monitoring. However, the “(re-create)” prefix is unclear and the detail about using a timeout context may be too implementation-specific for a concise title. Refining these aspects will improve clarity and maintain the title’s focus on the primary behavior change.
Description Check ✅ Passed The pull request description includes both the required “# Description” and “# How Has This Been Tested?” sections, provides a concise summary of the changes while referencing the original issue and related PRs, and uses the prescribed test checklist. Although it does not explicitly list any dependencies or elaborate on the motivation beyond reintroducing prior work, it adheres to the core structural requirements of the repository’s template.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch force-rescan-if-inbound-vote-monitor-fail

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ws4charlie ws4charlie added zetaclient Issues related to ZetaClient enhancement New feature or request SOLANA_TESTS Run make start-solana-test SUI_TESTS Run make start-sui-tests TON_TESTS Runs TON E2E tests labels Sep 29, 2025
@codecov
Copy link

codecov bot commented Sep 29, 2025

Codecov Report

❌ Patch coverage is 51.32743% with 55 lines in your changes missing coverage. Please review.
✅ Project coverage is 65.98%. Comparing base (4af473c) to head (ca462b5).
⚠️ Report is 2 commits behind head on develop.

Files with missing lines Patch % Lines
zetaclient/chains/base/observer.go 55.55% 33 Missing and 3 partials ⚠️
pkg/errors/monitor_error.go 0.00% 6 Missing ⚠️
zetaclient/zetacore/client_vote.go 33.33% 6 Missing ⚠️
zetaclient/context/context.go 57.14% 2 Missing and 1 partial ⚠️
zetaclient/zetacore/client_monitor.go 50.00% 3 Missing ⚠️
zetaclient/chains/solana/observer/inbound.go 0.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #4278      +/-   ##
===========================================
- Coverage    66.05%   65.98%   -0.08%     
===========================================
  Files          454      455       +1     
  Lines        33566    33649      +83     
===========================================
+ Hits         22172    22203      +31     
- Misses       10433    10481      +48     
- Partials       961      965       +4     
Files with missing lines Coverage Δ
zetaclient/chains/base/confirmation.go 100.00% <ø> (ø)
zetaclient/chains/bitcoin/observer/db.go 93.61% <100.00%> (ø)
zetaclient/chains/evm/observer/observer.go 70.54% <100.00%> (ø)
zetaclient/chains/solana/observer/inbound.go 36.36% <0.00%> (ø)
zetaclient/context/context.go 71.42% <57.14%> (-7.15%) ⬇️
zetaclient/zetacore/client_monitor.go 50.45% <50.00%> (+0.45%) ⬆️
pkg/errors/monitor_error.go 0.00% <0.00%> (ø)
zetaclient/zetacore/client_vote.go 51.56% <33.33%> (-2.10%) ⬇️
zetaclient/chains/base/observer.go 74.69% <55.55%> (-8.02%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ws4charlie ws4charlie marked this pull request as ready for review October 1, 2025 03:20
@ws4charlie ws4charlie requested a review from a team as a code owner October 1, 2025 03:20
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
zetaclient/context/context.go (1)

43-50: Consider validating the timeout parameter.

The function does not validate whether timeout is positive. While goctx.WithTimeout handles non-positive durations by creating an already-expired context, explicitly documenting or validating this behavior would improve clarity and prevent subtle bugs.

zetaclient/context/context_test.go (2)

79-87: Timing assertions may be flaky in slow CI environments.

The upper bound assertion elapsed < timeout*2 (line 86) may fail in resource-constrained CI runners where goroutine scheduling can be delayed. Consider using a more generous upper bound (e.g., timeout + 1*time.Second) or leveraging test utilities that account for execution environment variability.


60-88: Add test coverage for the error path.

The test verifies the happy path where AppContext is present, but does not cover the scenario where FromContext fails (i.e., when the source context lacks AppContext). This path should be tested to ensure the function still returns a valid timeout context and cancel function.

Add a test case similar to:

func TestCopyWithTimeout_NoAppContext(t *testing.T) {
	// ARRANGE
	ctx1 := goctx.Background() // no AppContext
	timeout := 100 * time.Millisecond

	// ACT
	ctx2, cancel := context.CopyWithTimeout(ctx1, goctx.Background(), timeout)
	defer cancel()

	// ASSERT
	// Verify that AppContext is not present
	_, err := context.FromContext(ctx2)
	assert.ErrorIs(t, err, context.ErrNotSet)

	// Verify that timeout still works
	<-ctx2.Done()
	assert.ErrorIs(t, ctx2.Err(), goctx.DeadlineExceeded)
}
pkg/errors/monitor_error.go (1)

5-19: Review the nil Err pattern for semantic clarity.

The Error() method returns "monitoring completed without error" when Err is nil, which is semantically unusual—this type is being used to signal both success and failure on the monitoring channel. Consider whether:

  1. The channel should send nil to signal success (idiomatic Go) and ErrTxMonitor only for errors, OR
  2. A separate success signal type is warranted if additional metadata (block height, tx hash, ballot index) must accompany successful completion.

As implemented, calling .Error() on a success case produces an error-like string, which may confuse logging and error handling downstream.

If the current design is intentional and success metadata is required, document this dual-purpose behavior clearly in a comment above the type definition. Otherwise, refactor to separate success/failure signaling:

+// ErrTxMonitor represents an error from the monitoring goroutine.
+// When Err is nil, monitoring completed successfully but metadata is preserved.
+// Callers should check if Err is nil before treating this as an error.
 type ErrTxMonitor struct {
 	Err                error
 	InboundBlockHeight uint64
 	ZetaTxHash         string
 	BallotIndex        string
 }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 4af473c and ca462b5.

📒 Files selected for processing (21)
  • changelog.md (1 hunks)
  • pkg/errors/monitor_error.go (1 hunks)
  • zetaclient/chains/base/confirmation.go (1 hunks)
  • zetaclient/chains/base/confirmation_test.go (2 hunks)
  • zetaclient/chains/base/observer.go (11 hunks)
  • zetaclient/chains/base/observer_test.go (1 hunks)
  • zetaclient/chains/bitcoin/observer/db.go (1 hunks)
  • zetaclient/chains/bitcoin/observer/db_test.go (1 hunks)
  • zetaclient/chains/evm/observer/observer.go (1 hunks)
  • zetaclient/chains/evm/observer/observer_test.go (1 hunks)
  • zetaclient/chains/interfaces/zetacore.go (2 hunks)
  • zetaclient/chains/solana/observer/inbound.go (1 hunks)
  • zetaclient/chains/sui/observer/observer_test.go (2 hunks)
  • zetaclient/chains/ton/observer/observer_test.go (1 hunks)
  • zetaclient/context/context.go (2 hunks)
  • zetaclient/context/context_test.go (2 hunks)
  • zetaclient/testutils/mocks/zetacore_client.go (3 hunks)
  • zetaclient/testutils/mocks/zetacore_client_opts.go (1 hunks)
  • zetaclient/zetacore/client_monitor.go (6 hunks)
  • zetaclient/zetacore/client_vote.go (3 hunks)
  • zetaclient/zetacore/tx_test.go (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.go

⚙️ CodeRabbit configuration file

Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

Files:

  • zetaclient/chains/solana/observer/inbound.go
  • zetaclient/chains/base/observer_test.go
  • zetaclient/chains/ton/observer/observer_test.go
  • zetaclient/context/context_test.go
  • zetaclient/chains/evm/observer/observer_test.go
  • zetaclient/chains/bitcoin/observer/db_test.go
  • zetaclient/chains/evm/observer/observer.go
  • zetaclient/context/context.go
  • zetaclient/chains/base/confirmation_test.go
  • zetaclient/testutils/mocks/zetacore_client_opts.go
  • zetaclient/chains/interfaces/zetacore.go
  • zetaclient/zetacore/tx_test.go
  • zetaclient/chains/sui/observer/observer_test.go
  • zetaclient/testutils/mocks/zetacore_client.go
  • pkg/errors/monitor_error.go
  • zetaclient/chains/base/confirmation.go
  • zetaclient/zetacore/client_monitor.go
  • zetaclient/zetacore/client_vote.go
  • zetaclient/chains/bitcoin/observer/db.go
  • zetaclient/chains/base/observer.go
🧬 Code graph analysis (10)
zetaclient/chains/base/observer_test.go (1)
zetaclient/chains/base/observer.go (1)
  • Observer (45-87)
zetaclient/context/context_test.go (3)
zetaclient/context/app.go (1)
  • New (38-45)
zetaclient/config/config_chain.go (1)
  • New (15-32)
zetaclient/context/context.go (3)
  • WithAppContext (15-17)
  • CopyWithTimeout (43-50)
  • FromContext (20-27)
zetaclient/chains/base/confirmation_test.go (1)
zetaclient/chains/base/observer.go (1)
  • Observer (45-87)
zetaclient/chains/interfaces/zetacore.go (1)
pkg/errors/monitor_error.go (1)
  • ErrTxMonitor (6-11)
zetaclient/chains/sui/observer/observer_test.go (1)
pkg/errors/monitor_error.go (1)
  • ErrTxMonitor (6-11)
zetaclient/testutils/mocks/zetacore_client.go (1)
pkg/errors/monitor_error.go (1)
  • ErrTxMonitor (6-11)
zetaclient/zetacore/client_monitor.go (3)
pkg/errors/monitor_error.go (1)
  • ErrTxMonitor (6-11)
pkg/retry/retry.go (1)
  • Retry (126-136)
zetaclient/logs/fields.go (1)
  • FieldZetaTx (25-25)
zetaclient/zetacore/client_vote.go (1)
pkg/errors/monitor_error.go (1)
  • ErrTxMonitor (6-11)
zetaclient/chains/bitcoin/observer/db.go (2)
pkg/chains/bitcoin.go (1)
  • IsBitcoinRegnet (46-48)
zetaclient/chains/bitcoin/observer/observer.go (1)
  • RegnetStartBlock (69-69)
zetaclient/chains/base/observer.go (9)
zetaclient/chains/evm/observer/observer.go (1)
  • Observer (59-73)
zetaclient/chains/bitcoin/observer/observer.go (1)
  • Observer (94-132)
zetaclient/chains/solana/observer/observer.go (1)
  • Observer (70-87)
zetaclient/chains/sui/observer/observer.go (1)
  • Observer (19-32)
zetaclient/chains/ton/observer/observer.go (1)
  • Observer (20-30)
pkg/errors/monitor_error.go (1)
  • ErrTxMonitor (6-11)
zetaclient/context/context.go (1)
  • CopyWithTimeout (43-50)
zetaclient/chains/interfaces/zetacore.go (1)
  • ZetacoreClient (66-124)
zetaclient/logs/fields.go (3)
  • FieldZetaTx (25-25)
  • FieldBallotIndex (35-35)
  • FieldBlock (16-16)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: build-zetanode
  • GitHub Check: analyze (go)
  • GitHub Check: build
  • GitHub Check: gosec
  • GitHub Check: lint
  • GitHub Check: build-and-test
🔇 Additional comments (22)
zetaclient/context/context.go (1)

44-47: Return the cancel function in the error path.

The error path returns only the context from goctx.WithTimeout, discarding the cancel function. This creates inconsistent return behavior and prevents callers from properly releasing resources associated with the timeout context.

Apply this diff to return both values:

 func CopyWithTimeout(from, to goctx.Context, timeout time.Duration) (goctx.Context, goctx.CancelFunc) {
 	app, err := FromContext(from)
 	if err != nil {
-		return goctx.WithTimeout(to, timeout)
+		ctxWithTimeout, cancel := goctx.WithTimeout(to, timeout)
+		return ctxWithTimeout, cancel
 	}
 	ctxWithTimeout, cancel := goctx.WithTimeout(to, timeout)
 	return WithAppContext(ctxWithTimeout, app), cancel
 }

Likely an incorrect or invalid review comment.

zetaclient/chains/base/confirmation_test.go (3)

70-79: LGTM! Test case validates scan range for reset scenario.

The new test case correctly validates the scan range calculation when lastScanned significantly lags behind lastBlock (50 vs 100), ensuring the observer handles forced resets appropriately.


86-86: LGTM! API signature updated correctly.

The call to WithLastBlockScanned now includes the second boolean parameter (false), aligning with the API evolution across the codebase.


144-144: LGTM! Consistent API usage.

The signature update is consistent with the broader API change to WithLastBlockScanned(lastScanned uint64, includeInBatch bool).

zetaclient/chains/ton/observer/observer_test.go (1)

247-247: LGTM! Mock signature updated for monitoring channel.

The mock expectation correctly accommodates the additional monitorErrCh parameter introduced to PostVoteInbound, enabling error monitoring for inbound vote processing.

zetaclient/chains/solana/observer/inbound.go (1)

57-57: LGTM! API signature updated correctly.

The call to WithLastBlockScanned includes the second boolean parameter (false), properly updating the last scanned block for metrics when no new signatures are found and the slot query succeeds.

zetaclient/chains/bitcoin/observer/db_test.go (1)

99-99: LGTM! Test correctly resets state with updated API.

The call to WithLastBlockScanned(0, false) properly resets the last scanned block to trigger the RPC code path, enabling validation of error handling.

zetaclient/testutils/mocks/zetacore_client_opts.go (1)

50-50: LGTM! Mock helper updated for monitoring channel.

The WithPostVoteInbound helper now correctly mocks the extended PostVoteInbound signature with five parameters, including the monitorErrCh channel for vote monitoring.

zetaclient/chains/evm/observer/observer_test.go (1)

241-241: LGTM: Test correctly updated for new signature.

The addition of the false parameter to WithLastBlockScanned aligns with the broader API change. In this test context, where the last block is intentionally reset to 0 to trigger RPC loading, passing false (presumably meaning "don't skip") is the correct behavior.

zetaclient/zetacore/tx_test.go (2)

240-245: LGTM: Test updated correctly for new signature.

The addition of nil for the monitorErrCh parameter is appropriate for this unit test, which focuses on the "already voted" scenario and doesn't require monitoring channel verification.


282-287: LGTM: Test updated correctly for new signature.

Passing nil for the monitorErrCh parameter is appropriate here, as this test validates the basic monitoring flow without requiring error channel assertions.

zetaclient/chains/interfaces/zetacore.go (2)

13-13: LGTM: Import alias added appropriately.

The zetaerrors alias avoids collision with the standard errors package and clearly identifies the custom error types.


49-54: LGTM: Interface signature extended correctly.

The addition of monitorErrCh chan<- zetaerrors.ErrTxMonitor to PostVoteInbound is well-typed (send-only channel) and consistently propagated across implementations and mocks according to the PR context.

zetaclient/chains/sui/observer/observer_test.go (2)

18-18: LGTM: Import alias added appropriately.

Consistent with the pattern used in other test files to avoid collision with the standard errors package.


605-615: LGTM: Test mock updated correctly for new signature.

The CatchInboundVotes test helper correctly:

  1. Adds the monitorErrCh parameter to the callback signature (unused via _, which is appropriate for this test)
  2. Updates the mock expectation to match the new 5-parameter signature

The test continues to capture inbound votes as intended without requiring monitoring channel verification.

zetaclient/chains/base/observer_test.go (1)

196-196: LGTM: Test correctly updated for new API signature.

The addition of the false parameter aligns with the updated WithLastBlockScanned signature across the codebase.

zetaclient/chains/base/confirmation.go (1)

98-98: LGTM: Example documentation accurately illustrates edge case.

The added example correctly demonstrates the scenario where lastScanned is significantly behind lastBlock, producing 41 unscanned blocks in the range [51, 91]. The parenthetical note provides helpful context about monitoring-driven resets.

zetaclient/chains/evm/observer/observer.go (1)

264-264: LGTM: Correct parameter added for initial block setup.

The false parameter appropriately indicates this is a normal initialization path rather than a forced reset scenario.

zetaclient/chains/bitcoin/observer/db.go (2)

50-50: LGTM: API signature correctly updated.

The false parameter is appropriate for this initialization path.


55-55: LGTM: Regtest initialization correctly updated.

Consistent usage of the false parameter for the regtest-specific initialization.

zetaclient/zetacore/client_vote.go (2)

159-159: LGTM: Structured error monitoring added.

The new monitorErrCh parameter enables structured error propagation from the monitoring goroutine, improving observability of vote monitoring failures.


193-210: Revise log and reevaluate monitoring context

  • In client_vote.go’s goroutine select, change
    - case <-ctx.Done():
    -     c.logger.Error().Msg("context cancelled: timeout")
    + case <-ctx.Done():
    +     c.logger.Error().Err(ctx.Err()).Msg("context cancelled: failed to send monitor error")
    to report the actual cancellation reason.
  • Confirm whether the monitor goroutine should share the parent ctx (and be cancelled when the request ends) or use a separate context that outlives the caller to avoid dropping monitoring errors.

Copy link
Member

@lumtis lumtis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm think about changing our approach for this one.
Initially we were in a context where we wanted a quick patch for the missed inbound issue.
This rescan approach was the simplest implementation for it.
But this has drawback, like rescanning blocks that you already know will not contains event and blocking the scanning process if for some reason one inbound can't be processed.
In the end I think the best approach would be to keep the current workflow for the block scanning and report the events that failed to be observed in a internal tracker cache.
We have a separated workflow to iterate these missed observation, the same as the current behavior with the tracker lists.
This way we ensure that most ZetaClient will remain at the same pace of block observation and one is not falling behind.

@ws4charlie
Copy link
Contributor Author

ws4charlie commented Oct 1, 2025

I'm think about changing our approach for this one. Initially we were in a context where we wanted a quick patch for the missed inbound issue. This rescan approach was the simplest implementation for it. But this has drawback, like rescanning blocks that you already know will not contains event and blocking the scanning process if for some reason one inbound can't be processed. In the end I think the best approach would be to keep the current workflow for the block scanning and report the events that failed to be observed in a internal tracker cache. We have a separated workflow to iterate these missed observation, the same as the current behavior with the tracker lists. This way we ensure that most ZetaClient will remain at the same pace of block observation and one is not falling behind.

I also think the previous solution could be improved. Creating zetaclient internal inbound trackers (like a backlog) is worth trying. I'll put this PR on draft and open it later.

@ws4charlie ws4charlie marked this pull request as draft October 1, 2025 19:55
@ws4charlie
Copy link
Contributor Author

Close this PR, it's replaced by #4295

@ws4charlie ws4charlie closed this Oct 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request SOLANA_TESTS Run make start-solana-test SUI_TESTS Run make start-sui-tests TON_TESTS Runs TON E2E tests zetaclient Issues related to ZetaClient

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Force rescan when inbound fails to be observed

3 participants