Skip to content

VarDiff: downward crush at shares_per_min=0 with deterministic ~0.0378 attractor #32

@dvb-projekt

Description

@dvb-projekt

Summary

When a miner has been temporarily silent (no accepted shares submitted in the current adjustment window) but starts from a high difficulty (e.g., via mining.suggest_difficulty, default_difficulty, or a downstream cache-restore path), the next VarDiff tick can crush the miner's difficulty all the way down to min_difficulty, then settle on a deterministic constant near 0.0378 on the following tick. Recovery from there takes ~10–15 minutes of stepwise upward adjustments — during which an ASIC-class miner is effectively non-contributing to block search.

I'm filing this against upstream because the log evidence comes from a downstream fork (with a difficulty-resume cache) but the math anomaly happens entirely inside upstream suggestedVardiff / maybeAdjustDifficulty — independent of how the miner got to the high diff in the first place.

Symptom (excerpted log)

ASIC miner (Avalon Q, ~90 TH/s rated) connects with effective starting diff 62 269. Pool is in steady-state throttle window (1m40s after restart). First VarDiff tick at ~T+2:30 after reconnect:

13:30:17  set difficulty miner=…AvalonQ requested_diff=62269 clamped_diff=62269
13:32:42  vardiff adjust miner=…AvalonQ shares_per_min=0 old_diff=62269 new_diff=0.001       ← crush to min_difficulty
13:33:28  vardiff adjust miner=…AvalonQ shares_per_min=0 old_diff=0.001 new_diff=0.03789442986798398   ← attractor
13:35:28  vardiff adjust miner=…AvalonQ shares_per_min=24.5    old_diff=0.0378  new_diff=0.815
13:37:28  vardiff adjust miner=…AvalonQ shares_per_min=526     old_diff=0.815   new_diff=18
13:39:28  vardiff adjust miner=…AvalonQ shares_per_min=11609   old_diff=18      new_diff=387
13:41:28  vardiff adjust miner=…AvalonQ shares_per_min=249940  old_diff=387     new_diff=8331
13:43:28  vardiff adjust miner=…AvalonQ shares_per_min=545450  old_diff=8331    new_diff=18182
13:44:16  vardiff adjust miner=…AvalonQ shares_per_min=577560  old_diff=18182   new_diff=36364

Steady-state diff before the restart was ~62k, and it took ~14 minutes to ramp back to ~36k.

Two distinct issues

1. Downward crush at shares_per_min=0

The log message at miner_rejects.go:859 computes accRate from snap.RollingHashrate only:

accRate := 0.0
if snap.RollingHashrate > 0 {
    accRate = (snap.RollingHashrate / hashPerShare) * 60
}

So shares_per_min=0 in the log just means the rolling-EMA hashrate hasn't been populated yet (which is normal in the initial-EMA-window phase). The actual decision in suggestedVardiff uses a fallback path (lines ~939-955) where rollingHashrate = (windowDifficulty * hashPerShare) / windowSeconds. If a couple of fractional partial shares ended up in windowDifficulty (or none at all during the steady-state throttle), the inferred hashrate is essentially zero, and targetDiff = rollingHashrate * 60 / targetShares collapses to min_difficulty.

Suggested fix: While initialEMAWindowDone == false AND the connection age is below some grace threshold, defer any downward adjustment. Only allow upward adjustments during bootstrap (a miner submitting many shares at low diff is a clear signal; a miner submitting no shares at high diff during the warmup is not a clear signal — it could be stratum-side latency or throttle).

2. Deterministic numerical attractor at ~0.0378

After the crush in step 1, the next VarDiff tick (old_diff=0.001, shares_per_min=0) produces new_diff=0.03789442986798398. I observed this in two independent restarts of the same pool:

Restart timestamp New diff after attractor tick
2026-05-21T11:35:05 0.0378777060693864
2026-05-21T13:33:28 0.03789442986798398

Variance < 0.001 %. Both restarts had shares_per_min=0 reported in the log and old_diff=0.001 (the floor). This suggests the adjustment formula, when fed degenerate inputs (rolling EMA still bootstrapping, fallback rate computed from a tiny non-zero windowDifficulty / windowSeconds), has a numerical fixed point around this value rather than the expected fallback (no-op or stay at old_diff).

Suggested fix: Guard the targetDiff computation against the degenerate-input case. Something like:

if rollingHashrate <= 0 || !mc.initialEMAWindowDone.Load() {
    return currentDiff   // no adjustment while bootstrapping
}

…added near the top of the fallback block in suggestedVardiff.

Reproducer (without a resume-cache)

  1. target_shares_per_min=15, default_difficulty=62269, min_difficulty=0.001, hashrate_ema_tau_seconds=450 (anything > 60 should work — the larger the tau, the longer the bootstrap window).
  2. Connect a high-hashrate ASIC. It will start at diff 62 269 (via default_difficulty).
  3. Either:
    • (a) Add a brief network stall (firewall the miner for ~90 seconds), or
    • (b) Run with max_accepts_per_second and accept_steady_state_window set such that the stratum throttle starves the miner during the first VarDiff window.
  4. Observe: At ~T+2:30 the first VarDiff tick crushes to min_difficulty; at ~T+3:30 the next tick lands on ~0.0378; long ramp-back follows.

Smaller miners (~150 GH/s class) recover from this almost invisibly — at diff 0.001 they produce 10⁴ shares/sec, so the next tick already has plausible EMA input and converges in 1–2 iterations. Only ASIC-class miners with optimal diff > ~10 000 show user-visible damage.

Code pointers

  • Log site: miner_rejects.go:859
  • Decision: miner_rejects.go:892 suggestedVardiff (the fallback block when rollingHashrate <= 0)
  • Outer loop: miner_rejects.go:834 maybeAdjustDifficulty
  • Bootstrap flag: miner_types.go MinerConn.initialEMAWindowDone

Environment

  • M45-goPool main HEAD (commit b555819 as of 2026-05-21), observed in a downstream fork compiled from this main; the math/log lines match upstream so the bug is upstream.
  • VarDiff constants left at defaults: step=2, damping=0.7, adjustment_window=60s, retarget_delay=30s.

Workaround for affected operators

For ASIC miners that submit a hashed-difficulty hint, lock_suggested_difficulty=true plus a mining.suggest_difficulty or password hint (d=65536) appears to apply a per-connection lock that prevents this crush from firing. Confirmed empirically: AvalonQ stable at diff=65 536 over 30+ minutes after connect, no VarDiff adjustments.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions