Skip to content

patch: v2.12.1#398

Merged
panjf2000 merged 7 commits into
masterfrom
dev
May 25, 2026
Merged

patch: v2.12.1#398
panjf2000 merged 7 commits into
masterfrom
dev

Conversation

@panjf2000

Copy link
Copy Markdown
Owner

No description provided.

panjf2000 and others added 7 commits April 1, 2026 08:40
…ate (#395)

* fix: resolve data race on poolCommon.once field using atomic.Pointer

The data race occurred between Reboot() writing p.once = &sync.Once{}
and worker goroutines reading p.once.Do(...) concurrently.

Fix: change the once field from *sync.Once to atomic.Pointer[sync.Once]
so all reads (Load) and writes (Store) are atomic.

Agent-Logs-Url: https://github.com/panjf2000/ants/sessions/b2f67c6c-ed78-4758-9185-643b3e09d143

Co-authored-by: panjf2000 <7496278+panjf2000@users.noreply.github.com>

* fix: make Reboot() wait for all workers to exit before resetting pool

The root cause of the data race is that Reboot() resets pool state
(once, allDone) while workers are still shutting down and accessing
those same fields.

Fix: Reboot() now waits on the allDone channel (which is closed when
the last worker exits) before resetting pool state. This ensures no
worker goroutine is running when the fields are reassigned, eliminating
the data race without needing atomic wrappers.

Reverts the atomic.Pointer[sync.Once] approach in favor of this
fundamental fix.

Agent-Logs-Url: https://github.com/panjf2000/ants/sessions/7b1d6e25-546a-40ef-b31d-33e824186b31

Co-authored-by: panjf2000 <7496278+panjf2000@users.noreply.github.com>

* fix: also wait for purge/ticktock goroutines to exit in Reboot()

Ensures the old purge and ticktock goroutines have fully exited
(set their done flags) before Reboot() resets those flags and
starts new goroutines. This prevents a logic race where the old
goroutine could set purgeDone/ticktockDone=1 after Reboot() resets
them to 0.

Agent-Logs-Url: https://github.com/panjf2000/ants/sessions/480234c8-a50f-4dc3-9352-0b337ebfc5d3

Co-authored-by: panjf2000 <7496278+panjf2000@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: panjf2000 <7496278+panjf2000@users.noreply.github.com>
* fix: use atomic.Pointer for once field to prevent data race in Reboot

The data race occurred between Reboot() writing p.once = &sync.Once{}
and worker goroutines reading p.once to call once.Do(close(allDone)).

By using atomic.Pointer[sync.Once], all reads (Load) and writes (Store)
to the once field are properly synchronized.

Agent-Logs-Url: https://github.com/panjf2000/ants/sessions/4a1d1b13-0a7f-4167-ae12-429e7719ea32

Co-authored-by: panjf2000 <7496278+panjf2000@users.noreply.github.com>

* fix: use sync.WaitGroup to ensure all worker goroutines fully exit before Reboot

Revert the superficial atomic.Pointer fix. The real issue is that
Reboot() proceeded after allDone (running count hit 0) while worker
goroutines were still executing deferred cleanup code.

Add workerWg sync.WaitGroup to poolCommon. Each worker goroutine calls
wg.Add(1) before spawning and defer wg.Done() as the outermost defer
(runs last). Reboot() calls workerWg.Wait() after <-allDone to block
until every goroutine has fully exited before resetting pool state.

Agent-Logs-Url: https://github.com/panjf2000/ants/sessions/758f404a-fab6-4a4e-8f42-ea4fdfd2ab9a

Co-authored-by: panjf2000 <7496278+panjf2000@users.noreply.github.com>

* fix: call workerWg.Add(1) before addRunning(1) for defensive ordering

Agent-Logs-Url: https://github.com/panjf2000/ants/sessions/758f404a-fab6-4a4e-8f42-ea4fdfd2ab9a

Co-authored-by: panjf2000 <7496278+panjf2000@users.noreply.github.com>

* fix: move once.Do(close(allDone)) to Release() instead of Reboot()

This is the fundamental fix: Release() is the proper place to close
allDone when no workers are running, since it's the operation that
transitions the pool to CLOSED state. If workers are still running,
the last one to exit closes allDone in its defer as before.

This eliminates the race between Reboot() and running workers on
once.Do, and removes the unnecessary workerWg complexity.

Agent-Logs-Url: https://github.com/panjf2000/ants/sessions/0d78b737-d4ef-4dbd-b021-3e532757c1f9

Co-authored-by: panjf2000 <7496278+panjf2000@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: panjf2000 <7496278+panjf2000@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 25, 2026 01:18
@github-actions github-actions Bot added docs patch Release patch version labels May 25, 2026
@codecov

codecov Bot commented May 25, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 60.00000% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.11%. Comparing base (d103005) to head (deeaf50).
⚠️ Report is 16 commits behind head on master.

Files with missing lines Patch % Lines
ants.go 60.00% 4 Missing and 4 partials ⚠️
multipool.go 33.33% 1 Missing and 1 partial ⚠️
multipool_func.go 33.33% 1 Missing and 1 partial ⚠️
multipool_func_generic.go 33.33% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #398      +/-   ##
==========================================
- Coverage   96.81%   95.11%   -1.70%     
==========================================
  Files          14       14              
  Lines         784      798      +14     
==========================================
  Hits          759      759              
- Misses         19       26       +7     
- Partials        6       13       +7     
Flag Coverage Δ
unittests 95.11% <60.00%> (-1.70%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This patch release tightens shutdown/reboot behavior in the core pool implementation, improves resource cleanup during MultiPool construction failures, and applies a few small correctness/documentation tweaks.

Changes:

  • Adjusts pool shutdown/reboot coordination (closing allDone earlier; Reboot() now waits for worker/purge/ticktock completion).
  • Prevents resource leaks in NewMultiPool* constructors by releasing already-created pools on partial construction failure; replaces sentinel int constant with math.MaxInt32.
  • Minor refactors/docs updates (loop-queue binary search local vars; README sponsorship badge; WithNonblocking comment).

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
ants.go Changes pool lifecycle coordination in Release()/Reboot()/ReleaseContext() to avoid hangs and reduce races.
multipool.go Releases already-created pools on constructor error; uses math.MaxInt32 in LeastTasks selection.
multipool_func.go Same constructor cleanup + math.MaxInt32 update for func-based multipool.
multipool_func_generic.go Same constructor cleanup + math.MaxInt32 update for generic func-based multipool.
worker_loop_queue.go Refactors binarySearch locals for clarity/scope reduction.
options.go Fixes WithNonblocking comment to match actual error behavior (ErrPoolOverload).
README.md Updates DigitalOcean sponsorship badge/link format.
README_ZH.md Updates DigitalOcean sponsorship badge/link format (Chinese README).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread ants.go
Comment on lines +496 to +500
p.goPurge()
atomic.StoreInt32(&p.ticktockDone, 0)
p.goTicktock()
p.allDone = make(chan struct{})
p.once = &sync.Once{}
Comment thread ants.go
Comment on lines 466 to 469
// Reboot reboots a closed pool, it does nothing if the pool is not closed.
// If you intend to reboot a closed pool, use ReleaseTimeout() instead of
// Release() to ensure that all workers are stopped and resource are released
// before rebooting, otherwise you may run into data race.
Comment thread ants.go
Comment on lines +484 to +488
runtime.Gosched()
}
}
for atomic.LoadInt32(&p.ticktockDone) != 1 {
runtime.Gosched()
@panjf2000 panjf2000 merged commit 943d026 into master May 25, 2026
22 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs patch Release patch version

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants