fix(gRPC): connection pool leak when connection is closed and there are no more subsequent calls#1945
Merged
DMwangnima merged 1 commit intocloudwego:mainfrom Apr 23, 2026
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1945 +/- ##
==========================================
+ Coverage 61.37% 62.82% +1.44%
==========================================
Files 388 394 +6
Lines 35063 30219 -4844
==========================================
- Hits 21521 18985 -2536
+ Misses 12247 9939 -2308
Partials 1295 1295
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
6ead596 to
c69b3d6
Compare
217f0c3 to
4961d52
Compare
Previously a closed/broken transport stayed in the nphttp2 client pool
indefinitely, because only a later put() could overwrite the slot.
Restructure the pool into per-slot transportSlot with atomic.Pointer for
lock-free reads and sync.Mutex for serialized writes. Each slot registers
onClose/onGoAway callbacks that evict the transport via CAS-based
removeTransport as soon as it closes or receives GoAway.
To avoid deadlock between transportSlot.mu and http2Client.mu,
http2Client.Close and handleGoAway now invoke onClose/onGoAway after
releasing t.mu and after the state has been set to closing/draining.
Key changes:
- Per-slot atomic.Pointer + Mutex replaces the old []ClientTransport
slice with unsynchronized put/get.
- isNew + store-first-then-recheck eliminates the TOCTOU between
isClosed() and LoadOrStore when racing with Close().
- Get() fast-exits via select{case <-ctx.Done()} when NewStream fails
due to user context timeout, avoiding unnecessary singleflight entry.
- newTransport now closes the raw conn on TLS handshake failure.
- newHTTP2Client now calls t.Close(err) on Flush failure, preventing
half-initialized transport leaks.
- closeStreamTask.Tick removes itself when draining + 0 active streams,
allowing the http2Client to be GCed.
- Close() is idempotent via atomic CAS.
- Callback signatures now take (ctx, trans, err); a new ClientConfig +
NewClientTransportWithConfig expose the new shape. The deprecated
NewClientTransport is preserved as a thin adapter with documented
timing change.
ppzqh
reviewed
Apr 22, 2026
ppzqh
reviewed
Apr 22, 2026
4961d52 to
3158415
Compare
ppzqh
approved these changes
Apr 22, 2026
YangruiEmma
approved these changes
Apr 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What type of PR is this?
fix
Check the PR title.
(Optional) Translate the PR title into Chinese.
(Optional) More detailed description for this PR(en: English/zh: Chinese).
en:
atomic.Pointer+sync.Mutex, replacing the old unsynchronized[]ClientTransportslice, fixing connection leak where closed/broken transports stayed in the pool indefinitelyonClose/onGoAwaycallback invocation after releasinghttp2Client.muto prevent deadlock when callbacks callIsActive()isNewguard +isClosed()check in singleflight to prevent re-inserting closed transports afterClean()/Close()t.Close(err)on Flush failure duringnewHTTP2ClientinitcloseStreamTaskGC leak: remove task from global ticker when transport is draining with zero active streamsNewClientTransportWithConfigwith new callback signatures (ctx, transport, err/reason); deprecateNewClientTransportzh(optional):
(Optional) Which issue(s) this PR fixes:
(optional) The PR that updates user documentation: