Skip to content

Conversation

@arjan-bal
Copy link
Contributor

@arjan-bal arjan-bal commented Dec 17, 2025

This change adds a new tiered buffer pool that uses power-of-2 tier sizes. It reduces the lookup time for the relevant sizedBufferPool from $O(\log n)$ to $O(1)$, where n is the number of tiers. This creates constant-time lookups independent of the tier count, allowing users to add more tiers without performance overhead.

Benchmarks

Micro-benchmark that measures only the pool query performance, ignoring the allocation time:

func BenchmarkSearch(b *testing.B) {
	defaultBufferPoolSizes := make([]int, len(defaultBufferPoolSizeExponents))
	for i, exp := range defaultBufferPoolSizeExponents {
		defaultBufferPoolSizes[i] = 1 << exp
	}
	b.Run("pool=Tiered", func(b *testing.B) {
		p := NewTieredBufferPool(defaultBufferPoolSizes...).(*tieredBufferPool)
		for b.Loop() {
			for size := range 1 << 19 {
				// One for get, one for put.
				_ = p.getPool(size)
				_ = p.getPool(size)
			}
		}
	})

	b.Run("pool=BinaryTiered", func(b *testing.B) {
		p := NewBinaryTieredBufferPool(defaultBufferPoolSizeExponents...).(*binaryTieredBufferPool)
		for b.Loop() {
			for size := range 1 << 19 {
				_ = p.poolForGet(size)
				_ = p.poolForPut(size)
			}
		}
	})
}

With 5 tiers:

go test -bench=BenchmarkSearch -count=10 -benchmem | benchstat -col '/pool' -
goos: linux
goarch: amd64
pkg: google.golang.org/grpc/mem
cpu: Intel(R) Xeon(R) CPU @ 2.60GHz
          │   Tiered    │            BinaryTiered             │
          │   sec/op    │   sec/op     vs base                │
Search-48   5.353m ± 2%   2.036m ± 0%  -61.97% (p=0.000 n=10)

          │   Tiered   │          BinaryTiered          │
          │    B/op    │    B/op     vs base            │
Search-48   0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

          │   Tiered   │          BinaryTiered          │
          │ allocs/op  │ allocs/op   vs base            │
Search-48   0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

With 9 tiers:

go test -bench=BenchmarkSearch -count=10 -benchmem | benchstat -col '/pool' -
goos: linux
goarch: amd64
pkg: google.golang.org/grpc/mem
cpu: Intel(R) Xeon(R) CPU @ 2.60GHz
          │   Tiered    │            BinaryTiered             │
          │   sec/op    │   sec/op     vs base                │
Search-48   5.659m ± 0%   2.035m ± 0%  -64.04% (p=0.000 n=10)

          │   Tiered   │          BinaryTiered          │
          │    B/op    │    B/op     vs base            │
Search-48   0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

          │   Tiered   │          BinaryTiered          │
          │ allocs/op  │ allocs/op   vs base            │
Search-48   0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

RELEASE NOTES:

  • mem: Add faster tiered buffer pool. Use NewBinaryTieredBufferPool to create such pools.

@arjan-bal arjan-bal added this to the 1.79 Release milestone Dec 17, 2025
@arjan-bal arjan-bal added Type: Performance Performance improvements (CPU, network, memory, etc) Area: Transport Includes HTTP/2 client/server and HTTP server handler transports and advanced transport features. labels Dec 17, 2025
@arjan-bal arjan-bal requested review from dfawley and easwars December 17, 2025 10:25
@codecov
Copy link

codecov bot commented Dec 17, 2025

Codecov Report

❌ Patch coverage is 91.66667% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.45%. Comparing base (81a00ce) to head (36d34f7).
⚠️ Report is 16 commits behind head on master.

Files with missing lines Patch % Lines
mem/buffer_pool.go 91.66% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8775      +/-   ##
==========================================
+ Coverage   83.22%   83.45%   +0.22%     
==========================================
  Files         418      419       +1     
  Lines       32385    32642     +257     
==========================================
+ Hits        26952    27240     +288     
+ Misses       4050     4026      -24     
+ Partials     1383     1376       -7     
Files with missing lines Coverage Δ
mem/buffer_pool.go 94.33% <91.66%> (-2.28%) ⬇️

... and 37 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@arjan-bal arjan-bal force-pushed the effecient-tiered-buffer-pool branch from bdccc9b to 1f26e66 Compare December 17, 2025 12:28
Copy link
Contributor

@easwars easwars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we decided to make it possible for the user to set the default buffer pool for the whole process through an experimental API, and get rid of the existing dial option and the server option. Was your plan to do that in a follow-up?


// Determine the maximum exponent we need to support.
// bits.Len64(math.MaxUint64) is 63.
const maxExponent = 63
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this assume a 64-bit machine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works for machines up to 64 bits, i.e. both 32 and 64 bits.

Comment on lines 149 to 151
if exp > maxExponent || exp < 0 {
continue
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we silently ignoring certain exponents that don't meet our criteria? Should we error out instead?

Copy link
Contributor Author

@arjan-bal arjan-bal Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment explaining the reason to ignore the values.

  1. For values greater than 63, it's not possible to allocate such large slices. So ignoring them will not cause observable changes. I changed the param type to uint8 to help give linter errors if people pass constant values > 256.
  2. For values < 1, the buffer sizes would be fractional, e.g. $2^{-1}=0.5$. Since slices are of integer sizes, a sized buffer pool with a fractional size will never get used. I changed the param type to uint, making it impossible to pass negative values.

@easwars easwars assigned arjan-bal and unassigned easwars and dfawley Dec 17, 2025
@arjan-bal
Copy link
Contributor Author

I thought we decided to make it possible for the user to set the default buffer pool for the whole process through an experimental API, and get rid of the existing dial option and the server option. Was your plan to do that in a follow-up?

I was waiting for the author of #8770 to raise a PR for exposing a function to set the default buffer pool. See the second part of #8770 (comment).

get rid of the existing dial option and the server option

I'm not sure if we want to do this. Maybe people want to use different buffer pools for each channel.


In this PR, I'm just improving the existing buffer pool implementation.

@arjan-bal arjan-bal assigned easwars and unassigned arjan-bal Dec 18, 2025
@arjan-bal arjan-bal force-pushed the effecient-tiered-buffer-pool branch from b50277e to 8e7ecc4 Compare December 18, 2025 19:51
@arjan-bal arjan-bal force-pushed the effecient-tiered-buffer-pool branch from 8e7ecc4 to 36d34f7 Compare December 18, 2025 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: Transport Includes HTTP/2 client/server and HTTP server handler transports and advanced transport features. Type: Performance Performance improvements (CPU, network, memory, etc)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants