-
Notifications
You must be signed in to change notification settings - Fork 355
Add pooled ByteBuffer allocator with size classes #578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Introduce PooledByteBufferAllocator with global buckets and per-thread caches and use it in HTTP/2 FrameFactory.
c0f67fd to
a7335a4
Compare
…lculation while preserving existing behaviour.
2eea338 to
ac81ac2
Compare
|
@arturobernalg I may be wrong but I was under impression that many (if not all) Java frameworks arrived at the same conclusion that memory pooling was counterproductive as of Java 8 given the efficiency of modern garbage collection algorithms. I will run the micro-benchmark locally and look at the results, but it may take me a certain while. Generally I see no problem with providing pluggable allocators as long as the simple one remains default and you are willing to maintain more complex ones. @rschmitt do you happen to have an opinion on this matter? |
|
@ok2c I'm going to ask one or two more qualified people for an opinion and get back to you. My understanding is that object pooling can outperform garbage collection, but it's harder to do than you'd think. (There's also the question of what "outperform" means. What are we measuring, tail latencies? CPU overhead? Heap footprint?) Pooled buffers also come with a lot of risks, like increased potential for memory leaks, or security vulnerabilities such as buffer over-reads. The Javadoc says that the I guess the main concern I have here is the effectiveness of adding buffer pooling retroactively, compared with the cost in code churn. Typically what I see is frameworks or applications that are designed from the ground up to be garbage-free or zero-copy or what have you. I think this proposal would be more persuasive if I knew what we were measuring and what our performance target is, and what the hotspots currently are for ephemeral garbage. Can they be addressed with a minimum of API churn? (I find it's very difficult to thread new parameters deep into HttpComponents; if we implemented pooling, I'd prefer to make it a purely internal optimization, and an implementation detail. We should be more hesitant to increase our API surface area.) Finally, I think it's a little late in the development cycle for httpcore 5.4 to be considering such a change. Any usage of pooling in the HTTP/2 or TLS or IOReactor implementation should probably be gated behind a system property and considered experimental. |
Covers mixed-route workloads with slow discard and expiry paths Test-only change, no impact on public API or defaults
|
@olegk @rschmitt I’ve added a small JMH benchmark that exercises the old and new pool under mixed routes with slow discard / expiry. To clarify the Netty reference: the allocator is conceptually closest to Netty 4.1s
|
|
I asked Aleksey Shipilëv for his thoughts:
Here, "special lifecycle" refers to things like finalizers, Cleaners, weak references, etc.; nothing that would apply to a simple byte buffer. Another interesting point that came up is that if you use heap (non-direct) byte buffers, and if the pool doesn't hold on to byte buffer references while they are leased out, then there is no risk of a memory leak: returning the buffer to the pool is purely an optimization. Since |
|
@rschmitt Thank you so much for such an informative summary. Please convey my gratitude to Aleksey. One thing bugs me is how big is big? How big should be byte buffers to justify pooling? If it is a couple of MB, then memory pooling may be useful in our case.
I think we have objects with "special lifecycle" in the HttpClient Caching module only but they are backed by files and not byte buffers. There is nothing else I can think of. However, I imagine the classic on async facade may actually qualify as a potential beneficiary of the pooled memory allocator, so I am leaning towards approving this change-set and letting @arturobernalg proceed with further experiments. What do you think? |
This is the first step towards a pluggable pooled ByteBuffer allocator.
The patch adds PooledByteBufferAllocator (power-of-two size buckets, global pool + per-thread caches) and switches the HTTP/2 FrameFactory and the benchmark to use ByteBufferAllocator. Behaviour is unchanged except for using pooled buffers for small control frames.
If this direction looks reasonable, I’ll follow up by threading the allocator through IOSession/SSLIOSession and the async codecs and add minimal metrics; otherwise I’ll keep it local to HTTP/2.
ByteBuffer allocator throughput (JMH)
@ok2c WDYT?