From 426007f70f9426dc9417b894a542ae5046635b2e Mon Sep 17 00:00:00 2001 From: fengyubiao Date: Tue, 15 Jul 2025 11:31:25 +0800 Subject: [PATCH 01/11] 1 --- pip/pip-434.md | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) create mode 100644 pip/pip-434.md diff --git a/pip/pip-434.md b/pip/pip-434.md new file mode 100644 index 0000000000000..181b52eae8040 --- /dev/null +++ b/pip/pip-434.md @@ -0,0 +1,76 @@ +# PIP-434: Expose Netty channel configuration WRITE_BUFFER_WATER_MARK to pulsar conf + +# Background knowledge & Motivation + +As we discussed along the discussion: https://lists.apache.org/thread/6jfs02ovt13mnhn441txqy5m6knw6rr8 + +> Problem Statement: +> We've encountered a critical issue in our Apache Pulsar clusters where brokers experience Out-Of-Memory (OOM) errors and continuous restarts under specific load patterns. This occurs when Netty channel write buffers become full, leading to a buildup of unacknowledged responses in the broker's memory. + +> Background: +> Our clusters are configured with numerous namespaces, each containing approximately 8,000 to 10,000 topics. Our consumer applications are quite large, with each consumer using a regular expression (regex) pattern to subscribe to all topics within a namespace. + +> The problem manifests particularly during consumer application restarts. When a consumer restarts, it issues a getTopicsOfNamespace request. Due to the sheer number of topics, the response size is extremely large. This massive response overwhelms the socket output buffer, causing it to fill up rapidly. Consequently, the broker's responses get backlogged in memory, eventually leading to the broker's OOM and subsequent restart loop. + +> Solution we got: +> - Expose Netty channel configuration WRITE_BUFFER_WATER_MARK to pulsar conf +> - Stops receive requests continuously once the Netty channel is unwritable, users can use the new config to control the threshold that limits the max bytes that are pending write. + +# Goals + +## In Scope +- Expose Netty channel configuration WRITE_BUFFER_WATER_MARK to pulsar conf +- Stops receive requests continuously once the Netty channel is unwritable, users can use the new config to control the threshold that limits the max bytes that are pending write. + +## Out of Scope + +- This proposal is not in order to add a broker level memory limitation, it only focuses on addressing the OOM caused by the accumulation of a large number of responses in memory due to the channel granularity being unwritable. + +# Detailed Design + +### Configuration + +```shell +# It relates to configuration "WriteBufferHighWaterMark" of Netty Channel Config. If the number of bytes queued in the write buffer exceeds this value, channel writable state will start to return "false". +pulsarChannelWriteBufferHighWaterMark=64k +# It relates to configuration "WriteBufferLowWaterMark" of Netty Channel Config. If the number of bytes queued in the write buffer is smaller than this value, channel writable state will start to return "true". +pulsarChannelWriteBufferLowWaterMark=32k +``` + +### CLI + +### Metrics +| Name | Description | Attributes | Units| +|------------------------------------------------------|---------------------------------------------------------------------------------------------|--------------| --- | +| `pulsar_server_channel_write_buf_memory_used_bytes` | Counter. The number of replicators. | cluster | - | + + +# Monitoring + + +# Security Considerations +Nothing. + +# Backward & Forward Compatibility + +## Upgrade +Nothing. + +## Downgrade / Rollback +Nothing. + +## Pulsar Geo-Replication Upgrade & Downgrade/Rollback Considerations +Nothing. + +# Alternatives +Nothing. + +# General Notes + +# Links + + +* Mailing List discussion thread: +* Mailing List voting thread: From f459966b868629e62fe1ae5d8b02e025f52124a7 Mon Sep 17 00:00:00 2001 From: fengyubiao <9947090@qq.com> Date: Tue, 15 Jul 2025 11:41:41 +0800 Subject: [PATCH 02/11] Update pip-434.md --- pip/pip-434.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pip/pip-434.md b/pip/pip-434.md index 181b52eae8040..d2fa6e0377106 100644 --- a/pip/pip-434.md +++ b/pip/pip-434.md @@ -72,5 +72,5 @@ Nothing. -* Mailing List discussion thread: +* Mailing List discussion thread: https://lists.apache.org/thread/hnbm9q3yvyf2wcbdggxmjzhr9boorqkn * Mailing List voting thread: From dcbe255f1ece911fd4b90caf3fd7ec3f32eb75c2 Mon Sep 17 00:00:00 2001 From: fengyubiao Date: Tue, 15 Jul 2025 11:48:55 +0800 Subject: [PATCH 03/11] 2 --- pip/pip-434.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pip/pip-434.md b/pip/pip-434.md index d2fa6e0377106..d769b33ffe81e 100644 --- a/pip/pip-434.md +++ b/pip/pip-434.md @@ -1,4 +1,4 @@ -# PIP-434: Expose Netty channel configuration WRITE_BUFFER_WATER_MARK to pulsar conf +# PIP-434: Expose Netty channel configuration WRITE_BUFFER_WATER_MARK to pulsar conf and pause receive requests when channel is unwritable # Background knowledge & Motivation From 09281bcf559682db6cb2ef2a825328783da24bd4 Mon Sep 17 00:00:00 2001 From: fengyubiao <9947090@qq.com> Date: Wed, 16 Jul 2025 10:32:25 +0800 Subject: [PATCH 04/11] Update pip-434.md --- pip/pip-434.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/pip/pip-434.md b/pip/pip-434.md index d769b33ffe81e..18f6ec66bf3d6 100644 --- a/pip/pip-434.md +++ b/pip/pip-434.md @@ -35,6 +35,8 @@ As we discussed along the discussion: https://lists.apache.org/thread/6jfs02ovt1 pulsarChannelWriteBufferHighWaterMark=64k # It relates to configuration "WriteBufferLowWaterMark" of Netty Channel Config. If the number of bytes queued in the write buffer is smaller than this value, channel writable state will start to return "true". pulsarChannelWriteBufferLowWaterMark=32k +# Once the writer buffer is full, the channel stops dealing with new requests until it changes to writable +pulsarChannelPauseReceivingRequestsIfUnwritable=false ``` ### CLI From 76b818cf620e34d1166d6ae97ab68fa851c5c0a9 Mon Sep 17 00:00:00 2001 From: fengyubiao <9947090@qq.com> Date: Thu, 24 Jul 2025 22:56:34 +0800 Subject: [PATCH 05/11] Update pip-434.md --- pip/pip-434.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pip/pip-434.md b/pip/pip-434.md index 18f6ec66bf3d6..8add24b2b3965 100644 --- a/pip/pip-434.md +++ b/pip/pip-434.md @@ -75,4 +75,4 @@ Nothing. Updated afterwards --> * Mailing List discussion thread: https://lists.apache.org/thread/hnbm9q3yvyf2wcbdggxmjzhr9boorqkn -* Mailing List voting thread: +* Mailing List voting thread: https://lists.apache.org/thread/vpvtf4jnbbrhsy9y5fg00mpz9qhb0cp5 From 7e86225304007b535ee1428a21d7d8fc5eff8b39 Mon Sep 17 00:00:00 2001 From: fengyubiao <9947090@qq.com> Date: Wed, 30 Jul 2025 16:59:24 +0800 Subject: [PATCH 06/11] Update pip-434.md --- pip/pip-434.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pip/pip-434.md b/pip/pip-434.md index 8add24b2b3965..6014aa57d6f85 100644 --- a/pip/pip-434.md +++ b/pip/pip-434.md @@ -44,7 +44,7 @@ pulsarChannelPauseReceivingRequestsIfUnwritable=false ### Metrics | Name | Description | Attributes | Units| |------------------------------------------------------|---------------------------------------------------------------------------------------------|--------------| --- | -| `pulsar_server_channel_write_buf_memory_used_bytes` | Counter. The number of replicators. | cluster | - | +| `pulsar_server_channel_write_buf_memory_used_bytes` | Counter. The memory amount that is occupied by netty write buffers | cluster | - | # Monitoring From f5558ddbffbcbacb288b5d77fc635f1a14a6cb3a Mon Sep 17 00:00:00 2001 From: fengyubiao <9947090@qq.com> Date: Thu, 7 Aug 2025 23:44:14 +0800 Subject: [PATCH 07/11] Update pip-434.md --- pip/pip-434.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/pip/pip-434.md b/pip/pip-434.md index 6014aa57d6f85..a680037d44238 100644 --- a/pip/pip-434.md +++ b/pip/pip-434.md @@ -39,6 +39,16 @@ pulsarChannelWriteBufferLowWaterMark=32k pulsarChannelPauseReceivingRequestsIfUnwritable=false ``` +### How it works +With the settings `pulsarChannelPauseReceivingRequestsIfUnwritable=false`, the behaviour is exactly the same as the previous. + +After setting `pulsarChannelPauseReceivingRequestsIfUnwritable` to `true`, the channel state will be changed as follows. +- Netty sets `channel.writable` to `false` when there is too much data that is waiting to be sent out(the size of the data cached in `ChannelOutboundBuffer` is larger than `{pulsarChannelWriteBufferHighWaterMark}`) + - Netty will trigger an event `channelWritabilityChanged` +- Stops receiving requests that come into the channel, which replies on the attribute `autoRead` of the `Netty channel`. +- Netty sets `channel.writable` to `true` once the size of the data that is waiting to be sent out is less than `{pulsarChannelWriteBufferLowWaterMark}` +- Starts to receive requests(sets `channel.autoRead` to `true`). + ### CLI ### Metrics From 4626238390c7b5c4d988a1c5c297395a82bd6c90 Mon Sep 17 00:00:00 2001 From: fengyubiao <9947090@qq.com> Date: Fri, 8 Aug 2025 16:33:37 +0800 Subject: [PATCH 08/11] Update pip-434.md --- pip/pip-434.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/pip/pip-434.md b/pip/pip-434.md index a680037d44238..c3c888f3a3b6b 100644 --- a/pip/pip-434.md +++ b/pip/pip-434.md @@ -37,6 +37,10 @@ pulsarChannelWriteBufferHighWaterMark=64k pulsarChannelWriteBufferLowWaterMark=32k # Once the writer buffer is full, the channel stops dealing with new requests until it changes to writable pulsarChannelPauseReceivingRequestsIfUnwritable=false +After the connection is recovered from an unreadable state, the channel will be rate-limited for a period of time to avoid overwhelming due to the backlog of requests. This parameter defines how many" requests should be allowed in the rate limiting period. +pulsarChannelRateLimitingRateAfterResumeFromUnreadable=1000 +After the connection is recovered from an unreadable state, the channel will be rate-limited for a period of time to avoid overwhelming due to the backlog of requests. This parameter defines how long the rate limiting should last, in seconds. Once the bytes that are waiting to be sent out reach the pulsarChannelWriteBufferHighWaterMark\", the timer will be reset. +pulsarChannelRateLimitingSecondsAfterResumeFromUnreadable=5 ``` ### How it works @@ -45,9 +49,11 @@ With the settings `pulsarChannelPauseReceivingRequestsIfUnwritable=false`, the b After setting `pulsarChannelPauseReceivingRequestsIfUnwritable` to `true`, the channel state will be changed as follows. - Netty sets `channel.writable` to `false` when there is too much data that is waiting to be sent out(the size of the data cached in `ChannelOutboundBuffer` is larger than `{pulsarChannelWriteBufferHighWaterMark}`) - Netty will trigger an event `channelWritabilityChanged` -- Stops receiving requests that come into the channel, which replies on the attribute `autoRead` of the `Netty channel`. +- Stops receiving requests that come into the channel, which relies on the attribute `autoRead` of the `Netty channel`. - Netty sets `channel.writable` to `true` once the size of the data that is waiting to be sent out is less than `{pulsarChannelWriteBufferLowWaterMark}` - Starts to receive requests(sets `channel.autoRead` to `true`). + - To avoid avalanches, Pulsar will start a timed rate-limiter, which limits the rate of handling the request backlog("pulsarChannelRateLimitingRateAfterResumeFromUnreadable" requests per second). + - After "{pulsarChannelRateLimitingSecondsAfterResumeFromUnreadable}" seconds, the rate-limiter will be removed automatically. Once the bytes that are waiting to be sent out reach the pulsarChannelWriteBufferHighWaterMark\", the timer will be reset. ### CLI From bca77d83ba5c2376c14f69c1c1e989932525fd82 Mon Sep 17 00:00:00 2001 From: fengyubiao <9947090@qq.com> Date: Wed, 3 Sep 2025 13:03:45 +0800 Subject: [PATCH 09/11] Update pip-434.md --- pip/pip-434.md | 1 + 1 file changed, 1 insertion(+) diff --git a/pip/pip-434.md b/pip/pip-434.md index c3c888f3a3b6b..ea851a5bbaa13 100644 --- a/pip/pip-434.md +++ b/pip/pip-434.md @@ -52,6 +52,7 @@ After setting `pulsarChannelPauseReceivingRequestsIfUnwritable` to `true`, the c - Stops receiving requests that come into the channel, which relies on the attribute `autoRead` of the `Netty channel`. - Netty sets `channel.writable` to `true` once the size of the data that is waiting to be sent out is less than `{pulsarChannelWriteBufferLowWaterMark}` - Starts to receive requests(sets `channel.autoRead` to `true`). + - Note: It will use `ServerCnxThrottleTracker`, which will track the "throttle count". When a throttling condition is present, the throttle count is increased and when it's no more present, the count is decreased. The autoread should be switched to false when the counter value goes from 0 to 1 and only when it goes back from 1 to 0 should it set to true again. The autoread flag is no more controlled directly from the rate limiters. Rate limiters are only responsible for their part and it's ServerCnxThrottleTracker that decides when autoread flag is toggled. See more details [pip-322](https://github.com/apache/pulsar/blob/master/pip/pip-322.md) - To avoid avalanches, Pulsar will start a timed rate-limiter, which limits the rate of handling the request backlog("pulsarChannelRateLimitingRateAfterResumeFromUnreadable" requests per second). - After "{pulsarChannelRateLimitingSecondsAfterResumeFromUnreadable}" seconds, the rate-limiter will be removed automatically. Once the bytes that are waiting to be sent out reach the pulsarChannelWriteBufferHighWaterMark\", the timer will be reset. From fb84527133332bc780a41f420ffc4ab702e288dd Mon Sep 17 00:00:00 2001 From: fengyubiao <9947090@qq.com> Date: Wed, 3 Sep 2025 13:04:17 +0800 Subject: [PATCH 10/11] Update pip-434.md --- pip/pip-434.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pip/pip-434.md b/pip/pip-434.md index ea851a5bbaa13..dba636898fe49 100644 --- a/pip/pip-434.md +++ b/pip/pip-434.md @@ -52,7 +52,7 @@ After setting `pulsarChannelPauseReceivingRequestsIfUnwritable` to `true`, the c - Stops receiving requests that come into the channel, which relies on the attribute `autoRead` of the `Netty channel`. - Netty sets `channel.writable` to `true` once the size of the data that is waiting to be sent out is less than `{pulsarChannelWriteBufferLowWaterMark}` - Starts to receive requests(sets `channel.autoRead` to `true`). - - Note: It will use `ServerCnxThrottleTracker`, which will track the "throttle count". When a throttling condition is present, the throttle count is increased and when it's no more present, the count is decreased. The autoread should be switched to false when the counter value goes from 0 to 1 and only when it goes back from 1 to 0 should it set to true again. The autoread flag is no more controlled directly from the rate limiters. Rate limiters are only responsible for their part and it's ServerCnxThrottleTracker that decides when autoread flag is toggled. See more details [pip-322](https://github.com/apache/pulsar/blob/master/pip/pip-322.md) + - Note: relies on `ServerCnxThrottleTracker`, which will track the "throttle count". When a throttling condition is present, the throttle count is increased and when it's no more present, the count is decreased. The autoread should be switched to false when the counter value goes from 0 to 1 and only when it goes back from 1 to 0 should it set to true again. The autoread flag is no more controlled directly from the rate limiters. Rate limiters are only responsible for their part and it's ServerCnxThrottleTracker that decides when autoread flag is toggled. See more details [pip-322](https://github.com/apache/pulsar/blob/master/pip/pip-322.md) - To avoid avalanches, Pulsar will start a timed rate-limiter, which limits the rate of handling the request backlog("pulsarChannelRateLimitingRateAfterResumeFromUnreadable" requests per second). - After "{pulsarChannelRateLimitingSecondsAfterResumeFromUnreadable}" seconds, the rate-limiter will be removed automatically. Once the bytes that are waiting to be sent out reach the pulsarChannelWriteBufferHighWaterMark\", the timer will be reset. From 2f9c9d3701c0e5adfc39c11b7b0a2b32d4d2a7fa Mon Sep 17 00:00:00 2001 From: fengyubiao <9947090@qq.com> Date: Tue, 9 Sep 2025 10:27:59 +0800 Subject: [PATCH 11/11] Update pip-434.md --- pip/pip-434.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/pip/pip-434.md b/pip/pip-434.md index dba636898fe49..6f6ee872984c4 100644 --- a/pip/pip-434.md +++ b/pip/pip-434.md @@ -52,8 +52,8 @@ After setting `pulsarChannelPauseReceivingRequestsIfUnwritable` to `true`, the c - Stops receiving requests that come into the channel, which relies on the attribute `autoRead` of the `Netty channel`. - Netty sets `channel.writable` to `true` once the size of the data that is waiting to be sent out is less than `{pulsarChannelWriteBufferLowWaterMark}` - Starts to receive requests(sets `channel.autoRead` to `true`). - - Note: relies on `ServerCnxThrottleTracker`, which will track the "throttle count". When a throttling condition is present, the throttle count is increased and when it's no more present, the count is decreased. The autoread should be switched to false when the counter value goes from 0 to 1 and only when it goes back from 1 to 0 should it set to true again. The autoread flag is no more controlled directly from the rate limiters. Rate limiters are only responsible for their part and it's ServerCnxThrottleTracker that decides when autoread flag is toggled. See more details [pip-322](https://github.com/apache/pulsar/blob/master/pip/pip-322.md) - - To avoid avalanches, Pulsar will start a timed rate-limiter, which limits the rate of handling the request backlog("pulsarChannelRateLimitingRateAfterResumeFromUnreadable" requests per second). + - Note: relies on `ServerCnxThrottleTracker`, which will track the "throttle count". When a throttling condition is present, the throttle count is increased and when it's no more present, the count is decreased. The autoread should be switched to false when the counter value goes from 0 to 1, and only when it goes back from 1 to 0 should it be set to true again. The autoread flag is no longer controlled directly from the rate limiters. Rate limiters are only responsible for their part, and it's ServerCnxThrottleTracker that decides when the autoread flag is toggled. See more details [pip-322](https://github.com/apache/pulsar/blob/master/pip/pip-322.md) + - To avoid handling a huge request in the backlog instantly, Pulsar will start a timed rate-limiter, which limits the rate of handling the request backlog("pulsarChannelRateLimitingRateAfterResumeFromUnreadable" requests per second). - After "{pulsarChannelRateLimitingSecondsAfterResumeFromUnreadable}" seconds, the rate-limiter will be removed automatically. Once the bytes that are waiting to be sent out reach the pulsarChannelWriteBufferHighWaterMark\", the timer will be reset. ### CLI