Replies: 1 comment
-
|
@spyroska I don't understand how the fact that these messages have TTL to be relevant. If a Hutch process terminates completely abruptly, then all outstanding deliveries of consumers that use manual acknowledgements will be automatically re-queued, whether they use TTL or not. The "in flight deliveries" problem has nothing to do with TTL, and messages with short TTL can be deleted without being consumed in plenty of other scenarios. If they are unacceptable to you, don't use message TTL or a use a much higher message TTL and automatic acknowledgements. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
I would like to report an issue and submit a PR for your review. Issue description with minimal working example and proposed solution are presented below.
Issue Description
What is the issue?
Messages that have an expiration property https://www.rabbitmq.com/docs/ttl#per-message-ttl-in-publishers, can expire during hutch graceful shutdown.
What versions have been verified to be affected?
Verified in hutch v1.3.1 running on ruby v3.2 and v3.3.
Not tested with jruby.
Not tested with older hutch versions.
What messages are affected?
Not all messages; only those that have an expiration property and that also arrive after SIGTERM is received.
Studying the graceful shutdown flow, we see that, after handling the signal, control returns to
Hutch::Workerand theHutch::Brokeris stopped https://github.com/ruby-amqp/hutch/blob/v1.3.1/lib/hutch/worker.rb#L28. In turn,ConsumerWorkPool#shutdownis called by the broker https://github.com/ruby-amqp/hutch/blob/v1.3.1/lib/hutch/broker.rb#L229Why can't those messages be handled?
Because
ConsumerWorkPool#shutdownremoves all threads from the pool in a way that does not allow those messages to be handled by this process's consumers https://github.com/ruby-amqp/bunny/blob/2.23.0/lib/bunny/consumer_work_pool.rb#L60.Since the internal queue is a FIFO data stucture, any messages that get enqueued/submitted after those terminal messages, are not going to be processed, because there will be no threads left.
Why do messages keep arriving into the ConsumerWorkPool during hutch graceful shutdown
Because the Network I/O Activity thread reads AMQP framesets from the socket, handles them https://github.com/ruby-amqp/bunny/blob/2.23.0/lib/bunny/reader_loop.rb#L90 and submits them to the
ConsumerWorkPoolhttps://github.com/ruby-amqp/bunny/blob/2.23.0/lib/bunny/channel.rb#L1841Why do messages arrive in the socket during hutch graceful shutdown?
Because,
Bunny::Consumers are actively consuming messages. The RabbitMQ server keeps pushing messages to the socket because it has not been notified to stop.How To Reproduce
See minimal working example here https://github.com/spyroska/hutch-rpc-timeout-example
See its README.md for detailed steps on how to both reproduce the issue and verify the proposed fix.
For future reference, an outline of steps that reproduce the issue follows:
Solution
The solution to this can be as simple as cancelling the
Hutch::Brokerchannel consumers before shutting down theConsumerWorkPool(a.k.a before terminating the threads). By stopping the influx of messages at their source, we ensure the remaining messages will be drained properly when the work pool is shutdown and no more messages will be submitted during shutdown.As a matter of fact, cancelling consumers is the first step of the graceful shutdown sequence in kicks and sneakers https://github.com/ruby-amqp/kicks/blob/3.2.0/lib/sneakers/queue.rb#L69 .
Please, consider the proposed PR which applies a similar solution as kicks, hoping it is applicable here as well.
Beta Was this translation helpful? Give feedback.
All reactions