Skip to content

Feature - Support RMQ with Khepri metadata store #127

@gomoripeti

Description

@gomoripeti

Since version 3.13 RabbitMQ can run using Khepri instead of Mnesia as metadata store. From version 4.3 (or 4.4) Khepri will be the only option for metadata store. When Khepri is used RabbitMQ does not even initialize Mnesia.

As this plugin uses Mnesia to store the deduplication caches, some work is needed to support RabbitMQ installations using Khepri metadata store.

Current usage

  • for queue caches the plugin uses an in-memory non-replicated Mnesia table per queue
  • for exchange caches the plugin uses a replicated Mnesia table (present on 2/3 of cluster nodes) with optional on-disk persistence, default in-memory

@noxdafox in #120 :

... the Mnesia DB is not initialized correctly anymore. Just fixing the above logic will lead to a subsequent error due to the fact that Mnesia schema is not managed properly and therefore disk caches cannot be started.

I am implementing Mnesia management logic within the plugin to support future RMQ releases and fix this bug at the same time.

There are multiple options going forward

  1. Setup Mnesia if RabbitMQ does not do it (similar to how the delayed-exchange plugin does it)
  • pros: minimal code change, current implementation can stay in place
  • cons: this plugin still requires a distributed Mnesia (unlike the delayed-exchange plugin which runs isolated one-node Mnesia clusters) and can have (can it?) all the netsplit issues that caused the Core Team to move away from Mnesia
  • cons: during mnesia2khepri migration rabbitmq wipes all mnesia files and there is no clean callback that is called after this step that the plugin can implement to eg move the migrated data from a tmp dir to its final location. Allow Khepri post-migration step to not delete certain files or dirs rabbitmq/rabbitmq-server#11304. So basically the Mnesia tables disappear from under a running dedup plugin. A workaround is to disable the plugin before migration and enable after (This is the workaround currently suggested for the delayed exchange plugin.) This results in message loss for the delayed exchange plugin but in the case of the dedup plugin might be acceptable as this plugin only stores caches.
  1. Use Khepri to store the distributed persistent exchange cache (eg like the recent-history-exchange plugin https://github.com/rabbitmq/rabbitmq-server/blob/main/deps/rabbitmq_recent_history_exchange/src/rabbit_db_rh_exchange.erl). Remove option for the exchange cache to be memory only. Could use simple ETS tables owned by the CacheManager process for queue caches.

  2. Use a 3rd party Elixir cache library that supports distributed in-memory caching (eg CachEx) I did not explore this option yet. As it would pull in yet another dep, I would only go this route if Khepri turns out to be too slow.

@noxdafox what do you think?

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureFeature request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions