Setting RabbitMQ cluster via config

November 25, 2017

RabbitMQ is the most popular AMQP broker or in other simple words - queue server. People use it to queue and track heavy processing, distribute tasks among workers, buffer incoming messages to handle spikes and for many other use cases.

This sounds like a very important part of your infrastructure, so you are better off making it highly available and RabbitMQ has clustering support for this case.

Now there are 2 ways to make a RabbitMQ cluster. One is by hand with rabbitmqctl join_cluster as described in the documentation. And the other one is via config file.

I haven’t seen the latter case described anywhere so I’ll do it myself in this post.

Most of the things I’ll describe here is automated in my rabbitmq-cluster Ansible role.

Suppose you have somehow installed RabbitMQ server on 3 nodes. It has started and now you have a 3 independent RabbitMQ instances.

To make it a cluster you first stop all of the 3 instances. You have to do this because, once set up, RabbitMQ configuration (including cluster) is persistent in mnesia files and will try to build a cluster using its own internal facilities.

Having it stopped you have to clear mnesia base dir like this rm -rf $MNESIA_BASE/*. Again, you need this to clear any previous configuration (usually broken from previous failed attempts).

Now is the meat of it. On each node open the /etc/rabbitmq/rabbitmq.config and add the list of cluster nodes:

{cluster_nodes, {['rabbit@rabbit1', 'rabbit@rabbit2', 'rabbit@rabbit3'], disc}},

Next, again on each node, create file /var/lib/rabbitmq/.erlang.cookie and add some string to it. It can really be anything unless it’s identical on all nodes in the cluster. This file must have 0600 permissions and owner, group of rabbitmq server process.

Now we are ready to start the cluster. But hold on. To make it work you MUST start nodes by one, not simultaneously. Because otherwise cluster won’t be created. This is a workaround for some strange that I found in mailing list here.

I hit this one 2 times - one when I configured my RabbitMQ nodes via tmux in synchronized panes, and the other when I was writing Ansible role.

But in the end, I’ve got a very nice cluster with sane production config values that you can check out in defaults of my role

That’s it. Untill next time!