LATEST VERSION: 1.10.5 - CHANGELOG
RabbitMQ for PCF v1.8.20

Deploying the RabbitMQ® Service

Default Deployment

Deploying RabbitMQ for Pivotal Cloud Foundry (PCF) through Ops Manager will deploy a RabbitMQ cluster of 3 nodes by default.

The deployment includes a single load balancer haproxy which spreads connections on all of the default ports, for all of the shipped plugins across all of the machines within the cluster.

The deployment will occur in a single availability zone (AZ).

The default configuration is for testing purposes only and it is recommended that customers have a minimum of 3 RabbitMQ nodes and 2 HAProxy nodes

Deployment default

Considerations for this deployment

  • Provides HA for the RabbitMQ cluster
  • Queues must be judiciously configured to be HA as they are placed on one node by default
  • Customers should decide on which partition behaviour is best suited to their use case. For two nodes ‘autoheal’ is preferred
  • HAProxy is a single point of failure (SPOF)
  • The entire deployment is in a single AZ, which does not protect against external failures from failures in hardware, networking, etc.

Mutli-AZ Deployment

RabbitMQ can be deployed in a multiple availability zone environment if the latency between the zones can be guaranteed to be less than 10ms. This is critical for cluster performance and recovering from network partitions. High latency can look like network partitions from the RabbitMQ cluster perspective.

RabbitMQ server nodes should be scaled to an odd number and should be greater than 3.

Replication of queues should only be used where required as it can have a big impact on system performance.

The HAProxy job instance count should also be increased to match the number of AZs to ensure there is a HAProxy located in each AZ. This removes the HAProxy SPOF and provides further redundancy.

Deployment recommended

In the above diagram, you can see that you can now suffer the failure of a single HAProxy and single RabbitMQ node and still keep your cluster online and applications connected.

It is also recommend that customers chooses an odd number of RabbitMQ server nodes of three or more.

Upgrading to this deployment from a single AZ deployment

It is not possible to upgrade to this setup from the default deployment across a single AZ.

This is because the AZ setup cannot be changed once the tile has being deployed for the first time, this is to protect against data loss when moving jobs between AZs.

Upgrading to this deployment from a multi AZ deployment

If you have deployed the tile across two AZs, but with a single HAProxy instance you can migrate to this setup as follows:

  1. Deploy an additional HAProxy instance through Ops Manager
  2. New or re-bound applications to the RabbitMQ service will see the IPs of both HAProxys immediately
  3. Existing bound applications will continue to work, but only using the previously deployed HAProxy IP Address. They can be re-bound as required at your discretion.

Considerations for this deployment

  • Requires IaaS configuration for availability zones ahead of deploying the RabbitMQ tile
  • It is required that cross-AZ latency be less than 10ms
  • Application developers will be handed the IPs of each deployed HAProxy in their environment variables
  • Queues must be judiciously configured to be HA as they are placed on one node by default
  • Customers should decide on which partition behaviour is best suited to their use case. For 3 or more nodes 'pause_minority’ is preferred

Advanced Deployment

This deployment builds upon the above recommended deployment, so follows the same upgrade paths.

This allows you to replace the use of HAProxy with your own external load balancer.

You may choose to do this to remove any knowledge of the topology of the RabbitMQ setup from application developers.

Deployment advanced

Advantages

  • Application developers do not need to handle multiple IPs for the HAProxy jobs in their applications

Disadvantages

  • The load balancer needs to be configured with the IPs of the RabbitMQ Nodes. These will only be known once the deployment has finished. The IPs should remain the same during subsequent deployments but there is a risk they can change.

Upgrading to this deployment from the multi AZ deployment

It is possible to first deploy with multiple HAProxy jobs, as per the recommended deployment and decided to later use your own external load balancer.

This can be achieved without downtime to your applications.

This can be achieved as follows:

  1. Configure your external load balancer to point to the RabbitMQ Node IPs
  2. Configure the DNS name or IP address for the external load balancer (ELB) on the RabbitMQ tile in Ops Manager
  3. Deploy the changes
  4. Any new instances of the RabbitMQ service or any re-bound connections will use the DNS name or IP address of the ELB in their VCAP_SERVICES
  5. Any existing instances will continue to use the HAProxy IP addresses in their VCAP_SERVICES
  6. Phase the re-binding of existing applications to have their environment variables updated
  7. Once all applications are updated
  8. Reduce the instance count of the HAProxy job in Ops Manager to 1
  9. Deploy the changes

This approach works as any existing bound applications have their VCAP_SERVICES information cached in the cloud controller and are only updated by a re-bind request.

If you are currently using an external load balancer, then you can move back to using HAProxys instead.

You can achieve this by following the above steps in reverse order and re-instating the HAProxy jobs.

Resource requirements

The following table shows the default resource and IP requirements for installing the tile:

Product Resource Instances CPU Ram Ephemeral Persistent Static IP Dynamic IP
RabbitMQ RabbitMQ node 3 2 8192 16384 30720 1 0
RabbitMQ HAProxy for RabbitMQ 1 1 2048 4096 0 1 0
RabbitMQ RabbitMQ service broker 1 1 2048 4096 0 1 0
RabbitMQ Broker Registrar 1 1 1024 2048 0 0 1
RabbitMQ Broker Deregistrar 1 1 1024 2048 0 0 1
RabbitMQ Smoke Tests 1 1 1024 2048 0 0 1
RabbitMQ RabbitMQ on-demand broker 1 1 1024 8192 1024 0 1
RabbitMQ Register On-Demand Service Broker 1 1 1024 2048 0 0 1
RabbitMQ Deregister On-Demand Service Broker 1 1 1024 2048 0 0 1
RabbitMQ Delete All Service Instances 1 1 1024 2048 0 0 1
RabbitMQ Upgrade All Service Instances 1 1 1024 2048 0 0 1

Notes:

  • The number of RabbitMQ Node can be increased if required.
  • Changing the number of RabbitMQ nodes when the erlang cookie is not defined will restart the cluster. Check here for more information.
Create a pull request or raise an issue on the source for this page in GitHub