LATEST VERSION: 1.8.10 - CHANGELOG
RabbitMQ for PCF v1.7.21

Deploying the RabbitMQ® Service

Default Deployment

Deploying RabbitMQ for Pivotal Cloud Foundry (PCF) through Ops Manager will deploy a RabbitMQ cluster of 2 nodes by default.

The deployment includes a single load balancer haproxy which spreads connections on all of the default ports, for all of the shipped plugins across all of the machines within the cluster.

The deployment will occur in a single availability zone (AZ).

The default configuration is for testing purposes only and it is recommended that customers have a minium of 3 RabbitMQ nodes and 2 HAproxy nodes

Deployment default

Considerations for this deployment

  • Provides HA for the RabbitMQ cluster
  • Queues must be judiciously configured to be HA as they are placed on one node by default
  • Customers should decide on which partition behaviour is best suited to their use case. For two nodes ‘autoheal’ is preferred
  • HAProxy is a single point of failure (SPOF)
  • The entire deployment is in a single AZ, which does not protect against external failures from failures in hardware, networking, etc.

Multi-AZ Deployment

RabbitMQ can be deployed in a multiple availability zone environment if the latency between the zones can be guaranteed to be less than 10ms. This is criticial for cluster performance and recovering from network partitions. High latency can look like network partitions from the RabbitMQ cluster perspective.

RabbitMQ server nodes should be scaled to an odd number and should be greater than 3.

Replication of queues should only be used where requred as it can have a big impact on system performance.

The HAProxy job instance count should also be increased to match the number of AZs to ensure there is a HAProxy located in each AZ. This removes the HAProxy SPOF and provides further redundancy.

Deployment recommended

In the above diagram, you can see that you can now suffer the failure of a single HAProxy and single RabbitMQ node and still keep your cluster online and applications connected.

It is also recommend that customers chooses an odd number of RabbitMQ server nodes of three or more.

Upgrading to this deployment from a single AZ deployment

It is not possible to upgrade to this setup from the default deployment across a single AZ.

This is because the AZ setup cannot be changed once the tile has being deployed for the first time, this is to protect against data loss when moving jobs between AZs.

Upgrading to this deployment from a multi AZ deployment

If you have deployed the tile across two AZs, but with a single HAProxy instance you can migrate to this setup as follows:

  1. Deploy an additional HAProxy instance through OpsManager
  2. New or re-bound applications to the RabbitMQ service will see the IPs of both HAProxys immediately
  3. Existing bound applications will continue to work, but only using the previously deployed HAProxy IP Address. They can be re-bound as required at your discretion.

Considerations for this deployment

  • Requires IaaS configuration for availability zones ahead of deploying the RabbitMQ tile
  • It is required that cross-AZ latency be less than 10ms
  • Application developers will be handed the IPs of each deployed HAProxy in their environment variables
  • Queues must be judiciously configured to be HA as they are placed on one node by default
  • Customers should decide on which partition behaviour is best suited to their use case. For 3 or more nodes 'pause_minority’ is preferred

Advanced Deployment

This deployment builds upon the above recommended deployment, so follows the same upgrade paths.

This allows you to replace the use of HAProxy with your own external load balancer.

You may choose to do this to remove any knowledge of the topology of the RabbitMQ setup from application developers.

Deployment advanced

Advantages

  • Application developers do not need to handle multiple IPs for the HAProxy jobs in their applications

Disadvantages

  • The load balancer needs to be configured with the IPs of the RabbitMQ Nodes. These will only be known once the deployment has finished. The IPs should remain the same during subsequent deployments but there is a risk they can change.

Upgrading to this deployment from the multi AZ deployment

It is possible to first deploy with multiple HAProxy jobs, as per the recommended deployment and decided to later use your own external load balancer.

This can be achieved without downtime to your applications.

This can be achieved as follows:

  1. Configure your external load balancer to point to the RabbitMQ Node IPs
  2. Configure the DNS name or IP address for the external load balancer (ELB) on the RabbitMQ tile in OpsManager
  3. Deploy the changes
  4. Any new instances of the RabbitMQ service or any re-bound connections will use the DNS name or IP address of the ELB in their VCAP_SERVICES
  5. Any existing instances will continue to use the HAProxy IP addresses in their VCAP_SERVICES
  6. Phase the re-binding of existing applications to have their environment variables updated
  7. Once all applications are updated
  8. Reduce the instance count of the HAProxy job in OpsManager to 1
  9. Deploy the changes

This approach works as any existing bound applications have their VCAP_SERVICES information cached in the cloud controller and are only updated by a re-bind request.

If you are currently using an external load balancer, then you can move back to using HAProxys instead.

You can achieve this by following the above steps in reverse order and re-instating the HAProxy jobs.

Create a pull request or raise an issue on the source for this page in GitHub