Frequently Asked Questions for
Pre-Provisioned VMware Tanzu RabbitMQ for VMs

Note: Pivotal Platform is now part of VMware Tanzu. In v1.20 and later, VMware Tanzu RabbitMQ [VMs] is named VMware Tanzu RabbitMQ for VMs.

This topic lists frequently asked questions that apply to the VMware Tanzu RabbitMQ for VMs pre-provisioned service.

Frequently Asked Questions

What should I check before deploying a new version of the tile?

Ensure that all nodes in the cluster are healthy through the RabbitMQ Management UI, or health metrics exposed by the firehose. You cannot rely solely on the BOSH instances output as that reflects the state of the Erlang VM used by RabbitMQ and not the RabbitMQ application.

What is the correct way to stop and start VMware Tanzu RabbitMQ for VMs?

Only BOSH commands should be used by the operator to interact with the RabbitMQ app.

For example:

bosh stop rabbitmq-server and bosh start rabbitmq-server.

There are BOSH job lifecycle hooks which are only fired when rabbitmq-server is stopped through BOSH. You can also stop individual instances by running the stop command and specifying JOB [index].

Note: Do not use monit stop rabbitmq-server as this does not call the drain scripts

What happens when I run bosh stop rabbitmq-server?

BOSH starts the shutdown sequence from the bootstrap instance.

We start by telling the RabbitMQ application to shutdown and then shutdown the Erlang VM within which it is running. If this succeeds, we run the following checks to ensure that the RabbitMQ application and Erlang VM have stopped:

  1. If /var/vcap/sys/run/rabbitmq-server/pid exists, check that the PID inside this file does not point to a running Erlang VM process. Notice that we are tracking the Erlang PID and not the RabbitMQ PID.
  2. Check that rabbitmqctl does not return an Erlang VM PID

Once this completes on the bootstrap instance, BOSH will continue the same sequence on the next instance. All remaining rabbitmq-server instances will be stopped one by one.

What happens when bosh stop rabbitmq-server fails?

If the BOSH stop fails, you will likely get an error saying that the drain script failed with:

result: 1 of 1 drain scripts failed. Failed Jobs: rabbitmq-server.

What do I do when bosh stop rabbitmq-server fails?

The drain script logs to /var/vcap/sys/log/rabbitmq-server/drain.log. If you have a remote syslog configured, this will appear as the rmq_server_drain program.

First, BOSH ssh into the failing rabbitmq-server instance and start the rabbitmq-server job by running monit start rabbitmq-server. You will not be able to start the job via BOSH start as this always runs the drain script first and will fail since the drain script is failing.

Once rabbitmq-server job is running (confirm this via monit status), run DEBUG=1 /var/vcap/jobs/rabbitmq-server/bin/drain. This will tell you exactly why it’s failing.

How can I manually back up the state of the RabbitMQ cluster?

It is possible to back up the state of a RabbitMQ cluster for both the on-demand and pre-provisioned services using the RabbitMQ Management API. Backups include virtual hosts, exchanges, queues, and users.

Back up Manually

  1. Log in to the RabbitMQ Management UI as the admin user that you created. For instructions about how to do so with or without OAuth enabled, see Using the RabbitMQ Management UI.

  2. Select export definitions from the main page.

Back up and Restore with a Script

Use the API to run scripts with code similar to the following:

  1. For the backup:

    curl -u "$USERNAME:$PASSWORD" "http://$RABBIT_ADDRESS:15672/api/definitions"
    -o "$BACKUP_FOLDER/rabbit-backup.json"
  2. For the restore:

    curl -u "$USERNAME:$PASSWORD" "http://$RABBIT_ADDRESS:15672/api/definitions"
    -X POST -H "Content-Type: application/json" -d

What pre-upgrade checks should I do?

Before doing any upgrade of RabbitMQ, VMware recommends checking the following:

  1. In Ops Manager, check that the status of all of the instances is healthy.
  2. Log in to the RabbitMQ Management UI. For instructions about how to do so with or without OAuth enabled, see Using the RabbitMQ Management UI.
  3. Check that no alarms have been triggered and that all nodes are healthy, that is, they should display as green.
  4. Check that the system is not close to hitting either the memory or disk alarm. Do this by looking at what each node has consumed in the RabbitMQ Management UI.

Knowledge Base (Community)

Find the answer to your question and browse product discussions and solutions by searching the VMware Tanzu Knowledge Base.

File a Support Ticket

You can file a ticket with Support. Be sure to provide the error message from cf service YOUR-SERVICE-INSTANCE.

To expedite troubleshooting, provide your service broker logs and your service instance logs. If your cf service YOUR-SERVICE-INSTANCE output includes a task-id, provide the BOSH task output.

Was this helpful?
What can we do to improve?