Frequently Asked Questions for
Pre-Provisioned VMware Tanzu RabbitMQ [VMs]

Note: Pivotal Platform is now part of VMware Tanzu. In v1.19 and later, RabbitMQ for Pivotal Platform is named VMware Tanzu RabbitMQ [VMs].

This topic lists frequently asked questions that apply to the VMware Tanzu RabbitMQ [VMs] pre-provisioned service.

Frequently Asked Questions

What should I check before deploying a new version of the tile?

Ensure that all nodes in the cluster are healthy via the RabbitMQ Management UI, or health metrics exposed via the firehose. You cannot rely solely on the BOSH instances output as that reflects the state of the Erlang VM used by RabbitMQ and not the RabbitMQ application.

What is the correct way to stop and start VMware Tanzu RabbitMQ [VMs]?

The operator should use only BOSH commands to interact with the RabbitMQ app.

There are BOSH job lifecycle hooks which are only fired when rabbitmq-server is stopped through BOSH. You can also stop individual instances by running the stop command with a specific job index.

Extra care must be taken after stopping the full cluster given that nodes must be started in reverse order of when they were shut down. This means that after running bosh stop rabbitmq-server, which stops rabbitmq-server/0 and then rabbitmq-server/1 and then rabbitmq-server/2, you cannot run bosh start rabbitmq-server.

This is because bosh start rabbitmq-server tries to start rabbitmq-server/0 first (it always starts with job index 0), but this node cannot start first because it was not the last node to stop.

To start the nodes after running bosh stop rabbitmq-server, start each instance individually by specifying each job index. For example:

bosh start rabbitmq-server/2
bosh start rabbitmq-server/1
bosh start rabbitmq-server/0

Alternatively, you can stop a full cluster by stopping each node individually, from highest index to lowest, and then running bosh start rabbitmq-server. See the example below:

bosh stop rabbitmq-server/2
bosh stop rabbitmq-server/1
bosh stop rabbitmq-server/0
bosh start rabbitmq-server

Note: Do not use monit stop rabbitmq-server as this does not call the drain scripts.

What happens when I run bosh stop rabbitmq-server?

BOSH starts the shutdown sequence from the bootstrap instance.

Tell the RabbitMQ app to shutdown and then shutdown the Erlang VM within which it is running. If this succeeds, run the following checks to ensure that the RabbitMQ app and Erlang VM have stopped:

  1. If /var/vcap/sys/run/rabbitmq-server/pid exists, check that the PID inside this file does not point to a running Erlang VM process. The Erlang PID is tracked and the RabbitMQ PID is not.
  2. Verify that rabbitmqctl does not return an Erlang VM PID.

After this completes on the bootstrap instance, BOSH continues the same sequence on the next instance. All remaining rabbitmq-server instances stop one by one.

What happens when bosh stop rabbitmq-server fails?

If the BOSH stop fails, you are likely to see the error below:

result: 1 of 1 drain scripts failed. Failed Jobs: rabbitmq-server.

What do I do when bosh stop rabbitmq-server fails?

The drain script logs to /var/vcap/sys/log/rabbitmq-server/drain.log. If you have a remote syslog configured, this appears as the rmq_server_drain program.

First, BOSH ssh into the failing rabbitmq-server instance and start the rabbitmq-server job by running monit start rabbitmq-server. You are not able to start the job through bosh start as this always runs the drain script first and fails because the drain script is failing.

After rabbitmq-server job is running (confirm this through monit status), run DEBUG=1 /var/vcap/jobs/rabbitmq-server/bin/drain. This tells you exactly why it is failing.

How can I manually back up the state of the RabbitMQ cluster?

You can back up the state of a RabbitMQ cluster for both the on-demand and pre-provisioned services using the RabbitMQ Management API. Backups include virtual hosts, exchanges, queues, and users.

Back up Manually

  1. Log in to the RabbitMQ Management UI as the admin user you created.

  2. Select export definitions from the main page.

Back up and Restore with a Script

Use the API to run scripts with code similar to the following:

  1. For the backup:

    curl -u "$USERNAME:$PASSWORD" "http://$RABBIT_ADDRESS:15672/api/definitions"
    -o "$BACKUP_FOLDER/rabbit-backup.json"
    
  2. For the restore:

    curl -u "$USERNAME:$PASSWORD" "http://$RABBIT_ADDRESS:15672/api/definitions"
    -X POST -H "Content-Type: application/json" -d
    "@$BACKUP_FOLDER/rabbit-backup.json"
    

What pre-upgrade checks should I do?

Before doing any upgrade of RabbitMQ:

  1. In Ops Manager, check that the status of all of the instances is healthy.
  2. Log in to the RabbitMQ Management UI. For instructions about how to do so with or without OAuth enabled, see Using the RabbitMQ Management UI.
  3. Verify that no alarms were triggered and that all nodes are green.
  4. Check what each node has consumed to verify that the system is not close to hitting either the memory or disk alarm.

Knowledge Base (Community)

Find the answer to your question and browse product discussions and solutions by searching the VMware Tanzu Knowledge Base.

File a Support Ticket

You can file a ticket with Support. Be sure to provide the error message from cf service YOUR-SERVICE-INSTANCE.

To expedite troubleshooting, provide your service broker logs and your service instance logs. If your cf service YOUR-SERVICE-INSTANCE output includes a task-id, provide the BOSH task output.