LATEST VERSION: 1.9.0 - CHANGELOG
RabbitMQ for PCF v1.7.23

Logging, Heartbeats, and Metrics

Logging

In RabbitMQ® for Pivotal Cloud Foundry (PCF) 1.6.0 and above, you can designate an external syslog endpoint for RabbitMQ logs through Ops Manager at deployment time by performing the following steps:

  1. From the Ops Manager Installation Dashboard, click the RabbitMQ tile.
  2. In the RabbitMQ tile, click the Settings tab.
  3. Click Syslog.
  4. Enter your syslog address and port.
  5. Click Save.
  6. Return to the Ops Manager Installation Dashboard and click Apply Changes to redeploy with the changes.

A correctly configured system emits metrics for all RabbitMQ and HAProxy nodes deployed in the service. You can identify logs from individual nodes by their index, which corresponds to the index of the RabbitMQ nodes displayed in Ops Manager:

  • The logs for RabbitMQ server nodes follow the format [job=rabbitmq-server-partition-GUID index=X]
  • The logs for HAproxy nodes follow the format [job=rabbitmq-haproxy-partition-GUID index=X]
  • The logs for the RabbitMQ service broker follow the format [job=rabbitmq-broker-partition-GUID index=X]

RabbitMQ and HAProxy servers are configured to log at the info level, and capture errors, warnings and informational messages.

Heartbeats

In RabbitMQ for PCF 1.6 and above, the key system components periodically emit heartbeats for RabbitMQ server nodes, HAProxy nodes, and the Service Broker. The heartbeats are Boolean metrics: 1 means the system is available, and 0 or the absence of a heartbeat metric means the service is not responding and should be investigated.

The heartbeats are visible on the Firehose and look as follows:

  • HAProxy heartbeat: "/p-rabbitmq/haproxy/heartbeat" value:1 unit:"boolean"
  • RabbitMQ heartbeat: "/p-rabbitmq/rabbitmq/heartbeat" value:1 unit:"boolean"
  • Service Broker heartbeat: "/p-rabbitmq/service_broker/heartbeat" value:1 unit:"boolean"

Metrics

The PCF Firehose exposes the RabbitMQ and HAProxy metrics, and can direct these metrics to an external endpoint of your choice.

Configuring Secure Communication

RabbitMQ for PCF v1.7.13 lets the operator turn on/off TLS communications for metrics, via a Use non-secure communication for metrics checkbox on the metrics configuration page in Ops Manager. Configure this checkbox for different versions of PCF as follows:

  • PCF v1.8: Select this checkbox to send metrics to the Firehose.

  • PCF v1.9: Clear this checkbox to send metrics to the Firehose securely, or select it to send metrics insecurely.

  • PCF v1.10 and later: Clear this checkbox to send metrics to the Firehose and avoid errors.

Configuring Secure Metrics

Setting this checkbox incorrectly can cause a Cannot generate manifest... unknown property "cf_etcd_client_cert" error:

Configuring Secure Metrics error

Polling Interval

The metrics polling interval defaults to 30 seconds. This can be changed by navigating to the bottom of RabbitMQ cluster configuration page and entering a new value in the Metrics polling interval configuration box. Metrics polling interval (min: 10).

Metrics polling

The emitted metrics follow the format of the example below:

origin:"p-rabbitmq" eventType:ValueMetric timestamp:1441188462382091652 deployment:"cf-rabbitmq" job:"cf-rabbitmq-node" index:"0" ip:"10.244.3.46" valueMetric: < name:"/p-rabbitmq/rabbitmq/system/memory" value:1024 unit:"MB">

RabbitMQ Metrics

The table below shows the current list of RabbitMQ metrics emitted and their description.

Note: The name space for the metrics follows the format /CF-SERVICE-NAME/NODE-TYPE/METRIC-NAME.

Name Space Unit Description
/p-rabbitmq/rabbitmq/erlang/erlang_processes count The number of Erlang processes
/p-rabbitmq/rabbitmq/system/memory MB The memory in MB used by the node
/p-rabbitmq/rabbitmq/connections/count count The total number of connections to the node
/p-rabbitmq/rabbitmq/consumers/count count The total number of consumers registered in the node
/p-rabbitmq/rabbitmq/messages/delivered count The total number of messages with the status deliver_get on the node
/p-rabbitmq/rabbitmq/messages/delivered_no_ack count The number of messages with the status deliver_no_ack on the node
/p-rabbitmq/rabbitmq/messages/delivered_rate rate The rate at which messages are being delivered to consumers or clients on the node
/p-rabbitmq/rabbitmq/messages/published count The total number of messages with the status publish on the node
/p-rabbitmq/rabbitmq/messages/published_rate rate The rate at which messages are being published by the node
/p-rabbitmq/rabbitmq/messages/redelivered count The total number of messages with the status redeliver on the node
/p-rabbitmq/rabbitmq/messages/redelivered_rate rate The rate at which messages are getting the status redeliver on the node
/p-rabbitmq/rabbitmq/messages/got _no_ack count The number of messages with the status get_no_ack on the node
/p-rabbitmq/rabbitmq/messages/get _no_ack_rate rate The rate at which messages get the status get_no_ack on the node
/p-rabbitmq/rabbitmq/messages/pending count The number of messages with the status messages_unacknowledged on the node
/p-rabbitmq/rabbitmq/system/file descriptors count The number of open file descriptors on the node
/p-rabbitmq/rabbitmq/exchanges/count count The total number of exchanges on the node
/p-rabbitmq/rabbitmq/messages/available count The total number of messages with the status messages_ready on the node
/p-rabbitmq/rabbitmq/queues/count count The number of queues on the node
/p-rabbitmq/rabbitmq/channels/count count The number of channels on the node

HAProxy Metrics

The table below shows the current list of HAProxy metrics emitted and their description.

Note: The name space for the metrics follows the format /CF-SERVICE-NAME/NODE-TYPE/METRIC-NAME.

Name Space Unit Description
/p-rabbitmq/haproxy/health/connections count The total number of concurrent front-end connections to the server
/p-rabbitmq/haproxy/backend/qsize/amqp size The total size of the AMQP queue on the server
/p-rabbitmq/haproxy/backend/retries/amqp count The number of AMQP retries to the server
/p-rabbitmq/haproxy/backend/ctime/amqp time The total time to establish the TCP AMQP connection to the server
Create a pull request or raise an issue on the source for this page in GitHub