Monitoring Pivotal Cloud Foundry

This guide describes how Pivotal Cloud Foundry (PCF) operators can monitor their deployments.

In This Guide

This guide includes the following topics:

  • Key Performance Indicators: A list of Key Performance Indicators (KPIs) that operators may want to monitor with their PCF deployment to help ensure it is in a good operational state.
  • Key Capacity Scaling Indicators: A list of capacity scaling indicators that operators may want to monitor to determine when they need to scale their PCF deployments.
  • Selecting and Configuring a Monitoring System: Guidance for setting up PCF with monitoring platforms to continuously monitor component metrics and trigger health alerts.

For information about logging and metrics in PCF and about monitoring of services for PCF, see Additional Resources below.

KPI Changes from PCF v1.12 to v2.0

This table highlights new and changed KPIs in PCF v2.0.

Modified KPI: Adapter Loss Rate

The origin name of these metrics has changed from scalablesyslog to cf-syslog-drain.
Link
Modified KPI: Syslog Drain Bindings Count

The origin name of this metric has changed from scalablesyslog to cf-syslog-drain.

As of May 2018, the recommended scaling threshold for this metric was modified downward.
Link
Modified KPI: Cloud Controller and Diego in Sync

As of May 2018, the recommended alerting threshold for this metric was modified.
Link
New KPI: bbs.LockHeld

New capability to monitor Diego active locks at the BBS component level.
Link
New KPI: auctioneer.LockHeld

New capability to monitor Diego active locks at the Auctioneer component level.
Link
New KPI: uaa.requests.global.completed.count

New recommendation for monitoring UAA throughput.
Link
New KPI: gorouter.latency.uaa

New recommendation for monitoring UAA request latency.
Link
New KPI: uaa.server.inflight.count

New recommendation for monitoring in-flight UAA requests.
Link
New KSI: system.cpu.user of the UAA VM(s)

CPU utilization of the UAA VM(s) can indicate the need to scale UAA instances.
Link
New KPI: Number of Route Registration Messages Sent and Received

New capability to monitor for issues in the control plane responsible for updating the routers with changes to the routing table. This recommendation was also added to the PCF 1.12, 1.11, and 1.10 versions of the Key Performance Indicators documentation.
Link

Additional Resources

For information about logging and metrics in PCF, see the following topics:

  • Configuring System Logging in PAS: This topic explains how to configure the PCF Loggregator system to scale its maximum throughput and to forward logs to an external aggregator service.
  • Logging and Metrics: A guide to Loggregator, the system which aggregates and streams logs and metrics from user apps and system components in PAS.

For information about KPIs and metrics for PCF services, see the following topics:

Create a pull request or raise an issue on the source for this page in GitHub