LATEST VERSION: 1.9 - CHANGELOG
JMX Bridge v1.8

Using JMX Bridge

Page last updated:

JMX Bridge is a Java Management Extensions (JMX) tool for Elastic Runtime. To help you monitor your installation and assist in troubleshooting, JMX Bridge collects and exposes system data from Cloud Foundry components via a JMX endpoint.

Note: If using JMX Bridge v1.8 with PCF v1.10, please see the following recommended Key Performance Indicators.

Cloud Controller Metrics

JMX Bridge reports the number of Cloud Controller API requests completed and the requests sent but not completed.

The number of requests sent but not completed represents the pending activity in your system, and can be higher under load. This number will vary over time, and the range it can vary over depends on specifics of your environment such as hardware, OS, processor speeds, load, etc. In any given environment, though, you can establish a typical range of values and maximum for this number.

Use the Cloud Controller metrics to ensure that the Cloud Controller is processing API requests in a timely manner. If the pending activity in your system increases significantly past the typical maximum and stays at an elevated level, Cloud Controller requests may be failing and additional troubleshooting may be necessary.

The following table shows the name of the Cloud Controller metric, what the metric represents, and the metric type (data type).

METRIC NAME DEFINITION METRIC TYPE (DATA TYPE)
cc.requests.completedNumber of Cloud Controller API requests completed since this instance of Cloud Controller startedCounter (Integer)
cc.requests.outstandingNumber of Cloud Controller API requests made but not completed since this instance of Cloud Controller startedCounter (Integer)

See the Cloud Controller topic for more information about the Cloud Controller.

Router Metrics

JMX Bridge reports the number of sent requests and the number of completed requests for each Cloud Foundry component.

The difference between these two metrics is the number of requests made to a component but not completed, and represents the pending activity for that component. The number for each component can vary over time, and is typically higher under load. In any given environment, though, you can establish a typical range of values and maximum for this number for each component.

Use these metrics to ensure that the Router is passing requests to other components in a timely manner. If the pending activity for a particular component increase significantly past the typical maximum and stays at an elevated level, additional troubleshooting of that component may be necessary. If the pending activity for most or all components increases significantly and stays at elevated values, troubleshooting of the router may be necessary.

The following table shows the name of the Router metric, what the metric represents, and the metric type (data type).

METRIC NAME DEFINITION METRIC TYPE (DATA TYPE)
gorouter.requests
[component=c]
Number of requests the router has received for component c since this instance of the router has started
c can be CloudController or route-emitter
Counter (Integer)
gorouter.responses
[status=s,component=c]
Number of requests completed by component c since this instance of the router has started
c can be CloudController or route-emitter
s is http status family: 2xx, 3xx, 4xx, 5xx, and other
Counter (Integer)

See the Router topic for more information about the Router.

Diego Metrics

Pivotal JMX Bridge reports metrics for the Diego cells and from the Diego Bulletin Board System (BBS). The following tables show the name of the Diego metric, what the metric represents, and the metric type (data type).

For general information about Diego, see the Diego Architecture topic.

Diego Cell Metrics

Pivotal JMX Bridge reports the following metrics for each Diego cell. If you have multiple cells, JMX Bridge reports metrics for each cell individually. The metrics are not summed across cells.

Use these metrics to determine the size of your deployment or when to scale up a deployment, and to track the status of Long Running Processes (LRP) in the Diego life cycle.

METRIC NAME DEFINITION METRIC TYPE (DATA TYPE)
rep.CapacityTotalMemoryTotal amount of memory available for this cell to allocate to containersGauge (Float)
rep.CapacityRemainingMemoryRemaining amount of memory available for this cell to allocate to containersGauge (Float)
rep.CapacityTotalDiskTotal amount of disk available for this cell to allocate to containersGauge (Float)
rep.CapacityRemainingDiskRemaining amount of disk available for this cell to allocate to containersGauge (Float)
rep.ContainerCountNumber of containers hosted on the cellGauge (Integer)

Diego BBS Metrics

Pivotal JMX Bridge reports these metrics from the Diego BBS, and are deployment-wide metrics. Use these metrics to inspect the state of the apps running on the deployment as a whole.

METRIC NAME DEFINITION METRIC TYPE (DATA TYPE)
bbs.CrashedActualLRPsTotal number of LRP instances that have crashedGauge (Integer)
bbs.LRPsRunningTotal number of LRP instances that are running on cellsGauge (Integer)
bbs.LRPsUnclaimedTotal number of LRP instances that have not yet been claimed by a cellGauge (Integer)
bbs.LRPsClaimedTotal number of LRP instances that have been claimed by some cellGauge (Integer)
bbs.LRPsDesiredTotal number of LRP instances desired across all LRPsGauge (Integer)
bbs.LRPsExtraTotal number of LRP instances that are no longer desired but still have a BBS recordGauge (Integer)
bbs.LRPsMissingTotal number of LRP instances that are desired but have no record in the BBSGauge (Integer)

Virtual Machine Metrics

JMX Bridge reports data for each virtual machine (VM) in a deployment. Use these metrics to monitor the health of your Virtual Machines.

The following table shows the name of the Virtual Machine metric, what the metric represents, and the metric type (data type).

METRIC NAME DEFINITION METRIC TYPE (DATA TYPE)
system.cpu.sysAmount of CPU spent in system processesGauge (Float)
system.cpu.userAmount of CPU spent in user processesGauge (Float)
system.cpu.waitAmount of CPU spent in waiting processesGauge (Float)
system.disk.ephemeral.percentPercentage of ephemeral disk used on the VMGauge (Float, 0-100)
system.disk.ephemeral.inode.percentPercentage of inodes consumed by the ephemeral diskGauge (Float, 0-100)
system.disk.persistent.percentPercentage of persistent disk used on the VMGauge (Float, 0-100)
system.disk.persistent.inode.percentThe percentage of inodes consumed by the persistent diskGauge (Float, 0-100)
system.disk.system.percentPercentage of system disk used on the VMGauge (Float, 0-100)
system.healthyIndicates whether a VM system is healthy. `1` means the system is healthy, and `0` means the system is not healthyGauge (Float, 0-1)
system.load.1mAmount of load the system is under, averaged over one minuteGauge (Float)
system.mem.percentPercentage of memory used on the VMGauge (Float)
system.swap.kbAmount of swap used on the VM in KBGauge (Float)
system.swap.percentPercentage of swap used on the VMGauge (Float, 0-100)
Create a pull request or raise an issue on the source for this page in GitHub