This topic describes implementation of high availability for Altoros Heartbeat for Pivotal Cloud Foundry (PCF) using an external load balancer.
Setup and configuration of an external load balancer is required to implement high availability of the installation. Any highly available TCP and UDP load balancer is acceptable.
Heartbeat uses the following two ports to accept metrics:
- TCP 2003
- UDP 8125
TCP 2003 is the load balancer port for the Graphite endpoint. It is used to receive component metrics from PCF and external services sent by the collectd agent. To implement high availability in your Altoros Heartbeat installation, configure your load balancer to accept these TCP connections and balance them across Heartbeat back ends. For information about getting back end IP addresses for your configuration, see Find Back End IPs below. If high availability is not required, configure your collectd agent to send metrics to any back end as described in Installing collectd Add-On for PCF.
UDP 8125 is the load balancer port for the Statsd endpoint. It is used to receive metrics from PCF deployed applications, both built-in (e.g. JMX) and custom (sent to Statsd from an app).
Because the UDP port is used to receive metrics,
your load balancer may not support health checks.
In this case, use TCP port 8126 to test server availability.
Besides simple configuration, advanced TCP health check is available.
You can pass the
health command to the TCP port and analyze the output, which can be
Similar to Graphite metrics, metrics from StatsD are sent to the VMs with
heartbeat-nozzle in their names.
For information about getting back end IP addresses for these VMs,
see Find Nozzle IPs below.
These IP addresses can be used for configuring the load balancer.
To find the IP address for your Heartbeat back end:
- Open the Status tab.
- Copy the value in the IPS column for the Heartbeat PCF Monitoring frontend job.
To find the IP address for your Heartbeat nozzle:
- Open the Status tab.
- Copy the value in the IPS column for the Heartbeat PCF Monitoring nozzle job.
The default installation can be scaled up to improve performance. Scaling up is achieved by increasing the number of back end and front end jobs. In fact, scaling up can be done without a load balancer and high availability. However, in this case, back end operation and the metrics flow can be compromised.
Note: The default installation of the Altoros Heartbeat for PCF tile is highly performant and has large capacity for receiving and processing metrics. Lack of performance might not be related to insufficient number of instances. For assistance with detecting and eliminating performance problems with your installation, contact Contact Altoros.
To increase the number of back end and front end jobs:
From the Settings tab, click Resource Config.
In the Instances column, select the required number of front-end and back-end instances.
Navigate to the Ops Manager Installation Dashboard and click Apply Changes.
The installation can be scaled down to adjust to decreased consumption of resources. To decrease the number of back end and front end jobs, follow the steps described in Scale Up above.
Although scaling down front end jobs is unrestricted, with back end jobs, avoid selecting fewer than two instances to prevent metrics loss. Scaling down can lead to a loss in high availability for some previously received metrics, because only one copy thereof is retained. New values of these metrics have multiple copies according to a new configuration. Scale down carefully. If you need assistance, contact Altoros support.