Installing and Configuring Altoros Heartbeat for PCF

This topic describes how to install and configure Altoros Heartbeat for Pivotal Cloud Foundry (PCF).

Note: If you update the Altoros Heartbeat for PCF tile, the collectd BOSH add-on must be updated as well.

Install and Configure Altoros Heartbeat for PCF

  1. Download the product file and the collectd add-on from Pivotal Network.

  2. Navigate to the Ops Manager Installation Dashboard and click Import a Product to upload the product file.

  3. Under the Import a Product button, click + next to the version number of Altoros Heartbeat for PCF. This adds the tile to your staging area.

  4. Click the newly added Altoros Heartbeat for PCF tile.

  5. Click Save.

Configure AZ and Network Assignments

  1. In Pivotal Ops Manager, click the Altoros Heartbeat for PCF tile to open its configuration options.

  2. From the Settings tab, click Assign AZs and Networks to configure the network and availability zones.

    Az

  3. Click Save.

Configure Alerts

  1. From the Settings tab, click Heartbeat Alert Configuration.

  2. Specify the external server for uploading Grafana images so that they can be attached to Slack and email alert notifications:

    1. If you are using AWS S3 , enable the S3 checkbox and provide your Amazon S3 Bucket URL, Amazon API Access Key ID, and Amazon API Secret Key. Alert
    2. If you are using WebDAV, enable the webdav checkbox and provide your WebDAV URL, username and password for WebDAV (basic auth), and (optional) the URL to send to users in notifications. The URL is appended to the resulting name of the uploaded file. Alert webdav
  3. To define email server settings, select the SMTP checkbox and complete the following fields:

    • Host:port: The hostname and port of the outgoing SMTP server
    • Username: Your SMTP server username
    • Password: Your SMTP server password
    • Disable SSL certificate verification: If you are using your own, self-signed certificates that are not from the trusted CA, disable SSL verification.
    • Address used when sending out emails: The address from which emails will be sent Alert mail
  4. Click Save.

Configure Graphite Nozzle

  1. From the Settings tab, click Heartbeat Graphite Nozzle Configuration.

  2. If you are using several apps domains in PCF, provide a comma-separated list of additional (other than the apps domain indicated in the Pivotal Elastic Runtime, which needs not to be provided) domains. HttpStartStop metrics from Firehose with the entered domain names will not be dropped.

  3. In the Interval of updating Nozzle’s cache for apps metadata field, you set the value that determines how frequently the information about all the deployed applications is updated. The frequency of updates has an impact on CC API’s performance (see note below).

  4. In the Interval (in seconds) to flush metrics from Firehose field, you set the interval (in seconds) to flush metrics from Firehose to metrics database. The interval value should correlate with the values entered in all the “Storage schema for …” fields (see section Heartbeat Metrics Storage Configuration) for Firehose-related metrics. This value should be adjusted after you configure the schemas.

    Graphite nozzle 1

  5. Click Save.

To get the right value for the Interval of updating Nozzle’s cache for apps metadata, you should know how much time it will take to extract application data from CC API. It is vital to prevent data updates from overlapping. You can check it out in the Heartbeat logs after setup:

$ bosh2 -d altoros-heartbeat-... vms
$ bosh2 -d altoros-heartbeat-... ssh heartbeat-nozzle/...
$ less /var/vcap/sys/log/graphite-nozzle/graphite-nozzle_ctl.stdout.log
INFO: 2018/04/24 **12:33:50** Start filling app/space/org cache.
INFO: 2018/04/24 **12:34:56** Done filling cache! Found [1051] Apps

Configure Grafana

  1. From the Settings tab, click Heartbeat Grafana Configuration.

  2. Provide the following information:

    1. The organization and space within PCF granting users admin access rights to Altoros Heartbeat.

      In Altoros Heartbeat, the following users have access to the Admin View:
      • Administrator accounts within PCF
      • User accounts within PCF are assigned to the above specified org and space
      Cloud Foundry user rights map to Grafana rights in the following way:
      PCF spaceGrafana organisation
      Space ManagerAdmin
      Space DeveloperEditor
      Space AuditorViewer


    2. The email domain to be appended to the user’s login if the latter does not represent a real email address. User email is used for alerts. User email can be changed by user later. This option is used to preserve consistency.
    3. Activate the Disable SSL certificate verification for UAA and CC API endpoints checkbox if the certificates you are using are self-signed or come from an untrusted CA. The option is used by Grafana for making requests to UAA and CC API during user authentication. Grafana
  3. Click Save.

Configure Metrics Storage

  1. In the Settings tab, click Heartbeat Metrics Storage Configuration.

    1. Definitions of a storage schema for different metric types:

      Note: At the moment, the storage schemas fields can only be configured during the initial installation. For this reason, please read the Altoros Heartbeat Scaling and Performance Tuning section first.

      1. The value entered in the Storage schema for metrics coming from statsd, e.g. Firehose/apps metrics and Storage schema for BOSH metrics coming from Firehose fields should correlate with the value entered in the Interval (in seconds) to flush metrics from Firehose field.
      2. The value entered in the Storage schema for metrics coming from collectd, e.g. VM/services metrics field should correlate with the value entered for interval collectd BOSH Add-on.
      3. The value entered in the Storage schema for MySQL Tile v1 metrics, Storage schema for MySQL Tile v2 metrics, Storage schema for RabbitMQ metrics, and Storage schema for Redis metrics fields should correlate with the value entered in the Interval (in seconds) to flush metrics from Firehose field. Values of these fields should also correlate with the value entered in the Metrics Polling Interval field in the corresponding Tiles. Metrics storage configuration 1
    2. If you navigate to the Replication factor automatic calculation field, you will see a tip explaining you may click Disable to set up replication manually. If you click Enable, the replication factor will be calculated as Math.sqrt(num_of_backends).ceil.

      Note: The replication factor is responsible for regulating a redundancy level by determining which servers a particular metric gets to. For large environments where the number of Heartbeat back ends exceeds 9, it is recommended to manually set up the value. It prevents overloading and saves disk space.

      1. If you choose to Disable, you will be able to set the Replication factor for metrics field and find out how many servers you may lose without losing data (metrics).
    3. Enable removal of inactive metrics helps prevent undesirable expansion of the metric tree resulting in overloaded dashboards, cluttered drop-down menus, extra disk usage, etc. To configure it:
      1. Select the Enable checkbox to activate removal of inactive metrics from the metrics storage.
      2. Specify the time period (in days) upon expiry of which the metrics that have not received any new values throughout will be considered inactive.
    4. vm.dirty_ratio contains, as a percentage of total available memory that contains free pages and reclaimable pages, the number of pages at which a process which is generating disk writes will itself start writing out dirty data. The total available memory is not equal to total system memory.
    5. vm.dirty_background_ratio contains the amount of dirty memory at which the background kernel flusher threads will start writeback.
    6. vm.dirty_expire_centisecs is used to define when dirty data is old enough to be eligible for writeout by the kernel flusher threads. It is expressed in 100'ths of a second. Data which has been dirty in-memory for longer than this interval will be written out next time a flusher thread wakes up. Metrics storage configuration 2
  2. Click Save.

Configure BOSH HM Forwarder

  1. In the Settings tab, click BOSH HM Forwarder configuration.

  2. Specify the default port BOSH HM Forwarder will use to forward the metrics collected from BOSH Agents.

    Bosh hm forwarder

  3. Click Save.

Configure Integration with Altoros Log Search for PCF

To enable integration with the Log Search for PCF tile from Altoros:

  1. In the Settings tab, click Integration with Altoros Log Search for PCF.

  2. Provide the Elasticsearch Master IP:

    1. Navigate to the Ops Manager Installation Dashboard and click the Altoros Log Search for PCF tile.
    2. In the Status tab, copy the first Elasticsearch Master IP from the list. Ip of elasticsearch
    3. Paste the copied IP into the Elasticsearch Master IP field. Integration with logsearch
  3. Click Save.

For more information about this feature, see Altoros Log Search for PCF Integration.

Configure External Services Monitoring

Note: Before configuring External Services Monitoring, make sure you have installed the latest version of the collectd BOSH add-on.

  1. In the Settings tab, click External Services Monitoring Configuration.

    External services

  2. Click Add next to the services you want to monitor and complete the following fields:

    1. For MySQL Servers:
      • Host: the IP address or hostname of the MySQL server
      • Port: the port number on which the server listens
      • User and Password: the username and password used to access the MySQL server
    2. For PostgreSQL Databases:
      • Host: the IP address or hostname of the database
      • Port: the port number on which the PostgreSQL database listens
      • User and Password: the username and password used to access the database
      • The Enable disk IO by table checkbox, when selected, measures the disc I/O usage by each specific table.

        Note: Selecting Enable disk IO by table significantly escalates the number of generated metrics, so, by default, the checkbox is disabled.

    3. For MongoDB Servers:
      • Host: the IP address or hostname of the MongoDB server
      • Port: the port number on which the server listens
      • (Optional) User and Password: the username and password used to access the server
      • A space-separated list of MongoDB databases: names of the monitored databases. The admin database is added automatically.
    4. For Memcached Servers:
      • Host: the IP address or hostname of the Memcached server
      • Port: the port number on which the server listens
    5. For Redis Servers:
      • Host: the IP address or hostname of the Redis server
      • Port: the port number on which the server listens
      • (Optional) Password: the password used to access the server
    6. For RabbitMQ Servers:
      • Host: the IP address or hostname of the RabbitMQ server
      • Port: the port number on which the server listens
      • User and Password: the username and password used to access the server
  3. Click Save.

Note: Currently, only one instance of the MongoDB, MySQL, or RabbitMQ services can be monitored.

For more information about third-party services monitoring, see Third-Party Services Monitoring.

Configure MySQL Proxy

  1. In the Settings tab, click MySQL Proxy Configuration.

  2. Complete the following fields:

    1. Load Balancer Unhealthy Threshold (Shutdown Delay): If you are using a load balancer above the proxies, enter your load balancers unhealthy total threshold time here in seconds. For example, if your load balancer polls every 30 seconds, and immediately fails over upon failure, set this property to 30 seconds.
    2. Load Balancer Healthy Threshold (Start Delay): If you are using a load balancer above the proxies, enter your load balancers healthy total threshold time here in seconds. For example, if your load balancer polls every 30 seconds and requires three successful attempts, set this property to 90 seconds.
    3. Consul Service Name: If consul_enabled is true, switchboard registers with consul using this name. Mysql proxy
  3. Click Save.

Configure MySQL Server

  1. In the Settings tab, click MySQL Server Configuration.

  2. If you are controlling access to the database from outside of PCF and the access is limited by a hostname, clear the Disable reverse DNS lookups checkbox.

  3. Complete the following fields:

    1. Read-Only User Password: The password to automatically provision a roadmin user with the right to read all databases in the system, but permission to write to none. Leave blank to disable the read-only user.
    2. MySQL Start Timeout: The minimum amount of time necessary for the MySQL process to start, in seconds. For advanced configuration, read the MySQL documentation before modifying. Mysql server
  4. Click Save.

Configure Load Balancer for High Availability

  1. From the Settings tab, click Load Balancer Configuration for HA.

    Balancer for ha

  2. Complete the following fields:

    1. Load Balancer host for graphite endpoint: IP address or hostname of your external load balancer configured for Heartbeat. For information about getting back end IP addresses for your configuration, see Find Back-End IPs. Leave blank if high availability is not required.
    2. Load Balancer port for graphite endpoint: port number for the load balancer configured to accept metrics from the collectd add-on.
    3. Load Balancer host for statsd endpoint: IP address or hostname of the load balancer configured to accept application metrics. For information about getting back end IP addresses for your configuration, see Find Back-End IPs. The properties are separate, so two separate load balancers for collectd and application metrics can be used. Leave blank if high availability is not required.
    4. Load Balancer port for statsd endpoint: port number for the external load balancer configured to accept statsd connections. Note that you need to provide the port accepting metrics, not the port accepting health checks.
  3. Click Save.

For more information about this feature, see High Availability.

Configure Errands

  1. In the Settings tab, click Errands.

    Errands

  2. Select options for the following post-deploy errands:

    1. Create Heartbeat UAA Client: Creates an UAA Client for Heartbeat. The recommended option is When Changed.
    2. Heartbeat Grafana Bootstrap: Configures/updates Grafana for admin view; updates all service instances for Heartbeat (developer views). The recommended option is On.
    3. Broker Registrar: Registers Heartbeat’s service broker. The recommended option is When Changed.
  3. Select options for the following pre-delete errands:

    1. Delete Heartbeat UAA Client: Deletes the UAA Client for Heartbeat. The recommended option is On.
    2. Broker Deregistrar: Deregisters Heartbeat’s service broker. The recommended option is On.
  4. Click Save.

Configure Resource Config

Check out Altoros Heartbeat Scaling and Performance Tuning to find Resource Config examples for different environments.

If you are using the gp2 volume type with Amazon EC2 instances and expect an intensive influx of metrics, Altoros recommends provisioning at least 300 GB of persistent disk volume to the Heartbeat PCF Monitoring back end VMs to ensure sustained IOPS performance.

To increase the default disk volume value, do the following:

  1. In the Settings tab, click Resource Config.

  2. Change the default value in the Persistent disk type column next to Heartbeat PCF Monitoring back end job from 100 GB to 300 GB.

    Resource config

  3. Click Save.

  4. Return to the Ops Manager Installation Dashboard and click Apply Changes to apply the configuration settings.

Note: If you are not monitoring external services, Altoros recommends selecting 0 instances for Collectd VM to monitor external services. If you are going to monitor external services, reset the number of instances for this VM to 1.

Note: After you complete the procedures on this page, follow the instructions in Installing collectd Add-On for PCF.

Configure BOSH Director

  1. In the Status tab, copy the IP of Bosh HM Forwarder.

    Ops manager director 1

  2. In the Ops Manager Installation Dashboard, click the BOSH Director tile.

  3. In the Settings tab, click Director Config.

  4. Do one of the following:

    • For PCF v1.10, paste the copied BOSH HM Forwarder IP address into Metrics IP Address. Ops manager director 2
    • For PCF v1.11 or later, paste the copied BOSH HM Forwarder IP address into Bosh HM Forwarder IP Address. Ops manager director 3
  5. Click Save.

  6. Return to the Ops Manager Installation Dashboard and click Apply Changes to apply the configuration settings.

Configure Redis, RabbitMQ, and MySQL

If you have Redis, RabbitMQ, and MySQL for PCF tiles installed and want to collect metrics from these services using Altoros Heartbeat, set the Metrics polling interval field for these services to correspond storage schemas configured earlier.

To set the metrics polling interval in Redis:

  1. In the Ops Manager Installation Dashboard, click the Redis tile.

  2. In the Setting tab, click Metrics.

  3. In the Metrics polling interval field, enter your value.

    Configure redis

  4. Click Save.

  5. Return to the Ops Manager Installation Dashboard and click Apply Changes to apply the configuration settings.

To set the metrics polling interval in RabbitMQ:

  1. In the Ops Manager Installation Dashboard, click the RabbitMQ tile.

  2. In the Setting tab, click RabbitMQ.

  3. In the Metrics polling interval field, enter your value.

    Configure rabbitmq

  4. Click Save.

To set the metrics polling interval in MySQL v1 tile:

  1. In the Ops Manager Installation Dashboard, click the MySQL v1 tile.

  2. In the Setting tab, click Advanced Options.

  3. In the Metrics polling interval field, enter your value.

    Configure mysql v1

  4. Click Save.

To set the metrics polling interval in MySQL v2 tile:

  1. In the Ops Manager Installation Dashboard, click the MySQL v2 tile.

  2. In the Setting tab, click Monitoring.

  3. In the Metrics polling interval field, enter your value.

    Configure mysql v2

  4. Click Save.

  5. Return to the Ops Manager Installation Dashboard and click Apply Changes to apply the configuration settings.

Note: After you complete the procedures on this page, follow the instructions in Installing collectd Add-On for PCF.

Create a pull request or raise an issue on the source for this page in GitHub