Configuring Federation for Healthwatch

This topic describes how to configure federation for your Healthwatch installation.

Overview of Federation

Healthwatch supports federation. When you configure your Healthwatch installation to federate metrics, the Prometheus instance in the Healthwatch tile on a monitoring foundation, such as the Ops Manager foundation that monitors other Ops Manager foundations or the Tanzu Kubernetes Grid Integrated Edition (TKGI) Control Plane, scrapes a subset of metrics from the Prometheus instances in the Healthwatch tiles installed on the foundations you monitor. This is useful if you want to monitor a subset of metrics from multiple foundations without storing all metrics from multiple foundations in a single Prometheus instance.

In a typical Healthwatch installation, you install the Healthwatch tile on the monitoring foundation that you use to monitor other foundations, and you install either the Healthwatch Exporter for VMware Tanzu Application Service for VMs (TAS for VMs) tile or the Healthwatch Exporter for TKGI tile on each foundation you want to monitor. The Prometheus instance in the Healthwatch tile on your monitoring foundation scrapes metrics from the Healthwatch Exporter tiles on each foundation you monitor.

To configure federation for your Healthwatch installation, you install the Healthwatch tile on your monitoring foundation and on each foundation you want to monitor, in addition to the Healthwatch Exporter tile you install on each foundation you monitor. You then configure the Healthwatch tile on each foundation you monitor to scrape metrics from the Healthwatch Exporter tile installed on the same foundation, and the Healthwatch tile on your monitoring foundation to scrape metrics from the Healthwatch tiles installed on the foundations you monitor.

For more information about federation, see Federation in the Prometheus documentation.

Warning: Storing all Loggregator Firehose metrics from more than one large TAS for VMs foundation in a single Prometheus instance negatively affects the performance of that Prometheus instance, sometimes even causing it to crash. To avoid this, VMware recommends federating only service level indicator (SLI) metrics from each foundation you monitor to the Prometheus instance in your monitoring foundation. For more information about SLI metrics for TAS for VMs, see TAS for VMs SLI Exporter VM in Healthwatch Metrics.

Configure Federation

To configure federation your Healthwatch installation:

  1. Install the Healthwatch tile on the foundation you use to monitor other foundations, such as an Ops Manager foundation or the TKGI Control Plane. For more information, see Installing a Tile Manually or Installing, Configuring, and Deploying a Tile Through an Automated Pipeline.

  2. Install the Healthwatch tile and either the Healthwatch Exporter for TAS for VMs tile or the Healthwatch Exporter for TKGI tile on each foundation you want to monitor. For more information, see Installing a Tile Manually or Installing, Configuring, and Deploying a Tile Through an Automated Pipeline.

  3. For each foundation you monitor, expose the Prometheus instance in the Healthwatch tile on port 4450.

  4. For each foundation you want to monitor:

    1. Navigate to the Ops Manager Installation Dashboard for the foundation you want to monitor.
    2. Click the Healthwatch tile.
    3. Select the Credentials tab.
    4. In the Promxy Client Mtls row of the TSDB section, click Link to Credential.
    5. Record the values of private_key_pem and cert_pem. These values are the private key and certificate for Promxy Client mTLS.
    6. Retrieve the Ops Manager root certificate authority (CA) for the foundation you want to monitor. For more information, see Retrieve the Ops Manager Root CA in Managing Certificates with the Ops Manager API in the Ops Manager documentation.
    7. Navigate to the Ops Manager Installation Dashboard for your monitoring foundation.
    8. Click the Healthwatch tile.
    9. Select Prometheus Configuration.
    10. Under Additional Scrape Config Jobs, click Add.
    11. For TSDB Scrape job, provide the configuration YAML for a scrape job for the Prometheus instance in the Healthwatch tile on the foundation you want to monitor. In the example below, the scrape job federates all metrics with names that match the regular expression ^metric_name_regex.* from the Prometheus instance at the IP address listed under the targets property:

      job_name: example-job-name
      scheme: https
      metrics_path: '/federate'
      params:
        'match[]':
          - '{__name__=~"^metric_name_regex.*"}'
      static_configs:
        - targets:
          - 'source-tsdb-1:4450'
          - 'source-tsdb-2:4450'
      

      Note: If you have configured a load balancer or DNS entry for your Prometheus instance, include the IP address for your load balancer or DNS entry in each target listed under the targets property instead of the IP address for your Prometheus instance.

    12. For TLS Config Certificate Authority, enter the Ops Manager root CA from the foundation you want to monitor that you recorded in a previous step.

    13. For TLS Config Certificate and Private Key, enter the certificate and private key you recorded from the Promxy Client mTLS row in the Credentials tab in the Healthwatch tile installed on the foundation you want to monitor in a previous step.

    14. For TLS Config Server Name, enter promxy. This server name appears as the server_name property in the tls_config section of the Prometheus configuration file.

    15. Click Save.

      If you are using the om CLI to configure the Healthwatch tile, the example below shows how you would enter the example configuration YAML above in an automation script:

      product-properties:
      .properties.scrape_configs:
      value:
      - ca: |
        -----BEGIN CERTIFICATE-----
        SECRET
        -----END CERTIFICATE-----
      scrape_job: |
        job_name: example-job-name
        scheme: https
        metrics_path: '/federate'
        params:
          'match[]':
            - '{name=~"^my_metric_name_regex.*"}'
        static_configs:
          - targets:
            - 'source-prometheus-1:4450'
      server_name: prometheus
      tls_certificates:
        cert_pem: |
          -----BEGIN CERTIFICATE-----
          SECRET
          -----END CERTIFICATE-----
        private_key_pem: |
          -----BEGIN RSA PRIVATE KEY-----
          SECRET
          -----END RSA PRIVATE KEY-----
      

    For more information, see Configure and Deploy Your Tile Using the om CLI in Installing, Configuring, and Deploying a Tile Through an Automated Pipeline.

For more information about configuring scrape jobs, see Prometheus Configuration in Configuring Healthwatch and <scrape_config> in Configuration in the Prometheus documentation.

Federation for a Highly Available Healthwatch Deployment

In a highly available (HA) Healthwatch deployment, each VM in the Prometheus instance in the Healthwatch tile scrapes the same data from the metric exporter VMs that the Healthwatch Exporter tiles deploy.

When federating metrics, you can configure the Prometheus instance in the Healthwatch tile on your monitoring foundation to scrape both copies of that data from the Prometheus instance in the Healthwatch tile on each foundation you monitor. To do this, include both VMs in each Prometheus instance from the foundations you want to monitor in the scrape job configuration YAML.

Alternatively, you can create load balancers or DNS entries in your IaaS console for the Prometheus instances on each foundation you monitor, then include the IP addresses for each load balancer or DNS entry in the targets listed under the targets property in your scrape job configuration YAML. For more information, see Configure Federation above.

In both cases, VMware recommends configuring static IP addresses for both VMs in each of your Prometheus instances. For more information about configuring static IP addresses for Prometheus instances, see Configure Prometheus in Configuring Healthwatch.