Healthwatch v2.1 Release Notes

This topic contains release notes for Healthwatch v2.0.6 and v2.1.

Note: Healthwatch v2.0.6 is a beta version that is no longer available for download. VMware does not recommend using Healthwatch v2.0.6 in production environments.

The architecture of Healthwatch v2.1 is entirely different from the architecture of Pivotal Healthwatch v1. Healthwatch v2.1 uses the open-source components Prometheus, Grafana, and Alertmanager to scrape, store, and view metrics, as well as configure alerts. For more information about the differences between Pivotal Healthwatch v1 and Healthwatch v2.1 and how to upgrade to Healthwatch v2.1, see Upgrading Healthwatch.

For more information about Healthwatch v2.1, read Healthwatch for VMware Tanzu 2.1 Offers Breakthrough Platform Monitoring on the VMware Tanzu blog and New Features below.

For information about the risks and limitations of Healthwatch v2.1, see Assumed Risks of Using Healthwatch v2.1 and Healthwatch v2.1 Limitations in Healthwatch.

Releases

v2.1.1

Release Date: 05/14/2021

  • [Feature] Healthwatch supports VMware Tanzu Application Service for VMs (TAS for VMs) v2.11.

  • [Known Issue Fix] When uninstalling the Healthwatch Exporter for TAS for VMs tile, the bosh-health deployment is always deleted. For more information about this known issue, see BOSH Health Metric Exporter VM Causes 401 Error below.

  • [Bug Fix] The uid|title is used more than once error does not appear in Grafana VM logs.

  • [Bug Fix] The Failed to read plugin provisioning files from directory error does not appear in Grafana VM logs.

  • [Bug Fix] The Uptime SLO Target filter dropdown appears in the Healthwatch SLO dashboard in the Grafana UI.

  • [Bug Fix] The Exporter Availability graphs in the Healthwatch SLO dashboard in the Grafana UI show six digits of precision.

  • [Bug Fix] You can scale your Prometheus instance down to one VM.

  • [Bug Fix] The BOSH Backup and Restore (BBR) script does not return a Snapshot failed: Client sent an HTTP request to an HTTPS server error.

  • [Bug Fix] Healthwatch Exporter for Tanzu Kubernetes Grid Integrated Edition (TKGI) requires the TKGI tile to be installed.

  • [Bug Fix] Healthwatch Exporter for TAS for VMs requires the TAS for VMs tile to be installed.

  • [Bug Fix] Healthwatch components deploy across availablity zones (AZs) when you configure them to do so.

  • [Bug Fix] The Ops Manager UAA instance logs canary URL test redirects to the Ops Manager Installation Dashboard correctly.

Healthwatch v2.1.1 uses the following open-source component versions:

Component Packaged Version
Prometheus 2.25.0
Grafana 7.5.4
Alertmanager 0.21.0

v2.1.0

Release Date: 03/18/2021

Healthwatch v2.1.0 uses the following open-source component versions:

Component Packaged Version
Prometheus 2.25.0
Grafana 7.4.2
Alertmanager 0.21.0

v2.0.6

Release Date: 02/11/2021

Healthwatch v2.0.6 uses the following open-source component versions:

Component Packaged Version
Prometheus 2.25.0
Grafana 7.4.2
Alertmanager 0.21.0

How to Upgrade

To upgrade from Pivotal Healthwatch v1 to Healthwatch v2.1, see Upgrading Healthwatch.

New Features

Healthwatch v2.0.6 and v2.1 include the following major features:

Healthwatch Supports TKGI v1.10

You can use Healthwatch to monitor TKGI v1.10.

Your TKGI dashboard in the Grafana UI updates automatically to display TKGI v1.10 metrics unless you manually set the dashboard version when you configure the Grafana VM. For more information about setting the TGKI version for your dashboards, see Configure Grafana in Configuring Healthwatch.

Assign Static IP Addresses to Prometheus VMs

You can assign static IP addresses to your Prometheus VMs.

If you configure email alerts through Alertmanager, you may need to add the IP addresses of your Prometheus VMs to your Ops Manager allowlist so your SMTP server does not block them. You can then view the IP addresses of your Prometheus VMs using the BOSH CLI.

For more information about assigning IP addresses to your Prometheus VMs, see (Optional) Configure Prometheus in Configuring Healthwatch.

Healthwatch Replaces Event Alerts With Alertmanager

Healthwatch v2.1 uses Alertmanager, an open-source Prometheus component, to manage and send alerts according to the alerting rules you configure. Pivotal Healthwatch v1 used Pivotal Event Alerts for managing alerts, which Healthwatch v2.1 does not support.

For more information about configuring alerts, see Configuring Alerting.

Healthwatch Removes Monitoring Indicator Protocol

Healthwatch no longer supports Monitoring Indicator Protocol, and the Indicator Protocol dashboard is removed from the Grafana UI.

This change does not affect RabbitMQ metrics and dashboards.

Breaking Changes

Healthwatch v2.0.6 includes the following breaking changes:

Self-Signed Certificates Break Dashboards

While Pivotal Healthwatch v1 does not require SSL validation of Ops Manager certificates, Healthwatch v2.1 checks for SSL validation by default. If your Ops Manager deployment uses self-signed certificates, you must configure the Healthwatch tile to skip SSL validation.

If the Ops Manager Health dashboard in the Grafana UI displays a “Not Running” error, select the Skip SSL validation checkbox in the Canary URL Configuration pane of the Healthwatch tile. For more information about configuring this checkbox, see (Optional) Configure Canary URLs in Configuring Healthwatch.

If your Certificate Expiration dashboard displays “N/A” or you see errors in your certificate expiration metric logs, select the Skip SSL Validation for Cert Expiration in the TAS Exporter Configuration pane of the Healthwatch Exporter for TAS for VMs tile or the TKGI Exporter Configuration pane of the Healthwatch Exporter for TKGI tile. For more information about configuring this checkbox, see (Optional) Configure TAS for VMs Metric Exporter VMs in Configuring Healthwatch Exporter for TAS for VMs or (Optional) Configure TKGI and Certificate Expiration Metric Exporter VMs in Configuring Healthwatch Exporter for TKGI.

Known Issues

Healthwatch v2.0.6 and v2.1 include the following known issues:

Scaled VMs Not Distributed Across AZs by Default

This known issue is fixed only for new downloads of Healthwatch v2.1.1. If you upgrade to Healthwatch v2.1.1 from an earlier version of Healthwatch v2, this is still an issue.

If you scale up Healthwatch component VMs, they are not automatically distributed across availablity zones. You must first scale down the VMs, then scale them back up to distribute them evenly across availability zones.

For more information about scaling Healthwatch component VMs, see Healthwatch Components and Resource Requirements.

“Unable to Render Templates” Error When Installing or Upgrading

This issue is fixed in Ops Manager v2.8 and later.

When installing or upgrading to Healthwatch v2.1, you could see the following error:

- Unable to render templates for job 'opsman-cert-expiration-exporter'. Errors are:
  - Error filling in template 'bpm.yml.erb' (line 9: Can't find property '["opsman_access_credentials.uaa_client_secret"]')

This error occurs if you upgraded from Ops Manager v2.3 or earlier to Ops Manager v2.4 through v2.7.

For more information about how to fix this issue without upgrading to Ops Manager v2.8, see “Unable to Render Templates” Error When Installing or Upgrading in Troubleshooting Healthwatch.

BOSH Health Metric Exporter VM Causes 401 Error

If you re-install the Healthwatch Exporter for TAS for VMs tile, the BOSH health metric exporter VM does not always delete the BOSH deployment it creates, bosh-health. This causes the following error:

Director responded with non-successful status code '401' response
'{"code":600000,"description":"Require one of the scopes: bosh.admin,
bosh.750587e9-eae5-494f-99c4-5ca429b13959.admin,
bosh.teams.p-healthwatch2-pas-exporter-b3a337d7ec4cca94f166.admin"}'

To fix this issue, upgrade to Healthwatch v2.1.1 or manually delete the bosh-health deployment using the BOSH CLI. For more information about upgrading to Healthwatch v2.1.1, see Upgrading Healthwatch. For more information about deleting a BOSH deployment using the BOSH CLI, see the BOSH documentation.

No Data on TKGI Kubernetes Nodes Dashboard

If you are using TKGI v1.10.0 or v1.10.1, the Kubernetes Nodes dashboard in the Grafana UI might not show data for individual pods. This is due to a known issue in Kubernetes v1.19.6 and earlier and Kubernetes v1.20.1 and earlier.

To fix this issue, upgrade to TKGI v1.10.2. For more information about upgrading to TKGI v1.10.2, see Upgrading Tanzu Kubernetes Grid Integrated Edition in the TKGI documentation.

MySQL Proxy Disk Space Fills Up Quickly

In Healthwatch v2.0.6, the MySQL PXC instance stores too many binlogs, which fills up the persistent disk space for MySQL Proxy VMs at a faster rate.

To fix this issue, upgrade to Healthwatch v2.1. For more information about upgrading to Healthwatch v2.1, see Upgrading Healthwatch.