Pivotal Healthwatch v1.8 Release Notes

v1.8.1

Release Date: March 11, 2020

Breaking Change: The delete-space errand was renamed to cleanup. If you have scripts that explicitly reference this errand, such as automated pipelines, update the errand name from delete-space to cleanup.

Features

New features and changes in this release:

  • [Bug Fix] Fix the issue where super metrics were lost when ingestor instances were overloaded by router latency metrics.

Known Issues

This release has the following known issues.

BOSH Health Check Fails After Reinstallation

If Healthwatch is uninstalled and re-installed while the BOSH Health Check is running, then the BOSH Health Check fails to deploy and reports an error in the Healthwatch UI.

To address this issue, manually delete the bosh-health-check deployment and restart the bosh-health-check app.

Disk Slowly Fills When Using vSAN with Healthwatch Leads

The vSAN object count increases on vSphere versions earlier than v6.5 update 2.

Healthwatch deploys the app bosh-health-check, which deploys and deletes a VM every 10 minutes. vSphere versions earlier than v6.5 update 2 leave a namespace or folder and subfolders when the VM is deleted. The orphaned folders cause the vSAN object count to increase. This is a known issue for vSAN. For more information about the vSAN known issue, see Deleted VMs leave components behind in GitHub.

To address the issue, update vSphere to v6.5 update 2 or later. Or, you can stop the bosh-health-check to slow down the increase in vSAN object count.

Indicator Protocol Beta Dashboard Displays Error Due to Log Cache

Occasionally, the Indicator Protocol Beta Dashboard charts will fail to load with the error: "Error fetching graph data.".

These charts are populated using Log Cache, which is part of Loggregator and will fail periodically due to Log Cache timing out while attempting to process the data.

No corrective action is required and it will self-resolve if possible.

Healthwatch Reports False Capacity Metrics for Isolation Segments Without Placement Tags

In PAS v2.8, a new feature in the Isolation Segment Tile allows you to deploy compute isolation segments without placement tags. This allows you to deploy a separate group of Diego Cells without isolating the Cell capacity from other apps. For more information about this feature, see Compute and Networking Isolation in Pivotal Isolation Segment v2.8 Release Notes.

If you deploy compute isolation segments without placement tags, Healthwatch cannot accurately measure and report on capacity. Capacity charts, calculated capacity metrics such as Free Chunks, and capacity alerts may incorrectly report a lower capacity than is available for apps.

validate-expected-metrics fails when external databases are configured

validate-expected-metrics expects certain MySQL metrics to be present, which will not be the case if you have configured the External databases section under Databases in the Tanzu Application Service tile.

As a workaround, you can disable this errand under Errands > Metrics Troubleshooting in the Healthwatch tile when installing it.

v1.8.0

Release Date: December 18, 2019

Features

New features and changes in this release:

  • Ingresses metrics from Reverse Log Proxy (Firehose v2), rather than Traffic Controllers (Firehose v1). The Reverse Log Proxy has a more configurable API and a longer support window. See Loggregator RLP Documentation.
  • Adds Syslog and Forwarder Agents to Pivotal Healthwatch VMs, which emit syslog directly from the VM rather than through the Firehose.
  • Removes Adapter Loss Rate and CF Syslog Drain Binding Count from Healthwatch
  • Renames log_cache.cache_period to log_cache.log_cache_cache_period to align with Open Metrics naming conventions.

Known Issues

This release has the following known issues.

BOSH Health Check Fails After Reinstallation

If Healthwatch is uninstalled and re-installed while the BOSH Health Check is running, then the BOSH Health Check fails to deploy and reports an error in the Healthwatch UI.

To address this issue, manually delete the bosh-health-check deployment and restart the bosh-health-check app.

Disk Slowly Fills When Using vSAN with Healthwatch Leads

The vSAN object count increases on vSphere versions earlier than v6.5 update 2.

Healthwatch deploys the app bosh-health-check, which deploys and deletes a VM every 10 minutes. vSphere versions earlier than v6.5 update 2 leave a namespace or folder and subfolders when the VM is deleted. The orphaned folders cause the vSAN object count to increase. This is a known issue for vSAN. For more information about the vSAN known issue, see Deleted VMs leave components behind in GitHub.

To address the issue, update vSphere to v6.5 update 2 or later. Or, you can stop the bosh-health-check to slow down the increase in vSAN object count.

Indicator Protocol Beta Dashboard Displays Error Due to Log Cache

Occasionally, the Indicator Protocol Beta Dashboard charts will fail to load with the error: "Error fetching graph data.".

These charts are populated using Log Cache, which is part of Loggregator and will fail periodically due to Log Cache timing out while attempting to process the data.

No corrective action is required and it will self-resolve if possible.

Healthwatch Reports False Capacity Metrics for Isolation Segments Without Placement Tags

In PAS v2.8, a new feature in the Isolation Segment Tile allows you to deploy compute isolation segments without placement tags. This allows you to deploy a separate group of Diego Cells without isolating the Cell capacity from other apps. For more information about this feature, see Compute and Networking Isolation in Pivotal Isolation Segment v2.8 Release Notes.

If you deploy compute isolation segments without placement tags, Healthwatch cannot accurately measure and report on capacity. Capacity charts, calculated capacity metrics such as Free Chunks, and capacity alerts may incorrectly report a lower capacity than is available for apps.