Pivotal Healthwatch v1.8 Release Notes

v1.8.3

Release Date: January 14, 2021

Features

New features and changes in this release:

  • Upgrades PXC to v0.31.0.

  • Removes the port check from the healthwatch-forwarder monit file. Due to long JVM startup times, this check can cause the monit script to restart, ultimately causing healthwatch-forwarder to fail to start until the JVM starts quickly enough.

  • [Bug Fix] healthwatch-forwarder quotes foundation names to ensure the foundation name does not get truncated.

Known Issues

This release has the following known issues:

Push Monitoring Components Errand Fails on Azure

The Push Monitoring Components errand pushes the Healthwatch components in parallel, which can cause the errand to fail on Azure.

To work around this issue:

  1. In the Healthwatch tile, select Healthwatch Component Config.

  2. Disable the Push Healthwatch Applications in Parallel checkbox. Disabling this checkbox changes the Push Monitoring Components errand’s deployment strategy from parallel to sequential.

  3. Click Save.

  4. Re-deploy Healthwatch.

Smoke Tests Errand Fails When Installed with TAS for VMs Small Footprint v2.8 and Later

In VMware Tanzu Application Service for VMs (TAS for VMs) Small Footprint v2.8 and later, a bug causes Loggregator to fail to emit the loggregator.syslog_agent.ingress.all_drains and loggregator.syslog_agent.dropped.egress metrics. This causes the Smoke Tests errand in Healthwatch to fail.

To work around this issue:

  1. In the Healthwatch tile, select Errands.

  2. Set the Smoke Tests errand to Off.

  3. Click Save.

  4. Re-deploy Healthwatch.

Disk Slowly Fills When Using vSAN with Healthwatch Leads

The vSAN object count increases on vSphere v6.5.1 and earlier.

Healthwatch deploys the bosh-health-check app, which deploys and deletes a VM every 10 minutes. vSphere v6.5.1 and earlier leave a namespace or folder and subfolders when the VM is deleted. The orphaned folders cause the vSAN object count to increase. This is a known issue for vSAN. For more information, see Deleted VMs leave components behind in GitHub.

To address the issue, update vSphere to v6.5.2 or later. Otherwise, stop the bosh-health-check app to slow down the increase in vSAN object count.

BOSH Health Check Fails After Re-Installating Healthwatch

If Healthwatch is uninstalled and re-installed while the bosh-health-check app is running, the bosh-health-check app fails to deploy and reports an error in the Healthwatch UI.

To address this issue:

  1. Manually delete the bosh-health-check deployment.

  2. Restart the bosh-health-check app.

Indicator Protocol Beta Dashboard Displays Error Due to Log Cache

Occasionally, the Indicator Protocol Beta Dashboard charts fail to load with the error "Error fetching graph data."

These charts are populated using Log Cache, a component of Loggregator. Log Cache fails periodically when it times out while attempting to process the data.

No corrective action is required.

Healthwatch Reports False Capacity Metrics for Isolation Segments Without Placement Tags

In Pivotal Isolation Segment v2.8, you can deploy compute isolation segments without placement tags. This allows you to deploy a separate group of Diego Cells without isolating the Diego Cell capacity from other apps. For more information about this feature, see Compute and Networking Isolation in Pivotal Isolation Segment v2.8 Release Notes.

If you deploy compute isolation segments without placement tags, Healthwatch cannot accurately measure Diego Cell capacity. Capacity charts, calculated capacity metrics such as free chunks, and capacity alerts may incorrectly report a lower capacity than is available for apps.

Metrics Troubleshooting Errand Fails When External Databases Are Configured

The Metrics Troubleshooting errand expects certain MySQL metrics to be present. If you configured TAS for VMs to use external databases, the errand fails.

To work around this issue:

  1. In the Healthwatch tile, select Errands.

  2. Set the Metrics Troubleshooting errand to Off.

  3. Click Save.

  4. Re-deploy Healthwatch.

v1.8.2

Release Date: September 29, 2020

Features

New features and changes in this release:

  • Adds support for TAS for VMs v2.10.

  • [Bug Fix] Adds a workaround for the Azure bug where running the Push Monitoring Components errand fails on Azure. For more information, see Known Issues.

  • [Bug Fix] Cleans up cf-health-check temporary files periodically to limit disk usage.

Known Issues

This release has the following known issues.

Push Monitoring Components Errand Fails on Azure

The Push Monitoring Components errand pushes the Healthwatch components in parallel, which can cause the errand to fail on Azure.

To work around this issue:

  1. In the Healthwatch tile, select Healthwatch Component Config.

  2. Disable the Push Healthwatch Applications in Parallel checkbox. Disabling this checkbox changes the Push Monitoring Components errand’s deployment strategy from parallel to sequential.

  3. Click Save.

  4. Re-deploy Healthwatch.

Smoke Tests Errand Fails When Installed with TAS for VMs Small Footprint v2.8 and Later

In VMware Tanzu Application Service for VMs (TAS for VMs) Small Footprint v2.8 and later, a bug causes Loggregator to fail to emit the loggregator.syslog_agent.ingress.all_drains and loggregator.syslog_agent.dropped.egress metrics. This causes the Smoke Tests errand in Healthwatch to fail.

To work around this issue:

  1. In the Healthwatch tile, select Errands.

  2. Set the Smoke Tests errand to Off.

  3. Click Save.

  4. Re-deploy Healthwatch.

BOSH Health Check Fails After Re-Installating Healthwatch

If Healthwatch is uninstalled and re-installed while the bosh-health-check app is running, the bosh-health-check app fails to deploy and reports an error in the Healthwatch UI.

To address this issue:

  1. Manually delete the bosh-health-check deployment.

  2. Restart the bosh-health-check app.

Disk Slowly Fills When Using vSAN with Healthwatch Leads

The vSAN object count increases on vSphere v6.5.1 and earlier.

Healthwatch deploys the bosh-health-check app, which deploys and deletes a VM every 10 minutes. vSphere v6.5.1 and earlier leave a namespace or folder and subfolders when the VM is deleted. The orphaned folders cause the vSAN object count to increase. This is a known issue for vSAN. For more information, see Deleted VMs leave components behind in GitHub.

To address the issue, update vSphere to v6.5.2 or later. Otherwise, stop the bosh-health-check app to slow down the increase in vSAN object count.

Indicator Protocol Beta Dashboard Displays Error Due to Log Cache

Occasionally, the Indicator Protocol Beta Dashboard charts fail to load with the error "Error fetching graph data."

These charts are populated using Log Cache, a component of Loggregator. Log Cache fails periodically when it times out while attempting to process the data.

No corrective action is required.

Healthwatch Reports False Capacity Metrics for Isolation Segments Without Placement Tags

In Pivotal Isolation Segment v2.8, you can deploy compute isolation segments without placement tags. This allows you to deploy a separate group of Diego Cells without isolating the Diego Cell capacity from other apps. For more information about this feature, see Compute and Networking Isolation in Pivotal Isolation Segment v2.8 Release Notes.

If you deploy compute isolation segments without placement tags, Healthwatch cannot accurately measure Diego Cell capacity. Capacity charts, calculated capacity metrics such as free chunks, and capacity alerts may incorrectly report a lower capacity than is available for apps.

Metrics Troubleshooting Errand Fails When External Databases Are Configured

The Metrics Troubleshooting errand expects certain MySQL metrics to be present. If you configured TAS for VMs to use external databases, the errand fails.

To work around this issue:

  1. In the Healthwatch tile, select Errands.

  2. Set the Metrics Troubleshooting errand to Off.

  3. Click Save.

  4. Re-deploy Healthwatch.

v1.8.1

Release Date: March 11, 2020

Breaking Change: The delete-space errand is renamed to cleanup. If you use automation scripts that reference this errand, such as automated pipelines, update the errand name from delete-space to cleanup.

Features

New features and changes in this release:

  • [Bug Fix] Fixes issue where super metrics were lost when ingestor instances were overloaded by router latency metrics.

Known Issues

This release has the following known issues.

Push Monitoring Components Errand Fails on Azure

The Push Monitoring Components errand pushes the Healthwatch components in parallel, which can cause the errand to fail on Azure.

To work around this issue:

  1. In the Healthwatch tile, select Healthwatch Component Config.

  2. Disable the Push Healthwatch Applications in Parallel checkbox. Disabling this checkbox changes the Push Monitoring Components errand’s deployment strategy from parallel to sequential.

  3. Click Save.

  4. Re-deploy Healthwatch.

Smoke Tests Errand Fails When Installed with TAS for VMs Small Footprint v2.8 and Later

In VMware Tanzu Application Service for VMs (TAS for VMs) Small Footprint v2.8 and later, a bug causes Loggregator to fail to emit the loggregator.syslog_agent.ingress.all_drains and loggregator.syslog_agent.dropped.egress metrics. This causes the Smoke Tests errand in Healthwatch to fail.

To work around this issue:

  1. In the Healthwatch tile, select Errands.

  2. Set the Smoke Tests errand to Off.

  3. Click Save.

  4. Re-deploy Healthwatch.

BOSH Health Check Fails After Re-Installating Healthwatch

If Healthwatch is uninstalled and re-installed while the bosh-health-check app is running, the bosh-health-check app fails to deploy and reports an error in the Healthwatch UI.

To address this issue:

  1. Manually delete the bosh-health-check deployment.

  2. Restart the bosh-health-check app.

Disk Slowly Fills When Using vSAN with Healthwatch Leads

The vSAN object count increases on vSphere v6.5.1 and earlier.

Healthwatch deploys the bosh-health-check app, which deploys and deletes a VM every 10 minutes. vSphere v6.5.1 and earlier leave a namespace or folder and subfolders when the VM is deleted. The orphaned folders cause the vSAN object count to increase. This is a known issue for vSAN. For more information, see Deleted VMs leave components behind in GitHub.

To address the issue, update vSphere to v6.5.2 or later. Otherwise, stop the bosh-health-check app to slow down the increase in vSAN object count.

Indicator Protocol Beta Dashboard Displays Error Due to Log Cache

Occasionally, the Indicator Protocol Beta Dashboard charts fail to load with the error "Error fetching graph data."

These charts are populated using Log Cache, a component of Loggregator. Log Cache fails periodically when it times out while attempting to process the data.

No corrective action is required.

Healthwatch Reports False Capacity Metrics for Isolation Segments Without Placement Tags

In Pivotal Isolation Segment v2.8, you can deploy compute isolation segments without placement tags. This allows you to deploy a separate group of Diego Cells without isolating the Diego Cell capacity from other apps. For more information about this feature, see Compute and Networking Isolation in Pivotal Isolation Segment v2.8 Release Notes.

If you deploy compute isolation segments without placement tags, Healthwatch cannot accurately measure Diego Cell capacity. Capacity charts, calculated capacity metrics such as free chunks, and capacity alerts may incorrectly report a lower capacity than is available for apps.

Metrics Troubleshooting Errand Fails When External Databases Are Configured

The Metrics Troubleshooting errand expects certain MySQL metrics to be present. If you configured TAS for VMs to use external databases, the errand fails.

To work around this issue:

  1. In the Healthwatch tile, select Errands.

  2. Set the Metrics Troubleshooting errand to Off.

  3. Click Save.

  4. Re-deploy Healthwatch.

v1.8.0

Release Date: December 18, 2019

Features

New features and changes in this release:

  • Ingresses metrics from Reverse Log Proxy (Firehose v2), rather than Traffic Controllers (Firehose v1). The Reverse Log Proxy has a more configurable API and a longer support window. For more information, see Loggregator RLP Documentation.

  • Adds Syslog and Forwarder Agents to Pivotal Healthwatch VMs, which emit syslog directly from the VM rather than through the Firehose.

  • Removes Adapter Loss Rate and CF Syslog Drain Binding Count from Healthwatch.

  • Renames log_cache.cache_period to log_cache.log_cache_cache_period to align with Open Metrics naming conventions.

Known Issues

This release has the following known issues.

BOSH Health Check Fails After Re-Installating Healthwatch

If Healthwatch is uninstalled and re-installed while the bosh-health-check app is running, the bosh-health-check app fails to deploy and reports an error in the Healthwatch UI.

To address this issue:

  1. Manually delete the bosh-health-check deployment.

  2. Restart the bosh-health-check app.

Disk Slowly Fills When Using vSAN with Healthwatch Leads

The vSAN object count increases on vSphere v6.5.1 and earlier.

Healthwatch deploys the bosh-health-check app, which deploys and deletes a VM every 10 minutes. vSphere v6.5.1 and earlier leave a namespace or folder and subfolders when the VM is deleted. The orphaned folders cause the vSAN object count to increase. This is a known issue for vSAN. For more information, see Deleted VMs leave components behind in GitHub.

To address the issue, update vSphere to v6.5.2 or later. Otherwise, stop the bosh-health-check app to slow down the increase in vSAN object count.

Indicator Protocol Beta Dashboard Displays Error Due to Log Cache

Occasionally, the Indicator Protocol Beta Dashboard charts fail to load with the error "Error fetching graph data."

These charts are populated using Log Cache, a component of Loggregator. Log Cache fails periodically when it times out while attempting to process the data.

No corrective action is required.

Healthwatch Reports False Capacity Metrics for Isolation Segments Without Placement Tags

In Pivotal Isolation Segment v2.8, you can deploy compute isolation segments without placement tags. This allows you to deploy a separate group of Diego Cells without isolating the Diego Cell capacity from other apps. For more information about this feature, see Compute and Networking Isolation in Pivotal Isolation Segment v2.8 Release Notes.

If you deploy compute isolation segments without placement tags, Healthwatch cannot accurately measure Diego Cell capacity. Capacity charts, calculated capacity metrics such as free chunks, and capacity alerts may incorrectly report a lower capacity than is available for apps.