Monitoring TKGI and TKGI-Provisioned Clusters on Linux

Page last updated:

This topic lists VMware Tanzu Kubernetes Grid Integrated Edition (TKGI) components and integrations you can use to capture logs and metrics about TKGI and TKGI-provisioned cluster VMs on Linux.

Overview

To monitor TKGI and TKGI-provisioned cluster VMs, you can enable one or more of the following components and integrations:

Name Link
Syslog See Syslog below.
Telegraf (metrics) See Telegraf below.
Healthwatch See Healthwatch below.
VMware vRealize Log Insight See vRealize Log Insight (vSphere Only) below.

Syslog, Telegraf, and VMware vRealize Log Insight integrations are enabled in the Tanzu Kubernetes Grid Integrated Edition tile > Host Monitoring section. Healthwatch is deployed to Ops Manager as the Healthwatch Exporter for TKGI tile.

These components and integrations are visible only to TKGI admins. They are not visible to cluster users, such as developers.

For information about monitoring Kubernetes workloads on Linux, see Monitoring Workers and Workloads.

For information about logging and monitoring Kubernetes clusters, workers and workloads on Windows, see:

Logs: Syslog and vRLI

You can configure Syslog or vRealize Log Insight (vSphere only) to publish logs from the TKGI control plane and TKGI-provisioned cluster VMs.

You might need to inspect Syslog or vRealize Log Insight (vRLI) logs when troubleshooting or auditing your TKGI environment. For information about key TKGI events and the log entries they generate, see Auditing Tanzu Kubernetes Grid Integrated Edition Logs.

Syslog

Syslog sends log messages from all BOSH-deployed VMs in a TKGI environment to a syslog endpoint. To configure Syslog, see Syslog in the Installing topic for your IaaS.

If you do not use Syslog, you can retrieve logs from BOSH-deployed VMs by downloading them as described in Downloading Logs from VMs. However, retrieving these logs through Syslog is recommended.

vRealize Log Insight (vSphere Only)

The vRealize Log Insight (vRLI) integration for TKGI pulls logs from all BOSH jobs and containers running in the cluster, including node logs from core Kubernetes and BOSH processes, Kubernetes event logs, and pod stdout and stderr.

To configure the vRLI integration, see VMware vRealize Log Insight Integration in the Installing topic for vSphere with Flannel or vSphere with NSX-T.

For information about vRLI, see vRealize Log Insight.

Metrics: Telegraf

Telegraf sends metrics from TKGI API, master node, and worker node VMs to a monitoring service, such as Wavefront or Datadog.

In the Tanzu Kubernetes Grid Integrated Edition tile > Host Monitoring, you can configure Telegraf to collect metrics from one or more the following sources:

Source Includes metrics from…
TKGI API
  • Node Exporter (Prometheus)
Master nodes
(not visible to cluster users)

One or more of the following:

  • Node Exporter (Prometheus)
  • Kubernetes API server
  • Kubernetes controller manager
  • etcd
Worker nodes

One or more of the following:

  • Node Exporter (Prometheus)
  • kubelet

To configure Telegraf, see Configuring Telegraf in TKGI.

For more information about Node Exporter, see About Node Exporter below.

About Node Exporter

Node Exporter exports hardware and operating system metrics in Prometheus format.

In the Host Monitoring pane of the Tanzu Kubernetes Grid Integrated Edition tile, you can enable the Node Exporter BOSH job separately on master nodes, worker nodes, and the TKGI API VM.

Node Exporter exposes metrics on localhost only. For a list of Node Exporter metrics, see the Node Exporter GitHub repository.

Healthwatch

You can use the Healthwatch Healthwatch Exporter for TKGI tile to monitor the health of the TKGI Control Plane and your Linux and Windows cluster master nodes.

Healthwatch enables you to monitor the functionality of your TKGI environment and can be configured to expose metrics to a service or database external to your Ops Manager foundation. For more information, see Overview of the Healthwatch Exporter for TKGI Tile.

To configure cluster discovery in Healthwatch, see Configuring TKGI Cluster Discovery in the Healthwatch documentation.


Please send any feedback you have to pks-feedback@pivotal.io.