Monitoring TKGI and TKGI-Provisioned Clusters

Note: As of v1.8, Enterprise PKS has been renamed to VMware Tanzu Kubernetes Grid Integrated Edition. Some screenshots in this documentation do not yet reflect the change.

Page last updated:

This topic lists VMware Tanzu Kubernetes Grid Integrated Edition (TKGI) components and integrations you can use to capture logs and metrics about TKGI and TKGI-provisioned cluster VMs.

For information about monitoring Kubernetes workloads, see Monitoring Workers and Workloads.

Overview

To monitor TKGI and TKGI-provisioned cluster VMs, you can enable one or more of the following components and integrations in the Tanzu Kubernetes Grid Integrated Edition tile > Host Monitoring:

Name Link
Syslog See Syslog below.
VMware vRealize Log Insight See vRealize Log Insight (vSphere Only) below.
Telegraf (metrics) See Telegraf below.

These components and integrations are visible only to TKGI admins. They are not visible to cluster users, such as developers.

Logs: Syslog and vRLI

You can configure Syslog or vRealize Log Insight (vSphere only) to publish logs from the TKGI control plane and TKGI-provisioned cluster VMs.

You might need to inspect Syslog or vRealize Log Insight (vRLI) logs when troubleshooting or auditing your TKGI environment. For information about key TKGI events and the log entries they generate, see Auditing Tanzu Kubernetes Grid Integrated Edition Logs.

Syslog

Syslog sends log messages from all BOSH-deployed VMs in a TKGI environment to a syslog endpoint. To configure Syslog, see Syslog in the Installing topic for your IaaS.

If you do not use Syslog, you can retrieve logs from BOSH-deployed VMs by downloading them as described in Downloading Logs from VMs. However, retrieving these logs through Syslog is recommended.

vRealize Log Insight (vSphere Only)

The vRealize Log Insight (vRLI) integration for TKGI pulls logs from all BOSH jobs and containers running in the cluster, including node logs from core Kubernetes and BOSH processes, Kubernetes event logs, and pod stdout and stderr.

To configure the vRLI integration, see VMware vRealize Log Insight Integration in the Installing topic for vSphere with Flannel or vSphere with NSX-T.

For information about vRLI, see vRealize Log Insight.

Metrics: Telegraf

Telegraf sends metrics from TKGI API, master node, and worker node VMs to a monitoring service, such as Wavefront or Datadog.

In the Tanzu Kubernetes Grid Integrated Edition tile > Host Monitoring, you can configure Telegraf to collect metrics from one or more the following sources:

Source Includes metrics from…
TKGI API
  • Node Exporter (Prometheus)
Master nodes
(not visible to cluster users)

One or more of the following:

  • Node Exporter (Prometheus)
  • Kubernetes API server
  • Kubernetes controller manager
  • etcd
Worker nodes

One or more of the following:

  • Node Exporter (Prometheus)
  • kubelet

To configure Telegraf, see Configuring Telegraf in TKGI.

For more information about Node Exporter, see About Node Exporter below.

About Node Exporter

Node Exporter exports hardware and operating system metrics in Prometheus format.

In the Host Monitoring pane of the Tanzu Kubernetes Grid Integrated Edition tile, you can enable the Node Exporter BOSH job separately on master nodes, worker nodes, and the TKGI API VM.

Node Exporter exposes metrics on localhost only. For a list of Node Exporter metrics, see the Node Exporter GitHub repository.


Please send any feedback you have to pks-feedback@pivotal.io.