Monitoring TKGI and TKGI-Provisioned Clusters on Linux
Page last updated:
This topic lists VMware Tanzu Kubernetes Grid Integrated Edition (TKGI) components and integrations you can use to capture logs and metrics about TKGI and TKGI-provisioned cluster VMs on Linux.
To monitor TKGI and TKGI-provisioned cluster VMs, you can enable one or more of the following components and integrations:
|Syslog||See Syslog below.|
|Telegraf (metrics)||See Telegraf below.|
|Healthwatch||See Healthwatch below.|
|VMware vRealize Log Insight||See vRealize Log Insight (vSphere Only) below.|
Syslog, Telegraf, and VMware vRealize Log Insight integrations are enabled in the Tanzu Kubernetes Grid Integrated Edition tile > Host Monitoring section. Healthwatch is deployed to Ops Manager as the Healthwatch Exporter for TKGI tile.
These components and integrations are visible only to TKGI admins. They are not visible to cluster users, such as developers.
For information about monitoring Kubernetes workloads on Linux, see Monitoring Workers and Workloads.
For information about logging and monitoring Kubernetes clusters, workers and workloads on Windows, see:
You can configure Syslog or vRealize Log Insight (vSphere only) to publish logs from the TKGI control plane and TKGI-provisioned cluster VMs.
You might need to inspect Syslog or vRealize Log Insight (vRLI) logs when troubleshooting or auditing your TKGI environment. For information about key TKGI events and the log entries they generate, see Auditing Tanzu Kubernetes Grid Integrated Edition Logs.
Syslog sends log messages from all BOSH-deployed VMs in a TKGI environment, including Kubernetes cluster audit logs, to a syslog endpoint. To configure Syslog, see Syslog in the Installing topic for your IaaS.
If you do not use Syslog, you can retrieve logs from BOSH-deployed VMs by downloading them as described in Downloading Logs from VMs. However, retrieving these logs through Syslog is recommended.
The vRealize Log Insight (vRLI) integration for TKGI
pulls logs from all BOSH jobs and
containers running in the cluster, including node logs from core Kubernetes and
BOSH processes, Kubernetes event logs, and pod
To configure the vRLI integration, see VMware vRealize Log Insight Integration in the Installing topic for vSphere with Flannel or vSphere with NSX-T.
For information about vRLI, see vRealize Log Insight.
Telegraf sends metrics from TKGI API, control plane node, and worker node VMs to a monitoring service, such as Wavefront or Datadog.
In the Tanzu Kubernetes Grid Integrated Edition tile > Host Monitoring, you can configure Telegraf to collect metrics from one or more the following sources:
|Source||Includes metrics from…|
Control Plane nodes
(not visible to cluster users)
One or more of the following:
One or more of the following:
To configure Telegraf, see Configuring Telegraf in TKGI.
For more information about Node Exporter, see About Node Exporter below.
Node Exporter exports hardware and operating system metrics in Prometheus format.
In the Host Monitoring pane of the Tanzu Kubernetes Grid Integrated Edition tile, you can enable the Node Exporter BOSH job separately on control plane nodes, worker nodes, and the TKGI API VM.
Node Exporter exposes metrics on localhost only. For a list of Node Exporter metrics, see the Node Exporter GitHub repository.
You can use the Healthwatch Healthwatch Exporter for TKGI tile to monitor the health of the TKGI Control Plane and your Linux and Windows cluster control plane nodes.
Healthwatch enables you to monitor the functionality of your TKGI environment and can be configured to expose metrics to a service or database external to your Ops Manager foundation. For more information, see Overview of the Healthwatch Exporter for TKGI Tile.
To configure cluster discovery in Healthwatch, see Configuring TKGI Cluster Discovery in the Healthwatch documentation.
Please send any feedback you have to firstname.lastname@example.org.