Telemetry

Note: As of v1.8, Enterprise PKS has been renamed to VMware Tanzu Kubernetes Grid Integrated Edition. Some screenshots in this documentation do not yet reflect the change.

Page last updated:

This topic describes the VMware Customer Experience Improvement Program (CEIP) and the Telemetry Program used in the Tanzu Kubernetes Grid Integrated Edition tile.

Overview

The CEIP and Telemetry program allows VMware to collect data from customer installations to improve your Tanzu Kubernetes Grid Integrated Edition experience. Collecting data at scale enables us to identify patterns and alert you to warning signals in your Tanzu Kubernetes Grid Integrated Edition installation.

Participation Levels

You can configure Tanzu Kubernetes Grid Integrated Edition to use one of the following CEIP and Telemetry participation levels:

  • None: This level disables data collection.

  • Standard: (Default) This level collects data anonymously. Your data is used to inform the ongoing development of Tanzu Kubernetes Grid Integrated Edition.

  • Enhanced: This level enables VMware to warn you about security vulnerabilities and potential issues with your software configurations. For more information, see Benefits of the Enhanced Participation Level below.

Note: Tanzu Kubernetes Grid Integrated Edition does not collect any personally identifiable information (PII) at either participation level. For a list of the data Tanzu Kubernetes Grid Integrated Edition collects, see Data Dictionary.

Benefits of the Enhanced Participation Level

Benefits you receive with the Enhanced participation level include but are not limited to the following:

  • Usage data: This gives you access to data about Kubernetes pod and cluster usage in your Tanzu Kubernetes Grid Integrated Edition installation. See sample reports below for more details.
  • Access to your telemetry data: This gives you access to configuration and usage data about your Tanzu Kubernetes Grid Integrated Edition installation. See sample reports below for more details.
  • Proactive support: This enables VMware to proactively warn you about unhealthy patterns.
  • Benchmarks: This is your usage relative to the rest of the Tanzu Kubernetes Grid Integrated Edition user base.

The table below compares the Standard and Enhanced participation levels.

Benefit Standard Level Enhanced Level
Usage data Raw data Reports and trend analysis
Access to your telemetry data No Yes
Proactive support No Yes
Benchmarks No Yes

Note: VMware reserves the right to change the benefits associated with the Enhanced participation level at any time.

Configure CEIP and Telemetry

For information about configuring CEIP and Telemetry participation, see the CEIP Opt-In Walkthrough video, below:

To configure CEIP and Telemetry, see the CEIP and Telemetry section of the installation topic for your IaaS:

Proxy Communication

If you use a proxy server, the Tanzu Kubernetes Grid Integrated Edition proxy settings apply to outgoing telemetry data.

To configure Tanzu Kubernetes Grid Integrated Edition proxy settings for CEIP and Telemetry and other communications, see the following:

System Components

The CEIP and Telemetry programs use the following components to collect data:

  • Telemetry Server: This component runs on the TKGI control plane. The server receives telemetry events from the TKGI API and metrics from Telemetry agent pods. The server sends events and metrics to a data lake for archiving and analysis.

  • Telemetry Agent Pod: This component runs in each Kubernetes cluster as a deployment with one replica. Agent pods periodically poll the Kubernetes API for cluster metrics and send the metrics to the Telemetry server.

The following diagram shows how telemetry data flows through the system components:


  Telemetry System Data Flow

Data Dictionary

For information about TKGI Telemetry collection and reporting, see the TKGI Telemetry Data spreadsheet, hosted on Google Drive.

Sample Reports

You can view the interactive version of the Sample Workbook with Tableau Reader (free to use). Click on the links below to see static screenshots of the reports.

  1. Consumption: As an Operator of TKGI, I need a way to monitor pod consumption across my TKGI environments over time, so I can:
    • See which environments and clusters get the heaviest use
    • See temporal patterns in pod consumption
    • Scale capacity accordingly
    • Show and charge back users of TKGI within my organization
  2. API heartbeats + Cluster heartbeats: As an Operator of TKGI I need a way to see the version of TKGI each of my environments was running over time, so I can:
    • Keep track of all my TKGI environments and clusters
    • Identify environments and clusters in need of upgrading
  3. Cluster creation events: As an Operator of TKGI I want to see how often cluster creation succeeds across my TKGI environments, so I can:
    • Identify environments that encounter repeated failures and debug or intervene as appropriate to avoid frustration for cluster admins and users
  4. Cluster creation duration: As an Operator of TKGI I want to see how long it takes to create clusters, so I can:
    • Intervene when cluster creation significantly more time than expected, and adjust my plan and network configuration as appropriate
  5. Cluster creation errors: As an Operator of TKGI, I want to see what errors are being encountered most frequently during cluster creation so I can:
    • Quickly identify widespread problems and remediate (e.g. NSX errors)
  6. Container images: As an Operator of TKGI, I want to see which container images are in use across my TKGI installations so I can:
    • Conduct an audit of container images and identify prohibited or problematic images
    • Infer which workloads are running on TKGI, to inform my planning, resourcing, and outreach

Please send any feedback you have to pks-feedback@pivotal.io.