Installing Tanzu Kubernetes Grid Integrated Edition on vSphere with NSX-T

Page last updated:

This topic describes how to install and configure VMware Tanzu Kubernetes Grid Integrated Edition on vSphere with NSX-T integration.

Prerequisites

Before you begin this procedure, ensure that you have successfully completed all preceding steps for installing Tanzu Kubernetes Grid Integrated Edition on vSphere with NSX-T, including:

Step 1: Install Tanzu Kubernetes Grid Integrated Edition

To install Tanzu Kubernetes Grid Integrated Edition, do the following:

  1. Download the product file from VMware Tanzu Network.
  2. Navigate to https://YOUR-OPS-MANAGER-FQDN/ in a browser to log in to the Ops Manager Installation Dashboard.
  3. Click Import a Product to upload the product file.
  4. Under Tanzu Kubernetes Grid Integrated Edition in the left column, click the plus sign to add this product to your staging area.

Step 2: Configure Tanzu Kubernetes Grid Integrated Edition

Click the orange Tanzu Kubernetes Grid Integrated Edition tile to start the configuration process.

Note: Configuration of NSX-T or Flannel cannot be changed after initial installation and configuration of Tanzu Kubernetes Grid Integrated Edition.

TKGI tile on the Ops Manager installation dashboard

WARNING: When you configure the Tanzu Kubernetes Grid Integrated Edition tile, do not use spaces in any field entries. This includes spaces between characters as well as leading and trailing spaces. If you use a space in any field entry, the deployment of Tanzu Kubernetes Grid Integrated Edition fails.

Assign AZs and Networks

To configure the availability zones (AZs) and networks used by the Tanzu Kubernetes Grid Integrated Edition control plane:

  1. Click Assign AZs and Networks.

  2. Under Place singleton jobs in, select the availability zone (AZ) where you want to deploy the TKGI API and TKGI Database VMs.

    Assign AZs and Networks pane in Ops Manager

  3. Under Balance other jobs in, select the AZ for balancing other Tanzu Kubernetes Grid Integrated Edition control plane jobs.

    Note: You must specify the Balance other jobs in AZ, but the selection has no effect in the current version of Tanzu Kubernetes Grid Integrated Edition.

  4. Under Network, select the TKGI Management Network linked to the ls-tkgi-mgmt NSX-T logical switch you created in the Create Networks Page step of Configuring BOSH Director with NSX-T for Tanzu Kubernetes Grid Integrated Edition. This provides network placement for Tanzu Kubernetes Grid Integrated Edition component VMs, such as the TKGI API and TKGI Database VMs.

  5. Under Service Network, your selection depends on whether you are installing a new Tanzu Kubernetes Grid Integrated Edition deployment or upgrading from a previous version of Tanzu Kubernetes Grid Integrated Edition.

    • If you are deploying Tanzu Kubernetes Grid Integrated Edition with NSX-T for the first time, select the TKGI Management Network that you specified in the Network field. You do not need to create or define a service network because Tanzu Kubernetes Grid Integrated Edition creates the service network for you during the installation process.
    • If you are upgrading from a previous version of Tanzu Kubernetes Grid Integrated Edition, then select the Service Network linked to the ls-tkgi-service NSX-T logical switch that Tanzu Kubernetes Grid Integrated Edition created for you during installation. The service network provides network placement for existing on-demand Kubernetes cluster service instances that were created by the Tanzu Kubernetes Grid Integrated Edition broker.
  6. Click Save.

TKGI API

Perform the following steps:

  1. Click TKGI API.

  2. Under Certificate to secure the TKGI API, provide a certificate and private key pair.
    TKGI API pane configuration
    The certificate that you supply should cover the specific subdomain that routes to the TKGI API VM with TLS termination on the ingress.

    Warning: TLS certificates generated for wildcard DNS records only work for a single domain level. For example, a certificate generated for *.tkgi.EXAMPLE.com does not permit communication to *.api.tkgi.EXAMPLE.com. If the certificate does not contain the correct FQDN for the TKGI API, calls to the API will fail.

    You can enter your own certificate and private key pair, or have Ops Manager generate one for you.
    To generate a certificate using Ops Manager:

    1. Click Generate RSA Certificate for a new install or Change to update a previously-generated certificate.
    2. Enter the domain for your API hostname. This must match the domain you configured under TKGI API > API Hostname (FQDN) in the Tanzu Kubernetes Grid Integrated Edition tile. It can be a standard FQDN or a wildcard domain.
    3. Click Generate.
      TKGI API certificate generation
  3. Under API Hostname (FQDN), enter the FQDN that you registered to point to the TKGI API load balancer, such as api.tkgi.example.com. To retrieve the public IP address or FQDN of the TKGI API load balancer, log in to your IaaS console.

  4. Under Worker VM Max in Flight, enter the maximum number of non-canary worker instances to create or resize in parallel within an availability zone.

    This field sets the max_in_flight variable value. When you create or resize a cluster, the max_in_flight value limits the number of component instances that can be created or started simultaneously. By default, the max_in_flight value is set to 4, which means that up to four component instances are simultaneously created or started at a time.

  5. Click Save.

Plans

A plan defines a set of resource types used for deploying a cluster.

Activate a Plan

You must first activate and configure Plan 1,

and afterwards you can activate up to twelve additional, optional, plans.

To activate and configure a plan, perform the following steps:

  1. Click the plan that you want to activate.

    Note: Plans 11, 12, and 13 support Windows worker-based Kubernetes clusters on vSphere with NSX-T, and are a beta feature on vSphere with Flannel. To configure a Windows worker plan see Plans in Configuring Windows Worker-Based Kubernetes Clusters for more information.

  2. Select Active to activate the plan and make it available to developers deploying clusters.
    Plan pane configuration
  3. Under Name, provide a unique name for the plan.
  4. Under Description, edit the description as needed. The plan description appears in the Services Marketplace, which developers can access by using the TKGI CLI.
  5. Under Master/ETCD Node Instances, select the default number of Kubernetes master/etcd nodes to provision for each cluster. You can enter 1, 3, or 5.

    Note: If you deploy a cluster with multiple master/etcd node VMs, confirm that you have sufficient hardware to handle the increased load on disk write and network traffic. For more information, see Hardware recommendations in the etcd documentation.

    In addition to meeting the hardware requirements for a multi-master cluster, we recommend configuring monitoring for etcd to monitor disk latency, network latency, and other indicators for the health of the cluster. For more information, see Configuring Telegraf in TKGI.

    WARNING: To change the number of master/etcd nodes for a plan, you must ensure that no existing clusters use the plan. Tanzu Kubernetes Grid Integrated Edition does not support changing the number of master/etcd nodes for plans with existing clusters.

  6. Under Master/ETCD VM Type, select the type of VM to use for Kubernetes master/etcd nodes. For more information, including master node VM customization options, see the Master Node VM Size section of VM Sizing for Tanzu Kubernetes Grid Integrated Edition Clusters.

  7. Under Master Persistent Disk Type, select the size of the persistent disk for the Kubernetes master node VM.

  8. Under Master/ETCD Availability Zones, select one or more AZs for the Kubernetes clusters deployed by Tanzu Kubernetes Grid Integrated Edition. If you select more than one AZ, Tanzu Kubernetes Grid Integrated Edition deploys the master VM in the first AZ and the worker VMs across the remaining AZs. If you are using multiple masters, Tanzu Kubernetes Grid Integrated Edition deploys the master and worker VMs across the AZs in round-robin fashion.

  9. Under Maximum number of workers on a cluster, set the maximum number of Kubernetes worker node VMs that Tanzu Kubernetes Grid Integrated Edition can deploy for each cluster. Enter any whole number in this field.
    Plan pane configuration, part two

  10. Under Worker Node Instances, specify the default number of Kubernetes worker nodes the TKGI CLI provisions for each cluster. The Worker Node Instances setting must be less than, or equal to, the Maximum number of workers on a cluster setting.

    For high availability, create clusters with a minimum of three worker nodes, or two per AZ if you intend to use PersistentVolumes (PVs). For example, if you deploy across three AZs, you should have six worker nodes. For more information about PVs, see PersistentVolumes in Maintaining Workload Uptime. Provisioning a minimum of three worker nodes, or two nodes per AZ is also recommended for stateless workloads.

    For more information about creating clusters, see Creating Clusters.

    Note: Changing a plan’s Worker Node Instances setting does not alter the number of worker nodes on existing clusters. For information about scaling an existing cluster, see Scale Horizontally by Changing the Number of Worker Nodes Using the TKGI CLI in Scaling Existing Clusters.

  11. Under Worker VM Type, select the type of VM to use for Kubernetes worker node VMs. For more information, including worker node VM customization options, see Worker Node VM Number and Size in VM Sizing for Tanzu Kubernetes Grid Integrated Edition Clusters.

    Note: Tanzu Kubernetes Grid Integrated Edition requires a Worker VM Type with an ephemeral disk size of 32 GB or more.

  12. Under Worker Persistent Disk Type, select the size of the persistent disk for the Kubernetes worker node VMs.

  13. Under Worker Availability Zones, select one or more AZs for the Kubernetes worker nodes. Tanzu Kubernetes Grid Integrated Edition deploys worker nodes equally across the AZs you select.

  14. Under Kubelet customization - system-reserved, enter resource values that Kubelet can use to reserve resources for system daemons. For example, memory=250Mi, cpu=150m. For more information about system-reserved values, see the Kubernetes documentation. Plan pane configuration, part two

  15. Under Kubelet customization - eviction-hard, enter threshold limits that Kubelet can use to evict pods when they exceed the limit. Enter limits in the format EVICTION-SIGNAL=QUANTITY. For example, memory.available=100Mi, nodefs.available=10%, nodefs.inodesFree=5%. For more information about eviction thresholds, see the Kubernetes documentation.

    WARNING: Use the Kubelet customization fields with caution. If you enter values that are invalid or that exceed the limits the system supports, Kubelet might fail to start. If Kubelet fails to start, you cannot create clusters.

  16. Under Errand VM Type, select the size of the VM that contains the errand. The smallest instance possible is sufficient, as the only errand running on this VM is the one that applies the Default Cluster App YAML configuration.

  17. (Optional) Under (Optional) Add-ons - Use with caution, enter additional YAML configuration to add custom workloads to each cluster in this plan. You can specify multiple files using --- as a separator. For more information, see Adding Custom Linux Workloads. Plan pane configuration

  18. (Optional) To create clusters with resizable persistent volumes or to allow users to create pods with privileged containers, select the Allow Privileged option. For information about creating clusters with resizable persistent volumnes, see Cloud Native Storage (CNS) on vSphere For more information about privileged mode, see Pods in the Kubernetes documentation.

    Note: Enabling the Allow Privileged option means that all containers in the cluster will run in privileged mode. Pod Security Policy provides a privileged parameter that can be used to enable or disable Pods running in privileged mode. As a best practice, if you enable Allow Privileged, define PSP to limit which Pods run in privileged mode. If you are implementing PSP for privileged pods, you must enable Allow Privileged mode.

  19. (Optional) Enable or disable one or more admission controller plugins: PodSecurityPolicy, DenyEscalatingExec, and SecurityContextDeny. For more information see Using Admission Control Plugins for Tanzu Kubernetes Grid Integrated Edition Clusters.

  20. (Optional) Under Node Drain Timeout(mins), enter the timeout in minutes for the node to drain pods. If you set this value to 0, the node drain does not terminate. Node Drain Timeout fields

  21. (Optional) Under Pod Shutdown Grace Period (seconds), enter a timeout in seconds for the node to wait before it forces the pod to terminate. If you set this value to -1, the default timeout is set to the one specified by the pod.

  22. (Optional) To configure when the node drains, enable the following:

    • Force node to drain even if it has running pods not managed by a ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet.
    • Force node to drain even if it has running DaemonSet-managed pods.
    • Force node to drain even if it has running running pods using emptyDir.
    • Force node to drain even if pods are still running after timeout.

    Warning: If you select Force node to drain even if pods are still running after timeout, the node kills all running workloads on pods. Before enabling this configuration, set Node Drain Timeout to a value greater than 0.

    For more information about configuring default node drain behavior, see Worker Node Hangs Indefinitely in Troubleshooting.

  23. Click Save.

Deactivate a Plan

To deactivate a plan, perform the following steps:

  1. Click the plan that you want to deactivate.
  2. Select Inactive.
  3. Click Save.

Kubernetes Cloud Provider

In the procedure below, you use credentials for vCenter master VMs. You must have provisioned the service account with the correct permissions. For more information, see Create the Master Node Service Account in Preparing vSphere Before Deploying Tanzu Kubernetes Grid Integrated Edition.

To configure your Kubernetes cloud provider settings, follow the procedure below:

  1. Click Kubernetes Cloud Provider.
  2. Under Choose your IaaS, select vSphere.
  3. Ensure the values in the following procedure match those in the vCenter Config section of the Ops Manager tile.

    vSphere pane configuration
    1. Enter your vCenter Master Credentials. Enter the username using the format user@example.com. For more information about the master node service account, see Preparing vSphere Before Deploying Tanzu Kubernetes Grid Integrated Edition.
    2. Enter your vCenter Host. For example, vcenter-example.com.
    3. Enter your Datacenter Name. For example, example-dc.
    4. Enter your Datastore Name. For example, example-ds. Populate Datastore Name with the Persistent Datastore name configured in your BOSH Director tile under vCenter Config > Persistent Datastore Names. The Datastore Name field should contain a single Persistent datastore.

      Note: The vSphere datastore type must be Datastore. Tanzu Kubernetes Grid Integrated Edition does not support the use of vSphere Datastore Clusters with or without Storage DRS. For more information, see Datastores and Datastore Clusters in the vSphere documentation.

      Note: The Datastore Name is the default datastore used if the Kubernetes cluster StorageClass does not define a StoragePolicy. Do not enter a datastore that is a list of BOSH Job/VMDK datastores. For more information, see PersistentVolume Storage Options on vSphere.

      Note: For multi-AZ and multi-cluster environments, your Datastore Name should be a shared Persistent datastore available to each vSphere cluster. Do not enter a datastore that is local to a single cluster. For more information, see PersistentVolume Storage Options on vSphere.

    5. Enter the Stored VM Folder so that the persistent stores know where to find the VMs. To retrieve the name of the folder, navigate to your BOSH Director tile, click vCenter Config, and locate the value for VM Folder. The default folder name is pks_vms.
    6. Click Save.

Networking

To configure networking, do the following:

  1. Click Networking.
  2. Under Container Networking Interface, select NSX-T.
    NSX-T Networking configuration pane in TKGI tile
    1. For NSX Manager hostname, enter the hostname or IP address of your NSX Manager.
    2. For NSX Manager Super User Principal Identify Certificate, copy and paste the contents and private key of the Principal Identity certificate you created in Generating and Registering the NSX Manager Superuser Principal Identity Certificate and Key.
    3. For NSX Manager CA Cert, copy and paste the contents of the NSX Manager CA certificate you created in Generate and Register the NSX-T Management SSL Certificate and Private Key. Use this certificate and key to connect to the NSX Manager.
    4. The Disable SSL certificate verification checkbox is not selected by default. In order to disable TLS verification, select the checkbox. You may want to disable TLS verification if you did not enter a CA certificate, or if your CA certificate is self-signed.

      Note: The NSX Manager CA Cert field and the Disable SSL certificate verification option are intended to be mutually exclusive. If you disable SSL certificate verification, leave the CA certificate field blank. If you enter a certificate in the NSX Manager CA Cert field, do not disable SSL certificate verification. If you populate the certificate field and disable certificate validation, insecure mode takes precedence.

    5. If you are using a NAT deployment topology, leave the NAT mode checkbox selected. If you are using a No-NAT topology, clear this checkbox. For more information, see Deployment Topologies.
    6. Enter the following IP Block settings:
      NSX-T Networking configuration pane in Ops Manager View a larger version of this image.
      • Pods IP Block ID: Enter the UUID of the IP block to be used for Kubernetes pods. Tanzu Kubernetes Grid Integrated Edition allocates IP addresses for the pods when they are created in Kubernetes. Each time a namespace is created in Kubernetes, a subnet from this IP block is allocated. The current subnet size that is created is /24, which means a maximum of 256 pods can be created per namespace.
      • Nodes IP Block ID: Enter the UUID of the IP block to be used for Kubernetes nodes. Tanzu Kubernetes Grid Integrated Edition allocates IP addresses for the nodes when they are created in Kubernetes. The node networks are created on a separate IP address space from the pod networks. The current subnet size that is created is /24, which means a maximum of 256 nodes can be created per cluster.
        For more information, including sizes and the IP blocks to avoid using, see Plan IP Blocks in Preparing NSX-T Before Deploying Tanzu Kubernetes Grid Integrated Edition.
    7. For T0 Router ID, enter the t0-tkgi T0 router UUID. Locate this value in the NSX-T UI router overview.
    8. For Floating IP Pool ID, enter the ip-pool-vips ID that you created for load balancer VIPs. For more information, see Plan Network CIDRs. Tanzu Kubernetes Grid Integrated Edition uses the floating IP pool to allocate IP addresses to the load balancers created for each of the clusters. The load balancer routes the API requests to the master nodes and the data plane.
    9. For Nodes DNS, enter one or more Domain Name Servers used by the Kubernetes nodes.
    10. For vSphere Cluster Names, enter a comma-separated list of the vSphere clusters where you will deploy Kubernetes clusters.
      The NSX-T pre-check errand uses this field to verify that the hosts from the specified clusters are available in NSX-T. You can specify clusters in this format: cluster1,cluster2,cluster3.
    11. For Kubernetes Service Network CIDR Range, specify an IP address and subnet size depending on the number of Kubernetes services that you plan to deploy within a single Kubernetes cluster, for example: 10.100.200.0/24. The IP address used here is internal to the cluster and can be anything, such as 10.100.200.0. A /24 subnet provides 256 IPs. If you have a cluster that requires more than 256 IPs, define a larger subnet, such as /20.
  3. (Optional) Configure a global proxy for all outgoing HTTP and HTTPS traffic from your Kubernetes clusters and the TKGI API server. See Using Proxies with Tanzu Kubernetes Grid Integrated Edition on NSX-T for instructions on how to enable a proxy.
  4. Under Allow outbound internet access from Kubernetes cluster vms (IaaS-dependent), ignore the Enable outbound internet access checkbox.
  5. Click Save.

UAA

To configure the UAA server:

  1. Click UAA.
  2. Under TKGI API Access Token Lifetime, enter a time in seconds for the TKGI API access token lifetime. This field defaults to 600.

    UAA pane configuration

  3. Under TKGI API Refresh Token Lifetime, enter a time in seconds for the TKGI API refresh token lifetime. This field defaults to 21600.

  4. Under TKGI Cluster Access Token Lifetime, enter a time in seconds for the cluster access token lifetime. This field defaults to 600.

  5. Under TKGI Cluster Refresh Token Lifetime, enter a time in seconds for the cluster refresh token lifetime. This field defaults to 21600.

    Note: VMware recommends using the default UAA token timeout values. By default, access tokens expire after ten minutes and refresh tokens expire after six hours.

  6. Under Configure created clusters to use UAA as the OIDC provider, select Enabled or Disabled. This is a global default setting for TKGI-provisioned clusters. For more information, see OIDC Provider for Kubernetes Clusters.

    To configure Tanzu Kubernetes Grid Integrated Edition to use UAA as the OIDC provider:

    1. Under Configure created clusters to use UAA as the OIDC provider, select Enabled. OIDC configuration checkbox
    2. For UAA OIDC Groups Claim, enter the name of your groups claim. This is used to set a user’s group in the JSON Web Token (JWT) claim. The default value is roles.
    3. For UAA OIDC Groups Prefix, enter a prefix for your groups claim. This prevents conflicts with existing names. For example, if you enter the prefix oidc:, UAA creates a group name like oidc:developers. The default value is oidc:.
    4. For UAA OIDC Username Claim, enter the name of your username claim. This is used to set a user’s username in the JWT claim. The default value is user_name. Depending on your provider, you can enter claims besides user_name, like email or name.
    5. For UAA OIDC Username Prefix, enter a prefix for your username claim. This prevents conflicts with existing names. For example, if you enter the prefix oidc:, UAA creates a username like oidc:admin. The default value is oidc:.

      Warning: VMware recommends adding OIDC prefixes to prevent users and groups from gaining unintended cluster privileges. If you change the above values for a pre-existing Tanzu Kubernetes Grid Integrated Edition installation, you must change any existing role bindings that bind to a username or group. If you do not change your role bindings, developers cannot access Kubernetes clusters. For instructions, see Managing Cluster Access and Permissions.

  7. Select one of the following options:

(Optional) Host Monitoring

In Host Monitoring, you can configure monitoring of nodes and VMs using Syslog, VMware vRealize Log Insight (vRLI) Integration, or Telegraf.

Host Monitoring pane

You can configure one or more of the following:

  • Syslog: To configure Syslog, see Syslog below. Syslog forwards log messages from all BOSH-deployed VMs to a syslog endpoint.
  • VMware vRealize Log Insight (vRLI) Integration: To configure VMware vRealize Log Insight (vRLI) Integration, see VMware vRealize Log Insight Integration below. The vRLI integration pulls logs from all BOSH jobs and containers running in the cluster, including node logs from core Kubernetes and BOSH processes, Kubernetes event logs, and pod stdout and stderr.
  • Telegraf: To configure Telegraf, see Configuring Telegraf in TKGI. The Telegraf agent sends metrics from TKGI API, master node, and worker node VMs to a monitoring service, such as Wavefront or Datadog.

For more information about these components, see Monitoring TKGI and TKGI-Provisioned Clusters.

Syslog

To configure Syslog for all BOSH-deployed VMs in Tanzu Kubernetes Grid Integrated Edition:

  1. Click Host Monitoring.
  2. Under Enable Syslog for TKGI, select Yes.
  3. Under Address, enter the destination syslog endpoint.
  4. Under Port, enter the destination syslog port.
  5. Under Transport Protocol, select a transport protocol for log forwarding.
  6. (Optional) To enable TLS encryption during log forwarding, complete the following steps:
    1. Ensure Enable TLS is selected.

      Note: Logs may contain sensitive information, such as cloud provider credentials. VMware recommends that you enable TLS encryption for log forwarding.

    2. Under Permitted Peer, provide the accepted fingerprint (SHA1) or name of remote peer. For example, *.YOUR-LOGGING-SYSTEM.com.
    3. Under TLS Certificate, provide a TLS certificate for the destination syslog endpoint.

      Note: You do not need to provide a new certificate if the TLS certificate for the destination syslog endpoint is signed by a Certificate Authority (CA) in your BOSH certificate store.

  7. (Optional) Under Max Message Size, enter a maximum message size for logs that are forwarded to a syslog endpoint. By default, the Max Message Size field is 10,000 characters.
  8. Click Save.

VMware vRealize Log Insight Integration

Note: Before you configure the vRLI integration, you must have a vRLI license and vRLI must be installed, running, and available in your environment. You need to provide the live instance address during configuration. For instructions and additional information, see the vRealize Log Insight documentation.

  1. By default, vRLI logging is disabled. To configure vRLI logging, under Enable VMware vRealize Log Insight Integration?, select Yes and then perform the following steps: Enable VMware vRealize Log Insight Integration
  2. Under Host, enter the IP address or FQDN of the vRLI host.
  3. (Optional) Select the Enable SSL? checkbox to encrypt the logs being sent to vRLI using SSL.
  4. Choose one of the following SSL certificate validation options:
    • To skip certificate validation for the vRLI host, select the Disable SSL certificate validation checkbox. Select this option if you are using a self-signed certificate in order to simplify setup for a development or test environment.

      Note: Disabling certificate validation is not recommended for production environments.

    • To enable certificate validation for the vRLI host, clear the Disable SSL certificate validation checkbox.
  5. (Optional) If your vRLI certificate is not signed by a trusted CA root or other well known certificate, enter the certificate in the CA certificate field. Locate the PEM of the CA used to sign the vRLI certificate, copy the contents of the certificate file, and paste them into the field. Certificates must be in PEM-encoded format.
  6. Under Rate limiting, enter a time in milliseconds to change the rate at which logs are sent to the vRLI host. The rate limit specifies the minimum time between messages before the fluentd agent begins to drop messages. The default value 0 means that the rate is not limited, which suffices for many deployments.

    Note: If your deployment is generating a high volume of logs, you can increase this value to limit network traffic. Consider starting with a lower value, such as 10, then tuning to optimize for your deployment. A large number might result in dropping too many log entries.

  7. Click Save. These settings apply to any clusters created after you have saved these configuration settings and clicked Apply Changes. If the Upgrade all clusters errand has been enabled, these settings are also applied to existing clusters.

    Note: The Tanzu Kubernetes Grid Integrated Edition tile does not validate your vRLI configuration settings. To verify your setup, look for log entries in vRLI.

(Optional) In-Cluster Monitoring

In In-Cluster Monitoring, you can configure one or more observability components and integrations that run in Kubernetes clusters and capture logs and metrics about your workloads. For more information, see Monitoring Workers and Workloads.

Cluster Monitoring pane

To configure in-cluster monitoring:

Wavefront

You can monitor Kubernetes clusters and pods metrics externally using the integration with Wavefront by VMware.

Note: Before you configure Wavefront integration, you must have an active Wavefront account and access to a Wavefront instance. You provide your Wavefront access token during configuration and enabling errands. For additional information, see the Wavefront documentation.

To enable and configure Wavefront monitoring:

  1. In the Tanzu Kubernetes Grid Integrated Edition tile, select In-Cluster Monitoring.
  2. Under Wavefront Integration, select Yes.
    Wavefront configuration
  3. Under Wavefront URL, enter the URL of your Wavefront subscription. For example:
    https://try.wavefront.com/api
    
  4. Under Wavefront Access Token, enter the API token for your Wavefront subscription.
  5. To configure Wavefront to send alerts by email, enter email addresses or Wavefront Target IDs separated by commas under Wavefront Alert Recipient, using the following syntax:

    USER-EMAIL,WAVEFRONT-TARGETID_001,WAVEFRONT-TARGETID_002
    

    Where:

    • USER-EMAIL is the alert recipient’s email address.
    • WAVEFRONT-TARGETID_001 and WAVEFRONT-TARGETID_002 are your comma-delimited Wavefront Target IDs.

    For example:

    randomuser@example.com,51n6psdj933ozdjf
    

  6. (Optional) For installations that require a proxy server for outbound Internet access, enable access by entering values for HTTP Proxy URL for outbound Internet access, HTTP Proxy Port, Proxy username, and Proxy password.

  7. Click Save.

To create alerts, you must enable errands in Tanzu Kubernetes Grid Integrated Edition.

  1. In the Tanzu Kubernetes Grid Integrated Edition tile, select Errands.
  2. On the Errands pane, enable Create pre-defined Wavefront alerts errand.
  3. Enable Delete pre-defined Wavefront alerts errand.
  4. Click Save. Your settings apply to any clusters created after you have saved these configuration settings and clicked Apply Changes.

The Tanzu Kubernetes Grid Integrated Edition tile does not validate your Wavefront configuration settings. To verify your setup, look for cluster and pod metrics in Wavefront.

VMware vRealize Operations Management Pack for Container Monitoring

You can monitor Tanzu Kubernetes Grid Integrated Edition Kubernetes clusters with VMware vRealize Operations Management Pack for Container Monitoring.

To integrate Tanzu Kubernetes Grid Integrated Edition with VMware vRealize Operations Management Pack for Container Monitoring, you must deploy a container running cAdvisor in your TKGI deployment.

cAdvisor is an open source tool that provides monitoring and statistics for Kubernetes clusters.

To deploy a cAdvisor container:

  1. Select In-Cluster Monitoring.
  2. Under Deploy cAdvisor, select Yes.
  3. Click Save.

For more information about integrating this type of monitoring with TKGI, see the VMware vRealize Operations Management Pack for Container Monitoring User Guide and Release Notes in the VMware documentation.

Metric Sink Resources

You can configure TKGI-provisioned clusters to send Kubernetes node metrics and pod metrics to metric sinks. For more information about metric sink resources and what to do after you enable them in the tile, see Sink Resources in Monitoring Workers and Workloads.

To enable clusters to send Kubernetes node metrics and pod metrics to metric sinks:

  1. In In-Cluster Monitoring, select Enable Metric Sink Resources. If you enable this checkbox, Tanzu Kubernetes Grid Integrated Edition deploys Telegraf as a DaemonSet, a pod that runs on each worker node in all your Kubernetes clusters.
  2. (Optional) To enable Node Exporter to send worker node metrics to metric sinks of kind ClusterMetricSink, select Enable node exporter on workers. If you enable this checkbox, Tanzu Kubernetes Grid Integrated Edition deploys Node Exporter as a DaemonSet, a pod that runs on each worker node in all your Kubernetes clusters.

    For instructions on how to create a metric sink of kind ClusterMetricSink for Node Exporter metrics, see Create a ClusterMetricSink Resource for Node Exporter Metrics in Creating and Managing Sink Resources.

  3. Click Save.

Log Sink Resources

You can configure TKGI-provisioned clusters to send Kubernetes API events and pod logs to log sinks. For more information about log sink resources and what to do after you enable them in the tile, see Sink Resources in Monitoring Workers and Workloads.

To enable clusters to send Kubernetes API events and pod logs to log sinks:

  1. Select Enable Log Sink Resources. If you enable this checkbox, Tanzu Kubernetes Grid Integrated Edition deploys Fluent Bit as a DaemonSet, a pod that runs on each worker node in all your Kubernetes clusters.
  2. Click Save.

Tanzu Mission Control (Experimental)

Participants in the VMware Tanzu Mission Control beta program can use the Tanzu Mission Control (Experimental) pane of the Tanzu Kubernetes Grid Integrated Edition tile to integrate their Tanzu Kubernetes Grid Integrated Edition deployment with Tanzu Mission Control.

Tanzu Mission Control integration lets you monitor and manage Tanzu Kubernetes Grid Integrated Edition clusters from the Tanzu Mission Control console, which makes the Tanzu Mission Control console a single point of control for all Kubernetes clusters.

Warning: VMware Tanzu Mission Control is currently experimental beta software and is intended for evaluation and test purposes only. For more information about Tanzu Mission Control, see the VMware Tanzu Mission Control home page.

To integrate Tanzu Kubernetes Grid Integrated Edition with Tanzu Mission Control:

  1. Confirm that the TKGI API VM has internet access and can connect to cna.tmc.cloud.vmware.com and the other outbound URLs listed in the What Happens When You Attach a Cluster section of the Tanzu Mission Control documentation.

  2. Navigate to the Tanzu Kubernetes Grid Integrated Edition tile > the Tanzu Mission Control (Experimental) pane and select Yes under Tanzu Mission Control Integration.

    Tanzu Mission Control Integration

  3. Configure the fields below:

    • Tanzu Mission Control URL: Enter the Org URL of your Tanzu Mission Control subscription, without a trailing slash (/). For example, YOUR-ORG.tmc.cloud.vmware.com.
    • VMware Cloud Services API token: Enter your API token to authenticate with VMware Cloud Services APIs. You can retrieve this token by logging in to VMware Cloud Services and viewing your account information.
    • Tanzu Mission Control Cluster Group: Enter the name of a Tanzu Mission Control cluster group.

      The name can be default or another value, depending on your role and access policy:

      • Org Member users in VMware cloud services have a service.admin role in Tanzu Mission Control. These users:
        • By default, can create and attach clusters only in the default cluster group.
        • Can create and attach clusters to other cluster groups after an organization.admin user grants them the clustergroup.admin or clustergroup.edit role for those groups.
      • Org Owner users in VMware cloud services have organization.admin permissions in Tanzu Mission Control. These users:
        • Can create cluster groups.
        • Can grant clustergroup roles to service.admin users through the Tanzu Mission Control Access Policy view.

      For more information about role and access policy, see Access Control in the VMware Tanzu Mission Control Product Documentation.

    • Tanzu Mission Control Cluster Name Prefix: Enter a name prefix for identifying the Tanzu Kubernetes Grid Integrated Edition clusters in Tanzu Mission Control.

  4. Click Save.

Warning: After the Tanzu Kubernetes Grid Integrated Edition tile is deployed with a configured cluster group, the cluster group cannot be updated.

Note: When you upgrade your Kubernetes clusters and have Tanzu Mission Control integration enabled, existing clusters will be attached to Tanzu Mission Control.

CEIP and Telemetry

Tanzu Kubernetes Grid Integrated Edition-provisioned clusters send usage data to the TKGI control plane for storage. The VMware Customer Experience Improvement Program (CEIP) and Telemetry Program provide the option to also send the cluster usage data to VMware to improve customer experience.

To configure Tanzu Kubernetes Grid Integrated Edition CEIP and Telemetry Program settings:

  1. Click CEIP and Telemetry.
  2. Review the information about the CEIP and Telemetry.
  3. To specify your level of participation in the CEIP and Telemetry program, select one of the Participation Level options:
    • None: If you select this option, the data collected by Tanzu Kubernetes Grid Integrated Edition is not sent to VMware.
    • (Default) Standard: If you select this option, the data collected by Tanzu Kubernetes Grid Integrated Edition is sent to VMware anonymously. This anonymous participation level does not permit the CEIP and Telemetry to identify your organization.
    • Enhanced: If you select this option, the data collected by Tanzu Kubernetes Grid Integrated Edition is sent to VMware. This participation level permits the CEIP and Telemetry to identify your organization.

      For more information about the CEIP and Telemetry participation levels, see Participation Levels in Telemetry.
  4. If you selected Standard or Enhanced, open your firewall to allow outgoing access to https://vcsa.vmware.com/ph on port 443.
  5. If you selected Enhanced, complete the following:
    • Enter your account number or customer number in the VMware Account Number or Pivotal Customer Number field:
      • If you are a VMware customer, you can find your VMware Account Number in your Account Summary on my.vmware.com.
      • If you started as a Pivotal customer, you can find your Customer Number in your Order Confirmation email.
    • (Optional) Enter a descriptive name for your TKGI installation in the TKGI Installation Label field. The label you assign to this installation will be used in telemetry reports to identify the environment.
  6. To provide information about the purpose for this installation, select an option in the TKGI Installation Type list.
    CEIP and Telemetry installation type
  7. Click Save.

Errands

Errands are scripts that run at designated points during an installation.

To configure which post-deploy and pre-delete errands run for Tanzu Kubernetes Grid Integrated Edition:

  1. Make a selection in the dropdown next to each errand.
    Errand configuration pane

    Note: We recommend that you use the default settings for all errands except for the NSX-T validation and Run smoke tests errands.

  2. (Optional) Set the NSX-T validation errand to On.

    This errand verifies the NSX-T objects.
  3. (Optional) Set the Run smoke tests errand to On.

    This errand uses the TKGI CLI to create a Kubernetes cluster and then delete it. If the creation or deletion fails, the errand fails and the installation of the Tanzu Kubernetes Grid Integrated Edition tile is aborted.

  4. (Optional) To ensure that all of your cluster VMs are patched, configure the Upgrade all clusters errand errand to On.

    Updating the Tanzu Kubernetes Grid Integrated Edition tile with a new Linux stemcell and the Upgrade all clusters errand enabled triggers the rolling of every Linux VM in each Kubernetes cluster. Similarly, updating the Tanzu Kubernetes Grid Integrated Edition tile with a new Windows stemcell triggers the rolling of every Windows VM in your Kubernetes clusters.

    Warning: To avoid workload downtime, use the resource configuration recommended in About Tanzu Kubernetes Grid Integrated Edition Upgrades and Maintaining Workload Uptime.

Resource Config

To modify the resource configuration of Tanzu Kubernetes Grid Integrated Edition, follow the steps below:

  1. Select Resource Config.

  2. For each job, review the Automatic values in the following fields:

    • VM TYPE: By default, the TKGI Database and TKGI API jobs are set to the same Automatic VM type. If you want to adjust this value, we recommend that you select the same VM type for both jobs.

      Note: The Automatic VM TYPE values match the recommended resource configuration for the TKGI API and TKGI Database jobs.

    • PERSISTENT DISK TYPE: By default, the TKGI Database and TKGI API jobs are set to the same persistent disk type. If you want to adjust this value, you can change the persistent disk type for each of the jobs independently. Using the same persistent disk type for both jobs is not required.

  3. Under each job, leave NSX-T CONFIGURATION and NSX-V CONFIGURATION blank.

Step 3: Apply Changes

After configuring the Tanzu Kubernetes Grid Integrated Edition tile, follow the steps below to deploy the tile:

  1. Return to the Ops Manager Installation Dashboard.
  2. Click Review Pending Changes. Select the product that you intend to deploy and review the changes. For more information, see Reviewing Pending Product Changes.
  3. Click Apply Changes.

Step 4: Install the TKGI and Kubernetes CLIs

The TKGI CLI and the Kubernetes CLI help you interact with your Tanzu Kubernetes Grid Integrated Edition-provisioned Kubernetes clusters and Kubernetes workloads. To install the CLIs, follow the instructions below:

Step 5: Verify NAT Rules

If you are using NAT mode, verify that you have created the required NAT rules for the Tanzu Kubernetes Grid Integrated Edition Management Plane. See Create Management Plane in Installing and Configuring NSX-T Data Center v3.0 for TKGI for details.

In addition, for NAT and no-NAT modes, verify that you created the required NAT rule for Kubernetes master nodes to access NSX Manager. For details, see Create IP Blocks and Pool for Compute Plane in Installing and Configuring NSX-T Data Center v3.0 for TKGI.

If you want your developers to be able to access the TKGI CLI from their external workstations, create a DNAT rule that maps a routable IP address to the TKGI API VM. This must be done after Tanzu Kubernetes Grid Integrated Edition is successfully deployed and it has an IP address. See Create Management Plane in Installing and Configuring NSX-T Data Center v3.0 for TKGI for details.

Step 6: Configure Authentication for Tanzu Kubernetes Grid Integrated Edition

Follow the procedures in Setting Up Tanzu Kubernetes Grid Integrated Edition Admin Users on vSphere in Installing Tanzu Kubernetes Grid Integrated Edition > vSphere.

Next Steps

After installing Tanzu Kubernetes Grid Integrated Edition on vSphere with NSX-T integration, complete the following tasks:


Please send any feedback you have to pks-feedback@pivotal.io.