Release Notes

Page last updated:

This topic contains release notes for Tanzu Kubernetes Grid Integrated Edition (TKGI) v1.9.

TKGI v1.9.6

Release Date: July 13, 2021

Product Snapshot

Release Details
Version v1.9.6
Release date July 13, 2021
Component Version
cAdvisor v0.39.1
CSI Driver for vSphere v2.0
v1.0.2
CoreDNS v1.6.7_vmware.8
Docker Linux: v20.10.7
Windows: v19.03.14
etcd v3.4.3
Harbor v2.2.2
Kubernetes v1.18.18
Metrics Server v0.3.6
NCP v3.0.2.5
Percona XtraDB Cluster (PXC) v0.35.0
UAA v74.5.23
Velero v1.4.2
VMware Cloud Foundation (VCF) v4.1
v4.0
Compatibilities Versions
Ops Manager See VMware Tanzu Network**.
NSX-T See VMware Product Interoperability Matrices.
vSphere
Windows stemcells v2019.34 or later
Xenial stemcells See VMware Tanzu Network.

* VMware recommends NSX-T v3.0.1.1 (v3.0.1 EP1) or later, or NSX-T v2.5.2 or later for NSX-T integration.

** Do not use TKGI with Ops Manager v2.10.15. For more information, see v2.10.15 in Ops Manager v2.10 Release Notes.

Upgrade Path

The supported upgrade paths to TKGI v1.9.6 are from Tanzu Kubernetes Grid Integrated Edition v1.8.0 and later or Tanzu Kubernetes Grid Integrated Edition v1.9.0 and later.

Features

This section describes new features and changes in VMware Tanzu Kubernetes Grid Integrated Edition v1.9.6.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.9.5 are also in Tanzu Kubernetes Grid Integrated Edition v1.9.6. See Known Issues in Tanzu Kubernetes Grid Integrated Edition v1.9.5 below.

Warning: Do not use TKGI with Ops Manager v2.10.15. For more information, see v2.10.15 in Ops Manager v2.10 Release Notes.


TKGI v1.9.5

Release Date: April 7, 2021

Product Snapshot

Release Details
Version v1.9.5
Release date April 7, 2021
Component Version
Kubernetes v1.18.16
CoreDNS v1.6.7+vmware.8
Docker Linux: v19.03.14
Windows: v19.03.14
etcd v3.4.3
Metrics Server v0.3.6
NCP v3.0.2.4
Percona XtraDB Cluster (PXC) v0.30.0
UAA v74.5.20
Compatibilities Versions
Ops Manager See VMware Tanzu Network**.
Xenial stemcells See VMware Tanzu Network.
Windows stemcells v2019.31 or later
NSX-T* See VMware Product Interoperability Matrices.
vSphere
VMware Cloud Foundation (VCF) v4.1
v4.0
CSI Driver for vSphere v2.0
v1.0.2
Harbor v2.1.3
Velero v1.4.2

* VMware recommends NSX-T v3.0.1.1 (v3.0.1 EP1) or later, or NSX-T v2.5.2 or later for NSX-T integration.

** Do not use TKGI with Ops Manager v2.10.15. For more information, see v2.10.15 in Ops Manager v2.10 Release Notes.

Upgrade Path

The supported upgrade paths to TKGI v1.9.5 are from Tanzu Kubernetes Grid Integrated Edition v1.8.0 and later.

Features

This section describes new features and changes in VMware Tanzu Kubernetes Grid Integrated Edition v1.9.5.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.9.4 are also in Tanzu Kubernetes Grid Integrated Edition v1.9.5. See Known Issues in Tanzu Kubernetes Grid Integrated Edition v1.9.4 below.

Warning: Do not use TKGI with Ops Manager v2.10.15. For more information, see v2.10.15 in Ops Manager v2.10 Release Notes.


Workloads Using Dynamic PVs Must be Removed Before Deleting a Cluster

Symptom

Your tkgi delete-cluster operation hangs while draining a worker VM containing a Pod bound to one or more dynamic persistent volumes.

Workaround

Before deleting the cluster, remove all workloads.

If you have already attempted to delete a cluster and your workloads with dynamic PVs have stopped, remove the dynamic PVs attached to the worker VMs in your cluster. For information about removing dynamic PVs from a worker VM, see Workloads using dynamic PersistentVolumes (PVs) must be removed before deleting a cluster in the Knowledge Base.


TKGI v1.9.4

Release Date: February 3, 2021

Product Snapshot

Release Details
Version v1.9.4
Release date February 3, 2021
Component Version
Kubernetes v1.18.14
CoreDNS v1.6.7+vmware.7
Docker Linux: v19.03.13
Windows: v19.03.13
etcd v3.4.3
Metrics Server v0.3.6
NCP v3.0.2.3
Percona XtraDB Cluster (PXC) v0.30.0
UAA v74.5.20
Compatibilities Versions
Ops Manager See VMware Tanzu Network**.
Windows worker support on vSphere with NSX-T requires v2.10.4 or later
Xenial stemcells See VMware Tanzu Network.
Windows stemcells v2019.29 or later
NSX-T* See VMware Product Interoperability Matrices.
vSphere
VMware Cloud Foundation (VCF) v4.1
v4.0
CSI Driver for vSphere v2.0
v1.0.2
Harbor v2.1.1
Velero v1.4.2

* VMware recommends NSX-T v3.0.1.1 (v3.0.1 EP1) or later, or NSX-T v2.5.2 or later for NSX-T integration.

** Do not use TKGI with Ops Manager v2.10.15. For more information, see v2.10.15 in Ops Manager v2.10 Release Notes.

Upgrade Path

The supported upgrade paths to TKGI v1.9.4 are from Tanzu Kubernetes Grid Integrated Edition v1.8.0 and later.

Features

This section describes new features and changes in VMware Tanzu Kubernetes Grid Integrated Edition v1.9.4.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.9.3 are also in Tanzu Kubernetes Grid Integrated Edition v1.9.4. See Known Issues in Tanzu Kubernetes Grid Integrated Edition v1.9.3 below.

Warning: Do not use TKGI with Ops Manager v2.10.15. For more information, see v2.10.15 in Ops Manager v2.10 Release Notes.


TKGI v1.9.3

Release Date: January 8, 2021

Product Snapshot

Release Details
Version v1.9.3
Release date January 8, 2021
Component Version
Kubernetes v1.18.14
CoreDNS v1.6.7+vmware.7
Docker Linux: v19.03.13
Windows: v19.03.13
etcd v3.4.5
Metrics Server v0.3.6
NCP v3.0.2.3
Percona XtraDB Cluster (PXC) v0.30.0
UAA v74.5.20
Compatibilities Versions
Ops Manager See VMware Tanzu Network**.
Windows worker support on vSphere with NSX-T requires v2.10.4 or later
Xenial stemcells See VMware Tanzu Network.
Windows stemcells v2019.28 or later
NSX-T* See VMware Product Interoperability Matrices.
vSphere
VMware Cloud Foundation (VCF) Do not use TKGI v1.9.3 with VCF. See TKGI Upgrade or Install Fails with Error “x509: certificate relies on legacy Common Name field”.
CSI Driver for vSphere v2.0
v1.0.2
Harbor v2.1.1
Velero v1.4.2

* VMware recommends NSX-T v3.0.1.1 (v3.0.1 EP1) or later, or NSX-T v2.5.2 or later for NSX-T integration.

** Do not use TKGI with Ops Manager v2.10.15. For more information, see v2.10.15 in Ops Manager v2.10 Release Notes.

Upgrade Path

The supported upgrade paths to TKGI v1.9.3 are from Tanzu Kubernetes Grid Integrated Edition v1.8.0 and later.

Features

This section describes new features and changes in VMware Tanzu Kubernetes Grid Integrated Edition v1.9.3.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.9.2 are also in Tanzu Kubernetes Grid Integrated Edition v1.9.3. See Known Issues in Tanzu Kubernetes Grid Integrated Edition v1.9.2 below.

Warning: Do not use TKGI with Ops Manager v2.10.15. For more information, see v2.10.15 in Ops Manager v2.10 Release Notes.


TKGI v1.9.2

Release Date: December 15, 2020

Product Snapshot

Release Details
Version v1.9.2
Release date December 15, 2020
Component Version
Kubernetes v1.18.12
CoreDNS v1.6.7+vmware.6
Docker Linux: v19.03.5
Windows: v19.03.12
etcd v3.4.5
Metrics Server v0.3.6
NCP v3.0.2.3
Percona XtraDB Cluster (PXC) v0.30.0
UAA v74.5.20
Compatibilities Versions
Ops Manager See VMware Tanzu Network**.
Windows worker support on vSphere with NSX-T requires v2.10.2 or later
Xenial stemcells See VMware Tanzu Network.
Windows stemcells v2019.28 or later
NSX-T* See VMware Product Interoperability Matrices.
vSphere
VMware Cloud Foundation (VCF) Do not use TKGI v1.9.2 with VCF. See TKGI Upgrade or Install Fails with Error “x509: certificate relies on legacy Common Name field”.
CSI Driver for vSphere v2.0
v1.0.2
Harbor v2.1.1
Velero v1.4.2

* VMware recommends NSX-T v3.0.1.1 (v3.0.1 EP1) or later, or NSX-T v2.5.2 or later for NSX-T integration.

** Do not use TKGI with Ops Manager v2.10.15. For more information, see v2.10.15 in Ops Manager v2.10 Release Notes.

Upgrade Path

The supported upgrade paths to TKGI v1.9.2 are from Tanzu Kubernetes Grid Integrated Edition v1.8.0 and later.

Features

This section describes new features and changes in VMware Tanzu Kubernetes Grid Integrated Edition v1.9.2.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.9.0 are also in Tanzu Kubernetes Grid Integrated Edition v1.9.2. See Known Issues in Tanzu Kubernetes Grid Integrated Edition v1.9.0 below.

Warning: Do not use TKGI with Ops Manager v2.10.15. For more information, see v2.10.15 in Ops Manager v2.10 Release Notes.

TKGI v1.9.2 has the following known issues:


Your TKGI Cluster Fails to Start After Changing Your Worker Node’s Compute Profile AZ

This issue is fixed in TKGI v1.9.3.

Symptom

The worker nodes in your TKGI cluster fail to start after you modify the AZ in the node’s compute profile. For example, you may have modified the compute profile AZ for the existing TKGI cluster, or attached a compute profile configured with an AZ to the existing TKGI cluster.

Explanation

In TKGI v1.9.2, the AZ for existing worker nodes cannot be changed by modifying their cluster’s compute profile.

See also You Cannot Change a Master Node’s AZ.

Workaround

Change the AZ for worker nodes in an existing cluster by modifying the cluster’s plan.


TKGI CLI get-credentials Returns an “od-broker" Error

This issue is fixed in TKGI v1.9.5.
The probability of encountering this issue has been reduced in TKGI v1.9.4.

Symptom

The TKGI CLI get-credentials occasionally returns the error “od-broker is processing a request for the same instance… please try again later” during periods of intermittent latency.


Pods are Stuck in the Init State and Return the DeadlineExceeded RPC Error

This issue is fixed in TKGI v1.9.5.

Symptom

You might experience one or more of the following when upgrading to TKGI v1.9.2, v1.9.3 or v1.9.4:

  • You are unable to delete Pods.
  • Your Pods are stuck in the Init state.
  • kubelet describe pod returns the following:

    Failed to create pod sandbox: rpc error: code = DeadlineExceeded desc = context deadline exceeded
    
  • Your kubelet.stderr.log log includes either of the following errors:

    Failed to stop sandbox
    

    or

    CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container
    

Explanation

A node agent has hung because it was unable to obtain a lock while querying data from cache.

Workaround

To remove this issue, restart the nsx-node-agent process:

  1. Follow Retrieve Your Cluster Deployment Names to retrieve the deployment name of your workload cluster.

  2. Use bosh ssh to access the cluster:

    bosh ssh -d DEPLOYMENT-NAME worker
    

    Where DEPLOYMENT-NAME is the workload cluster deployment.

  3. Restart nsx-node-agent on the cluster’s worker node VMs:

    sudo -i
    monit restart nsx-node-agent
    
  4. Confirm the nsx-node-agent status:

    watch monit summary
    
  5. Wait for nsx-node-agent to restart.


TKGI v1.9.1

Release Date: November 4, 2020

Product Snapshot

Release Details
Version v1.9.1
Release date November 4, 2020
Component Version
Kubernetes v1.18.8
CoreDNS v1.6.7+vmware.3
Docker Linux: v19.03.5
Windows: v19.03.11
etcd v3.4.3
Metrics Server v0.3.6
NCP v3.0.2.2
Percona XtraDB Cluster (PXC) v0.30.0
UAA v74.5.20
Compatibilities Versions
Ops Manager See VMware Tanzu Network**.
Windows worker support on vSphere with NSX-T requires v2.10.2 or later
Xenial stemcells See VMware Tanzu Network.
Windows stemcells v2019.24 or later
NSX-T* See VMware Product Interoperability Matrices.
vSphere
VMware Cloud Foundation (VCF) Do not use TKGI v1.9.1 with VCF. See TKGI Upgrade or Install Fails with Error “x509: certificate relies on legacy Common Name field”.
CSI Driver for vSphere v2.0
v1.0.2
Harbor v2.1.0
v2.0.3
v1.10.3
Velero v1.4.2

* VMware recommends NSX-T v3.0.1.1 (v3.0.1 EP1) or later, or NSX-T v2.5.2 or later for NSX-T integration.

** Do not use TKGI with Ops Manager v2.10.15. For more information, see v2.10.15 in Ops Manager v2.10 Release Notes.

Upgrade Path

The supported upgrade paths to TKGI v1.9.1 are from Tanzu Kubernetes Grid Integrated Edition v1.8.0 and later.

Features

This section describes new features and changes in VMware Tanzu Kubernetes Grid Integrated Edition v1.9.1.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.9.0 are also in Tanzu Kubernetes Grid Integrated Edition v1.9.1. See Known Issues in Tanzu Kubernetes Grid Integrated Edition v1.9.0 below.

Warning: Do not use TKGI with Ops Manager v2.10.15. For more information, see v2.10.15 in Ops Manager v2.10 Release Notes.


TKGI v1.9.0

Release Date: September 29, 2020

Product Snapshot

Release Details
Version v1.9.0
Release date September 29, 2020
Component Version
Kubernetes v1.18.8
CoreDNS v1.6.7+vmware.3
Docker Linux: v19.03.5
Windows: v19.03.11
etcd v3.4.3
Metrics Server v0.3.6
NCP v3.0.2.1
Percona XtraDB Cluster (PXC) v0.28.0
UAA v74.5.18
Compatibilities Versions
Ops Manager See VMware Tanzu Network***.
Windows worker support on vSphere with NSX-T requires v2.10.1 or later
Xenial stemcells* See VMware Tanzu Network.
Windows stemcells v2019.24 or later
NSX-T** See VMware Product Interoperability Matrices.
vSphere
Backup and Restore SDK v1.18.0
VMware Cloud Foundation (VCF) Do not use TKGI v1.9.0 with VCF. See TKGI Upgrade or Install Fails with Error “x509: certificate relies on legacy Common Name field”.
CSI Driver for vSphere v2.0
v1.0.2
Harbor v2.1
v2.0.1
v1.10.3
Velero v1.4.2

* See Kubernetes Clusters With Xenial Stemcell v621.85 and Later Fail After Upgrading TKGI on vSphere With NSX-T to TKGI v1.9.

** VMware recommends NSX-T v3.0.1.1 (v3.0.1 EP1) or later, or NSX-T v2.5.2 or later for NSX-T integration.

*** Do not use TKGI with Ops Manager v2.10.15. For more information, see v2.10.15 in Ops Manager v2.10 Release Notes.

Note: Tanzu Kubernetes Grid Integrated Edition does not support the NSX-T Limited Edition (LE). For more information, refer to the VMware Inteoperatiblity Matrix for TKGI.

Upgrade Path

The supported upgrade paths to TKGI v1.9.0 are from Tanzu Kubernetes Grid Integrated Edition v1.8.0 and later.

Features

This section describes new features and changes in VMware Tanzu Kubernetes Grid Integrated Edition v1.9.0.

  • [Security Fix] Forces all TLS communications with the TKGI Database instances to use TLSv1.2+.
  • [Bug Fix] NSX-T pre-check errand no longer fails with error “Unable to check if ESXi hosts are onboarded”.

Windows Workers on NSX-T

TKGI v1.9.0 supports clusters with Windows-based worker nodes on vSphere with NSX-T networking only. TKGI v1.9.0 continues to support Windows workloads on vSphere with Flannel networking as a beta feature.

Cluster Certificate Rotation Support

For secure communication, TKGI clusters use TLS certificates created unique for each cluster. TKGI v1.9.0 integrates with the CredHub Maestro CLI to enable expiry checks and rotation for these cluster-specific certificates, including additional certificates that clusters use with NSX-T networking.

See Rotating Cluster Certificates for how to check and rotate cluster-specific certificates, and TKGI Certificates for managing all certificates used by TKGI.

PKS CLI Supports Certificates Trusted by the Local System

The PKS CLI now trusts the certificates in a system CA store, such as the MacOS keychain. If you have a local system CA, you no longer need to specify the --skip-ssl-validation or --ca-cert command line arguments when logging in to the PKS CLI.

Compute Profile CLI Support and Improvements (vSphere)

Compute profiles let developers customize cluster topology, node sizing, and other compute resource options, overriding configurations that a cluster inherits from its plan.

TKGI v1.9.0 redesigns and improves compute profile functionality, and adds TKG CLI options for creating, managing, and using compute profiles.

With NSX-T networking you can use compute profiles with Linux- and Windows-worker clusters. With Flannel networking you can only apply compute profiles to Linux clusters.

For more information, see Creating and Managing Compute Profiles with the CLI (vSphere).

Velero Support and Bundling for Backup and Restore

TKGI v1.9.0 includes support for Velero, an open source community standard tool for backing up and restoring Kubernetes workloads, including stateless and stateful using persistent volumes. Velero is the preferred backup solution for workloads running on TKGI clusters, and is downloadable from your TKGI downloads page on https://my.vmware.com.

For more information, see Backing Up and Restoring Tanzu Kubernetes Grid Integrated Edition.

Resizable Persistent Volume Support on vSphere 7.0

TKGI supports creating resizable persistent volumes on clusters created on vSphere 7.0 with CNS v2.0.

For more information, see Cloud Native Storage (CNS) on vSphere.

Tagging Support on AWS and GCP

TKGI supports tagging from the CLI on Amazon Web Services (AWS) and Google Cloud Platform (GCP). This enables the --tags option to tkgi create-cluster and other commands on all infrastructures.

For more information, see Tagging Clusters.

New Telegraf Configuration Fields

TKGI supports modifying the Telegraf Agent configuration. For more information, see Configure Telegraf in the Tile in Configuring Telegraf in TKGI.

Telemetry Changes

  • Telemetry Enhanced Participation Level Change: The TKGI Customer Experience Improvement Program (CEIP) and Telemetry Program has been streamlined to bring the Enhanced participation level closer to the Standard participation level. For descriptions of the participation levels, see Telemetry.

  • Telemetry Database Removal: The legacy Telemetry DB has been removed from the TKGI Database.

Component Updates

The following components have been updated:

  • Bumps Kubernetes to v1.18.8+vmware.1.
  • Bumps CoreDNS to v1.6.7+vmware.3.
  • Bumps NCP to v3.0.2.1.

Known Issues

Warning: Do not use TKGI with Ops Manager v2.10.15. For more information, see v2.10.15 in Ops Manager v2.10 Release Notes.

TKGI v1.9.0 has the following known issues:


Kubernetes Clusters With Xenial Stemcell v621.85 and Later Fail After Upgrading TKGI on vSphere With NSX-T to TKGI v1.9

This issue is fixed in TKGI v1.9.1.

Symptom

After you upgrade to TKGI on vSphere with NSX-T v1.9.0 and later, TKGI-provisioned Kubernetes clusters with Xenial Stemcell v621.85 and later fail to start. The cluster start failure log includes the following:

Error: Action Failed get_task: Task ... result: Compiling package openvswitch:
Running packaging script: Running packaging script: Command exited with 2;

Xenial Stemcell v621.85 is installed by default when installing Ops Manager v2.8.15.

Workaround

If you experience this issue, manually revert the stemcell to an earlier compatible version. For information about reverting stemcells, see How to revert a stemcell in TKGI to prevent OVS compilation issues in the Knowledge Base.


Error: Could Not Execute “Apply-Changes” in Azure Environment

Symptom

After clicking Apply Changes on the TKGI tile in an Azure environment, you experience an error ’…could not execute “apply-changes”…’ with either of the following descriptions:

  • {“errors”:{“base”:[“undefined method 'location’ for nil:NilClass”]}}
  • FailedError.new(“Resource Groups in region ’#{location}’ do not support Availability Zones”))

For example:

INFO | 2020-09-21 03:46:49 +0000 | Vessel::Workflows::Installer#run | Install product (apply changes)
2020/09/21 03:47:02 could not execute "apply-changes": installation failed to trigger: request failed: unexpected response from /api/v0/installations:
HTTP/1.1 500 Internal Server Error
Transfer-Encoding: chunked
Cache-Control: no-cache, no-store
Connection: keep-alive
Content-Type: application/json; charset=utf-8
Date: Mon, 21 Sep 2020 17:51:50 GMT
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Pragma: no-cache
Referrer-Policy: strict-origin-when-cross-origin
Server: Ops Manager
Strict-Transport-Security: max-age=31536000; includeSubDomains
X-Content-Type-Options: nosniff
X-Download-Options: noopen
X-Frame-Options: SAMEORIGIN
X-Permitted-Cross-Domain-Policies: none
X-Request-Id: f5fc99c1-21a7-45c3-7f39
X-Runtime: 9.905591
X-Xss-Protection: 1; mode=block

44
{"errors":{"base":["undefined method `location' for nil:NilClass"]}}
0

Explanation

The Azure CPI endpoint used by Ops Manager has been changed and your installed version of Ops Manager is not compatible with the new endpoint.

Workaround

Run the following Ops Manager CLI command:

om --skip-ssl-validation --username USERNAME --password PASSWORD --target https://OPSMAN-API curl --silent --path /api/v0/staged/director/verifiers/install_time/IaasConfigurationVerifier -x PUT -d '{ "enabled": false }'

Where:

  • USERNAME is the account to use to run Ops Manager API commands.
  • PASSWORD is the password for the account.
  • OPSMAN-API is the IP address for the Ops Manager API

For more information, see Error 'undefined method location’ is received when running Apply Change on Azure in the VMware Tanzu Knowledge Base.


VMware vRealize Operations Does Not Support Windows Worker-Based Kubernetes Clusters

VMware vRealize Operations (vROPs) does not support Windows worker-based Kubernetes clusters and cannot be used to manage TKGI-provisioned Windows workers.


TKGI Wavefront Requires Manual Installation for Windows Workers

To monitor Windows-based worker node clusters with a Wavefront collector and proxy, you must first install Wavefront on the clusters manually, using Helm. For instructions, see the Wavefront section of the Monitoring Windows Worker Clusters and Nodes topic.


Pinging Windows Workers Does Not Work

TKGI-provisioned Windows workers inherit a Kubernetes limitation that prevents outbound ICMP communication from workers. As a result, pinging Windows workers does not work.

For information about this limitation, see Limitations > Networking in the Windows in Kubernetes documentation.


Velero Does Not Support Backing Up Stateful Windows Workloads

You can use Velero to backup stateless TKGI-provisioned Windows workers. Velero can back up stateless Windows workloads only, and cannot be used to backup stateful Windows applications. For more information, see Velero on Windows in Basic Install in the Velero documentation.


Tanzu Mission Control Integration Not Supported on GCP

TKGI on Google Cloud Platform (GCP) does not support Tanzu Mission Control (TMC) integration, which is configured in the Tanzu Kubernetes Grid Integrated Edition tile > the Tanzu Mission Control (Experimental) pane.

If you intend to run TKGI v1.9 on GCP, skip this pane when configuring the Tanzu Kubernetes Grid Integrated Edition tile.


TMC Data Protection Feature Requires Privileged TKGI Containers

TMC Data Protection feature supports privileged TKGI containers only. For more information, see Plans in the Installing TKGI topic for your IaaS.


Existing TKGI Clusters No Longer Automatically Attach to TMC

This issue is fixed in TKGI v1.9.2.

Symptom

Previously, your TKGI clusters had attached to TMC automatically. Now, existing TKGI clusters no longer automatically attach to TMC.

Explanation

The TMC API has been updated and is incompatible with the API calls from existing TKGI clusters.

Workaround

Manually attach your existing TKGI clusters to TMC. For more information, see Attach an Existing Cluster in the VMware Tanzu Mission Control Product Documentation.


You Cannot Upgrade Clusters Configured with a TKGI v1.9 Compute Profile

This issue is fixed in TKGI v1.9.2.

Symptom

tkgi upgrade-cluster does not upgrade clusters configured with a TKGI v1.9 compute profile.


Windows-Worker Clusters on Flannel Do Not Support Compute Profiles

On vSphere with NSX-T networking you can use compute profiles with both Linux and Windows‑worker clusters. On vSphere with Flannel networking, you can apply compute profiles only to Linux clusters.


TKGI v1.9 Does Not Support Managing Pre-TKGI v1.9 Compute Profiles

Compute profiles created in TKGI v1.8 and earlier have a different format from v1.9 compute profiles, as described in Compute Profile CLI Support and Improvements (vSphere).

TKGI v1.9 does not support resizing, updating, upgrading, or managing compute profiles using tkgi CLI compute profile commands, on a cluster that has a compute profile created in TKGI v1.8 and earlier.


TKGI CLI Does Not Prevent Reducing the Control Plane Node Count

TKGI CLI does not prevent accidentally reducing a cluster’s control plane node count using a compute profile.

Warning: Reducing a cluster’s control plane node count can destroy the cluster. Do not scale out or scale in existing master nodes by reconfiguring the TKGI tile or by using a compute profile. Reducing a cluster’s number of control plane nodes may remove a master node and cause the cluster to become inactive.


Windows Nodes With Workloads and an emptyDir Volume are Unable to Drain

This issue is fixed in TKGI v1.9.1.

Node draining will fail when scaling down a Windows cluster with a deployed Windows workload and an emptyDir volume.

Solution

Run kubectl drain NODENAME --delete-local-data, then restart the cluster scale down.


In-Cluster DNS Lookup Fails in Windows Clusters

This issue is fixed in TKGI v1.9.1.

Symptom

DNS lookup fails for Windows Pods that do not use a fully qualified domain name (FQDN) to look up services and Pods within its namespace or cluster.

Explanation

DNS lookup fails because a Primary DNS suffix has not been configured on the Windows Pod. A Windows Pod configured without a Primary DNS suffix must use a fully qualified domain name (FQDN) to look up addresses within its namespace and cluster.

Solution

To configure a Primary DNS suffix for your Windows Pods, use a hook to dynamically inject a Primary DNS setting into your Windows Pods. For more information see In-Cluster DNS lookup requires a Fully Qualified Domain Name (FQDN) in Windows Clusters in the VMware Tanzu Community Knowledge Base.


Windows Cluster Creation Fails for Certain Compute Profile Configurations

This issue is fixed in TKGI v1.9.2.

Windows cluster creation does not support using a compute profile with two or more worker instance groups.


Windows Cluster Nodes Not Deleted After VM Deleted

Symptom

After you delete a VM using the management console of your infrastructure provider, you notice a Windows worker node that had been on that VM is now in a notReady state.

Solution

  1. To identify the leftover node:

    kubectl get no -o wide
    
  2. Locate nodes on the returned list that are in a notReady state and have the same IP address as another node in the list.

  3. To manually delete a notReady node:

    kubectl delete node NODE-NAME
    

    Where NODE-NAME is the name of the node in the notReady state.


502 Bad Gateway After OIDC Login

Symptom

You experience a “502 Bad Gateway” error from the NSX load balancer after you log in to OIDC.

Explanation

A large response header has exceeded your NSX-T load balancer maximum response header size. The default maximum response header size is 10,240 characters and should be resized to 50,000.

Workaround

If you experience this issue, manually reconfigure your NSX-T request_header_size and response_header_size to 50,000 characters. For information about configuring NSX-T default header sizes, see OIDC Response Header Overflow in the Knowledge Base.


Difficulty Changing Proxy for Windows Workers

You must configure a global proxy in the Tanzu Kubernetes Grid Integrated Edition tile > Networking pane before you create any Windows workers that use the proxy.

You cannot change the proxy configuration for Windows workers in an existing cluster.


Character Limitations in HTTP Proxy Password

For vSphere with NSX-T, the HTTP Proxy password field does not support the following special characters: & or ;.


Error After Modifying Your Harbor Storage Configuration

Symptom

You receive the following error after modifying your existing Harbor installation’s storage configuration:

Error response from daemon: manifest for ... not found: manifest unknown: manifest unknown

Explanation

Harbor does not support modifying an existing Harbor installation’s storage configuration.

Workaround

To modify your Harbor storage configuration, re-install Harbor. Before starting Harbor, configure the new Harbor installation with the desired configuration.


Cluster Creation Fails With Error 'Too Many Open Files’

This issue is fixed in TKGI v1.9.2.

Symptom

Under certain situations, tkgi create-cluster fails with the error too many open files:

$  tkgi create-cluster ... --external-hostname ... --plan small --num-nodes 1

Error: Status: 500; ErrorMessage: <nil>;
Description: Currently unable to create service instance, please try again later - error-message: Fetching info:
Performing request GET 'https://192.168.1.1:25555/info':
Performing GET request: Requesting token via client credentials grant:
Performing request POST 'https://192.168.1.1:8443/oauth/token':
Performing POST request: Retry: Post https://192.168.1.1:8443/oauth/token: dial tcp 192.168.1.1:8443:
socket: too many open files; ResponseError: < nil >

The tkgi create-cluster command fails with the too many open files error during periods of high latency between the BOSH or TKGI API servers and the remainder of the network.


Cluster Creation Fails During the 'Creating Load Balancer’ Step

This issue is fixed in TKGI v1.9.2.

Symptom

Under certain situations, cluster creation occasionally fails with error codes 401 and 403 during the Creating Load Balancer instance provisioning step:

"Failed to get node properties" pks-networking=networkManager Error: [GET /node][403] 
readNodePropertiesForbidden  &{RelatedAPIError:{Details: ErrorCode:401 ErrorData:&lt;nil&gt; 
ErrorMessage:Not authorized. ModuleName:common-services} RelatedErrors:[]}


You Cannot Change a Master Node’s AZ

Symptom

Your existing master nodes return the error: “instance group 'master’ must specify availability zone that matches availability zones of network” after you modify the node’s existing compute profile AZ or after attaching a compute profile configured with an AZ.

Explanation

TKGI does not support changing an existing master node’s AZ, either by modifying its cluster plan, modifying its cluster’s attached compute profile, or attaching a new compute profile configured with an AZ.


Worker Node Kubelet Unable to Communicate with Master Node Kubernetes API after the cni0 Interface Changes Mac Address

This issue is fixed in TKGI v1.9.3.

Symptom

On Flannel networks, worker node CNI0 interfaces may intermittently change MAC address, causing the worker nodes’ kubelet to become unable to communicate with the master node Kubernetes API.

Explanation

A CNI bridge adopts the lowest MAC address of the attached veths. Adding a new container with a veth with a lower MAC address than any of the currently existing AC addresses results in changing the MAC address of cbr0, and the deletion of all of the existing connection tracking entries for the old MAC address. On Flannel networks this results in transient interruption of network traffic with the Pod.

Workaround

Complete the steps in Tanzu Kubernetes Grid Integrated (TKGI) / PKS Disable Docker Bridge in GitHub.


One Plan ID Longer than Other Plan IDs

Symptom

One of your plan IDs is one character longer than your other plan IDs.

Explanation

In TKGI, each plan has a unique plan ID. A plan ID is normally a UUID consisting of 32 alphanumeric characters and 4 hyphens. However, the Plan 4 ID consists of 33 alphanumeric characters and 4 hyphens.

Solution

You can safely configure and use Plan 4. The length of the Plan 4 ID does not affect the functionality of Plan 4 clusters.

If you require all plan IDs to have identical length, do not activate or use Plan 4.


TKGI Upgrade or Install Fails with Error “x509: certificate relies on legacy Common Name field”

This issue is fixed in TKGI v1.9.4.

Symptom

Installing or upgrading TKGI fails with the error:

x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0

Explanation

Your NSX-T Root CA Certificate was generated without a Subject Alternate Name (SAN) and does not work with TKGI. For example, if you use VCF, you see this error because VCF generates only NSX-T Root CA Certificates without a SAN.


Resizing a Compute Profile Node Pool Size May Resize Other Node Pool Sizes

This issue is fixed in TKGI v1.9.4.

Symptom

If you resize one compute profile node pool in a compute profile containing multiple node pools, the other node pools in the profile may resize.

Workaround

Resize all of the node pools in a compute profile when resizing a single node pool in a compute profile containing multiple node pools.


Ingress Controller Statefulset Fails to Start After Resizing Worker Nodes

Symptom

Permissions are removed from your cluster’s files and processes after resizing the persistent disk during a cluster upgrade. The ingress controller statefulset fails to start.

Explanation

When resizing a persistent disk, Bosh migrates the data from the old disk to the new disk but does not copy the files’ extended attributes.

Workaround

To resolve the problem, complete the steps in Ingress controller statefulset fails to start after resize of worker nodes with permission denied in the VMware Tanzu Knowledge Base.


Network Profile Required with Compute Profile

This issue is fixed in TKGI v1.9.4.

Symptom

When you create a cluster with a compute profile configured with an AZ setting, you must also configure the cluster with a network profile.


Kubernetes Worker Nodes Run Out of Space

Symptom

Your Kubernetes worker nodes slowly run out of disk space, and your logs indicate Kubernetes has been attempting to access a directory that does not exist instead of an actual existing location:

Failed to create existing container: ...: failed to identify the read-write layer ID for container "...". 
- open .../mount-id: no such file or directory

Explanation

cAdvisor has defaulted to default path settings due to a race condition between kubelet and Docker.


Azure Default Security Group Is Not Automatically Assigned to Cluster VMs

Symptom

You experience issues when configuring a load balancer for a multi-master Kubernetes cluster or creating a service of type LoadBalancer. Additionally, in the Azure portal, the VM > Networking page does not display any inbound and outbound traffic rules for your cluster VMs.

Explanation

As part of configuring the Tanzu Kubernetes Grid Integrated Edition tile for Azure, you enter Default Security Group in the Kubernetes Cloud Provider pane. When you create a Kubernetes cluster, Tanzu Kubernetes Grid Integrated Edition automatically assigns this security group to each VM in the cluster. However, on Azure the automatic assignment may not occur.

As a result, your inbound and outbound traffic rules defined in the security group are not applied to the cluster VMs.

Workaround

If you experience this issue, manually assign the default security group to each VM NIC in your cluster.


FQDNs in TKGI API Commands Cannot Contain Uppercase Letters

This issue is fixed in TKGI v1.9.5.

Symptom

TKGI CLI commands fail with error:

Error: An error occurred in the PKS API when processing

Explanation

BOSH DNS, which TKGI uses for its internal DNS, does not support uppercase letters in FQDNs.

Workaround

Use only lowercase letters in FQDNs that you assign to your TKGI API VM in TKGI.


The TKGI API FQDN Must Not Include Trailing Whitespace

Symptom

Your TKGI logs include the following error:

'uaa'. Errors are:- Error filling in template 'uaa.yml.erb' (line 59: Client redirect-uri is invalid: uaa.clients.pks_cli.redirect-uri Client redirect-uri is invalid: uaa.clients.pks_cluster_client.redirect-uri)

Explanation

The TKGI API fully-qualified domain name (FQDN) for your cluster contains leading or trailing whitespace.

Workaround

Do not include whitespace in the TKGI tile API Hostname (FQDN) field.


Certain Linux Nodes Are Unable to Complete the Drain Process during a TKGI Upgrade

This issue is fixed in TKGI v1.9.5.

Symptom

You may experience one or more of the following:

  • Your cluster upgrades are interrupted when upgrading the second node of a three-node Pod.
  • Pod eviction fails when either bosh stop or tkgi upgrade-cluster initiates kubectl drain.

Explanation

Ovs drain has finished, removing Pod networking before kubelet has completely drained the Pod. Draining a worker node has triggered the removal of a container interface on the node, blocking all network traffic from the Pod.


The TKGI CLI Resize and Update Cluster Commands Remove the Network Profile CNI Configuration from a Cluster

This issue is fixed in TKGI v1.9.5.

Symptom

In TKGI v1.9.4 or earlier, the network profile CNI configuration is dropped when updating the cluster using the tkgi resize or tkgi update-cluster CLI commands.

Solution

To restore a CNI network profile configuration:

  1. Upgrade TKGI to v1.9.5 or later.
  2. Upgrade your TKGI-provisioned cluster to TKGI v1.9.5 or later.
  3. To restore the network profile CNI configuration to your cluster, modify the cluster using either of the following:

    • The TKGI CLI resize command:

      tkgi resize CLUSTER-NAME
      

      Where CLUSTER-NAME is the name of your cluster.

    • The TKGI CLI update-cluster command:

      tkgi update-cluster CLUSTER-NAME --network-profile PROFILE-NAME
      

      Where:

      • CLUSTER-NAME is the name of your cluster.
      • PROFILE-NAME is the name of the network profile previously applied to the cluster.


The TKGI API Does Not Import the Current TKGI CA Certificates

Symptom

After rotating your TKGI CA certificates, you notice the following:

  • When using the TKGI API, SSL authentication returns the following error and fails:

    None of the TrustManagers trust this certificate chain executing POST https://100.104.252.16:8443/oauth/token
    
  • The keystore file at /var/vcap/jobs/pks-api/config/cacerts.jks, contains only a single, stale certificate.

Explanation

The keystore file used by the TKGI API has not been updated with current CA certificates.


Windows Workloads With Attached Dynamic PVs Must Be Removed before Deleting a Cluster

Symptom

Your tkgi delete-cluster operation hangs while draining a Windows worker VM containing a Pod bound to one or more dynamic persistent volumes (PVs).

Workaround

Before deleting your Windows cluster, remove all workloads.


Compute Profile Dropped From Clusters During BOSH Upgrade

This issue is fixed in TKGI v1.9.2.

If a cluster created using a compute profile is upgraded using tkgi upgrade-cluster, the cluster’s compute profile is dropped.


Fluent Bit ClusterLogSink Returns 'TCP Connection Failed’ While CEIP Telemetry Services Are Disabled

Symptom

Fluent Bit ClusterLogSink logs the errors “TCP connection failed” and “no upstream connections available” after you disable CEIP Telemetry services:

[2020/10/14 10:38:50] [error] [io] TCP connection failed: telemetry.pks.internal:24224 (Connection timed out)
[2020/10/14 10:38:50] [error] [out_fw] no upstream connections available
[2020/10/14 10:41:08] [error] [io] TCP connection failed: telemetry.pks.internal:24224 (Connection timed out)
[2020/10/14 10:41:08] [error] [out_fw] no upstream connections available
[2020/10/14 10:43:51] [error] [io] TCP connection failed: telemetry.pks.internal:24224 (Connection timed out)
[2020/10/14 10:43:51] [error] [out_fw] no upstream connections available
[2020/10/14 10:43:51] [ warn] [engine] Task cannot be retried: task_id=0 thread_id=2 output=forward.0

Explanation

Fluent Bit intermittently attempts to connect to the Telemetry Server during ClusterLogSink. The connection attempts fail while the CEIP Telemetry services are disabled, and Fluent Bit logs the failed connection attempts.


Network Profiles Does Not Support the failover_mode Parameter

Symptom

When you attempt to resize or update a cluster, the operation fails with logged errors similar to the following:

Error processing update parameters: Unexpected field 'failover_mode'

Explanation

The failover_mode network profile parameter has been incorrectly flagged as an unsupported parameter.

Workaround

To remove the failover_mode parameter from a network profile:

  1. Create a copy of the network profile configuration file used by your cluster.
  2. Remove the failover_mode parameter from the copy of the configuration file. For example, remove:

    "failover_mode":"PREEMPTIVE",
    
  3. Update the network profile for the cluster using the revised configuration file.

You can now resize your cluster.


You Cannot Use vRealize Log Insight (vRLI) to Monitor NCP

This issue is fixed in TKGI v1.9.6.

Symptom

vRLI monitoring does not include NCP stdout or stderror.

Explanation

A bug in the vRLI configuration prevents inclusion of NCP stdout and stderror in vRLI.

Workaround

To write NCP logs for a cluster to vRLI:

  1. Confirm you have admin access to the master node.
  2. Open /var/vcap/jobs/fluentd/config/config.d/nsx-t.conf.
  3. Replace:

    expression /^(?<num>[\w]+) (?<time>[^ ]+) (?<nsx_uuid>[^ ]+) (?<nsx_component>[^ ]+) (?<pid>[^ ]+) - \[.* level="(?<severity>[^\"]+)".*\] (?<message>.+)/
    

    with:

    expression /^(?<time>[^ ]+) (?<nsx_uuid>[^ ]+) (?<nsx_component>[^ ]+) (?<pid>[^ ]+) - \[.* level="(?<severity>[^\"]+)".*\] (?<message>.+)/
    

Note: This workaround affects only the cluster you modified and does not persist.


NSX-T Pre-Check Errand Fails Due to Edge Node CPU Memory Configuration

This issue is fixed in TKGI v1.9.6.

Symptom

You have configured your NSX-T Edge Node VM as medium size, and the NSX-T Pre-Check Errand fails with the following error: “ERROR: NSX-T Precheck failed due to Edge Node … memory is less than 8GB”.

Explanation

The NSX-T Pre-Check Errand has failed because the NSX-T Edge Node has less than 8GB of available memory.


NSX-T Pre-Check Errand Fails Due to Edge Node CPU Count Configuration

This issue is fixed in TKGI v1.8.0 and later and was incorrectly included in the TKGI v1.9 Release Notes.

Symptom

You have configured your NSX-T Edge Node VM as medium size, and the NSX-T Pre-Check Errand fails with the following error: “ERROR: NSX-T Precheck failed due to Edge Node … no of cpu cores is less than 8”.

Explanation

The NSX-T Pre-Check Errand is erroneously returning the “cpu cores is less than 8” error.

Solution

You can safely configure your NSX-T Edge Node VMs as medium size and ignore the error.


Cluster Fails to Restart

This issue is fixed in TKGI v1.9.6.

Symptom

A stopped cluster fails to restart and you see the following error in the NSX Container Plug-in (NCP) logs:

Response body {'httpStatus': 'BAD_REQUEST', 'module_name': 'nsx-search', 'error_code': 60513, 'error_message': 'The result set is too large. Please refine the search criteria.'}

Explanation

This problem occurs on clusters with a network policy with more than twelve port-protocol fields.

Workaround

Modify the cluster network policy to have twelve or fewer port-protocol fields.


The TKGI API VM fluentd.stdout.log Grows While vRealize Log Insight is Disabled

This issue is fixed in TKGI v1.9.6.

Symptom

While vRealize Log Insight (vRLI) is disabled, the log for fluentd.stdout on the TKGI API VM grows quite large and rolls over frequently.

Explanation

The fluentd job on the TKGI API VM automatically starts when vRLI is disabled. This fluentd job runs continuously, frequently writing inaccessible hostname errors to stdout for the non-existent vRLI endpoint. The fluentd stdout on the TKGI API VM is logged in: /var/vcap/sys/log/pks-vrli-control-plane-fluentd/fluentd.stdout.log.


TKGI MC Unable to Manage TKGI after Restoring the TKGI Control Plane from Backup

Symptom

After you restore Ops Manager and the TKGI API VM from backup, TKGI functions normally, but your TKGI MC tabs include the following error: “…product ‘pivotal-container service’ is not deployed…”.

Explanation

TKGI MC is associated with an Ops Manager with a specific name. If you rename Ops Manager with a new name while restoring, your TKGI MC will not recognize the restored Ops Manager and cannot manage it.


SAML Authentication Requests Are Always Signed

Symptom

After you disable the SAML Identity Provider Sign Authentication Requests setting on the Tanzu Kubernetes Grid Integrated Edition tile, SAML IdP authentication requests continue to be signed.


TKGI Management Console v1.9.6

Release Date: July 13, 2021

Warning: Use only builds of Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.6 downloaded after July 22, 2021.

If you have installed an earlier build, then contact VMware Support for instructions on re-installing Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.6. For more information, see v2.10.15 in Ops Manager v2.10 Release Notes.

Features

Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.6 update includes:

Product Snapshot

Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.

Element Details
Version v1.9.6
Release date July 13, 2021
Installed Tanzu Kubernetes Grid Integrated Edition version v1.9.6
Installed Ops Manager version v2.10.14
Installed Kubernetes version v1.18.18
Installed Harbor Registry version v2.2.2
Windows stemcells v2019.34 or later

Upgrade Path

The supported upgrade path to Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.6 is from Tanzu Kubernetes Grid Integrated Edition v1.8.0 and later.

Known Issues

Except where noted, the Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.6 has the same known issues as v1.9.5.

Warning: Do not use TKGI with Ops Manager v2.10.15. For more information, see v2.10.15 in Ops Manager v2.10 Release Notes.

Warning: Use only builds of Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.6 downloaded after July 22, 2021.

If you have installed an earlier build, then contact VMware Support for instructions on re-installing Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.6. For more information, see v2.10.15 in Ops Manager v2.10 Release Notes.


TKGI Management Console v1.9.5

Release Date: April 7, 2021

Features

Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.5 update includes:

Product Snapshot

Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.

Element Details
Version v1.9.5
Release date April 7, 2021
Installed Tanzu Kubernetes Grid Integrated Edition version v1.9.5
Installed Ops Manager version v2.10.8
Installed Kubernetes version v1.18.16
Installed Harbor Registry version v2.1.3
Windows stemcells v2019.31 or later

Upgrade Path

The supported upgrade path to Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.5 is from Tanzu Kubernetes Grid Integrated Edition v1.8.0 and later.

Known Issues

Except where noted, the Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.5 has the same known issues as v1.9.4.


TKGI Management Console v1.9.4

Release Date: February 3, 2021

Features

Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.4 update includes:

Product Snapshot

Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.

Element Details
Version v1.9.4
Release date February 3, 2021
Installed Tanzu Kubernetes Grid Integrated Edition version v1.9.4
Installed Ops Manager version v2.10.6
Installed Kubernetes version v1.18.14
Installed Harbor Registry version v2.1.3
Windows stemcells v2019.29 or later

Upgrade Path

The supported upgrade path to Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.4 is from Tanzu Kubernetes Grid Integrated Edition v1.8.0 and later.

Known Issues

Except where noted, the Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.4 has the same known issues as v1.9.3.


TKGI Management Console v1.9.3

Release Date: January 8, 2021

Features

Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.3 is a compatibility release with no new features.

Product Snapshot

Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.

Element Details
Version v1.9.3
Release date January 8, 2021
Installed Tanzu Kubernetes Grid Integrated Edition version v1.9.3
Installed Ops Manager version v2.10.4
Installed Kubernetes version v1.18.14
Installed Harbor Registry version v2.1.2
Windows stemcells v2019.28 or later

Upgrade Path

The supported upgrade path to Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.3 is from Tanzu Kubernetes Grid Integrated Edition v1.8.0 and later.

Known Issues

The Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.3 has the same known issues as v1.9.2.


TKGI Management Console v1.9.2

Release Date: December 15, 2020

Features

Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.2 is a compatibility release with no new features.

Product Snapshot

Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.

Element Details
Version v1.9.2
Release date December 15, 2020
Installed Tanzu Kubernetes Grid Integrated Edition version v1.9.2
Installed Ops Manager version v2.10.2
Installed Kubernetes version v1.18.12
Installed Harbor Registry version v2.1.1
Windows stemcells v2019.28 or later

Upgrade Path

The supported upgrade path to Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.2 is from Tanzu Kubernetes Grid Integrated Edition v1.8.0 and later.

Known Issues

Except where noted, the Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.2 has the same known issues as v1.9.1.


Access Cluster Not Displaying Content After Initial Page Load

This issue is fixed in TKGI Management Console v1.9.4.

Symptom

If you open the TKGI Management Console and immediately open Cluster Summary and Access Cluster, the page does not update with content.

Workaround

After encountering this problem, close the window and reselect Access Cluster.


TKGI Management Console v1.9.1

Release Date: November 4, 2020

Features

Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.1 is a compatibility release with no new features.

Product Snapshot

Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.

Element Details
Version v1.9.1
Release date November 4, 2020
Installed Tanzu Kubernetes Grid Integrated Edition version v1.9.1
Installed Ops Manager version v2.10.2
Installed Kubernetes version v1.18.8
Installed Harbor Registry version v2.1.0
Windows stemcells v2019.24 or later

Upgrade Path

The supported upgrade path to Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.1 is from Tanzu Kubernetes Grid Integrated Edition v1.8.0 and later.

Known Issues

The Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.1 has the following known issues:


TKGI Management Console regenerates certificates if the NSX Manager admin password changes

This issue is fixed in TKGI Management Console v1.9.4.

Symptom

If the admin password for NSX Manager changes and you update it in the TKGI Management Console, the certificates for NSX-T Data Center and Harbor are regenerated. This causes the NCP process on the master node VM to fail for all existing clusters.

Workaround

Upgrade the affected clusters.


Windows Plans are Inactive and Windows Clusters are Missing After Upgrading Tanzu Kubernetes Grid Integrated Edition Management Console to 1.9.1

Symptom

After you upgrade Tanzu Kubernetes Grid Integrated Edition Management Console to v1.9.1, existing Windows plans are inactive and Windows-based Kubernetes clusters are no longer listed in the Tanzu Kubernetes Grid Integrated Edition Management Console.

Workaround

To resolve this issue:

  • Open Ops Manager.
  • Upload your Windows Stemcell.
  • Open the Tanzu Kubernetes Grid Integrated Edition Management Console.
  • Apply Changes.

The Windows plans are now activated and the Windows clusters are now listed in Tanzu Kubernetes Grid Integrated Edition Management Console.

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.0 are also in Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.1. See Known Issues in Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.0 below.

TKGI Management Console v1.9.0

Release Date: September 29, 2020

Features

Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.0 updates include:


Product Snapshot

Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.

Element Details
Version v1.9.0
Release date September 29, 2020
Installed Tanzu Kubernetes Grid Integrated Edition version v1.9.0
Installed Ops Manager version v2.10.1
Installed Kubernetes version v1.18.8+vmware.1
Installed Harbor Registry version v2.0.2
Windows stemcells v2019.24 or later

Upgrade Path

The supported upgrade path to Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.0 is from Tanzu Kubernetes Grid Integrated Edition v1.8.0 and later.

Known Issues

The Tanzu Kubernetes Grid Integrated Edition Management Console v1.9.0 has the following known issues:


vRealize Log Insight Integration Does Not Support HTTPS Connections

Symptom

The Tanzu Kubernetes Grid Integrated Edition Management Console integration to vRealize Log Insight does not support connections to the HTTPS port on the vRealize Log Insight server.

Workaround

  1. Use SSH to log in to the Tanzu Kubernetes Grid Integrated Edition Management Console appliance VM.
  2. Open the file /lib/systemd/system/pks-loginsight.service in a text editor.
  3. Add -e LOG_SERVER_ENABLE_SSL_VERIFY=false.
  4. Set -e LOG_SERVER_USE_SSL=true.

    The resulting file should look like the following example:

    ExecStart=/bin/docker run --privileged --restart=always --network=pks
    -v /var/log/journal:/var/log/journal
    --name=pks-loginsight
    -e TYPE=gear2-vm
    -e LOG_SERVER_HOST=${LOGINSIGHT_HOST}
    -e LOG_SERVER_PORT=${LOGINSIGHT_PORT}
    -e LOG_SERVER_ENABLE_SSL_VERIFY=false
    -e LOG_SERVER_USE_SSL=true
    -e LOG_SERVER_AGENT_ID=${LOGINSIGHT_ID}
    pksoctopus/vrli-journald:v07092019
    
  5. Save the file and run systemctl daemon-reload.

  6. To restart the vRealize Log Insight service, run systemctl restart pks-loginsight.service.

Tanzu Kubernetes Grid Integrated Edition Management Console can now send logs to the HTTPS port on the vRealize Log Insight server.


vSphere HA causes Management Console ovfenv Data Corruption

Symptom

If you enable vSphere HA on a cluster, if the TKGI Management Console appliance VM is running on a host in that cluster, and if the host reboots, vSphere HA recreates a new TKGI Management Console appliance VM on another host in the cluster. Due to an issue with vSphere HA, the ovfenv data for the newly created appliance VM is corrupted and the new appliance VM does not boot up with the correct network configuration.

Workaround

  • In the vSphere Client, right-click the appliance VM and select Power > Shut Down Guest OS.
  • Right-click the appliance again and select Edit Settings.
  • Select VM Options and click OK.
  • Verify under Recent Tasks that a Reconfigure virtual machine task has run on the appliance VM.
  • Power on the appliance VM.


Base64 encoded file arguments are not decoded in Kubernetes profiles

Symptom

Some file arguments in Kubernetes profiles are base64 encoded. When the management console displays the Kubernetes profile, some file arguments are not decoded.

Workaround

Run echo "$content" | base64 --decode


Network profiles not immediately selectable

Symptom

If you create network profiles and then try to apply them in the Create Cluster page, the new profiles are not available for selection.

Workaround

Log out of the management console and log back in again.


Real-Time IP information not displayed for network profiles

Symptom

In the cluster summary page, only default IP pool, pod IP block, node IP block values are displayed, rather than the real-time values from the associated network profile.

Workaround

None


Error After Modifying Your Harbor Storage Configuration

Symptom

You receive the following error after modifying your existing Harbor installation’s storage configuration:

Error response from daemon: manifest for ... not found: manifest unknown: manifest unknown

Explanation

Harbor does not support modifying an existing Harbor installation’s storage configuration.

Workaround

To modify your Harbor storage configuration, re-install Harbor. Before starting Harbor, configure the new Harbor installation with the desired configuration.


Management Console Page Unresponsive

This bug is fixed in v1.9.4.

The Management Console upgrade page is unresponsive with numerous errors when upgrading from 1.8 to 1.9.


Windows Stemcells Must be Re-Imported After Upgrading Ops Manager

Symptom

After upgrading Ops Manager, your Management Console does not recognize a Windows stemcell imported when using the prior version of Ops Manager.

Workaround

If your Management Console does not recognize a Windows stemcell after upgrading Ops Manager:

  1. Re-import your previously imported Windows stemcell.
  2. Apply Changes to TKGI MC.


Management Console Deletes Custom Workload Configurations

This issue is fixed in v1.9.6.

Symptom

Your Management Console deletes the custom workload configurations that you have added to a Plan Add-ons - Use with caution field.



Please send any feedback you have to pks-feedback@pivotal.io.