About Tanzu Kubernetes Grid Integrated Edition Upgrades

Page last updated:

This topic provides conceptual information about Tanzu Kubernetes Grid Integrated Edition upgrades, including upgrading the TKGI control plane and TKGI-provisioned Kubernetes clusters.

For step-by-step instructions on upgrading Tanzu Kubernetes Grid Integrated Edition and TKGI-provisioned Kubernetes clusters, see:

Overview

An Tanzu Kubernetes Grid Integrated Edition upgrade modifies the version of Tanzu Kubernetes Grid Integrated Edition, for example, from v1.10.x to v1.11.0 or from v1.11.0 to v1.11.1.

By default, Tanzu Kubernetes Grid Integrated Edition is set to perform a full upgrade, which upgrades both the TKGI control plane and all TKGI-provisioned Kubernetes clusters.

However, you can choose to upgrade Tanzu Kubernetes Grid Integrated Edition in two phases by upgrading the TKGI control plane first and then upgrading your TKGI-provisioned Kubernetes clusters later.

You can use either the Tanzu Kubernetes Grid Integrated Edition tile or the TKGI CLI to perform TKGI upgrades:

  • To perform a full upgrade of the TKGI control plane and TKGI-provisioned Kubernetes clusters, use the Tanzu Kubernetes Grid Integrated Edition tile .
  • To upgrade the TKGI control plane only, use the Tanzu Kubernetes Grid Integrated Edition tile.
  • To upgrade TKGI-provisioned Kubernetes clusters, use either the TKGI CLI or the Tanzu Kubernetes Grid Integrated Edition tile.
Upgrade Method Supported Upgrade Types
Full TKGI upgrade TKGI control plane only Kubernetes clusters only
TKGI Tile
TKGI CLI

Typically, if you choose to upgrade TKGI-provisioned Kubernetes clusters only, you will upgrade them through the TKGI CLI.

Deciding Between Full and Two-Phase Upgrade

When deciding whether to perform the default full upgrade or to upgrade the TKGI control plane and TKGI-provisioned Kubernetes clusters separately, consider your organization needs.

For example, if your organization runs TKGI-provisioned Kubernetes clusters in both development and production environments and you want to upgrade only one environment first, you can achieve your goal by upgrading the TKGI control plane and TKGI-provisioned Kubernetes separately instead of performing a full upgrade.

Examples of other advantages of upgrading Tanzu Kubernetes Grid Integrated Edition in two phases include:

  • Faster Tanzu Kubernetes Grid Integrated Edition tile upgrades. If you have a large number of clusters in your Tanzu Kubernetes Grid Integrated Edition deployment, performing a full upgrade can significantly increase the amount of time required to upgrade the Tanzu Kubernetes Grid Integrated Edition tile.

  • More granular control over cluster upgrades. In addition to enabling you to upgrade subsets of clusters, the TKGI CLI supports upgrading each cluster individually.

  • Not a monolithic upgrade. This helps isolate the root cause of an error when troubleshooting upgrades. For example, when a cluster-related upgrade error occurs during a full upgrade, the entire Tanzu Kubernetes Grid Integrated Edition tile upgrade may fail.

Warning: If you disable the default full upgrade and upgrade only the TKGI control plane, you must upgrade all your TKGI-provisioned Kubernetes clusters before the next Tanzu Kubernetes Grid Integrated Edition tile upgrade. Disabling the default full upgrade and upgrading only the TKGI control plane cause the TKGI version tagged in your Kubernetes clusters to fall behind the Tanzu Kubernetes Grid Integrated Edition tile version. If your TKGI-provisioned Kubernetes clusters fall more than one version behind the tile, Tanzu Kubernetes Grid Integrated Edition cannot upgrade the clusters.

What Happens During Full TKGI and TKGI Control Plane Upgrades

During a Tanzu Kubernetes Grid Integrated Edition control plane upgrade to v1.11, the Tanzu Kubernetes Grid Integrated Edition tile does the following:

  1. Creates MySQL VM (First-time v1.11 Upgrade Only): When you first upgrade from Tanzu Kubernetes Grid Integrated Edition v1.10 to v1.11, the upgrade process creates the TKGI Database VM, a new dedicated MySQL VM.

    • This MySQL VM resides alongside the TKGI API VM on the TKGI Control Plane.
    • The upgrade process then migrates the TKGI v1.10 MySQL data from the TKGI API VM onto the new dedicated MySQL VM.
    • Subsequent Tanzu Kubernetes Grid Integrated Edition upgrades, from earlier to later patch versions of TKGI v1.11, do not include this step.
  2. Recreates the Control Plane VMs which host the TKGI API and UAA servers.

    • If the Tanzu Kubernetes Grid Integrated Edition installation is not scaled for high availability (beta), this control plane recreation causes temporary outages as described in Non-HA Control Plane Outages, below.
  3. (Optional) Upgrades Clusters

    • Upgrading Tanzu Kubernetes Grid Integrated Edition only upgrades its Kubernetes clusters if the Upgrade all clusters errand checkbox is enabled in the Errands pane.
    • The cluster upgrade process recreates all clusters, which may cause cluster outages.
      For more information, see the What Happens During Cluster Upgrades section of the Upgrading Clusters topic.

You can perform full TKGI upgrades and TKGI control plane upgrades only through the Tanzu Kubernetes Grid Integrated Edition tile.

After you add a new Tanzu Kubernetes Grid Integrated Edition tile version to your staging area on the Ops Manager Installation Dashboard, Ops Manager automatically migrates your configuration settings into the new tile version.

For more information, see:

Full TKGI Upgrades

During a full TKGI upgrade, the Tanzu Kubernetes Grid Integrated Edition tile does the following:

  1. Upgrades the TKGI control plane, which includes the TKGI API and UAA servers and the TKGI database. This control plane upgrade causes temporary outages as described in Control Plane Outages below.

  2. Upgrades TKGI-provisioned Kubernetes clusters.

    • Upgrading TKGI-provisioned Kubernetes clusters is controlled by the Upgrade all clusters errand in the Tanzu Kubernetes Grid Integrated Edition tile.
    • The cluster upgrade process recreates all clusters, which may cause cluster outages. For more information, see What Happens During Cluster Upgrades below.

TKGI Control Plane Upgrades

When upgrading the TKGI control plane only, the Tanzu Kubernetes Grid Integrated Edition tile follows the process described in Full TKGI Upgrades above, step 1. It does not upgrade TKGI-provisioned Kubernetes clusters, step 2.

Control Plane Outages

When the Tanzu Kubernetes Grid Integrated Edition control plane is not scaled for high availability (beta), upgrading it temporarily interrupts the following:

  • Logging in to the TKGI CLI and using all tkgi commands
  • Using the TKGI API to retrieve information about clusters
  • Using the TKGI API to create and delete clusters
  • Using the TKGI API to resize clusters

These outages do not affect the Kubernetes clusters themselves. During a TKGI control plane upgrade, you can still interact with clusters and their workloads using the Kubernetes Command Line Interface, kubectl.

For more information about the TKGI control plane and high availability (beta), see TKGI Control Plane Overview in Tanzu Kubernetes Grid Integrated Edition Architecture.

Canary Instances

The Tanzu Kubernetes Grid Integrated Edition tile is a BOSH deployment.

BOSH-deployed products can set a number of canary instances to upgrade first, before the rest of the deployment VMs. BOSH continues the upgrade only if the canary instance upgrade succeeds. If the canary instance encounters an error, the upgrade stops running and other VMs are not affected.

The Tanzu Kubernetes Grid Integrated Edition tile uses one canary instance when deploying or upgrading Tanzu Kubernetes Grid Integrated Edition.

What Happens During Cluster Upgrades

Upgrading TKGI-provisioned Kubernetes clusters updates their Kubernetes version to the version included with the Tanzu Kubernetes Grid Integrated Edition tile. It also updates the TKGI version tagged in your clusters to the Tanzu Kubernetes Grid Integrated Edition tile version.

You can upgrade TKGI-provisioned Kubernetes clusters either through the Tanzu Kubernetes Grid Integrated Edition tile or the TKGI CLI. See the table below.

This method Upgrades
The Upgrade all clusters errand in
the Tanzu Kubernetes Grid Integrated Edition tile > Errands
All clusters. Clusters are upgraded serially.
tkgi upgrade-cluster One cluster.
tkgi upgrade-clusters Multiple clusters. Clusters are upgraded serially or in parallel.

During an upgrade of TKGI-provisioned clusters, Tanzu Kubernetes Grid Integrated Edition recreates your clusters. This includes the following stages for each cluster you upgrade:

  1. Master nodes are recreated.
  2. Worker nodes are recreated.

Depending on your cluster configuration, these recreations may cause Master Nodes Outage or Worker Nodes Outage as described below.

Master Nodes Outage

When Tanzu Kubernetes Grid Integrated Edition upgrades a single-master cluster, you cannot interact with your cluster, use kubectl, or push new workloads.

To avoid this loss of functionality, VMware recommends using multi-master clusters.

Worker Nodes Outage

When Tanzu Kubernetes Grid Integrated Edition upgrades a worker node, the node stops running containers. If your workloads run on a single node, they will experience downtime.

To avoid downtime for stateless workloads, VMware recommends using at least one worker node per availability zone (AZ). For stateful workloads, VMware recommends using a minimum of two worker nodes per AZ.

Note: When the Upgrade all clusters errand is enabled in the Tanzu Kubernetes Grid Integrated Edition tile, updating the tile with a new Linux or Windows stemcell rolls every Linux or Windows VM in each Kubernetes cluster. This automatic rolling ensures that all your VMs are patched. To avoid workload downtime, use the resource configuration recommended in Master Nodes Outage and Worker Nodes Outage above and in Maintaining Workload Uptime.

About Upgrading from the Flannel CNI to the Antrea CNI

In Tanzu Kubernetes Grid Integrated Edition v1.9 and earlier the Tanzu Kubernetes Grid Integrated Edition tile provided Flannel and vSphere with NSX-T as the only supported Container Network Interface (CNI) options for TKGI-provisioned clusters.

Current versions of Tanzu Kubernetes Grid Integrated Edition support Antrea, Flannel and NSX-T CNIs. The Antrea CNI provides Kubernetes Network Policy support for non-NSX-T environments. For more information about Antrea, see Antrea in the Antrea documentation.

Antrea CNI-configured clusters are supported on AWS, Azure and vSphere without NSX-T environments.

Upgrade from the Flannel CNI to Antrea

You can configure TKGI to create new clusters networked with the Antrea CNI:

  • You can only configure Antrea CNI during a TKGI v1.11 or later upgrade.
  • You cannot change your CNI from Antrea back to Flannel after upgrading to TKGI v1.11 or later.


If you switch to the Antrea CNI during your TKGI upgrade:

  • Existing Flannel-configured clusters remain networked using Flannel.
  • New clusters created after the upgrade are created using Antrea as their CNI.

Note: Your existing Flannel clusters will not be migrated to Antrea if you switch to Antrea.

The Flannel CNI Deprecation Time-line

VMware will deprecate and remove Flannel CNI networking from TKGI. The planned Flannel CNI support time-line is as follows:

TKGI Version Flannel Support Existing Flannel Clusters New Clusters
v1.10, v1.11 and v1.12 Full support Flannel clusters continue to run using Flannel as their CNI. Use Flannel or Antrea as your default CNI.
v1.13 Deprecated Flannel clusters continue to run using Flannel as their CNI. Use Antrea as the default CNI. Flannel CNI support for new clusters is removed.
v1.x Support Removed Flannel CNI support is removed entirely. You must migrate all existing Flannel clusters to the Antrea CNI before upgrading. Use Antrea as the default CNI. Flannel CNI support is removed entirely.

Note: The Flannel CNI will be deprecated in Tanzu Kubernetes Grid Integrated Edition v1.13. TKGI v1.13 will continue to support existing Flannel CNI-configured clusters but will not provide Flannel as a CNI option for new clusters.


Please send any feedback you have to pks-feedback@pivotal.io.