Enterprise PKS Release Notes
Page last updated:
Warning: Pivotal Container Service (PKS)
v1.5 is no longer supported because it has reached the End
of General Support (EOGS) phase as defined by the
Support Lifecycle Policy.
To stay up to date with the latest software and security updates, upgrade to a supported version.
This topic contains release notes for Enterprise PKS v1.5.x and Enterprise PKS Management Console versions that install it.
v1.5.3
Release Date: May 19, 2020
Features
New features and changes in this release:
- Bumps compatible NSX-T v2.4 to v2.4.3.
- Bumps UAA to v73.4.21.
- Bumps On-Demand Broker to v0.38.0.
- Bumps Apache Tomcat (in PKS API) to v9.0.31.
- Bumps pxc to v0.22.0.
- [Security Fix] Fixes bug of plaintext admin password, Wavefront token, and NSX superuser creds in internal BOSH manifest.
- [Security Fix] UAA bump fixes blind SCIM injection vulnerability, CVE-2019-11282
- [Security Fix] UAA bump fixes CSRF attack vulnerability.
- [Security Fix] PXC bump fixes cURL/libcURL buffer overflow vulnerability, CVE-2019-3822
- [Bug Fix] New UAA version includes Apache Tomcat bump that fixes SAML login issues.
- [Bug Fix] Fixes an issue where fluentD sent chunks of log data that were too large for vRLI to accept.
- [Bug Fix] Fixes an issue with the PKS clusters upgrade errand not pushing the latest NSX-T certificate to Kubernetes Master nodes.
- [Bug Fix] Fixes an issue with the PKS OSB Proxy taking a long time to start due to scanning all NSX-T firewall rules.
- [Bug Fix] Fixes an issue with PKS releasing floating IP addresses incompletely while deleting clusters under active/active mode.
- [Bug Fix] Fixes an issue with the DNS Lookup Feature: INGRESS IP not kept out of PKS Floating IP pool.
- [Bug Fix] Fixes an issue where updating a cluster with a network profile incorrectly sets
subnet_prefixon the cluster to0. - [Bug Fix] Fixes an issue when a network profile sets a cluster’s
ingress_ipto a floating IP pool address that is already allocated for another purpose.
Product Snapshot
| Element | Details |
|---|---|
| Version | v1.5.3 |
| Release date | May 19, 2020 |
| Compatible Ops Manager versions * | See Pivotal Network |
| Ubuntu Xenial stemcell version | See Pivotal Network |
| Windows stemcell version | v2019.15 |
| Kubernetes version | v1.14.10 |
| On-Demand Broker version | v0.38.0 |
| NSX-T versions | v2.5.0, v2.4.3 |
| NCP version | v2.5.0 |
| Docker version | v18.09.8 |
| etcd version | v3.3.12 |
| Backup and Restore SDK version | v1.17.0 |
| UAA version | v73.4.21 |
* If you want to use Windows workers in Enterprise PKS v1.5, you must install Ops Manager v2.6.16 and later. Enterprise PKS does not support this feature on Ops Manager v2.5. For more information about Ops Manager v2.6.16 and later, see PCF Ops Manager v2.6 Release Notes.
vSphere Version Requirements
For Enterprise PKS installations on vSphere or on vSphere with NSX-T Data Center, refer to the VMware Product Interoperability Matrices.
Upgrade Path
The supported upgrade paths to Enterprise PKS v1.5.3 are from Enterprise PKS v1.4.0 and later, and v1.5.0 and later.
Breaking Changes
For information about breaking changes in Enterprise PKS v1.5.3, see the following:
Known Issues
For information about known issues in Enterprise PKS v1.5.3, see the following:
v1.5.2
Release Date: February 22, 2020
Features
New features and changes in this release:
- [Security Fix] Bumps Kubernetes to v1.14.10.
- [Security Fix] Bumps UAA to v73.4.16.
- [Security Fix] Bumps pxc to v0.22.0.
- [Security Fix] Bumps syslog to v11.5.0.
- [Security Fix] Bumps the Windows stemcell to v2019.15. If you have existing Kubernetes clusters with Windows workers and you want to upgrade to Enterprise PKS v1.5.2, see Windows Stemcell Update from v2019.7 to v2019.15.
- [Security Fix] Secures traffic into Kubernetes clusters with up-to-date TLS (v1.2+) and approved cipher suites.
- [Bug Fix] Improves the behavior of the
pks get-kubeconfigandpks get-credentialscommands during cluster updates and upgrades. You can now run thepks get-kubeconfigcommand during single- and multi-master cluster updates. Additionally, you can run thepks get-credentialscommand during multi-master cluster upgrades. - [Bug Fix] Resolves a known issue where cluster creation fails when using Availability Sets on Azure.
- [Bug Fix] Resolves a known issue where a vSphere AZ configured without a resource pool in Ops Manager causes failure during persistent volume creation.
- [Bug Fix] Improves upgrades of Windows worker-based clusters by gracefully removing
dockerd.exe. - [Bug Fix] Resolves a buffer overflow issue for Telemetry.
Product Snapshot
| Element | Details |
|---|---|
| Version | v1.5.2 |
| Release date | February 22, 2020 |
| Compatible Ops Manager versions * | See Pivotal Network |
| Ubuntu Xenial stemcell version | See Pivotal Network |
| Windows stemcell version | v2019.15 |
| Kubernetes version | v1.14.10 |
| On-Demand Broker version | v0.29.0 |
| NSX-T versions | v2.5.0, v2.4.2, v2.4.1, or v2.4.0.1 |
| NCP version | v2.5.0 |
| Docker version | v18.09.8 |
| Backup and Restore SDK version | v1.17.0 |
| UAA version | v73.4.16 |
* If you want to use Windows workers in Enterprise PKS v1.5, you must install Ops Manager v2.6.16 and later. Enterprise PKS does not support this feature on Ops Manager v2.5. For more information about Ops Manager v2.6.16 and later, see PCF Ops Manager v2.6 Release Notes.
vSphere Version Requirements
For Enterprise PKS installations on vSphere or on vSphere with NSX-T Data Center, refer to the VMware Product Interoperability Matrices.
Upgrade Path
The supported upgrade paths to Enterprise PKS v1.5.2 are from Enterprise PKS v1.4.0 and later, and v1.5.0 and later.
Breaking Changes
Enterprise PKS v1.5.2 has the following breaking change.
Windows Stemcell Update from v2019.7 to v2019.15
To address CVE-2020-0601, Enterprise PKS v1.5.2 updates the Windows stemcell from v2019.7 to v2019.15. Enterprise PKS v1.5.2 will use this stemcell version for provisioning new Kubernetes clusters with Windows workers. If you have existing Kubernetes clusters with Windows workers and you want to upgrade to Enterprise PKS v1.5.2, follow the instructions below.
Step 1. Upgrade to Enterprise PKS v1.5.2:
- Replace your Windows stemcell v2019.7 with v2019.15 in Ops Manager.
- Upgrade Enterprise PKS to v1.5.2.
Upgrading to Enterprise PKS v1.5.2 does not update the Windows stemcell version for your existing Kubernetes clusters with Windows workers even if Upgrade all clusters errand is enabled in the tile. To update the Windows stemcell version for existing Kubernetes clusters with Windows workers, follow the instructions below.
Step 2. Update Windows stemcell version for existing clusters:
After upgrading to Enterprise PKS v1.5.2, update the Windows stemcell version for all your existing Kubernetes clusters with Windows workers:
- Log in to your Enterprise PKS environment through BOSH.
- Run the
bosh deploymentscommand to identify the names of your Kubernetes cluster deployments with Windows workers. - For each Kubernetes cluster deployment with Windows workers, do the following:
- Download the cluster deployment manifest:
bosh -e PKS-ENVIRONMENT -d service-instance-ID manifest > /tmp/manifest.yml
- In the deployment manifest, update the Windows stemcell version from
2019.7to2019.15. - Update the cluster deployment with the modified manifest:
bosh -e PKS-ENVIRONMENT -d service-instance-ID deploy manifest.yml
- Download the cluster deployment manifest:
Breaking Changes in Enterprise PKS v1.5.x
For information about breaking changes in Enterprise PKS v1.5.x, see the following:
Known Issues
For information about known issues in Enterprise PKS v1.5.x, see the following:
v1.5.1
Release Date: October 10, 2019
Features
New features and changes in this release:
- [Feature Improvement] Cluster administrators can use a shared Tier-1 topology in a Multi-Tier-0 environment. For more information, see Defining Network Profiles for Shared Tier-1 Router Note: This feature requires NSX-T Data Center v2.5.
- [Security Fix] Bumps Kubernetes to v1.14.6.
- [Security Fix] Bumps UAA to v73.4.8 to address privilege escalation vulnerabilities and other bug fixes.
- [Security Fix] Upgrades system to Golang v1.12.9 to address several security issue with the Golang net/http package that can result in a DoS against any process with an HTTP or HTTPS listener, specifically:
- CVE-2015-5739 ‘Content Length’ treated as valid header.
- CVE-2015-5740 Double content-length headers does not return 400 error.
- CVE-2015-5741 Additional hardening, not sending Content-Length with Transfer-Encoding, Closing connections.
- [Bug Fix] Fixed missing CA certificates that affected metric sinks.
- [Bug Fix] When Syslog endpoint is enabled through the PKS tile -> Logging -> Enable Syslog for PKS, the logs at the destination are truncated. This release resolves this issue.
- [Bug Fix] Fixed PKS API calls failing with “no enum constant” after successful upgrade.
- [Bug Fix] Added capability to maintain PKS API consistency and compatibility for both \v1 and \v1beta endpoints.
- [Bug Fix] Improved the error message if the
pks resizeorpks upgrade-clusteroperations fail. - [Bug Fix] When a user submits a request to /v1beta1/clusters/CLUSTER-NAME with an array of insecure_registries, validation fails with “At least one of update parameters have to be specified.” This release resolves the issue so that the request is accepted successfully with a “202” HTTP response.
- [Bug Fix] Fixed the error
"pre-start scripts failed. Failed Jobs: pks-nsx-t-prepare-master-vm"due to pagination unhandled by logical switch search.
Product Snapshot
| Element | Details |
|---|---|
| Version | v1.5.1 |
| Release date | October 10, 2019 |
| Compatible Ops Manager versions * | See Pivotal Network |
| Xenial stemcell version | See Pivotal Network |
| Windows stemcell version | v2019.7 |
| Kubernetes version | v1.14.6 |
| On-Demand Broker version | v0.29.0 |
| NSX-T versions | v2.5.0, v2.4.2, v2.4.1, or v2.4.0.1 |
| NCP version | v2.5.0 |
| Docker version | v18.09.8 |
| Backup and Restore SDK version | v1.17.0 |
| UAA version | 73.4.8 |
* If you want to use Windows workers in Enterprise PKS v1.5, you must install Ops Manager v2.6.16 and later. Enterprise PKS does not support this feature on Ops Manager v2.5. For more information about Ops Manager v2.6.16 and later, see PCF Ops Manager v2.6 Release Notes.
vSphere Version Requirements
For Enterprise PKS installations on vSphere or on vSphere with NSX-T Data Center, refer to the VMware Product Interoperability Matrices.
Upgrade Path
The supported upgrade paths to Enterprise PKS v1.5.1 are from Enterprise PKS v1.4.0 and later, and v1.5.0.
Exception: If you are running Enterprise PKS v1.4.0 with NSX-T v2.3.x, follow the steps below:
- Upgrade to PKS v1.4.1.
- Upgrade to NSX-T v2.4.1.
- Upgrade to PKS v1.5.1.
For more information, see Upgrading Enterprise PKS and Upgrading Enterprise PKS with NSX-T.
Breaking Changes
For information about breaking changes in Enterprise PKS v1.5.x, see the following:
Known Issues
Enterprise PKS v1.5.1 has the following known issues, which also apply to v1.5.0:
- Your Kubernetes API Server CA Certificate Is Only a One-Year Certificate
- Cluster Upgrade Does Not Upgrade Kubernetes Version on Windows Workers
- 502 Bad Gateway After OIDC Login
v1.5.0
Release Date: August 20, 2019
Features
New features and changes in this release:
- Cluster administrators can use the
pks cluster CLUSTER-NAME --detailscommand to view details about an individual cluster, including Kubernetes nodes and NSX-T network details. For more information, see Viewing Cluster Details. - Enterprise PKS v1.5.0 adds the following network profiles:
- Cluster administrators can define a network profile to use a single, shared Tier-1 router per Kubernetes cluster. For more information, see Defining Network Profiles for Shared Tier-1 Router. Note: This feature requires NSX-T Data Center v2.5.
- Cluster administrators can define a network profile to use a third-party load balancer for Kubernetes services of type
LoadBalancer. For more information, see Defining Network Profiles for the Layer 4 Load Balancer. - Cluster administrators can define a network profile to use a third-party ingress controller for pod ingress traffic. For more information, see Defining Network Profile for the HTTP/S Ingress Controller.
- Cluster administrators can define a network profile to configure section markers for explicit distributed firewall rule placement. For more information, see Defining Network Profile for Distributed Firewall Section Marking.
- Cluster administrators can define a network profile to configure NCP logging. For more information, see Defining Network Profiles for NCP Logging.
- Cluster administrators can define a network profile to configure DNS lookup of the IP addresses for the Kubernetes API load balancer and ingress controller. For more information, see Defining Network Profile for DNS Lookup of Pre-Provisioned IP Addresses.
- Cluster administrators can provision Windows worker-based Kubernetes clusters on vSphere with Flannel. Windows worker-based clusters in Enterprise PKS v1.5.0 do not support NSX-T integration. For more information, see Configuring Windows Worker-Based Clusters (Beta) and Deploying and Exposing Windows Workloads (Beta).
- Operators can set the lifetime for the refresh and access tokens for Kubernetes clusters. You can configure the token lifetimes to meet your organization’s security and compliance needs. For information about configuring the access and refresh token for your Kubernetes clusters, see the UAA section in the Installing topic for your IaaS.
- Operators can configure prefixes for OpenID Connect (OIDC) users and groups to avoid name conflicts with existing Kubernetes system users. Pivotal recommends adding prefixes to ensure OIDC users and groups do not gain unintended privileges on clusters. For information about configuring OIDC prefixes, see the Configure OpenID Connect section in the Installing topic for your IaaS.
- Operators can configure an external SAML identity provider for user authentication and authorization. For information about configuring an external SAML identity provider, see the Configure SAML as an Identity Provider section in the Installing topic for your IaaS.
- Operators can upgrade Kubernetes clusters separately from the Enterprise PKS tile. For information about upgrading Kubernetes clusters, see Upgrading Clusters.
- Operators can configure the Telegraf agent to send master/etcd node metrics to a third-party monitoring service. For more information, see Monitoring Master/etcd Node VMs.
- Operators can configure the default node drain behavior. You can use this feature to resolve hanging or failed cluster upgrades. For more information about configuring node drain behavior, see Worker Node Hangs Indefinitely in Troubleshooting and Configure Node Drain Behavior in Upgrade Preparation Checklist for Enterprise PKS v1.5.
- App developers can create metric sinks for namespaces within a Kubernetes cluster. For more information, see Creating and Managing Sink Resources.
- VMware’s Customer Experience Improvement Program (CEIP) and the Pivotal Telemetry Program (Telemetry) are now enabled in Enterprise PKS by default. This includes both new installations and upgrades. For information about configuring CEIP and Telemetry in the Enterprise PKS tile, see the CEIP and Telemetry section in the Installing topic for your IaaS.
Product Snapshot
| Element | Details |
|---|---|
| Version | v1.5.0 |
| Release date | August 20, 2019 |
| Compatible Ops Manager versions * | See Pivotal Network |
| Xenial stemcell version | See Pivotal Network |
| Windows stemcell version | v2019.7 |
| Kubernetes version | v1.14.5 |
| On-Demand Broker version | v0.29.0 |
| NSX-T versions | v2.5.0, v2.4.2, v2.4.1, or v2.4.0.1 |
| NCP version | v2.5.0 |
| Docker version | v18.09.8 |
| Backup and Restore SDK version | v1.17.0 |
| UAA version | v71.2 |
* If you want to use Windows workers in Enterprise PKS v1.5, you must install Ops Manager v2.6.16 and later. Enterprise PKS does not support this feature on Ops Manager v2.5. For more information about Ops Manager v2.6.16 and later, see PCF Ops Manager v2.6 Release Notes.
vSphere Version Requirements
For Enterprise PKS installations on vSphere or on vSphere with NSX-T Data Center, refer to the VMware Product Interoperability Matrices.
Upgrade Path
The supported upgrade paths to Enterprise PKS v1.5.0 are from Enterprise PKS v1.4.0 and later.
Exception: If you are running Enterprise PKS v1.4.0 with NSX-T v2.3.x, follow the steps below:
- Upgrade to PKS v1.4.1.
- Upgrade to NSX-T v2.4.1.
- Upgrade to PKS v1.5.0.
For more information, see Upgrading Enterprise PKS and Upgrading Enterprise PKS with NSX-T.
Breaking Changes
Enterprise PKS v1.5.0 has the following breaking changes:
Persistent Volume Data Loss with Worker Reboot
With old versions of Ops Manager, PKS worker nodes with persistent disk volumes may get stuck in a startup state and lose data when they are rebooted manually from the dashboard or automatically by vSphere HA.
This issue is fixed in the following Ops Manager versions:
- v2.7.6+
- v2.6.16+
- v2.5.24+
For all PKS installations that host workers using persistent volumes, Pivotal recommends upgrading to one of the Ops Manager versions above.
Announcing Support for NSX-T v2.5.0 with a Known Issue and KB Article
Enterprise PKS v1.5 supports NSX-T v2.5. Before upgrading to NSX-T v2.5, note the following:
- In some cases, a few hosts may fail to upgrade from NSX-T v2.4.1 to v2.5.0. This is a known issue. For more information, see the Failed to install software on host error during upgrade NSX-T from 2.4.x to 2.5.0 (74674) KB article.
- Review known issues related to NSX-T v2.5 in NSX-T Release Notes.
Announcing Support for NSX-T v2.4.2 with a Known Issue and Workaround
Enterprise PKS v1.5 supports NSX-T v2.4.2. However, there is a known issue with NSX-T v2.4.2 that can affect new and upgraded installations of Enterprise PKS v1.5 that use a NAT topology.
For NSX-T v2.4.2, the PKS Management Plane must be deployed on a Tier-1 distributed router (DR). If the PKS Management Plane is deployed on a Tier-1 service router (SR), the router needs to be converted. To convert an SR to a DR, refer to the East-West traffic between workloads behind different T1 is impacted, when NAT is configured on T0 (71363) KB article.
This issue is addressed in NSX-T v2.5 so that it does not matter if the Tier-1 Router is a DR or an SR.
New OIDC Prefixes Break Existing Cluster Role Bindings
In Enterprise PKS v1.5, operators can configure prefixes for OIDC usernames and groups. If you add OIDC prefixes you must manually change any existing role bindings that bind to a username or group. If you do not change your role bindings, developers cannot access Kubernetes clusters. For information about creating a role binding, see Managing Cluster Access and Permissions.
New API Group Name for Sink Resources
The apps.pivotal.io API group name for sink resources is no longer supported.
The new API group name is pksapi.io.
When creating a sink resource,
your sink resource YAML definition must start with apiVersion: pksapi.io/v1beta1.
All existing sinks are migrated automatically.
For more information about defining and managing sink resources, see Creating and Managing Sink Resources.
Log Sink Changes
Enterprise PKS v1.5.0 adds the following log sink changes:
The
ClusterSinklog sink resource has been renamed toClusterLogSinkand theSinklog sink resource has been renamed toLogSink.- When you create a log sink resource with YAML, you must use one of the new names in your sink resource YAML definition. For example, specify
kind: ClusterLogSinkto define a cluster log sink. All existing sinks are migrated automatically. - When managing your log sink resources through kubectl, you must use the new log sink resource names.
For example, if you want to delete a cluster log sink, run
kubectl delete clusterlogsinkinstead ofkubectl delete clustersink.
- When you create a log sink resource with YAML, you must use one of the new names in your sink resource YAML definition. For example, specify
Log transport now requires a secure connection. When creating a
ClusterLogSinkorLogSinkresource, you must includeenable_tls: truein your sink resource YAML definition. All existing sinks are migrated automatically.
For more information about defining and managing sink resources, see Creating and Managing Sink Resources.
Deprecation of Sink Commands in the PKS CLI
The following Enterprise PKS Command Line Interface (PKS CLI) commands are deprecated and will be removed in a future release:
pks create-sinkpks sinkspks delete-sink
You can use the following Kubernetes CLI commands instead:
kubectl apply -f YOUR-SINK.ymlkubectl get clusterlogsinkskubectl delete clusterlogsink YOUR-SINK
For more information about defining and managing sink resources, see Creating and Managing Sink Resources.
Known Issues
Enterprise PKS v1.5.0 has the following known issues:
Your Kubernetes API Server CA Certificate Expires Unless You Regenerate It
Symptom
Your Kubernetes API server’s tls-kubernetes-2018 certificate is a one-year certificate
instead of a four-year certificate.
Explanation
When you upgraded from PKS v1.2.7 to PKS v1.3.1, the upgrade process extended the lifespan of all PKS CA certificates to four years, except for the Kubernetes API server’s tls-kubernetes-2018 certificate. The tls-kubernetes-2018 certificate remained a one-year certificate.
Unless you regenerate the tls-kubernetes-2018 certificate it retains its one-year lifespan, even through subsequent Enterprise PKS upgrades.
Workaround
If you have not already done so, you should replace the Kubernetes API server’s one-year tls-kubernetes-2018 certificate before it expires.
For information about generating and applying a new four-year tls-kubernetes-2018 certificate, see
How to regenerate tls-kubernetes-2018 certificate when it is not regenerated in the upgrade to PKS v1.3.x in the Pivotal Knowledge Base.
Cluster Upgrade Does Not Upgrade Kubernetes Version on Windows Workers
When PKS clusters are upgraded, Windows worker nodes in the cluster do not upgrade their Kubernetes version. The master and Linux worker nodes in the cluster do upgrade their Kubernetes version as expected.
When the Kubernetes version of a Windows worker does not exactly match the version of the master node, the cluster still functions. kube-apiserver has no restriction on lagging patch bumps.
PKS clusters upgrade manually with the pks upgrade-cluster command, or automatically with PKS upgrades when the Upgrade all clusters errand is set to Default (On) in the PKS tile Errands pane.
NCP Flapping is Observed When Resizing the Number of Worker Nodes in Large Clusters
When resizing the number of worker nodes in a large cluster, for example, from 250 nodes to 500 nodes, NCP may fail to esablish a connection, resulting in flapping behavior. The ncp-stdout.log shows repeated error messages similar to the following: “NewConnectionError: Failed to establish a new connection: No address found.”
To fix this issue, upgrade to Ops Man 2.6.7 or later with bosh-dns version 1.12.0.
NSX-T v2.4.x Logical Switch Is Not Synced to Nested ESXi Host and vCenter
When provisioning a Kubernetes cluster using Enterprise PKS on vSphere with NSX-T, NCP creates a logical switch for Kubernetes nodes (the Node Network). Although a logical switch may be successfully created and visible in the NSX Manager web interface, the logical switch may not exist in vSphere (vCenter) when nested ESXi hosts are used, resulting in an error.
If the logical switch does not exist in vSphere, when PKS tries to create the Kubernetes master and worker node VMs on that logical switch, BOSH reports an error similar to the following:
Error: Unknown CPI error 'Unknown' with message 'Unable to find network 'pks-500bebbd-a41d-425b-bff7-f7415f8a2fz9'. Verify that the portgroup exists.' in 'create_vm' CPI method (CPI request ID: 'cpi-662509')
The recommended fix to address this issue is to upgrade to from NSX-T v2.4 to NSX-T v2.5.
If it is not possible to upgrade, invoke the NSX-T Management API and perform a PUT operation on each logical switch object that is in this intermediate state. For more information, refer to the Update Logical Switch section of the NSX-T Data Center REST API documentation.
NOTE: The use of nested ESXi hosts is not supported for production environments of Enterprise PKS.
Azure Availability Sets Not Supported
Note: This issue is resolved in Enterprise PKS v1.5.2.
For new Enterprise PKS 1.5.x installations on Azure using Ops Manager v2.5.x or v2.6.x, enabling the Availability Sets mode at the BOSH Director > Azure Config results in the kubelet failing to start on provisioning of a Kubernetes cluster.
When installing Enterprise PKS on Azure, choose the Availability Zones option. For configuration details, see Azure Config Page in Configuring BOSH Director on Azure Using Terraform.
Duplicate IP address conflict can occur
When a network profile is used to provision a Kubernetes cluster and perform a DNS lookup of the ingress controller IP address, NCP allocates the IP address from the floating IP pool, but in NSX-T the IP address is not marked as allocated. As a result, NCP can re-allocate the IP same address for another purpose in which case a duplicate IP conflict will occur.
This known issue does not affect allocation of the IP address for the Kubernetes API server load balancer.
Passwords Not Supported for Ops Manager VM on vSphere
Starting in Ops Manager v2.6, you can SSH into the Ops Manager VM in a vSphere deployment only with a private SSH key. You cannot SSH into the Ops Manager VM with a password.
To avoid upgrade failure and errors when authenticating, add a public key to the Customize Template screen of the OVF template for the Ops Manager VM. Then, use the private key to SSH into the Ops Manager VM.
Warning: You cannot upgrade to Ops Manager v2.6 successfully without adding a public key. If you do not add a key, Ops Manager shuts down automatically because it cannot find a key and may enter a reboot loop.
For more information about adding a public key to the OVF template, see Deploy Ops Manager in Deploying Ops Manager on vSphere.
Azure Default Security Group Is Not Automatically Assigned to Cluster VMs
Symptom
You experience issues when configuring a load balancer for a multi-master Kubernetes cluster or creating a service of type LoadBalancer.
Additionally, in the Azure portal, the VM > Networking page does not display
any inbound and outbound traffic rules for your cluster VMs.
Explanation
As part of configuring the Enterprise PKS tile for Azure, you enter Default Security Group in the Kubernetes Cloud Provider pane. When you create a Kubernetes cluster, Enterprise PKS automatically assigns this security group to each VM in the cluster. However, on Azure the automatic assignment may not occur.
As a result, your inbound and outbound traffic rules defined in the security group are not applied to the cluster VMs.
Workaround
If you experience this issue, manually assign the default security group to each VM NIC in your cluster.
Cluster Creation Fails When First AZ Runs Out of Resources
Symptom
If the first availability zone (AZ) used by a plan with multiple AZs runs out of resources, cluster creation fails with an error like the following:
L Error: CPI error 'Bosh::Clouds::CloudError' with message 'No valid placement found for requested memory: 4096
Explanation
BOSH creates VMs for your Enterprise PKS deployment using a round-robin algorithm, creating the first VM in the first AZ that your plan uses. If the AZ runs out of resources, cluster creation fails because BOSH cannot create the cluster VM.
For example, if you have three AZs and you create two clusters with four worker VMs each, BOSH deploys VMs in the following AZs:
| AZ1 | AZ2 | AZ3 | |
|---|---|---|---|
| Cluster 1 | Worker VM 1 | Worker VM 2 | Worker VM 3 |
| Worker VM 4 | |||
| Cluster 2 | Worker VM 1 | Worker VM 2 | Worker VM 3 |
| Worker VM 4 |
In this scenario, AZ1 has twice as many VMs as AZ2 or AZ3.
vSphere AZ Without Resource Pool Causes Volume Failure and Pod Downtime on Upgrade
Note: This issue is resolved in Enterprise PKS v1.5.2.
On vSphere, if an availability zone (AZ) that PKS uses is configured without a resource pool, then its cluster workers lose access to their persistent volumes during upgrade to PKS v1.5. This volume mount failure renders the volumes unavailable, along with any pods that use them.
Availability zones are configured with clusters and optional resource pools in the BOSH Director for vSphere tile > Create Availability Zones pane.
For more information and how to resolve the issue, see the Knowledge Base article Persistent volume creation fails with error “Failed to get the folder reference” when AZs are configured with no resource pool.
Azure Worker Node Communication Fails after Upgrade
Symptom
Outbound communication from a worker node VM fails after upgrading Enterprise PKS.
Explanation
Enterprise PKS uses Azure Availability Sets to improve the uptime of workloads and worker nodes in the event of Azure platform failures. Worker node VMs are distributed evenly across Availability Sets.
Azure Standard SKU Load Balancers are recommended for the Kubernetes control plane and Kubernetes ingress and egress. This load balancer type provides an IP address for outbound communication using SNAT.
During an upgrade, when BOSH rebuilds a given worker instance in an Availability Set, Azure can time out while re-attaching the worker node network interface to the back-end pool of the Standard SKU Load Balancer.
For more information, see Outbound connections in Azure in the Azure documentation.
Workaround
You can manually re-attach the worker instance to the back-end pool of the Azure Standard SKU Load Balancer in your Azure console.
Error During Individual Cluster Upgrades
Symptom
While submitting a large number of cluster upgrade requests using the pks upgrade-cluster command, some of your Kubernetes clusters are marked as failed.
Explanation
BOSH upgrades Kubernetes clusters in parallel with a limit of up to four concurrent cluster upgrades by default. If you schedule more than four cluster upgrades, Enterprise PKS queues the upgrades and waits for BOSH to finish the last upgrade. When BOSH finishes the last upgrade, it starts working on the next upgrade request.
If you submit too many cluster upgrades to BOSH, an error may occur,
where some of your clusters are marked as FAILED because BOSH can start the upgrade only within the specified timeout.
The timeout is set to 168 hours by default.
However, BOSH does not remove the task from the queue or stop working on the upgrade if it has been picked up.
Solution
If you expect that upgrading all of your Kubernetes clusters takes more than 168 hours, do not use a script that submits upgrade requests for all of your clusters at once. For information about upgrading Kubernetes clusters provisioned by Enterprise PKS, see Upgrading Clusters.
Kubectl CLI Commands Do Not Work after Changing an Existing Plan to a Different AZ
Symptom
After you update the AZ of an existing plan, kubectl CLI commands do not work for your clusters associated with the plan.
Explanation
This issue occurs in IaaS environments that do not support attaching a disk across multiple AZs.
When the plan of an existing cluster changes to a different AZ, BOSH migrates the cluster by creating VMs for the cluster in the new AZ and removing your cluster VMs from the original AZ.
On an IaaS that does not support attaching VM disks across AZs, the disks BOSH attaches to the new VMs do not have the original content.
Workaround
If you cannot run kubectl CLI commands after reconfiguring the AZ of an existing cluster, contact Support for assistance.
Internal Server Error When Saving Telemetry Settings
Symptom
When saving the Telemetry configuration pane in the Enterprise PKS tile, you receive an HTTP 500 Internal Server Error.
Explanation
When using Ops Manager v2.5, you may receive an HTTP 500 Internal Server Error if you attempt to
save Telemetry preferences without configuring all of the required settings in the pane.
Solution
In your browser, return to the Telemetry pane. Configure all of the required settings and click Save.
One Plan ID Longer than Other Plan IDs
Symptom
One of your plan IDs is one character longer than your other plan IDs.
Explanation
In Enterprise PKS, each plan has a unique plan ID. A plan ID is normally a UUID consisting of 32 alphanumeric characters and 4 hyphens. However, the Plan 4 ID consists of 33 alphanumeric characters and 4 hyphens.
Solution
You can safely configure and use Plan 4. The length of the Plan 4 ID does not affect the functionality of Plan 4 clusters.
If you require all plan IDs to have identical length, do not activate or use Plan 4.
Enterprise PKS Metric Sinks Fail to Use Secure Connections
Symptom
When you attempt to use MetricSink or ClusterMetricSink over a secure connection, the TLS handshake is rejected by Telegraf.
Explanation
This is due to missing CA certificates in the Telegraf container images included in the tile version. A patch is being worked on. Until this patch is published, users are not able to send metrics through secure connections.
Enterprise PKS Windows Worker-based Clusters Cannot be Deployed in Internet-less Environments
Explanation
Kubernetes Kubelet deploys a pause container during a Kubernetes pod deployment.
To support Kubelet deployment of Linux pods, the Enterprise PKS tile includes a packaged Linux pause image. Microsoft does not allow Windows container images to be distributed in this way, and only Microsoft can distribute a Windows container base image.
To support deployment of a Windows worker-based pod, Kubelet fetches a Windows container image from a Docker registry.
Docker registries are not accessible from within an internet-less environment and therefore Windows worker-based clusters are not supported in internet-less environments.
PKS Resize Returns the Error: 'The service broker has been updated’
Symptom
When you attempt to resize a kubernetes cluster using the pks resize command, the command returns the error: “The service broker has been updated”.
Explanation
After upgrading the Enterprise PKS tile, all existing Enterprise PKS-provisioned kubernetes clusters must be upgraded before you can resize clusters.
Solution
For instructions on upgrading kubernetes clusters, see Running 'pks resize cluster-name’ returns an error following PKS tile upgrade, stating that 'The service broker has been updated’ in the Pivotal Support Knowledge Base.
Saving UAA Tab Settings Fails With Error: 'InvalidURIError bad URI’
Symptom
When you save your UAA tab with LDAP Server selected and multiple LDAP servers specified,
you receive the error: URI::InvalidURIError bad URI(is not URI?):LDAP URLs.
Explanation
When you configure the UAA tab with multiple LDAP servers your settings will fail to validate when using the following Ops Manager releases:
| Ops Manager Version | Affected Releases |
|---|---|
| Ops Manager v2.6 | Ops Manager v2.6.18 and earlier patch releases. |
| Ops Manager v2.7 | All patch releases. |
Workaround
To resolve this issue see the following:
| Ops Manager Version | Workaround |
|---|---|
| Ops Manager v2.6 | Perform one of the following:
|
| Ops Manager v2.7 | Complete the procedures in UAA authentication tab in PKS 1.6 fails to save with error “URI::InvalidURIError bad URI(is not URI?):LDAP URLs” (76495) in the Pivotal Support Knowledge Base. |
502 Bad Gateway After OIDC Login
Symptom
You experience a “502 Bad Gateway” error from the NSX load balancer after you log in to OIDC.
Explanation
A large response header has exceeded your NSX-T load balancer maximum response header size. The default maximum response header size is 10,240 characters and should be resized to 50,000.
Workaround
If you experience this issue, manually reconfigure your NSX-T request_header_size
and response_header_size to 50,000 characters.
For information about configuring NSX-T default header sizes,
see OIDC Response Header Overflow in the Pivotal Knowledge Base.
Character Limitations in HTTP Proxy Password
For vSphere with NSX-T, the HTTP Proxy password field does not support the following special characters: & or ;.
Enterprise PKS Management Console v1.5.3
Release Date: May 19, 2020
Features
This release includes bug fixes and compatibility updates.
Product Snapshot
Note: Enterprise PKS Management Console provides an opinionated installation of Enterprise PKS. The supported versions may differ from or be more limited than what is generally supported by Enterprise PKS.
| Element | Details |
|---|---|
| Enterprise PKS Management Console version | v1.5.3 |
| Release date | May 19, 2020 |
| Installed Enterprise PKS version | v1.5.3 |
| Installed Ops Manager version | v2.7.16 |
| Installed Kubernetes version | v1.14.10 |
| Supported NSX-T versions | v2.5.0, v2.4.2, or v2.4.1 |
| Installed Harbor Registry version | v1.8.6 |
Known Issues
See the Known Issues for v1.5.2, Known Issues for v1.0.0, and Known Issues for v0.9.0.
Enterprise PKS Management Console v1.5.2
Release Date: March 09, 2020
Features
This release includes bug fixes and compatibility updates.
Product Snapshot
Note: Enterprise PKS Management Console provides an opinionated installation of Enterprise PKS. The supported versions may differ from or be more limited than what is generally supported by Enterprise PKS.
| Element | Details |
|---|---|
| Enterprise PKS Management Console version | v1.5.2* |
| Release date | March 09, 2020 |
| Installed Enterprise PKS version | v1.5.2 |
| Installed Ops Manager version | v2.7.13 |
| Installed Kubernetes version | v1.14.10 |
| Supported NSX-T versions | v2.5.0, v2.4.2, or v2.4.1 |
| Installed Harbor Registry version | v1.8.4 |
*In v1.5.2 and later, Enterprise PKS Management Console version numbers match the version of Enterprise PKS that they install.
Known Issues
See the Known Issues for v1.0.0 and Known Issues for v0.9.0.
Enterprise PKS Management Console v1.0.0
Release Date: October 16, 2019
Product Snapshot
Note: Enterprise PKS Management Console provides an opinionated installation of Enterprise PKS. The supported versions may differ from or be more limited than what is generally supported by Enterprise PKS.
| Element | Details |
|---|---|
| Enterprise PKS Management Console version | v1.0.0 |
| Release date | October 16, 2019 |
| Installed Enterprise PKS version | v1.5.1 |
| Installed Ops Manager version | v2.6.11 |
| Installed Kubernetes version | v1.14.6 |
| Supported NSX-T versions | v2.5.0, v2.4.2, or v2.4.1 |
| Installed Harbor Registry version | v1.8.3 |
Known Issues
The following known issues are specific to the Enterprise PKS Management Console v1.0.0 appliance and user interface. See also the Known Issues for v0.9.0.
Cannot modify floating IP range for Automated NAT Deployment
Symptom
In the PKS Configuration > Wizard view of Enterprise PKS Management Console, you cannot modify the Usable range of floating IPs values if you initially deployed Enterprise PKS with the NSX-T Data Center (Automated NAT Deployment) option.
Explanation
You can only set the Usable range of floating IPs values during an initial deployment of Enterprise PKS from Enterprise PKS Management Console. If you attempt to modify these values in either the wizard or the YAML editor, the management console throws an error when you generate the configuration.
Enterprise PKS Management Console v0.9.0
Release Date: August 22, 2019
Features
Enterprise PKS v1.5.0 includes a beta release of VMware Enterprise PKS Management Console that provides a graphical interface for deploying and managing Enterprise PKS on vSphere. For more information, see Using the Enterprise PKS Management Console.
Product Snapshot
Note: The Management Console BETA provides an opinionated installation of Enterprise PKS. The supported versions may differ from or be more limited than what is generally supported by Enterprise PKS.
| Element | Details |
|---|---|
| Version | v0.9. This feature is a beta component and is intended for evaluation and test purposes only. |
| Release date | August 22, 2019 |
| Installed Enterprise PKS version | v1.5.0 |
| Installed Ops Manager version | v2.6.5 |
| Installed Kubernetes version | v1.14.5 |
| Supported NSX-T versions | v2.4.2 or v2.4.1 |
| Installed Harbor Registry version | v1.8.1 |
Known Issues
The following known issues are specific to the Enterprise PKS Management Console v0.9.0 appliance and user interface.
Enterprise PKS Management Console Notifications Persist
Symptom
In the Enterprise PKS view of Enterprise PKS Management Console, error notifications sometimes persist in memory on the Clusters and Nodes pages after you clear those notifications.
Explanation
After clicking the X button to clear a notification it is removed, but when you navigate back to those pages the notification might show again.
Workaround
Use shift+refresh to reload the page.
Cannot Delete Enterprise PKS Deployment from Management Console
Symptom
In the Enterprise PKS view of Enterprise PKS Management Console, you cannot use the Delete Enterprise PKS Deployment option even after you have removed all clusters.
Explanation
The option to delete the deployment is only activated in the management console a short period after the clusters are deleted.
Workaround
After removing clusters, wait for a few minutes before attempting to use the Delete Enterprise PKS Deployment option again.
Configuring Enterprise PKS Management Console Integration with VMware vRealize Log Insight
Symptom
Enterprise PKS Management Console appliance sends logs to VMware vRealize Log Insight over HTTP, not HTTPS.
Explanation
When you deploy the Enterprise PKS Management Console appliance from the OVA, if you require log forwarding to vRealize Log Insight, you must provide the port on the vRealize Log Insight server on which it listens for HTTP traffic. Do not provide the HTTPS port.
Workaround
Set the vRealize Log Insight port to the HTTP port. This is typically port 9000.
Deploying Enterprise PKS to an Unprepared NSX-T Data Center Environment Results in Flannel Error
Symptom
When using the management console to deploy Enterprise PKS in NSX-T Data Center (Not prepared for PKS) mode, if an error occurs during the network configuration, the message Unable to set flannel environment is displayed in the deployment progress page.
Explanation
The network configuration has failed, but the error message is incorrect.
Workaround
To see the correct reason for the failure, see the server logs. For instructions about how to obtain the server logs, see Troubleshooting Enterprise PKS Management Console.
Using BOSH CLI from Operations Manager VM
Symptom
The BOSH CLI client bash command that you obtain from the Deployment Metadata view does not work when logged in to the Operations Manager VM.
Explanation
The BOSH CLI client bash command from the Deployment Metadata view is intended to be used from within the Enterprise PKS Management Console appliance.
Workaround
To use the BOSH CLI from within the Operations Manager VM, see Connect to Operations Manager.
From the Ops Manager VM, use the BOSH CLI client bash command from the Deployment Metadata page, with the following modifications:
- Remove the clause
BOSH_ALL_PROXY=xxx - Replace the
BOSH_CA_CERTsection withBOSH_CA_CERT=/var/tempest/workspaces/default/root_ca_certificate
Run pks Commands against the PKS API Server
Explanation
The PKS CLI is available in the Enterprise PKS Management Console appliance.
Workaround
To be able to run pks commands against the PKS API Server, you must first log to PKS using the following command syntax pks login -a fqdn_of_pks ....
To do this, you must ensure either of the following:
- The FQDN configured for the PKS Server is resolvable by the DNS server configured for the Enterprise PKS Management Console appliance, or
- An entry that maps the Floating IP assigned to the PKS Server to the FQDN exists on /etc/hosts in the appliance. For example:
192.168.160.102 api.pks.local.
Please send any feedback you have to pks-feedback@pivotal.io.
