Verifying Deployment Health

Page last updated:

Warning: VMware Enterprise PKS v1.6 is no longer supported because it has reached the End of General Support (EOGS) phase as defined by the Support Lifecycle Policy. To stay up to date with the latest software and security updates, upgrade to a supported version.

This topic describes how to verify the health of your VMware Enterprise PKS deployment.

Verify Kubernetes Node and Pod Health

Verify the health of your Kubernetes nodes and pods by following the steps below:

  1. From the Ops Manager VM, run the following command:

    bosh -e ENVIRONMENT login
    

    Where ENVIRONMENT is the alias you set for your BOSH Director. For more information, see Using BOSH Diagnostic Commands in Enterprise PKS.

    For example:

    $ bosh -e pks login

  2. To verify that all nodes are in a ready state, run the following command for all Kubernetes contexts:

    kubectl get nodes
    
  3. To verify that all pods are running, run the following command for all Kubernetes contexts:

    kubectl get pods --all-namespaces
    

Verify Kubernetes Cluster Health

Verify the health of your Kubernetes clusters by following the steps below:

  1. From the Ops Manager VM, run the following command:

    bosh -e ENVIRONMENT login
    

    Where ENVIRONMENT is the alias you set for your BOSH Director. For more information, see Using BOSH Diagnostic Commands in Enterprise PKS.

    For example:

    $ bosh -e pks login

  2. To get the deployment name of a target Kubernetes cluster, run the following command:

    bosh deployments
    

    For example:

    $ bosh deployments
    Using environment '30.0.0.10' as client 'ops_manager'
    
    Name                                                   Release(s)                               Stemcell(s)                                      Team(s)
    harbor-container-registry-b4023f6857207b237399         bosh-dns/1.10.0                          bosh-vsphere-esxi-ubuntu-xenial-go_agent/170.15  -
                                                           harbor-container-registry/1.7.3-build.2
    pivotal-container-service-7e64d53fc570503b5690         backup-and-restore-sdk/1.8.0             bosh-vsphere-esxi-ubuntu-xenial-go_agent/170.15  -
                                                           bosh-dns/1.10.0
                                                           bpm/0.13.0
                                                           cf-mysql/36.14.0
                                                           cfcr-etcd/1.8.0
                                                           docker/35.0.0
                                                           harbor-container-registry/1.7.3-build.2
                                                           kubo/0.25.9
                                                           kubo-service-adapter/1.3.3-build.1
                                                           nsx-cf-cni/2.3.1.10693410
                                                           on-demand-service-broker/0.24.0
                                                           pks-api/1.3.3-build.1
                                                           pks-helpers/50.0.0
                                                           pks-nsx-t/1.19.0
                                                           pks-telemetry/2.0.0-build.113
                                                           pks-vrli/0.7.0
                                                           sink-resources-release/0.1.15
                                                           syslog/11.4.0
                                                           uaa/64.0
                                                           wavefront-proxy/0.9.0
    service-instance_8de000ff-a87a-4930-81ba-106d42c2471e  bosh-dns/1.10.0                          bosh-vsphere-esxi-ubuntu-xenial-go_agent/170.15  pivotal-container-service-7e64d53fc570503b5690
                                                           bpm/0.13.0
                                                           cfcr-etcd/1.8.0
                                                           docker/35.0.0
                                                           harbor-container-registry/1.7.3-build.2
                                                           kubo/0.25.9
                                                           nsx-cf-cni/2.3.1.10693410
                                                           pks-helpers/50.0.0
                                                           pks-nsx-t/1.19.0
                                                           pks-telemetry/2.0.0-build.113
                                                           pks-vrli/0.7.0
                                                           sink-resources-release/0.1.15
                                                           syslog/11.4.0
                                                           wavefront-proxy/0.9.0
    
    3 deployments

    In the example above, service-instance_8de000ff-a87a-4930-81ba-106d42c2471e is the Kubernetes cluster deployment name.

    Note: If you have deployed multiple Kubernetes clusters, determine the UUID using pks clusters and then match that UUID with the Kubernetes cluster deployment you are targeting.

  3. With each cluster in a deployment, or any specific cluster, check the status of the cluster’s VMs by running the following command:

    bosh -d K8S-DEPLOYMENT vms
    

    Where K8S-DEPLOYMENT is the name of your Kubernetes cluster deployment. Kubernetes cluster deployment names begin with service-instance and include a unique BOSH-generated identifier.
    This command returns the name of each VM comprising the Kubernetes cluster, including each master and worker node.
    For example:

    $ bosh -d service-instance_8de000ff-a87a-4930-81ba-106d42c2471e vms
    Using environment '30.0.0.10' as client 'ops_manager'

    Task 677. Done

    Deployment 'service-instance_8de000ff-a87a-4930-81ba-106d42c2471e'

    Instance Process State AZ IPs VM CID VM Type Active master/b6d3c263-1682-4c79-a9ab-35939127dedb running AZ-K8S 40.0.2.2 vm-60dbcf68-5538-4c4e-8c00-61edc003bb54 medium.disk true worker/d450548a-2b0c-4494-8144-cf9b7ef9c825 running AZ-K8S 40.0.2.4 vm-1bfdde6d-ce1d-4cdf-90d9-32bba260358f medium.disk true worker/d7f882f0-33dd-43d3-ab5d-058bcc505088 running AZ-K8S 40.0.2.3 vm-822cb573-411f-4c44-a32b-34e79520a7a6 medium.disk true worker/e5e25ffe-f448-4d19-990b-89546118c502 running AZ-K8S 40.0.2.5 vm-c6748604-8440-4b27-9cf4-10a70a02da24 medium.disk true

    4 vms

    Succeeded

  4. With each cluster in a deployment, or any specific cluster, check the status of the cluster’s processes by running the following command:

    bosh -d K8S-DEPLOYMENT instances --ps
    

    Where K8S-DEPLOYMENT is the name of your Kubernetes cluster deployment. Kubernetes cluster deployment names begin with service-instance and include a unique BOSH-generated identifier. This command returns status information for the processes on each Kubernetes cluster VM, including each master and worker node.

    For example:

    $ bosh -d service-instance_8de000ff-a87a-4930-81ba-106d42c2471e instances --ps
    Using environment '30.0.0.10' as client 'ops_manager'

    Task 678. Done

    Deployment 'service-instance_8de000ff-a87a-4930-81ba-106d42c2471e'

    Instance Process Process State AZ IPs apply-addons/ef0d09ae-d3ed-4832-8d3f-431d99730c26 - - AZ-K8S - master/b6d3c263-1682-4c79-a9ab-35939127dedb - running AZ-K8S 40.0.2.2 ~ blackbox running - - ~ bosh-dns running - - ~ bosh-dns-healthcheck running - - ~ bosh-dns-resolvconf running - - ~ etcd running - - ~ kube-apiserver running - - ~ kube-controller-manager running - - ~ kube-scheduler running - - ~ ncp running - - ~ pks-helpers-bosh-dns-resolvconf running - - worker/d450548a-2b0c-4494-8144-cf9b7ef9c825 - running AZ-K8S 40.0.2.4 ~ blackbox running - - ~ bosh-dns running - - ~ bosh-dns-healthcheck running - - ~ bosh-dns-resolvconf running - - ~ docker running - - ~ kube-proxy running - - ~ kubelet running - - ~ nsx-kube-proxy running - - ~ nsx-node-agent running - - ~ ovs-vswitchd running - - ~ ovsdb-server running - - ~ pks-helpers-bosh-dns-resolvconf running - - worker/d7f882f0-33dd-43d3-ab5d-058bcc505088 - running AZ-K8S 40.0.2.3 ~ blackbox running - - ~ bosh-dns running - - ~ bosh-dns-healthcheck running - - ~ bosh-dns-resolvconf running - - ~ docker running - - ~ kube-proxy running - - ~ kubelet running - - ~ nsx-kube-proxy running - - ~ nsx-node-agent running - - ~ ovs-vswitchd running - - ~ ovsdb-server running - - ~ pks-helpers-bosh-dns-resolvconf running - - worker/e5e25ffe-f448-4d19-990b-89546118c502 - running AZ-K8S 40.0.2.5 ~ blackbox running - - ~ bosh-dns running - - ~ bosh-dns-healthcheck running - - ~ bosh-dns-resolvconf running - - ~ docker running - - ~ kube-proxy running - - ~ kubelet running - - ~ nsx-kube-proxy running - - ~ nsx-node-agent running - - ~ ovs-vswitchd running - - ~ ovsdb-server running - - ~ pks-helpers-bosh-dns-resolvconf running - -

    51 instances

Verify NCP Health (NSX-T Only)

NSX Container Plugin (NCP) runs as a BOSH host process. Each Kubernetes master node VM has one running NCP process. If your cluster has multiple master nodes, one NCP process is active while the others are on standby.

To verify the ncp process is running, do the following:

  1. Run bosh instances in your Enterprise PKS environment:

    bosh -e ENVIRONMENT -d K8S-DEPLOYMENT instances --ps
    

    Where:

    • ENVIRONMENT is the alias you set for your BOSH Director.
    • K8S-DEPLOYMENT is the name of your Kubernetes cluster deployment. Kubernetes cluster deployment names begin with service-instance and include a unique BOSH-generated identifier.

    For example:

    $ bosh -e pks -d service-instance_8de000ff-a87a-4930-81ba-106d42c2471e instances --ps
    Using environment '30.0.0.10' as client 'ops_manager'

    Task 678. Done

    Deployment 'service-instance_8de000ff-a87a-4930-81ba-106d42c2471e'

    Instance Process Process State AZ IPs apply-addons/ef0d09ae-d3ed-4832-8d3f-431d99730c26 - - AZ-K8S - master/b6d3c263-1682-4c79-a9ab-35939127dedb - running AZ-K8S 40.0.2.2 ~ blackbox running - - ~ bosh-dns running - - ~ bosh-dns-healthcheck running - - ~ bosh-dns-resolvconf running - - ~ etcd running - - ~ kube-apiserver running - - ~ kube-controller-manager running - - ~ kube-scheduler running - - ~ ncp running - - ~ pks-helpers-bosh-dns-resolvconf running - -

Alternatively:

  1. SSH into your target Kubernetes master node VM:

    bosh -e ENVIRONMENT -d K8S-DEPLOYMENT ssh master/VM-ID
    

    Where:

    • ENVIRONMENT is the alias you set for your BOSH Director.
    • K8S-DEPLOYMENT is the name of your Kubernetes cluster deployment. Kubernetes cluster deployment names begin with service-instance and include a unique BOSH-generated identifier.
    • VM-ID is your Kubernetes master node VM ID. This is a unique BOSH-generated identifier.

    For example:

    $ bosh -e pks -d service-instance_8de000ff-a87a-4930-81ba-106d42c2471e ssh master/b6d3c263-1682-4c79-a9ab-35939127dedb

  2. From the master node VM, run monit summary.

  3. (Optional) To check if the ncp process on your target master node is active or on standby, run /var/vcap/jobs/ncp/bin/nsxcli -c get ncp-master status. This applies only to multi-master clusters.

For information about troubleshooting NCP, see NSX-T NCP troubleshooting and debug logging in the VMware Knowledge Base.


Please send any feedback you have to pks-feedback@pivotal.io.