Troubleshooting Ingress Router

Page last updated:

Troubleshooting

Istio Pilot and/or Istio Ingress Gateway not running

Symptom

After installing or upgrading Pivotal Ingress Router and running the following command, istio-pilot and istio-ingressgateway show a Pending status or that 0/1 instances are ready:

kubectl get all -n ingress-router-system
NAME                                             READY   STATUS      RESTARTS   AGE
pod/cluster-registrar-58c96fd4dc-4mwrj           1/1     Running     0          83s
pod/istio-ingressgateway-75d7645c7c-2jcr9        0/1     Running     0          83s
pod/istio-init-crd-10-t4mxs                      0/1     Completed   0          83s
pod/istio-init-crd-11-844sv                      0/1     Completed   0          82s
pod/istio-pilot-7f87f87c54-bv967                 0/1     Pending     0          83s
pod/master-routing-controller-86c64d94f5-h8ds5   1/1     Running     0          83s

During upgrades, you may see an older running istio-pilot pod and a newer pending istio-pilot pod. Looking at the logs for the pending istio-pilot shows that the newer pod is pending because the node is out of memory.

Problem

If all your pods are not running or completed, you might need to increase the size of the worker VM or horizontally scale up the number of workers. We have tested this successfully on 2 CPU/2GB worker VMs, which is Medium on GCP.

Solution

It is not currently possible to switch a cluster to a different plan reference. To increase the worker VM size, recreate the cluster using a plan with larger VMs/number of VMs or increase the VM size/number of VMs for the existing cluster by modifying the plan it is using.

You can increase the number of worker nodes as shown below: pks resize <cluster-name> -n <worker-count>

Cluster registrar gets stuck in crash loop failing to reach PKS API

Symptom

After installing Pivotal Ingress Router and running the following command, cluster-registrar is stuck in a CrashLoopBackoff status:

kubectl get all -n ingress-router-system

NAME                                           READY   STATUS               RESTARTS   AGE
pod/cluster-registrar-58c964dc-4mwrj           0/1     CrashLoopBackoff     7          83s
pod/istio-ingressgateway-77645c7c-2jcr9        1/1     Running              0          83s
pod/istio-init-crd-10-t4s                      0/1     Completed            0          83s
pod/istio-init-crd-11-8sv                      0/1     Completed            0          82s
pod/istio-pilot-7f87f854-bv967                 1/1     Running              0          83s
pod/master-routing-controller-86c694f5-h8ds5   1/1     Running              0          83s

The logs for the cluster-registrar show that it cannot reach the API:

kubectl -n ingress-router-system logs pod/cluster-registrar-58c964dc-4mwrj

I0724 21:21:17.801221 1 logger.go:17] Initializing cluster-registrar...
E0724 21:22:17.801709 1 logger.go:29] Post https://api.pks.example.com:8443/oauth/token: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

Problem

The cluster registrar cannot reach the PKS API.

Solution

Set up NAT to allow your cluster VMs to reach the PKS API or turn on this setting in your PKS deployment to “Allow outbound internet access from Kubernetes cluster vms”.