Troubleshooting PFS

This topic describes how to troubleshoot Pivotal Function Service (PFS).

For general Kubernetes troubleshooting, the kubectl Cheat Sheet is handy.

Avoiding duffle error: claim already exists

If an install fails or if you prefer to reset your Kubernetes cluster rather than uninstalling, you can remove existing duffle claims by deleting the files in ~/.duffle/claims. This will allow you to re-install without encountering the “claim already exists” error.

Watch Pods

It is useful to monitor your cluster using a utility like watch that is provided with many Linux distributions.

For more information, see the watch Linux manual page.

To install watch on a Mac:

brew install watch

See Using PFS with Windows for watch on Windows.

View Function Container Logs

To see a function’s log entries use the kubectl logs command targeting the user-container container of the function pod. The command below will print logs from the user-container container of a pod named hello-00001-deployment-5c64894dbd-pk5k2

kubectl logs hello-00001-deployment-5c64894dbd-pk5k2 -c user-container

To find the identity of pods running a PFS function use the kubectl get pods command in the appropriate namespace with a label query of the form riff.projectriff.io/function={name of function}. The example below shows how the pods running a PFS function created with the name hello can be found in the default namespace.

kubectl get pods -l riff.projectriff.io/function=hello
NAME                                      READY     STATUS    RESTARTS   AGE
hello-00001-deployment-5c64894dbd-pk5k2   3/3       Running   0          8m

Alternatively, just run kubectl get pods in the appropriate namespace and look for pods with a name of the form {name of function}-{revision number}-{guid} such as hello-00001-deployment-5c64894dbd-pk5k2

Examine Function Pod Details

To see the status and other details of a function’s pod, use kubectl describe po specifying the pod name and namespace. The example below shows the details of a failed build due to a failure in the analysis step.

kubectl describe po square-00001-7g74x
Name:               square-00001-7g74x
Namespace:          default
...
Init Containers:
  build-step-credential-initializer:
    ...
    Ready:          True
    ...
  build-step-git-source:
    ...
    Ready:          True
    ...
  build-step-prepare:
    ...
    Ready:          True
    ...
  build-step-write-riff-toml:
    ...
    Ready:          True
    ...
  build-step-detect:
    ...
    Ready:          True
    ...
  build-step-analyze:
    ...
    State:          Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Thu, 06 Dec 2018 08:59:21 +0000
      Finished:     Thu, 06 Dec 2018 08:59:22 +0000
    Ready:          False
  build-step-build:
    ...
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
  build-step-export:
    ...
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
Containers:
    ...
Conditions:
  Type              Status
  Initialized       False
  Ready             False
  ContainersReady   False
  PodScheduled      True
...

To dig deeper, you can then grab the logs for the failed container.

kubectl logs square-00001-7g74x -c build-step-analyze
2018/12/06 08:59:22 Error: failed to access image metadata from image : failed to access manifest gcr.io/testproj/square: DENIED: "Permission denied for \"latest\" from request \"/v2/testproj/square/manifests/latest\". "

An alternative way of viewing a function pod’s details is to use kubectl get po with the -oyaml flag to produce a YAML dump. The example below again shows the details of the failed build, but with somewhat different details.

kubectl get po square-00001-7g74x -oyaml

Examine Knative Service Details

To see the status and other details of a Knative service, use kubectl describe services.serving.knative.dev specifying the pod name and namespace. The example below shows the details of Knative service named greet.

kubectl describe services.serving.knative.dev greet

Alternatively, use kubctl get as in the example below.

kubectl get services.serving.knative.dev greet -oyaml

Restart a Controller

Sometimes Knative issues are solved by restarting a controller pod. This is as simple as issuing kubectl delete specifying the name of the pod and the relevant Knative namespace. Once deleted, the pod should restart automatically.

The example below restarts the Knative serving controller.

kubectl delete po controller-f4c59f474-6rwwh -n knative-serving

Verify Outbound Network Access

To verify that a function which requires access to services hosted outside the PFS cluster has the correct network configuration policy applied to it, examine the function pod details and check that the traffic.sidecar.istio.io/includeOutboundIPRanges annotation has the expected value for the cluster platform.

The example below shows that traffic.sidecar.istio.io/includeOutboundIPRanges has been set to a value appropriate for PFS running on PKS for GCP.

kubectl describe po callexternal-00001-deployment-6d694d8954-t5lnb
Name:           callexternal-00001-deployment-6d694d8954-t5lnb
Namespace:      default
. . .
Annotations:    riff.projectriff.io/nonce=1
                serving.knative.dev/configurationGeneration=1
                sidecar.istio.io/inject=true
                sidecar.istio.io/status={"version":"07c4150f9d362775cadfdc066636574134dce0c17e859c05ce7eb6424afb59ab","initContainers":["istio-init"],"containers":["istio-proxy"],"volumes":["istio-envoy","istio-certs...
                traffic.sidecar.istio.io/includeOutboundIPRanges=10.200.0.0/16,10.100.200.0/24
. . .                

For details on configuring outbound network access please consult the Knative guide.