Troubleshooting PCF on Azure

Page last updated:

This topic describes how to troubleshoot known issues when deploying or running Pivotal Cloud Foundry (PCF) on Azure.

Installation Issues

Cannot Copy the Ops Manager Image

Symptom

Cannot copy the Ops Manager image into your storage account when completing Step 4: Boot Ops Manager of the Launching an BOSH Director Instance on Azure topic.

Explanation

You have an outdated version of the Azure CLI. You need the Azure CLI version 2.0.0 or greater. Run az --version from the command line to display your current Azure CLI version.

Solution

Install the Azure CLI 2.0 by following the instructions for your operating system in the Azure documentation.


Deployment Fails at “create-env”

Symptom

After clicking Review Pending Changes, then Apply Changes to install Ops Manager and PAS, the deployment fails at create-env with an error message similar to the following:

Command 'deploy' failed:
  Deploying:
    Creating instance 'bosh/0':
      Waiting until instance is ready:
        Starting SSH tunnel:
          Parsing private key file '/tmp/bosh_ec2_private_key.pem':
            asn1: structure error: tags don't match (16 vs {class:3 tag:28 length:127
            isCompound:false}) {optional:false explicit:false application:false 
            defaultValue:<nil> tag:<nil> stringType:0 timeType:0 set:false omitEmpty:false} pkcs1PrivateKey @2
            ===== 2016-09-29 16:28:22 UTC Finished "bosh create-env" 
            /var/tempest/workspaces/default/deployments/bosh.yml"; 
            Duration: 328s; Exit Status: 1
Exited with 1. 

Explanation

You provided a passphrase when creating your key pair in Step 4: Boot Ops Manager section of the Launching a BOSH Director Instance on Azure topic.

Solution

Create a new key pair with no passphrase and redo the installation, beginning with the step for creating a VM against the Ops Manager image in Step 4: Boot Ops Manager section of the Launching a BOSH Director Instance on Azure topic.


Insufficient External Database Permissions

Upgrade issues can be caused when the external database user used for the network policy DB is given insufficient permissions. To avoid this upgrade issue, ensure that the networkpolicyserver database user has the ALL PRIVILEGES permission.

Operation Issues

Slow Performance or Timeouts

Symptom

Developers suffer from slow performance or timeouts when pushing or managing apps, and end users suffer from slow performance or timeouts when accessing apps

Explanation

The Azure Load Balancer (ALB) disconnects active TCP connections lying idle for over four minutes.

Solution

To mitigate slow performance or timeouts, the default value of the Router Timeout to Backends (in seconds) field is set to 900 seconds. This default value is set high to mitigate performance issues but operators should tune this parameter to fit their infrastructure.

To edit the Router Timeout to Backends (in seconds) field:

  1. Select the Pivotal Application Service (PAS) tile that is located within your Installation Dashboard.
  2. Select the Networking tab.
  3. Enter your desired time, in seconds, within the Router Timeout to Backends (in seconds) field. Router timeout ert
  4. Click Save.


Service Instance Creation Times Out

Symptom

You are unable to provision a service instance of a Java or Go service. cf create-service fails with error Failure provisioning service instance... Timed out after 8 minutes... or similar.

Explanation

HTTP libraries for Java and Go, running with default settings, prune idle (for 240 seconds) connections from their connection pool without sending a TCP reset message back to the client service broker. This removes the ability of the broker to provision new service instances.

Solution

In the PAS tile Configure Networking pane, set Frontend Idle Timeout to 240 seconds or less, so that Cloud Foundry regenerates the front-end connection before it times out.

See the Knowledge Base article Azure Networking Connection Idle for more than Four minutes for details.

Create a pull request or raise an issue on the source for this page in GitHub