Troubleshooting PCF on Azure
Page last updated:
This topic describes how to troubleshoot known issues when deploying or running Pivotal Cloud Foundry (PCF) on Azure.
Installation Issues
Cannot Copy the Ops Manager Image
Symptom
Cannot copy the Ops Manager image into your storage account when completing the Boot Ops Manager section of Deploying Ops Manager on Azure Manually.
Explanation
You have an outdated version of the Azure CLI. You need the Azure CLI version 2.0.0 or greater. Run az --version
from the command line to display your current Azure CLI version.
Solution
Install the Azure CLI 2.0 by following the instructions for your operating system in the Azure documentation.
Deployment Fails at “create-env”
Symptom
After clicking Review Pending Changes, then Apply Changes to install Ops Manager and PAS, the deployment fails at create-env
with an error message similar to the following:
Command 'deploy' failed:
Deploying:
Creating instance 'bosh/0':
Waiting until instance is ready:
Starting SSH tunnel:
Parsing private key file '/tmp/bosh_ec2_private_key.pem':
asn1: structure error: tags don't match (16 vs {class:3 tag:28 length:127
isCompound:false}) {optional:false explicit:false application:false
defaultValue:<nil> tag:<nil> stringType:0 timeType:0 set:false omitEmpty:false} pkcs1PrivateKey @2
===== 2016-09-29 16:28:22 UTC Finished "bosh create-env"
/var/tempest/workspaces/default/deployments/bosh.yml";
Duration: 328s; Exit Status: 1
Exited with 1.
Explanation
You provided a passphrase when creating your key pair in the Boot Ops Manager section of Deploying Ops Manager on Azure Manually.
Solution
Create a new key pair with no passphrase and redo the installation, beginning with the step for creating a VM against the Ops Manager image in the Boot Ops Manager section of Deploying Ops Manager on Azure Manually.
Insufficient External Database Permissions
Upgrade issues can be caused when the external database user used for the network policy DB is given insufficient permissions. To avoid this upgrade issue, ensure that the networkpolicyserver database user has the ALL PRIVILEGES
permission.
Operation Issues
Slow Performance or Timeouts
Symptom
Developers suffer from slow performance or timeouts when pushing or managing apps, and end users suffer from slow performance or timeouts when accessing apps
Explanation
The Azure Load Balancer (ALB) disconnects active TCP connections lying idle for over four minutes.
Solution
To mitigate slow performance or timeouts, the default value of the Router Timeout to Backends (in seconds) field is set to 900 seconds. This default value is set high to mitigate performance issues but operators should tune this parameter to fit their infrastructure.
To edit the Router Timeout to Backends (in seconds) field:
- Select the Pivotal Application Service (PAS) tile that is located within your Installation Dashboard.
- Select the Networking tab.
- Enter your desired time, in seconds, within the Router Timeout to Backends (in seconds) field.
- Click Save.
Service Instance Creation Times Out
Symptom
You are unable to provision a service instance of a Java or Go service. cf create-service
fails with error Failure provisioning service instance... Timed out after 8 minutes...
or similar.
Explanation
HTTP libraries for Java and Go, running with default settings, prune idle (for 240 seconds) connections from their connection pool without sending a TCP reset message back to the client service broker. This removes the ability of the broker to provision new service instances.
Solution
In the PAS tile Configure Networking pane, set Frontend Idle Timeout to 240 seconds or less, so that Cloud Foundry regenerates the front-end connection before it times out.
See the Knowledge Base article Azure Networking Connection Idle for more than Four minutes for details.