Upgrade Preparation Checklist for PCF v2.6

Page last updated:

This topic serves as a checklist for preparing to upgrade Pivotal Cloud Foundry (PCF) from v2.5 to v2.6.

This topic contains important preparation steps that you must follow before beginning your upgrade. Failure to follow these instructions may jeopardize your existing deployment data and cause the upgrade to fail.

After completing the steps in this topic, you can continue to Upgrading Pivotal Cloud Foundry.

Warning: Pivotal does not recommend that you skip minor versions when upgrading PCF. Skipping minor versions when upgrading PCF may result in breaking changes. To avoid additional breaking changes, upgrade PCF to the minor version that directly follows your current version of PCF.

Back Up Your PCF Deployment

Pivotal recommends backing up your PCF deployment before upgrading, to restore in the case of failure. To do this, see Backing Up Pivotal Cloud Foundry with BBR.

Find Your Decryption Passphrase for Ops Manager

To complete the Ops Manager upgrade, you must have your Ops Manager decryption passphrase. You defined this decryption passphrase during the initial installation of Ops Manager.

Review Changes in PCF v2.6

Review each of the following links to understand the changes in the new release, such as new features, known issues, and breaking changes.

Migrate NFS Legacy Volume Services

If you are using the nfs-legacy volume service and your service instances are configured with uid=0 or gid=0, you may need to rebind your apps with a non-zero or non-root uid or gid value.

To find the apps that need migration, see Get the Apps that Need Migration.

To migrate your apps to a non-root uid or gid value, see Rebind Your Apps.

Get the Apps that Need Migration

To get the apps you must migrate:

  1. Copy the following bash script to your Ops Manager jumpbox.

    #!/bin/bash
    source /var/vcap/jobs/cfdot/bin/setup
    for container_id in $(ps aux | grep fuse-nfs |  grep -oE '(nfs:.*uid=0.*|nfs:.*gid=0.*)' |grep -o '/var/vcap/data/volumes/nfs.*' | grep -n '\-\-auto_cache' | awk '{print $1}' | awk -F_ '{print $2}'); do /var/vcap/packages/cfdot/bin/cfdot cell-state `cat /var/vcap/bosh/spec.json | /var/vcap/packages/cfdot/bin/jq -r .id` | jq -r ".LRPs[] | select(.instance_guid == \"${container_id}\") | .process_guid" | cut -c1-36; done
    
  2. Put the bash script you just copied in /tmp/get-affected-apps.sh and run chmod +x.

  3. Copy /tmp/get-affected-apps.sh to the Diego Cells by running:

    bosh scp /tmp/get-affected-apps.sh diego_cell:/tmp/get-affected-apps.sh
    

    Note: If you are running Small Footprint PAS, substitute `diego-cell` with `compute` in the command.

  4. Determine whether there are apps that need to be manually updated to be compatible with PCF v2.6 by running:

    bosh ssh diego_cell -c 'mv /tmp/get-affected-apps.sh ~/get-affected-apps.sh && sudo ~/get-affected-apps.sh' | grep 'stdout' | awk '{print $4}'  > /tmp/file_with_guids
    

    Note: If you are running Small Footprint PAS, substitute `diego-cell` with `compute` in the command.

  5. Using the file generated from the previous step, use scp to copy the file /tmp/file_with_guids from the Ops Manager jumpbox to a machine with the Cloud Foundry Command-Line Interface (cf CLI) installed.

  6. From the machine with cf CLI installed, target your foundation.

  7. Copy the following bash script into the /tmp/print-apps-to-migrate.sh directory:

    #!/bin/bash
    
    cat /tmp/file_with_guids | sort -n | uniq | while read guid; do
      app_name=$(cf curl /v2/apps/$guid | jq -r '.entity.name')
      space_url=$(cf curl /v2/apps/$guid | jq -r '.entity.space_url')
      space_name=$(cf curl ${space_url} | jq -r '.entity.name')
      org_name=$(cf curl $(cf curl ${space_url} | jq -r '.entity.organization_url') | jq -r '.entity.name')
    
      cf target -o $org_name -s $space_name 2>&1 1>/dev/null
      service_name=$(cf services | grep nfs-legacy | grep $app_name | awk '{print $1}')
      echo service name=$service_name, org name=$org_name, space name=$space_name, app name=$app_name
    done
    
  8. Run chmod +x.

  9. Run the /tmp/print-apps-to-migrate.sh file. This action gives you the service name, org, space and app name that you need to migrate. If you receive a list of apps, continue to the procedure in the next section.

Rebind Your Apps

If you get a list of apps from the print-apps-to-migrate.sh file in the previous procedure, you must rebind those apps with a non-zero or non-root uid or gid value.

To rebind your apps:

  1. Choose a non-root user that can access the files on the remote share.

  2. Work with the admin of the remote NAS server to ensure that the POSIX permissions on the share are compatible with the non-root uid or gid value to be used by the app.

  3. To unbind the existing volume services, run:

    cf stop APP-NAME
    cf unbind-service APP-NAME SERVICE-NAME
    
  4. To create a new nfs volume service to the remote share, run:

    cf create-service nfs Existing NEW-SERVICE-NAME -c '{"share": "SHARE-URL"}'
    
  5. To bind the nfs service to the app by using the uid or gid in the first step, run:

    cs bind-service APP-NAME NEW-SERVICE-NAME -c '{"uid": <uid. Cannot be 0>, "gid": <gid. Cannot be 0>[, <other original options>]}'
    
  6. To start the app, run:

    cf start APP-NAME
    

Restage Windows Apps

The windows2016 stack is deprecated in PAS for Windows v2.6.

Before upgrading to PAS for Windows v2.6 from PAS for Windows v2.5, all Windows apps that were pushed with the windows2016 stack must be restaged to use the windows stack.

For more information, see the Deprecation of the windows2016 Stack section in the PAS for Windows v2.6 Release Notes.

If you are running PAS for Windows v2.4.4 or later on PCF v2.5, you can also restage your apps before upgrading to PAS for Windows v2.5. For more information about upgrading to PAS for Windows v2.5, see Upgrade to PAS for Windows v2.5.

(vSphere Only) Add SSH Key to OVF Template for Ops Manager VM

To avoid upgrade failure and errors when authenticating:

  1. Add a public key to the OVF template for the Ops Manager VM using the Public SSH Key field of the Customize template screen. For more information, see the Deploy Ops Manager section in the Deploying Ops Manager on vSphere topic.

For more information about this breaking change, see the Passwords Not Supported for Ops Manager VM on vSphere section in the PCF v2.6 Breaking Changes topic.

Update Tiles and Add-Ons

The following section describes changes you must make to your product tiles and add-ons before upgrading PCF.

Upgrade to PAS for Windows v2.5

If you are running PAS for Windows v2.4.4 or later on PCF v2.5 and want to upgrade to PCF v2.6, you must upgrade to PAS for Windows v2.5 before upgrading to PCF v2.6. To download PAS for Windows, go to Pivotal Network.

Upgrade Stemcell to v2019.3 or Later for PAS for Windows v2.5

If you are running PAS for Windows v2.5 using Windows stemcell v2019.2, you must upgrade your stemcell to v2019.3 or later before you upgrade to Ops Manager v2.6.

Warning: Windows stemcell v2019.2 is not compatible with Ops Manager v2.6.

Review Service Tile Compatibility

Before you upgrade, check whether the service tiles that you currently have are compatible with the new version of PCF.

To check PCF versions supported by a service tile, either from Pivotal or a Pivotal partner:

  1. Navigate to the tile’s download page on Pivotal Network.

  2. Select the tile version in the Releases dropdown.

  3. See the Depends On section under Release Details. For more information, refer to the tile’s release notes.

If the currently-deployed version of a tile is not compatible with PCF v2.6, you must upgrade the tile to a compatible version before you upgrade PCF. You do not need to upgrade tiles that are compatible with both PCF v2.5 and v2.6.

Some partner service tiles may be incompatible with PCF v2.6. Pivotal works with partners to ensure their tiles are updated to work with the latest versions of PCF. For more information about which partner service release compatibility, see the Depends On section of the partners tile download page, see the Services documentation, or contact the partner organization that produces the service tile.

For an overview of which PCF versions support which versions of the most popular service tiles from Pivotal, see the Product Compatibility Matrix.

Environment Details

Pivotal provides the empty table below as a model to print out or adapt for recording and tracking the tile versions that you have deployed in all of your environments.

Sandbox Non-Prod Prod Other…
Pivotal Cloud Foundry Ops Manager
Pivotal Application Service (PAS)
Pivotal Cloud Foundry Services MySQL v2
Redis
RabbitMQ
Single Sign On (SSO)
Spring Cloud Services
Concourse
Pivotal Cloud Foundry Partner Services New Relic

Upgrade Services Tiles

Upgrade all service tiles to versions that are compatible with PCF v2.6. Service tiles are add-on products you install alongside your runtime. For example, MySQL for PCF, PCF Healthwatch, and RabbitMQ are service tiles.

Do not upgrade runtime tiles, such as PAS, PAS for Windows, or Pivotal Container Service (PKS), at this time.

To check version compatibility, see the Compatibility Matrix and Developer Guide.

Configure BOSH Director

With each release of a new PCF version, BOSH Director may require specific updates before upgrading to the new version. See the following for what action to take before upgrading to PCF v2.6.

Check Required Machine Specifications

Check the required machine specifications for Ops Manager v2.6. These specifications are specific to your IaaS. If these specifications do not match your existing Ops Manager, modify the values of your Ops Manager VM instance. For example, if the boot disk of your existing Ops Manager is 50 GB and the new Ops Manager requires 100 GB, then increase the size of your Ops Manager boot disk to 100 GB.

Configure PAS

With each release of a new PCF version, PAS may require specific updates before upgrading to the new version. See the following sections for what action to take before upgrading to PCF v2.6.

Remove Healthcheck from SSH Load Balancer

If your SSH load balancer is configured with an HTTP healthcheck, remove the healthcheck before upgrading. This is the load balancer specified in the Diego Brain row of the Resource Config pane.

For more information, see HTTP Healthcheck Server Disabled for SSH Proxy.

(Optional) Disable Unused Errands

To save upgrade time, you can disable unused PAS post-deploy errands. For more information, see the Post-Deploy Errands section of the Errands topic for details. Only disable these errands if your environment does not need them.

In some cases, if you have previously disabled lifecycle errands for any installed product to reduce deployment time, you may want to re-enable these errands before upgrading. For more information, see the Adding and Importing Products section of the Adding and Deleting Products topic.

Configure Diego Cell Garbage Collection

In the Application Containers pane of the PAS tile, if Docker Images Disk-Cleanup Scheduling on Cell VMs is set to Clean up disk-space once threshold is reached and the value of Threshold of Disk-Used (MB) is different from the default of 10240, then you must specify a new threshold.

If the scheduling is set to Never clean up… or Routinely clean up…, or the threshold value is set to the default of 10240, then no action is necessary. All thresholds will migrate to a sensible value when you upgrade PAS. For more information, see Configuring Cell Disk Cleanup Scheduling.

Check OS Compatibility of BOSH-Managed Add-Ons and Tiles

Before upgrading to PCF v2.6, operators who have deployed any PCF add-ons such as IPsec for PCF, ClamAV for PCF, or File Integrity Monitoring for PCF and who have deployed or are planning to deploy PAS for Windows must modify the add-on manifest to specify a compatible OS stemcell. For more information, see Pivotal Application Service for Windows.

For example, File Integrity Monitor for PCF (FIM) is not supported on Windows. Therefore, the manifest must use an include directive to specify the target OS stemcell of ubuntu-trusty and ubuntu-xenial.

Note: To upgrade to a Xenial stemcell, see the documentation for each add-on and follow the instructions.

To update an add-on manifest:

  1. Locate your existing add-on manifest file. For example, for FIM, locate the fim.yml you uploaded to the Ops Manager VM.

  2. Modify the manifest to include following include directive to your manifest:

      include:
        stemcell:
          - os: ubuntu-xenial
    
  3. Upload the modified manifest file to your PCF deployment. For example instructions, see Create the FIM Manifest.

If you use any other BOSH-managed add-ons in your deployment, you should verify OS compatibility for those component as well. For more information about configuring BOSH add-on manifests, see Addons Bloci in the BOSH documentation.

Check Backup and Restore External Blobstore Add-On

If you have enabled external blobstore backups for an Azure Blobstore using the Blobstore Add-On, you must update your runtime configuration to remove the sdk-preview add-on before upgrading. For more information, see the Azure Blobstores section of the Enabling External Blobstore Backups topic.

If you do not remove this job, upgrading PAS fails with the error:

Preparing deployment: Preparing deployment (00:00:01)
  L Error: Colocated job 'azure-blobstore-backup-restorer' is already added to the instance group 'backup-restore'.

After removing this job from your runtime configuration, ensure that the Enable backup and restore checkbox is enabled in the File Storage pane of the PAS tile.

Check Certificate Authority Expiration Dates

Depending on the requirements of your deployment, you may need to rotate your Certificate Authority (CA) certificates. The non-configurable certificates in your deployment expire every two years. You must regenerate and rotate them so that critical components do not face a complete outage.

Note: PCF uses SHA-2 certificates and hashes by default. You can convert existing SHA-1 hashes into SHA-2 hashes by rotating your Ops Manager certificates using the procedure described in the Rotate Identity Provider SAML Certificates section of the Managing TLS Certificates topic.

To retrieve information about all the RSA and CA certificates for the BOSH Director and other products in your deployment, you can use the GET /api/v0/deployed/certificates?expires_within=TIME request of the Ops Manager API.

In this request, the expires_within parameter is optional. Valid values for the parameter are d for days, w for weeks, m for months, and y for years. For example, to search for certificates expiring within one month, replace TIME with 1m:

$ curl "https://OPS-MAN-FQDN/api/v0/deployed/certificates?expires_within=1m" \
 -X GET \
 -H "Authorization: Bearer UAA_ACCESS_TOKEN"

For information about regenerating and rotating CA certificates, see Rotating Certificates.

Check the Capacity of Your Deployment

The following sections describe steps for ensuring your deployment has adequate capacity to perform the upgrade.

Confirm Adequate Disk Space

Confirm that you have adequate disk space for your upgrades. You need at least 20 GB of free disk space to upgrade PCF Ops Manager and Pivotal Application Service. If you plan to upgrade other products, the amount of disk space required depends on how many tiles you plan to deploy to your upgraded PCF deployment.

To check current persistent disk usage, select the BOSH Director tile from the Installation Dashboard. Select Status and review the value of the PERS. DISK column. If persistent disk usage is higher than 50%, select Settings > Resource Config, and increase your persistent disk space to handle the size of the resources. If you do not know how much disk space to allocate, set the value to at least 100 GB.

Check Diego Cell RAM and Disk

Check that Diego Cells have sufficient available RAM and disk capacity to support app containers.

The Key Performance Indicators (KPIs) that monitor these these resources are are:

  • rep.CapacityRemainingMemory
  • rep.CapacityRemainingDisk

Adjust Diego Cell Limits

If needed, adjust the maximum number of Diego Cells that the platform can upgrade simultaneously, to avoid overloading the other Diego Cells. For more information, see the Limit PCF Component Instances During Restart section of the Configuring PAS for Upgrades topic.

For PCF v1.10 and later, the maximum number of Diego Cells that can update at once, max_in_flight is 4%. This setting is configured in the BOSH manifest’s Diego Cell definition. For more information, see the Prevent Overload section of the Configuring PAS for Upgrades topic.

For more information about these KPIs, see the Diego Cell Metrics section of the Key Performance Indicators topic.

Review File Storage IOPS and Other Upgrade Limiting Factors

During the PCF upgrade process, a large quantity of data is moved around on disk.

To ensure a successful upgrade of PCF, verify that your underlying PAS file storage is performant enough to handle the upgrade. For more information about the configurations to evaluate, see the Configure File Storage section of the Configuring PAS for Upgrades topic.

In addition to file storage IOPS, consider additional existing deployment factors that can impact overall upgrade duration and performance:

Factor Impact
Network latency Network latency can contribute to how long it takes to move app instance data to new containers.
Number of ASGs A large number of App Security Groups (ASGs) in your deployment can contribute to an increase in app instance container startup time. For more information, see App Security Groups.
Number of app instances and app growth A large increase in the number of app instances and average droplet size since the initial deployment can increase the upgrade impact on your system.

To see example upgrade-related performance measurements of an existing production PCF deployment, see Upgrade Load Example: Pivotal Web Services.

Run BOSH Clean-Up

Run bosh -e ALIAS clean-up --all to clean up old stemcells, releases, orphaned disks, and other resources before upgrade. This cleanup helps prevent the product and stemcell upload process from exceeding the BOSH Director’s available persistent disk space.

Check the Health of Your Deployment

The following sections describe steps for ensuring your deployment is healthy before you perform the upgrade.

Collect Foundation Health Status

For collecting foundation health status, Pivotal recommends PCF Healthwatch, which monitors and alerts on the current health, performance, and capacity of PCF. For more information, see PCF Healthwatch.

If you are not using PCF Healthwatch, you can do some or all of the following to collect foundation health status:

  • If your PCF deployment has external metrics monitoring set up, verify that VM CPU, RAM, and disk use levels are within reasonable levels.

  • Run BOSH CLI commands to check system status:

    • bosh -e ALIAS -d DEPLOYMENT_NAME instances --ps.
      bosh instances with the flags --ps, --vitals, or --failing highlights individual job failure.
    • bosh -e ALIAS vms --vitals. This reveals VMs with high CPU, high memory, high disk utilization, and with state != running.
    • bosh -e ALIAS -d DEPLOYMENT_NAME cck --report
  • Check Ops Manager GUI each PAS/Tiles the status page for CPU/RAM/DISK utilization

  • Validate Ops Manager persistent disk usage is below 50%. If not, follow the procedure in Confirm Adequate Disk Space.

(Optional) Check the logs for errors before proceeding with the upgrade. For more information, see the Viewing Logs in the Command Line Interface section of the App Logging in Cloud Foundry topic.

Push and Scale a Test App

Check that a test app can be pushed and scaled horizontally, manually or through automated testing. This check ensures that the platform supports apps as expected before the upgrade.

Validate MySQL Cluster Health

If you are running PAS MySQL as a cluster, run the mysql-diag tool to validate health of the cluster.

For more information, see Running mysql-diag.

Review Pending and Recent Changes

To review pending and recent changes:

  1. Confirm there are no outstanding changes in Ops Manager or any other tile. All tiles should be green. Click Review Pending Changes, then Apply Changes if necessary.

  2. After applying changes, click Recent Install Logs to confirm that the changes completed cleanly:

    Cleanup complete
    {"type": "step\_finished", "id": "clean\_up\_bosh.cleaning\_up"}
    Exited with 0.
    

Export Your Installation

To export your installation:

  1. In your Ops Manager Installation Dashboard, click the account dropdown and select Settings.

    Upgrade to 1.9

  2. On the Settings screen, select Export Installation Settings from the left menu, then click Export Installation Settings.

    Export install settings

This exports the current PCF installation with all of its assets.

When you export an installation, the export contains the base VM images, necessary packages, and configuration settings, but does not include releases between upgrades if Ops Manager has already uploaded them to BOSH. When backing up PCF, you must take this into account by backing up the BOSH blobstore that contains the uploaded releases. BOSH Backup and Restore (BBR) backs up the BOSH blobstore. For more information, see Backing Up Pivotal Cloud Foundry with BBR.

  • The export time depends on the size of the exported file.
  • Some browsers do not provide feedback on the status of the export process and might appear to hang.

Note: Some operating systems automatically unzip the exported installation. If this occurs, create a ZIP file of the unzipped export. Do not start compressing at the “installation” folder level. Instead, start compressing at the level containing the config.yml file:

Compress

Warning: If you fail to perform the remedial steps for this issue, this upgrade process may corrupt your existing usage data.

Next Steps

Now that you have completed the Upgrade Preparation Checklist, continue to Upgrading Pivotal Cloud Foundry.

Complete Survey

Please take some time to help us improve this document by completing the Upgrade Checklist Survey.