Upgrade Preparation Checklist for PCF v2.5

This topic serves as a checklist for preparing to upgrade Pivotal Cloud Foundry (PCF) from v2.4 to v2.5.

This topic contains important preparation steps that you must follow before beginning your upgrade. Failure to follow these instructions may jeopardize your existing deployment data and cause the upgrade to fail.

After completing the steps in this topic, you can continue to Upgrading Pivotal Cloud Foundry.

Warning: Pivotal does not recommend that you skip minor versions when upgrading PCF. Skipping minor versions when upgrading PCF may result in breaking changes. To avoid additional breaking changes, upgrade PCF to the minor version that directly follows your current version of PCF.

Back Up Your PCF Deployment

Pivotal recommends backing up your PCF deployment before upgrading, to restore in the case of failure. To do this, follow the instructions in the Backing Up Pivotal Cloud Foundry with BBR topic.

Find Your Decryption Passphrase for Ops Manager

To complete the Ops Manager upgrade, you must have your Ops Manager decryption passphrase. You defined this decryption passphrase during the initial installation of Ops Manager.

Review Changes in PCF v2.5

Review each of the following links to understand the changes in the new release, such as new features, known issues, and breaking changes.

Migrate Apps to cflinuxfs3

PAS v2.5 patch versions v2.5.10 and later support pre-existing apps that were staged with cflinuxfs2 buildpacks, but do not support restaging on cflinuxfs2. PAS versions v2.5.0 to v2.5.9 do not support apps running on cflinuxfs2.

If your PAS v2.4 deployment hosts cflinuxfs2 apps, Pivotal recommends that you upgrade to the latest v2.5 and encourage developers to migrate all their apps and upgrade any custom buildpacks to cflinuxfs3. If you upgrade PAS to an older v2.5 patch version, apps staged on cflinuxfs2 stop running.

To work with developers to migrate all PAS-hosted apps to cflinuxfs3, do the following:

  1. Install the Stack Auditor plugin for the cf CLI. See Install Stack Auditor in Using the Stack Auditor Plugin.

  2. Audit stack usage to determine which apps need to be migrated from cflinuxfs2 to cflinuxfs3. You can list apps and their stacks for each org you have access to by running the following command. To see all the apps in your deployment, ensure that you are logged in to the cf CLI as a user who can access all orgs.

    cf audit-stack

    See the following example output:

    $ cf audit-stack
    first-org/development/first-app cflinuxfs2 
    first-org/staging/first-app cflinuxfs2
    first-org/production/first-app cflinuxfs2
    second-org/development/second-app cflinuxfs3
    second-org/staging/second-app cflinuxfs3
    second-org/production/second-app cflinuxfs3
  3. Communicate to developers that they must migrate their existing apps to cflinuxfs3 and begin pushing all new apps on cflinuxfs3. When changing stacks, developers may see errors related to new or changed libraries. If they do, they must update their app accordingly. Developers can migrate their apps using two methods:

    • cf push: See Restaging Apps on a New Stack in Changing Stacks.
    • cf change-stack: See Change Stacks in Using the Stack Auditor Plugin. This method does not require the source code of the app.

      Note: When changing the stack, the app experiences downtime. Pivotal recommends changing the stack using a zero-downtime strategy or at a time when you feel downtime is acceptable. To avoid the brief downtime, use a blue-green strategy. See Using Blue-Green Deployment to Reduce Downtime and Risk.

  4. Confirm that there are no apps running on cflinuxfs2 using the cf audit-stack command.

  5. Delete all buildpacks in your deployment that are associated with cflinuxfs2.

  6. Delete the cflinuxfs2 stack. This action cannot be undone, with the following exception: If you delete cflinuxfs2 and update to a later patch release of PAS v2.4, the stack returns and you must delete it again before upgrading to PAS v2.5.

    1. Run the following command:

      cf delete-stack cflinuxfs2
    2. When prompted, type cflinuxfs2 and press enter.

    See the following example:

    $ cf delete-stack cflinuxfs2
    Are you sure you want to remove the cflinuxfs2 stack? If so, type the name of the stack [cflinuxfs2]
    Deleting stack cflinuxfs2...
    Stack cflinuxfs2 has been deleted. 

    If you have any apps still running on cflinuxfs2, the command returns the following error:

    Failed to delete stack cflinuxfs2 with error: Please delete the app associations for your stack.   

Update Tiles and Add-Ons

The following section describes changes you must make to your product tiles and add-ons before upgrading PCF.

Review Tile Compatibility

Before you upgrade to PCF v2.5, check whether the service tiles that you currently have deployed on PCF v2.4 are compatible with PCF v2.5.

To check PCF versions supported by a service tile, either from Pivotal or a Pivotal partner:

  • Navigate to the tile’s download page on Pivotal Network.
  • Select the tile version in the Releases dropdown.
  • See the Depends On section under Release Details. For more information, refer to the tile’s release notes.

If the currently-deployed version of a tile is not compatible with PCF v2.5, you must upgrade the tile to a compatible version before you upgrade PCF. You do not need to upgrade tiles that are compatible with both PCF v2.4 and v2.5.

Some partner service tiles may be incompatible with PCF v2.5. Pivotal works with partners to ensure their tiles are updated to work with the latest versions of PCF. For more information about which partner service release compatibility, review the Depends On section of the partners tile download page, the partners services release documentation in Pivotal Documentation, or contact the partner organization that produces the service tile.

The Product Compatibility Matrix provides an overview of which PCF versions support which versions of the most popular service tiles from Pivotal.

Environment Details

Pivotal provides the empty table below as a model to print out or adapt for recording and tracking the tile versions that you have deployed in all of your environments.

Sandbox Non-Prod Prod Other…
Pivotal Cloud Foundry Ops Manager
Pivotal Application Service (PAS)
Pivotal Cloud Foundry Services MySQL v2
Single Sign On (SSO)
Spring Cloud Services
Pivotal Cloud Foundry Partner Services New Relic

Upgrade Services Tiles

Upgrade all service tiles to versions that are compatible with PCF v2.5. Service tiles are add-on products you install alongside your runtime. For example, MySQL for PCF, PCF Healthwatch, and RabbitMQ are service tiles.

Do not upgrade runtime tiles, such as PAS, PASW, or Pivotal Container Service (PKS), at this time.

Review the Compatibility Matrix and tile documentation to check version compatibility.

Install BOSH CLI v5.4.0 or Later

Install BOSH CLI v5.4.0 to avoid IP conflict errors that cause BOSH healthcheck tasks to fail to acquire locks. For more information, see BOSH health check tasks fails to acquire lock in the Pivotal Knowledge Database.

Configure BOSH Director

With each release of a new PCF version, BOSH Director may require specific updates before upgrading to the new version. See the following for what action to take before upgrading to PCF v2.5:

Check Required Machine Specifications

Check the required machine specifications for Ops Manager v2.5. These specifications are specific to your IaaS. If these specifications do not match your existing Ops Manager, modify the values of your Ops Manager VM instance. For example, if the boot disk of your existing Ops Manager is 50 GB and the new Ops Manager requires 100 GB, then increase the size of your Ops Manager boot disk to 100 GB.

Ensure CA is Trusted by BOSH Director

Starting in PAS v2.4.3, Apps Manager verifies SSL certificates for endpoints to which it proxies. For environments using self-signed certificates or certificates that are signed by a certificate authority that is not trusted by the BOSH Director, this may cause Apps Manager to show no content.

If you are upgrading to PAS v2.5 from PAS v2.4.2 or earlier, you can prevent this issue by adding the certificate authority using the Trusted Certificates field in the Security pane of the BOSH Director tile.

Configure PAS

With each release of a new PCF version, PAS may require specific updates before upgrading to the new version. See the following sections for what action to take before upgrading to PCF v2.5:

Deselect Option to Override Deployment Name to CF

Ensure that the Use “cf” as deployment name in emitted metrics instead of unique name option in the Advanced Features pane of the PAS tile is deselected. This is a requirement to upgrade successfully.

Additionally, if you have PCF Healthwatch installed and you changed the value of Use “cf” as deployment name in emitted metrics instead of unique name in PAS, you must run the Push Monitoring Components errand for Healthwatch to detect the change.

For more information, see Removed Option to Override Deployment Name to CF.

Configure Diego Cell Garbage collection

In the PAS tile Application Container pane, if Docker Images Disk-Cleanup Scheduling on Cell VMs is set to Clean up disk-space once threshold is reached and the value of Threshold of Disk-Used (MB) below has been changed from the default of 10240, then operators need to pick a new threshold. If the scheduling is set to Never clean up… or Routinely clean up…*, or the threshold value is set to the default, then no action is necessary and any threshold migrates to a sensible value. For more information see Options for Disk Cleanup.

Configure Gorouter with TLS

Before upgrading to PAS v2.5, you must secure the Gorouter with TLS or mutual TLS for PAS and Isolation Segment tiles.

The following sections describe how to enable routing with TLS or mutual TLS and scale the Diego cell VM CPU and RAM. If you do not have enough RAM on your Diego cell VM after enabling TLS routing, you are unable to stage tasks and app instances. App instances may also stop running.

Note: Gorouter with TLS or mutual TLS is not supported in the PASW tile.

Step 1: Enable TLS or Mutual TLS Routing

To enable TLS or mutual TLS routing, do the following:

  1. From the Ops Manager Installation Dashboard, go to the PAS tile.

  2. Go to the Application Containers pane.

  3. Under Router application identity verification, select either of the following options:

    • Router uses TLS to verify application identity
    • Router and applications use mutual TLS to verify each other’s identity
  4. Click Save.

Step 2: Determine Number of App Instances

Before you scale your Diego cell VM to handle TLS routing, you must determine the number of app instances running on your deployment.

See the following methods for how you can count your app instances. Choose the method that corresponds to your use case.

  • Deployments without Isolation Segments

    1. Access your platform metrics with your configured monitoring tool or with cf CLI Firehose nozzle plugin. For more information about the CLI Firehose nozzle plugin, see Installing the Loggregator Firehose Plugin for cf CLI. For more information about configuring a monitoring system, see Selecting and Configuring a Monitoring System.
    2. Find the LRPsDesired metric of the bbs job on the Diego Database VM. See the following example:
      origin:"bbs" eventType:ValueMetric
      timestamp:1541543212057232344 deployment:"cf" job:"diego_database"
      index:"b1f0c6d8-274b-4cfc-bfa1-a6feeb351802" ip:""
      tags:< key:"instance_id" value:"b1f0c6d8-274b-4cfc-bfa1-a6feeb351802" >
      tags:< key:"source_id" value:"bbs" > valueMetric:< name:"LRPsDesired"
      value:200 unit:"Metric" >
    3. Record the value of the LRPsDesired for each instance of the bbs job. You need the value for the procedure in the next section.
  • Deployments with Isolation Segments

    1. Access your platform metrics with your configured monitoring tool or with cf CLI Firehose nozzle plugin. For more information about the CLI Firehose nozzle plugin, see Installing the Loggregator Firehose Plugin for cf CLI. For more information about configuring a monitoring system, see Selecting and Configuring a Monitoring System.
    2. Find all ContainerCount metrics of the rep job of each Diego cell VM. See the following example:
      origin:"rep" eventType:ValueMetric
      timestamp:1541543910092859448 deployment:"cf" job:"diego_cell"
      index:"8007afda-3bff-4856-857f-a47a43cbf994" ip:""
      tags:< key:"instance_id" value:"8007afda-3bff-4856-857f-a47a43cbf994" >
      tags:< key:"source_id" value:"rep" > valueMetric:< name:
      "ContainerCount" value:200 unit:"Metric" >
    3. For each ContainerCount metric, record the value of the ContainerCount and the ip of the job.

Step 3: Scale Diego Cell VM

To support TLS or mutual TLS routing, you must have enough CPU and RAM for your Diego cell VM. TLS routing requires an additional 32 MB of RAM on your Diego cell per app instance.

To calculate and configure the amount of RAM you need for your Diego cell, choose one of the following methods for your use case:

  • Deployments without Isolation Segments

    For your PAS tile, do the following:

    1. Go to the Resource Config pane.
    2. In the Diego Cell row, see your current VM Type with the amount of RAM you currently have.
    3. Multiply the value you recorded in the previous section by 32. Add your solution to the amount of RAM you currently have.
    4. Select your new VM Type based on the amount of RAM you need.
    5. Click Save.
  • Deployments with Isolation Segments

    For each PAS and Isolation Segment tile in your foundation, do the following:

    1. Go to the Status tab and see the IP of your Diego cell. To determine the value that corresponds to this tile, match the IP to the ip metric you recorded in the previous section.
    2. Go to the Resource Config pane of the the tile.
    3. In the Diego Cell row, see your current VM Type with the amount of RAM you currently have.
    4. Multiply the value for this tile by 32. Add your solution to the amount of RAM you currently have.
    5. Select your new VM Type based on the amount of RAM you need.
    6. Click Save.

(Optional) Migrate from Metrics Forwarder to Metric Registrar

The Metric Registrar allows app developers to export custom app metrics to Loggregator. Metrics Forwarder for PCF is a service that also allows apps to emit custom metrics.

Metrics Forwarder for PCF is not supported in PAS v2.5. If you use Metrics Forwarder for PCF to emit custom app metrics to Loggregator, Pivotal recommends enabling and configuring the Metric Registrar before upgrading to PAS v2.5. The PAS tile does not deploy the Metric Registrar by default.

To enable the Metric Registrar, do the following:

  1. In the PAS tile, click Metric Registrar.

  2. Select the Enable Metric Registrar checkbox.

For more information about enabling the Metric Registrar in PAS, see (Optional) Configure Metric Registrar in Configuring PAS.

For more information about the Metric Registrar, see Registering Custom App Metrics.

(Optional) Disable Unused Errands

To save upgrade time, you can disable unused PAS post-deploy errands. See the Post-Deploy Errands section of the Errands topic for details. Only disable these errands if your environment does not need them.

In some cases, if you have previously disabled lifecycle errands for any installed product to reduce deployment time, you may want to re-enable these errands before upgrading. For more information, see the Adding and Deleting Products topic.

Check OS Compatibility of BOSH-Managed Add-Ons and Tiles

Before upgrading to PCF v2.5, operators who have deployed any PCF add-ons such as IPsec for PCF, ClamAV for PCF, or File Integrity Monitoring for PCF and who have deployed or are planning to deploy PASW must modify the add-on manifest to specify a compatible OS stemcell. For more information, see Pivotal Application Service for Windows.

For example, File Integrity Monitor for PCF (FIM) is not supported on Windows. Therefore, the manifest must use an include directive to specify the target OS stemcell of ubuntu-trusty and ubuntu-xenial.

Note: To upgrade to a Xenial stemcell, see the documentation for each add-on and follow the instructions.

To update an add-on manifest, do the following:

  1. Locate your existing add-on manifest file. For example, for FIM, locate the fim.yml you uploaded to the Ops Manager VM.

  2. Modify the manifest to include following include directive to your manifest:

          - os: ubuntu-xenial
  3. Upload the modified manifest file to your PCF deployment. For example instructions, see Create the FIM Manifest.

If you use any other BOSH-managed add-ons in your deployment, you should verify OS compatibility for those component as well. For more information about configuring BOSH add-on manifests, see the BOSH documentation.

Check Backup and Restore External Blobstore Add-On

If you have enabled external blobstore backups for an Azure Blobstore using the Blobstore Add-On, you must update your runtime configuration to remove the sdk-preview add-on before upgrading to PCF v2.5. If you do not remove this job, upgrading PAS fails with the error:

Preparing deployment: Preparing deployment (00:00:01)
  L Error: Colocated job 'azure-blobstore-backup-restorer' is already added to the instance group 'backup-restore'.

After removing this job from your runtime configuration, ensure that the Enable backup and restore checkbox is enabled in the PAS tile > File Storage pane.

Check Certificate Authority Expiration Dates

Depending on the requirements of your deployment, you may need to rotate your Certificate Authority (CA) certificates. The non-configurable certificates in your deployment expire every two years. You must regenerate and rotate them so that critical components do not face a complete outage.

Note: PCF uses SHA-2 certificates and hashes by default. You can convert existing SHA-1 hashes into SHA-2 hashes by rotating your Ops Manager certificates using the procedure described in the Regenerating and Rotating Non-Configurable TLS/SSL Certificates section of Managing TLS Certificates.

To retrieve information about all the RSA and CA certificates for the BOSH Director and other products in your deployment, you can use the GET /api/v0/deployed/certificates?expires_within=TIME request of the Ops Manager API.

In this request, the expires_within parameter is optional. Valid values for the parameter are d for days, w for weeks, m for months, and y for years. For example, to search for certificates expiring within one month, replace TIME with 1m:

$ curl "https://OPS-MAN-FQDN/api/v0/deployed/certificates?expires_within=1m" \
 -X GET \
 -H "Authorization: Bearer UAA_ACCESS_TOKEN"

For information about regenerating and rotating CA certificates, see Managing TLS Certificates.

Check the Capacity of Your Deployment

The following sections describe steps for ensuring your deployment has adequate capacity to perform the upgrade.

Confirm Adequate Disk Space

Confirm that you have adequate disk space for your upgrades. You need at least 20 GB of free disk space to upgrade PCF Ops Manager and Pivotal Application Service. If you plan to upgrade other products, the amount of disk space required depends on how many tiles you plan to deploy to your upgraded PCF deployment.

To check current persistent disk usage, select the BOSH Director tile from the Installation Dashboard. Select Status and review the value of the PERS. DISK column. If persistent disk usage is higher than 50%, select Settings > Resource Config, and increase your persistent disk space to handle the size of the resources. If you do not know how much disk space to allocate, set the value to at least 100 GB.

Check Diego Cell RAM and Disk

Check that Diego cells have sufficient available RAM and disk capacity to support app containers.

The KPIs that monitor these these resources are are:

  • rep.CapacityRemainingMemory
  • rep.CapacityRemainingDisk

Adjust Diego Cell Limits

If needed, adjust the maximum number of Diego cells that the platform can upgrade simultaneously, to avoid overloading the other cells. See Limit PCF Component Instances During Restart.

For PCF v1.10 and later, the maximum number of cells that can update at once, max_in_flight is 4%. This setting is configured in the BOSH manifest’s Diego cell definition. For more information, see the Prevent Overload section.

Review the Diego Cell Metrics section of the KPI topic for more information about these KPIs.

Review File Storage IOPS and Other Upgrade Limiting Factors

During the PCF upgrade process, a large quantity of data is moved around on disk.

To ensure a successful upgrade of PCF, verify that your underlying PAS file storage is performant enough to handle the upgrade. For more information about the configurations to evaluate, see Upgrade Considerations for Selecting Pivotal Cloud Foundry Storage.

In addition to file storage IOPS, consider additional existing deployment factors that can impact overall upgrade duration and performance:

Factor Impact
Network latency Network latency can contribute to how long it takes to move app instance data to new containers.
Number of ASGs A large number of Application Security Groups in your deployment can contribute to an increase in app instance container startup time.
Number of app instances and application growth A large increase in the number of app instances and average droplet size since the initial deployment can increase the upgrade impact on your system.

To review example upgrade-related performance measurements of an existing production Cloud Foundry deployment, see the Pivotal Web Services Performance During Upgrade topic.

Run BOSH Clean-Up

Run bosh -e ALIAS clean-up --all to clean up old stemcells, releases, orphaned disks, and other resources before upgrade. This cleanup helps prevent the product and stemcell upload process from exceeding the BOSH Director’s available persistent disk space.

Check the Health of Your Deployment

The following sections describe steps for ensuring your deployment is healthy before you perform the upgrade.

Collect Foundation Health Status

For collecting foundation health status, Pivotal recommends PCF Healthwatch, which monitors and alerts on the current health, performance, and capacity of PCF. For more information, see the PCF Healthwatch documentation.

If you are not using PCF Healthwatch, you can do some or all of the following to collect foundation health status:

  • If your PCF deployment has external metrics monitoring set up, verify that VM CPU, RAM, and disk use levels are within reasonable levels.
  • Run BOSH CLI commands to check system status:
    • bosh -e ALIAS -d DEPLOYMENT_NAME instances --ps.

      bosh instances with the flags --ps, --vitals, or --failing highlights individual job failure.

    • bosh -e ALIAS vms --vitals. This reveals VMs with high CPU, high memory, high disk utilization, and with state != running.
    • bosh -e ALIAS -d DEPLOYMENT_NAME cck --report
  • Check Ops Manager GUI each PAS/Tiles the status page for CPU/RAM/DISK utilization
  • Validate Ops Manager persistent disk usage is below 50%. If not, follow the procedure in Confirm Adequate Disk Space.

(Optional) Check the logs for errors before proceeding with the upgrade. For more information, see Viewing Logs in the Command Line Interface.

Push and Scale a Test App

Check that a test app can be pushed and scaled horizontally, manually or through automated testing. This check ensures that the platform supports apps as expected before the upgrade.

Validate MySQL Cluster Health

If you are running PAS MySQL as a cluster, run the mysql-diag tool to validate health of the cluster.

See the BOSH CLI v2 instructions in the Running mysql-diag topic.

Review Pending and Recent Changes

  1. Confirm there are no outstanding changes in Ops Manager or any other tile. All tiles should be green. Click Review Pending Changes, then Apply Changes if necessary.

  2. After applying changes, click Recent Install Logs to confirm that the changes completed cleanly:

    Cleanup complete
    {"type": "step_finished", "id": "clean_up_bosh.cleaning_up"}
    Exited with 0.

Export Your Installation

To export your installation, do the following:

  1. In your Ops Manager v2.4 Installation Dashboard, click the account dropdown and select Settings.

    Upgrade to 1.9

  2. On the Settings screen, select Export Installation Settings from the left menu, then click Export Installation Settings.

    Export install settings

This exports the current PCF installation with all of its assets.

When you export an installation, the export contains the base VM images, necessary packages, and configuration settings, but does not include releases between upgrades if Ops Manager has already uploaded them to BOSH. When backing up PCF, you must take this into account by backing up the BOSH blobstore that contains the uploaded releases. BOSH Backup and Restore (BBR) backs up the BOSH blobstore. For more information, see Backing Up Pivotal Cloud Foundry with BBR.

  • The export time depends on the size of the exported file.
  • Some browsers do not provide feedback on the status of the export process and might appear to hang.

Note: Some operating systems automatically unzip the exported installation. If this occurs, create a ZIP file of the unzipped export. Do not start compressing at the “installation” folder level. Instead, start compressing at the level containing the config.yml file:


WARNING: If you fail to perform the remedial steps for this issue, this upgrade process may corrupt your existing usage data.

Next Steps

Now that you have completed the Upgrade Preparation Checklist for PCF v2.5, continue to Upgrading Pivotal Cloud Foundry.

Complete Survey

Please take some time to help us improve this document by completing the Upgrade Checklist Survey.