What Happens During PAS Upgrades
Page last updated:
This topic explains what happens to Pivotal Application Service (PAS) components and apps during a PAS upgrade.
During a PAS upgrade, BOSH drains all Diego Cell VMs that host app instances. BOSH manages this process by upgrading a batch of cells one at a time.
When BOSH triggers an upgrade, each upgrading Diego Cell enters evacuation mode. In evacuation mode, BOSH stops Diego cells and then schedules replacements for its app instances.
For more information, see the Specific Guidance for Diego Cells section of the Configuring PAS for Upgrades topic.
cf push is mostly available for the duration of a PAS upgrade. However,
cf push can become unavailable when a single VM is in use or during BOSH Backup and Restore (BBR).
For more information, see cf push Availability During Pivotal Application Service Upgrades.
This section describes the order in which Ops Manager upgrades components and runs tasks during a full platform upgrade. It also explains how the scale of different PAS components affects uptime during upgrades, and which components are scalable.
When performing an upgrade, Ops Manager first upgrades individual components, and then runs one-time tasks.
The Components section describes how Ops Manager upgrades PAS components and explains how individual component upgrades affect broader PAS capabilities.
The One-Time Tasks section lists the tasks that Ops Manager runs after it upgrades the PAS components.
Ops Manager upgrades PAS components in a fixed order that honors component dependencies and minimizes downtime and other system limitations during the upgrade process.
The type and duration of downtime and other limitations that you can expect during a PAS upgrade reflect the following:
Component instance scaling. See How Single-Component Scaling Affects Upgrades.
Component upgrade order. See Component Upgrade Order and Behavior.
In Pivotal Cloud Foundry (PCF) Ops Manager, the Resource Config pane in the PAS tile shows the components that the BOSH Director installs:
- Scalable component fields let you select the instance count from a range of settings or enter a custom value.
- Unscalable component fields allow a maximum of one instance.
When a component is scaled at a single instance, it can experience downtime and other limitations while the single VM restarts. This behavior might be acceptable for a test environment. To avoid downtime in a production environment, you should scale any scalable components, such as HAProxy, Router, and Diego Cells, to more than one instance.
For more information about how the scale of each component affects upgrade behavior, see Component Upgrade Order and Behavior.
Note: A full Ops Manager upgrade may take close to two hours, and you will have limited ability to deploy an app during this time.
The table below lists components in the order that Ops Manager upgrades each. It also lists which components are scalable and explains how component downtime affects PAS app and control availability. The table includes the following columns:
Scalable: Indicates whether the component is scalable above a single instance.
Note: For components marked with a checkmark in this column, we recommend that you change the preconfigured instance value of
1to a value that best supports your production environment. For more information about scaling a deployment, see High Availability in PAS.
Extended Downtime: Indicates that if there is only one instance of the component, that component is unavailable for up to five minutes during an Ops Manager upgrade.
Downtime Affects…: Indicates the plane of the PAS platform that component downtime affects, if the component is scaled at single instance:
- Apps: Downtime can affect app availability.
- Platform: Apps remain available during component downtime, but you cannot push, stage, or restart apps, or run other Cloud Foundry command-line interface (cf CLI) commands.
Other Limitations and Information: Provides the following information:
- Component availability, behavior, and usage during an upgrade
- Guidance on disabling the component before an upgrade
|Component||Scalable||Extended Downtime||Downtime Affects…||Other Limitations and Information|
|MySQL Proxy||✓||✓||✓||The MySQL Proxy is responsible for managing failover of the MySQL Servers. If the Proxy becomes unavailable, then access to the MySQL Server could be broken.|
|MySQL Server||✓||✓||✓||The MySQL Server is responsible for persisting internal databases for the platform. If the MySQL Server becomes unavailable, then platform services that rely upon a database (Cloud Controller, UAA) will also become unavailable.|
|Backup Restore Node|
|UAA||✓||✓||If a user has an active authorization token prior to performing an upgrade, they can still log in using either a UI or the CLI.|
|HAProxy||✓||✓||✓||HAProxy is used to load-balance incoming requests to the Router. If HAProxy is unavailable, you may lose the ability to make requests to apps unless there is another routing path from your load balancer to the Router.|
|Router||✓||✓||✓||The Router is responsible for routing requests to their app containers. If the Router is not available, then apps cannot receive requests.|
|Cloud Controller Worker||✓||✓|
|Diego Cell||✓||✓||✓||✓||If you only have one Diego Cell, upgrading causes downtime for all apps that run on it. These include apps pushed with
|Loggregator Trafficcontroller||✓||Ops Manager operators experience 2-5 minute gaps in logging.|
|Doppler Server||✓||Ops Manager operators experience 2-5 minute gaps in logging.|
|TCP Router (if enabled)||✓|
|CredHub||✓||✓||✓||✓||App downtime for apps that use secure credentials. Platform downtime for cf CLI commands such as
|Istio Router||✓||✓||✓||Downtime for this component only affects routes on mesh domains.|
|Route Syncer||✓||✓||✓||Downtime for this component only affects routes on mesh domains.|
After Ops Manager upgrades components, it performs system checks and launches UI apps and other PAS components as PCF apps. These tasks run in the following order:
|1||Apps Manager Errand - Push Apps Manager|
|2||Smoke Test Errand - Run smoke tests|
|3||Usage Service Errand - Push Usage Service app|
|4||Notifications Errand - Push Notifications app|
|5||Notifications UI Errand - Push Notifications UI|
|6||App Autoscaler Errand - Push App Autoscaler|
|7||App Autoscaler Smoke Test Errand - Run smoke tests against App Autoscaler|
|8||Register Autoscaling Service Broker|
|9||Destroy Autoscaling Service Broker|
|10||Bootstrap Errand - Recover MySQL cluster|
|11||MySQL Rejoin Unsafe Errand|
For sample performance measurements of an upgrading PCF installation, see Upgrade Load Example: Pivotal Web Services.