Pivotal Cloud Foundry v1.9

Upgrade Considerations for Selecting File Storage in Pivotal Cloud Foundry

Page last updated:

This topic describes critical factors to consider when evaluating the type of file storage to use in your Pivotal Cloud Foundry (PCF) deployment. The Elastic Runtime blobstore relies on the file storage system to read and write resources, app packages and droplets.

During an upgrade of PCF, file storage with insufficient IOPS numbers can negatively impact the performance and stability of your PCF deployment. However, the minimum required IOPS depends upon a number of deployment-specific factors and configuration choices.

Use this topic as a guide when deciding on the file storage configuration for your deployment.

Selecting Internal or External File Storage

When you deploy PCF, you can select internal file storage or external file storage (network-accessible or IaaS-provided) file storage as an option in the Elastic Runtime tile.

Selecting internal storage causes PCF to deploy a dedicated VM that uses either NFS or WebDAV for file storage. Selecting external storage allows you to configure file storage provided in network-accessible location or by an IaaS, such as Amazon S3, Google Cloud Storage or Azure Storage.

Whenever possible, Pivotal recommends using external file storage.

Calculating Potential Disk Load Requirements

The best effort calculation to figure out how performant (IOPS) your file storage needs is to determine a rough estimate of how many total bits need to move during a system upgrade.

Number of Diego Cells

As a first calculation, determine the number of Diego cells that your deployment is currently using.

To view the number of Diego cell instances currently running in your deployment, see the Resource Config section of your Elastic Runtime tile.

If you expect to scale up the number of instances, use the anticipated scaled number.

Note: If your deployment uses more than 20 Diego cells, you should avoid the use of internal file storage and always select external or IaaS-provided file storage.

Maximum In-Flight Load and Container Starts for Diego Cells

Operators can limit the number of containers and Diego cell instances that Diego starts concurrently. If no limits are imposed, then your file storage may undergo exceptionally heavy load during an upgrade.

To prevent overload, Cloud Foundry provides two major throttle configurations:

  • The maximum number of starting containers that Diego can start in Cloud Foundry. This is a deployment-wide limit. The default value and ability to override this configuration depends on the version of Cloud Foundry deployed. For information on how to configure this setting, see the Setting a Maximum Number of Started Containers topic.

  • The max_in_flight setting for the Diego cell job configured in the BOSH manifest. This configuration, expressed as a percentage or an integer, sets the maximum number of job instances that can be upgraded simultaneously. For example, if your deployment is running 10 Diego cell job instances and the configured max_in_flight value is 20%, then only 2 Diego cell job instances can start up at a single time.

The value of the above configurations depend on the version of PCF that you have have deployed and whether you have overridden the default value.

Refer to the following table for existing defaults and if necessary, determine the override value in your deployment.

PCF Version Starting Container Count Maximum Starting Container Count Overridable? Maximum In Flight Diego Cell Instances Maximum In Flight Diego Cell Instances Overridable?
PCF 1.7.43 and earlier No limit set No 1 instance No
PCF 1.7.44 to 1.7.49 200 No 1 instance No
PCF 1.7.50 + 200 No 1 instance No
PCF 1.8.0 to 1.8.29 No limit set No 10% of total instances No
PCF 1.8.30 + 200 Yes 10% of total instances No
PCF 1.9.0 to 1.9.7 No limit set No 10% of total instances Yes
PCF 1.9.8 + 200 Yes 10% of total instances Yes
PCF 1.10.0 and later 200 Yes 10% of total instances Yes

Calculating Upgrade Load Based on Number of App Instances and Droplet Size

Using the above numbers, you can determine a rough estimate of the expected upgrade load by multiplying the total number of expected app instances for all cells with the size of the instance droplets.

For example, if your deployment starts up 10 cells that each host 20 app instances and each app instance droplet is an average of 100 MB in size, then you potentially have 20 GB of data hitting the disk at the same time.

Depending on the IOPS capacity of your disk, this 20 GB of data will take a set amount of time to reassemble on a new disk. If this disk processing time takes too longer than the evacuation timeout for Diego cells, then Diego cells and app instances may take too long to start up, resulting in a cascading failure.

For more information on how Diego cells are upgraded, see Managing Diego Cell Limits During an Upgrade.

Related Links

Create a pull request or raise an issue on the source for this page in GitHub