About App Autoscaler

Page last updated:

This topic describes App Autoscaler. It includes information about default and custom scaling rules as well as App Autoscaler architecture.


App Autoscaler is a marketplace service that automatically scales apps in your environment based on app performance metrics or a schedule. This controls the cost of running apps while maintaining app performance.

You can use App Autoscaler to do the following:

  • Configure scaling rules that adjust app instance counts based on metrics thresholds
  • Modify the maximum and minimum number of instances for an app, either manually or following a schedule

For example, you can configure App Autoscaler to automatically scale down the number of instances for an app over the weekend. You can also configure App Autoscaler to automatically scale up the number of instances for an app when the value of the CPU Usage metric increases above a custom threshold.

Breaking Change: App Autoscaler relies on API endpoints from Loggregator’s Log Cache. If you disable Log Cache, App Autoscaler fails. For more information about Log Cache, see Loggregator Introduces Log Cache in Pivotal Application Service v2.2 Release Notes.

About App Autoscaler Scaling Rules

This section describes how App Autoscaler determines when to scale an app up or down.

It also provides information about the custom metrics, comparison metrics, and default metrics that you can use when you create scaling rules for an app in App Autoscaler.

How App Autoscaler Determines When to Scale

Every 35 seconds, App Autoscaler makes a decision about whether to scale up, scale down, or keep the same number of instances.

To make a scaling decision, App Autoscaler averages the values of a given metric for the most recent 120 seconds.

Note: Operators can modify the 35 second scaling interval and the 120 second metric collection interval for all apps within the org. For more information, see (Optional) Configure App Autoscaler in Configuring PAS.

App Autoscaler scales apps as follows:

  • Increment by one instance when any metric exceeds its maximum threshold.
  • Decrement by one instance only when all metrics fall below their minimum thresholds.
  • Keep the same number of instances when app metrics do not exceed thresholds.

The following diagram provides an example of how App Autoscaler makes scaling decisions:

Read the description after this diagram for a description of the example shown in the diagram.

As shown in the diagram above, an app has a maximum threshold of 200 milliseconds and a minimum threshold of 80 milliseconds for an HTTP latency metric.

If HTTP latency averages 220 milliseconds for 120 seconds, App Autoscaler scales the app up one instance.

If HTTP latency then averages 70 milliseconds over the next 120 second window and the app’s other scaling metrics also fall below their minimum thresholds, App Autoscaler scales the app down one instance.

If the average value for HTTP latency over a 120 second window is below the maximum threshold of 200 milliseconds and above the minimum threshold of 80 milliseconds, App Autoscaler maintains the same number of instances for the app.

You can also set a maximum and minimum number of instances. For example, if an app exceeds the maximum threshold of a given metric, but the number of instances is already at the maximum number of allowed instances, App Autoscaler does not scale up the app.

Default Metrics for Scaling Rules

App Autoscaler includes several default metrics for which you can create scaling rules.

Note: Pivotal recommends that you define custom metrics for scaling rules instead of using the default metrics. Custom metrics allow you to more accurately monitor the performance of your apps based on your environment.

The table below lists the default metrics for App Autoscaler:

Metric Description Notes
CPU Utilization Average CPU percentage for all instances of the app. App CPU utilization data may vary greatly based on the number of CPU cores on Diego Cells and app density. For more information, see App Autoscaler advisory for scaling Apps based on the CPU utilization in the Knowledge Base.
Container Memory Utilization Average memory percentage for all instances of the app.
HTTP Throughput Total HTTP requests per second (divided by the total number of app instances). Pivotal does not recommend using http_throughput as a scaling rule when logging volume is high in the system. For more information, see Autoscaling using HTTP Throughput & Latency metrics and HTTP throughput based Autoscaling rules do not fire in the Knowledge Base.
HTTP Latency Average latency of apps response to HTTP requests. This does not include Gorouter processing time or other network latency.
Average is calculated on the middle 99% or middle 95% of all HTTP requests.
RabbitMQ Depth The queue length of the specified queue.

Custom Metrics for Scaling Rules

Pivotal recommends that you define custom metrics for App Autoscaler scaling rules. Custom metrics allow you to define the metrics that are the best indicators of app performance for your environment.

You can configure apps to emit custom metrics out of the Loggregator Firehose using Metric Registrar. For steps on how to configure your apps to emit custom metrics with Metric Registrar, see Registering Custom App Metrics.

Comparison Metrics for Scaling Rules

You can use the Comparison Metric field in App Autoscaler to define a scaling rule that divides one custom metric by another.

When you add a scaling rule, the Metric field is the dividend and the Comparison Metric field is the divisor.

App Autoscaler Architecture

The following diagram shows the components and architecture of App Autoscaler. It also shows how App Autoscaler components interact with Pivotal Application Service (PAS) components to make app scaling decisions.

Two boxes represent the components of App Autoscaler. Several other boxes represent the Cloud Foundry components with which App Autoscaler interacts. The two boxes that represent App Autoscaler components are titled "Autoscale GO app" and "Autoscale api". "Autoscale GO app" and "Autoscale api" appear on the right side of the diagram. They are within a box called Autoscaling Space, which is within another box called System Org. This indicates that the App Autoscaler components run in a space that is within an org on your Cloud Foundry deployment. The diagram also includes several arrows. First, there is an arrow that points from "Autoscale GO app" to the Cloud Foundry Load Balancer and Gorouter. Additional arrows go from the load balancer and the Gorouter boxes to boxes titled Cloud Cache and Cloud Controller. These arrows indicate that the Autoscale app makes requests to the Log Cache and Cloud Controller for app metrics and that these requested are routed through the Load Balancer and Gorouter. There is also an arrow from "Autoscale GO app" that points to a box titled "MySQL proxy". The arrow pointing from the "Autoscale GO app" box to the MySQL proxy box indicates that the Autoscale app reads scaling rules that are stored in a MySQL databse. The diagram also includes arrows that point from "Autoscale api" to "MySQL proxy" and and box titled "UAA". These arrows indicate that the Autoscale API authenticates using UAA and that the API stores scaling rules in the MySQL database. There is also an arrow that points to "Autoscale api" from a box that represents both the Cloud Foundry Command Line Interface and Apps Manager. This arrow indicates that you can access the Autoscale API from either the Cloud Foundry command line interface or Apps Manager.

View a larger version of this image.

As demonstrated in the architecture diagram above, App Autoscaler makes scaling decisions based on autoscaling rules that users configure through either the Cloud Foundry Command Line Interface (cf CLI) or through Apps Manager. The Autoscale API stores these autoscaling rules in a MySQL database.

At a predefined interval, known as the scaling interval, the App Autoscaler app reads the scaling rules and retrieves app metric data from the Loggregator Log Cache. Then, App Autoscaler makes a scaling decision and communicates with the Cloud Controller to scale the app, if necessary.

For more information about Loggregator Log Cache, see Loggregator Architecture. For more information about the Cloud Controller, see Cloud Controller.