About App Autoscaler
Page last updated:
This topic describes App Autoscaler. It includes information about default and custom scaling rules as well as App Autoscaler architecture.
App Autoscaler is a Marketplace service that scales apps in your environment based on app performance metrics or a schedule. This controls the cost of running apps while maintaining app performance.
You can use App Autoscaler to do the following:
- Configure scaling rules that adjust app instance counts based on metrics thresholds
- Modify the maximum and minimum number of instances for an app, either manually or following a schedule
For example, you can configure App Autoscaler to scale down the number of instances for an app over the weekend. You can also configure App Autoscaler to scale up the number of instances for an app when the value of the CPU Usage metric increases above a custom threshold.
Breaking Change: App Autoscaler relies on API endpoints from Loggregator’s Log Cache. If you disable Log Cache, App Autoscaler fails. For more information about Log Cache, see Loggregator Introduces Log Cache in Pivotal Application Service v2.2 Release Notes.
This section describes how App Autoscaler decides when to scale an app up or down.
It also provides information about the custom metrics, comparison metrics, and default metrics that you can use when you create scaling rules for an app in App Autoscaler.
Every 35 seconds, App Autoscaler makes a decision about whether to scale up, scale down, or keep the same number of instances.
To make a scaling decision, App Autoscaler averages the values of a given metric for the most recent 120 seconds.
Note: Operators can edit the 35 second scaling interval and the 120 second metric collection interval for all apps within the org. For more information, see (Optional) Configure App Autoscaler in Configuring PAS.
App Autoscaler scales apps as follows:
- Increment by one instance when any metric exceeds its maximum threshold.
- Decrement by one instance only when all metrics fall below their minimum thresholds.
- Keep the same number of instances when app metrics do not exceed thresholds.
The following diagram provides an example of how App Autoscaler makes scaling decisions:
As shown in the diagram, an app has a maximum threshold of 200 milliseconds and a minimum threshold of 80 milliseconds for an HTTP latency metric.
If HTTP latency averages 220 milliseconds for 120 seconds, App Autoscaler scales the app up one instance.
If HTTP latency then averages 70 milliseconds over the next 120 second window and the app’s other scaling metrics also fall below their minimum thresholds, App Autoscaler scales the app down one instance.
If the average value for HTTP latency over a 120 second window is below the maximum threshold of 200 milliseconds and above the minimum threshold of 80 milliseconds, App Autoscaler maintains the same number of instances for the app.
You can also set a maximum and minimum number of instances. For example, if an app exceeds the maximum threshold of a given metric, but the number of instances is already at the maximum number of allowed instances, App Autoscaler does not scale up the app.
App Autoscaler includes several default metrics for which you can create scaling rules.
Note: Pivotal recommends that you define custom metrics for scaling rules instead of using the default metrics. Custom metrics allow you to more accurately monitor the performance of your apps based on your environment.
The following table lists the default metrics for App Autoscaler:
|CPU Utilization||Average CPU percentage for all instances of the app.||App CPU utilization data can vary greatly based on the number of CPU cores on Diego Cells and app density. For more information, see App Autoscaler advisory for scaling Apps based on the CPU utilization in the Knowledge Base.|
|Container Memory Utilization||Average memory percentage for all instances of the app.|
|HTTP Throughput||Total HTTP requests per second (divided by the total number of app instances).||For TAS for VMs v2.7.14 and earlier, Pivotal discourages
|HTTP Latency||Average latency of apps response to HTTP requests. This does not include Gorouter processing time or other network latency.
Average is calculated on the middle 99% or middle 95% of all HTTP requests.
|RabbitMQ Depth||The queue length of the specified queue.|
Pivotal recommends that you define custom metrics for App Autoscaler scaling rules. Custom metrics allow you to define the metrics that are the best indicators of app performance for your environment.
You can configure apps to emit custom metrics out of the Loggregator Firehose using Metric Registrar. For steps on how to configure your apps to emit custom metrics with Metric Registrar, see Registering Custom App Metrics.
You can use the Comparison Metric field in App Autoscaler to define a scaling rule that divides one custom metric by another.
When you add a scaling rule, the Metric field is the dividend and the Comparison Metric field is the divisor.
The following diagram shows the components and architecture of App Autoscaler. It also shows how App Autoscaler components interact with Pivotal Application Service (PAS) components to make app scaling decisions.
As demonstrated in the architecture diagram, App Autoscaler makes scaling decisions based on autoscaling rules that users configure by using either the Cloud Foundry Command Line Interface (cf CLI) or Apps Manager. The Autoscale API stores these autoscaling rules in a MySQL database.
At a predefined interval, known as the scaling interval, the App Autoscaler app reads the scaling rules and retrieves app metric data from the Loggregator Log Cache. Then, App Autoscaler makes a scaling decision and communicates with the Cloud Controller to scale the app, if necessary.