About App Autoscaler
Page last updated:
This topic describes App Autoscaler. It includes information about default and custom scaling rules as well as App Autoscaler architecture.
Overview
App Autoscaler is a marketplace service that automatically scales apps in your environment based on app performance metrics or a schedule. This controls the cost of running apps while maintaining app performance.
You can use App Autoscaler to do the following:
- Configure scaling rules that adjust app instance counts based on metrics thresholds
- Modify the maximum and minimum number of instances for an app, either manually or following a schedule
For example, you can configure App Autoscaler to automatically scale down the number of instances for an app over the weekend. You can also configure App Autoscaler to automatically scale up the number of instances for an app when the value of the CPU Usage metric increases above a custom threshold.
Breaking Change: App Autoscaler relies on API endpoints from Loggregator’s Log Cache. If you disable Log Cache, App Autoscaler fails. For more information about Log Cache, see Loggregator Introduces Log Cache in Pivotal Application Service v2.2 Release Notes.
About App Autoscaler Scaling Rules
This section describes how App Autoscaler determines when to scale an app up or down.
It also provides information about the custom metrics, comparison metrics, and default metrics that you can use when you create scaling rules for an app in App Autoscaler.
How App Autoscaler Determines When to Scale
Every 35 seconds, App Autoscaler makes a decision about whether to scale up, scale down, or keep the same number of instances.
To make a scaling decision, App Autoscaler averages the values of a given metric for the most recent 120 seconds.
Note: Operators can modify the 35 second scaling interval and the 120 second metric collection interval for all apps within the org. For more information, see (Optional) Configure App Autoscaler in Configuring PAS.
App Autoscaler scales apps as follows:
- Increment by one instance when any metric exceeds its maximum threshold.
- Decrement by one instance only when all metrics fall below their minimum thresholds.
- Keep the same number of instances when app metrics do not exceed thresholds.
The following diagram provides an example of how App Autoscaler makes scaling decisions:
As shown in the diagram above, an app has a maximum threshold of 200 milliseconds and a minimum threshold of 80 milliseconds for an HTTP latency metric.
If HTTP latency averages 220 milliseconds for 120 seconds, App Autoscaler scales the app up one instance.
If HTTP latency then averages 70 milliseconds over the next 120 second window and the app’s other scaling metrics also fall below their minimum thresholds, App Autoscaler scales the app down one instance.
If the average value for HTTP latency over a 120 second window is below the maximum threshold of 200 milliseconds and above the minimum threshold of 80 milliseconds, App Autoscaler maintains the same number of instances for the app.
You can also set a maximum and minimum number of instances. For example, if an app exceeds the maximum threshold of a given metric, but the number of instances is already at the maximum number of allowed instances, App Autoscaler does not scale up the app.
Default Metrics for Scaling Rules
App Autoscaler includes several default metrics for which you can create scaling rules.
Note: Pivotal recommends that you define custom metrics for scaling rules instead of using the default metrics. Custom metrics allow you to more accurately monitor the performance of your apps based on your environment.
The table below lists the default metrics for App Autoscaler:
Metric | Description | Notes |
---|---|---|
CPU Utilization | Average CPU percentage for all instances of the app. | App CPU utilization data may vary greatly based on the number of CPU cores on Diego Cells and app density. For more information, see App Autoscaler advisory for scaling Apps based on the CPU utilization in the Knowledge Base. |
Container Memory Utilization | Average memory percentage for all instances of the app. | |
HTTP Throughput | Total HTTP requests per second (divided by the total number of app instances). | For TAS for VMs v2.8.8 and earlier, Pivotal does not recommend
using http_throughput as a scaling rule when logging volume is high
in the system.
For more information, see Autoscaling
using HTTP Throughput & Latency metrics and HTTP
throughput based Autoscaling rules do not fire in the Knowledge Base. |
HTTP Latency | Average latency of apps response to HTTP requests. This does not include Gorouter processing time or other network latency. Average is calculated on the middle 99% or middle 95% of all HTTP requests. |
|
RabbitMQ Depth | The queue length of the specified queue. |
Custom Metrics for Scaling Rules
Pivotal recommends that you define custom metrics for App Autoscaler scaling rules. Custom metrics allow you to define the metrics that are the best indicators of app performance for your environment.
You can configure apps to emit custom metrics out of the Loggregator Firehose using Metric Registrar. For steps on how to configure your apps to emit custom metrics with Metric Registrar, see Registering Custom App Metrics.
Comparison Metrics for Scaling Rules
You can use the Comparison Metric field in App Autoscaler to define a scaling rule that divides one custom metric by another.
When you add a scaling rule, the Metric field is the dividend and the Comparison Metric field is the divisor.
App Autoscaler Architecture
The following diagram shows the components and architecture of App Autoscaler. It also shows how App Autoscaler components interact with Pivotal Application Service (PAS) components to make app scaling decisions.
View a larger version of this image.
As demonstrated in the architecture diagram above, App Autoscaler makes scaling decisions based on autoscaling rules that users configure through either the Cloud Foundry Command Line Interface (cf CLI) or through Apps Manager. The Autoscale API stores these autoscaling rules in a MySQL database.
At a predefined interval, known as the scaling interval, the App Autoscaler app reads the scaling rules and retrieves app metric data from the Loggregator Log Cache. Then, App Autoscaler makes a scaling decision and communicates with the Cloud Controller to scale the app, if necessary.
For more information about Loggregator Log Cache, see Loggregator Architecture. For more information about the Cloud Controller, see Cloud Controller.