Scaling an App Using App Autoscaler

Page last updated:

Warning: Pivotal Cloud Foundry (PCF) v2.4 is no longer supported because it has reached the End of General Support (EOGS) phase as defined by the Support Lifecycle Policy. To stay up to date with the latest software and security updates, upgrade to a supported version.

This topic describes how Pivotal Cloud Foundry (PCF) users with the Space Developer role can set up and configure the App Autoscaler service in the Apps Manager UI to automatically scale apps based on rules that they set.

You can use the App Autoscaler command-line interface (CLI) plugin to configure App Autoscaler rules from the command line. For more information, see Using the App Autoscaler CLI.

Note: Space Managers, Space Auditors, and all Org roles do not have permission to use App Autoscaler. For help managing user roles, see Managing User Accounts and Permissions Using the Apps Manager.

App Autoscaler Overview

App Autoscaler is a marketplace service that automatically scales apps in your environment based on app performance metrics or a schedule. This controls the cost of running apps while maintaining app performance.

You can use App Autoscaler to do the following:

  • Configure scaling rules that adjust app instance counts based on metrics thresholds
  • Modify the maximum and minimum number of instances for an app, either manually or following a schedule

For example, you can configure App Autoscaler to automatically scale down the number of instances for an app over the weekend. You can also configure App Autoscaler to automatically scale up the number of instances for an app when the value of the CPU Usage metric increases above a custom threshold.

For information about configuring autoscaling rules and thresholds, see Configure Autoscaling for an App.

Breaking Change: App Autoscaler relies on API endpoints from Loggregator’s Log Cache. If you disable Log Cache, App Autoscaler will fail. For more information about Log Cache, see Loggregator Introduces Log Cache.

App Autoscaler Architecture

The following diagram shows the components and architecture of App Autoscaler. It also shows how App Autoscaler components interact with Cloud Foundry components to make app scaling decisions.

Two boxes represent the components of App Autoscaler. Several other boxes represent the Cloud Foundry components with which App Autoscaler interacts. The two boxes that represent App Autoscaler components are titled "Autoscale GO app" and "Autoscale api". "Autoscale GO app" and "Autoscale api" appear on the right side of the diagram. They are within a box called Autoscaling Space, which is within another box called System Org. This indicates that the App Autoscaler components run in a space that is within an org on your Cloud Foundry deployment. The diagram also includes several arrows. First, there is an arrow that points from "Autoscale GO app" to the Cloud Foundry Load Balancer and Gorouter. Additional arrows go from the load balancer and the Gorouter boxes to boxes titled Cloud Cache and Cloud Controller. These arrows indicate that the Autoscale app makes requests to the Log Cache and Cloud Controller for app metrics and that these requested are routed through the Load Balancer and Gorouter. There is also an arrow from "Autoscale GO app" that points to a box titled "MySQL proxy". The arrow pointing from the "Autoscale GO app" box to the MySQL proxy box indicates that the Autoscale app reads scaling rules that are stored in a MySQL databse. The diagram also includes arrows that point from "Autoscale api" to "MySQL proxy" and and box titled "UAA". These arrows indicate that the Autoscale API authenticates using UAA and that the API stores scaling rules in the MySQL database. There is also an arrow that points to "Autoscale api" from a box that represents both the Cloud Foundry Command Line Interface and Apps Manager. This arrow indicates that you can access the Autoscale API from either the Cloud Foundry command line interface or Apps Manager.

View a larger version of this image.

As demonstrated in the architecture diagram above, App Autoscaler makes scaling decisions based on autoscaling rules that users configure through either the Cloud Foundry Command Line Interface (cf CLI) or through Apps Manager. The Autoscale API stores these autoscaling rules in a MySQL database.

At a predefined interval, known as the scaling interval, the App Autoscaler app reads the scaling rules and retrieves app metric data from the Loggregator Log Cache. Then, App Autoscaler makes a scaling decision and communicates with the Cloud Controller to scale the app, if necessary.

For more information about Loggregator Log Cache, see Loggregator Architecture. For more information about the Cloud Controller, see Cloud Controller.

Set Up App Autoscaler

To use App Autoscaler, you must create an instance of the App Autoscaler service and bind it to any app you want to autoscale. You can do this using either the Apps Manager or from the Cloud Foundry Command Line Interface (cf CLI):

Note: Manual scaling overrides scaling rules that you configure with App Autoscaler. If you manually scale an app bound to an App Autoscaler service instance, the App Autoscaler instance automatically unbinds from that app, and the app scales to the manual setting. For more information, see Managing Apps and Service Instances Using Apps Manager.

To enable App Autoscaler for the app, do the following:

  1. In Apps Manager, select an app from the space in which you created the App Autoscaler service.

  2. Under Processes and Instances, enable Autoscaling. At the top of the image is the header 'Processes and Instances'. Below it is the subheader 'web', below which is a row of metrics: 'Instances: 2', 'Memory Allocated: 1 GB', and 'Disk Allocated: 512 MB'. To the right of these metrics is a grayed-out blue, rectangular button labeled 'Scale'. Below these metrics are a sliding button labeled 'Autoscaling' to the far left and the words 'Manage Autoscaling' in blue letters to the far right. Below these are a table with five columns labeled, from left to right, '#', 'CPU', 'Memory', 'Disk', and 'Uptime'. Below these are two rows. Under '#' are the number '0' in the first row and '1' in the second. Under 'CPU' is the text '0%' in both the first and second rows. Under 'Memory' is the text '6.63 MB' in the first row and '27.98 MB' in the second. Under 'Disk' is the text '51.07' in both the first and second rows. Under 'Uptime' is the text '4 d 21 hr 58 min' in the first row and '4 d 21 hr 56 min' in the second row.

  3. Click Manage Autoscaling to configure instance limits, scaling rules, and scheduled limit changes for the app. For more information, see Configure Autoscaling for an App.

    Manage autoscaling

About App Autoscaler Scaling Rules

This section describes how App Autoscaler determines when to scale an app up or down.

It also provides information about the custom metrics, comparison metrics, and default metrics that you can use when you create scaling rules for an app in App Autoscaler.

How App Autoscaler Determines When to Scale

Every 35 seconds, App Autoscaler makes a decision about whether to scale up, scale down, or keep the same number of instances.

To make a scaling decision, App Autoscaler averages the values of a given metric for the most recent 120 seconds.

App Autoscaler scales apps as follows:

  • Increment by one instance when any metric exceeds its maximum threshold.
  • Decrement by one instance only when all metrics fall below their minimum thresholds.
  • Keep the same number of instances when app metrics do not exceed thresholds.

The following diagram provides an example of how App Autoscaler makes scaling decisions:

Read the description after this diagram for a description of the example shown in the diagram.

As shown in the diagram above, an app has a maximum threshold of 200 milliseconds and a minimum threshold of 80 milliseconds for an HTTP latency metric.

If HTTP latency averages 220 milliseconds for 120 seconds, App Autoscaler scales the app up one instance.

If HTTP latency then averages 70 milliseconds over the next 120 second window and the app’s other scaling metrics also fall below their minimum thresholds, App Autoscaler scales the app down one instance.

If the average value for HTTP latency over a 120 second window is below the maximum threshold of 200 milliseconds and above the minimum threshold of 80 milliseconds, then App Autoscaler maintains the same number of instances for the app.

You can also set a maximum and minimum number of instances. For example, if an app exceeds the maximum threshold of a given metric, but the number of instances is already at the maximum number of allowed instances, App Autoscaler does not scale up the app.

For more information about setting a maximum and minimum number of instances, see Create or Modify Instance Limits.

Custom Metrics for Scaling Rules

Pivotal recommends that you define custom metrics for App Autoscaler scaling rules. Custom metrics allow you to define the metrics that are the best indicators of app performance for your environment.

You can configure apps to emit custom metrics out of the Loggregator Firehose using Metric Registrar. For steps on how to configure your apps to emit custom metrics with Metric Registrar, see Registering Custom App Metricsr.

Comparison Metrics for Scaling Rules

You can use the Comparison Metric field in App Autoscaler to define a scaling rule that divides one custom metric by another.

When you add a scaling rule, the Metric field is the dividend and the Comparison Metric field is the divisor.

Default Metrics for Scaling Rules

App Autoscaler includes several default metrics for which you can create scaling rules.

Note: Pivotal recommends that you define custom metrics for scaling rules instead of using the default metrics. Custom metrics allow you to more accurately monitor the performance of your apps based on your environment.

The table below lists the default metrics for App Autoscaler:

Metric Description Notes
CPU Utilization Average CPU percentage for all instances of the app. App CPU utilization data may vary greatly based on the number of CPU cores on Diego cells and app density. For more information, see PCF Autoscaler advisory for scaling Apps based on the CPU utilization.
Container Memory Utilization Average memory percentage for all instances of the app.
HTTP Throughput Total HTTP requests per second (divided by the total number of app instances). Pivotal does not recommend using http_throughput as a scaling rule when logging volume is high in the system. For more information, see Autoscaling using HTTP Throughput & Latency metrics and HTTP throughput based Autoscaling rules do not fire in the Pivotal Support Knowledge base.
HTTP Latency Average latency of apps response to HTTP requests. This does not include Gorouter processing time or other network latency.
Average is calculated on the middle 99% or middle 95% of all HTTP requests.
RabbitMQ Depth The queue length of the specified queue.

Configure Autoscaling for an App

App Autoscaler keeps instance counts within an allowable range defined by minimum and maximum values, or instance limits.

For more information about how App Autoscaler makes scaling decisions for an app, see How App Autoscaler Determines When to Scale.

Follow the procedures in the sections below to set any of the following:

Create or Modify Instance Limits

Follow these steps to manually modify instance limits:

Note: You can also schedule changes to your instance limits for a specific date and time. For more information, see Scheduled Limit Changes.

  1. In Apps Manager, navigate to the Overview page for your app and click Manage Autoscaling under Processes and Instances.

  2. In the Instance Limits section, set values for Minimum and Maximum.

    Instance limits

  3. Click Apply Changes.

Add or Delete Scaling Rules

The following procedures describe how to add and delete scaling rules for your apps with App Autoscaler.

For more information about scaling rules and metrics in App Autoscaler, see About App Autoscaler Scaling Rules.

Add a Scaling Rule

To add a scaling rule for an app:

  1. In the Manage Autoscaling pane, click Edit next to Scaling Rules. The Edit Scaling Rules pane appears.

  2. Click Add Rule.

  3. In the Select type dropdown, select the metric for the new scaling rule.

    Scaling rules

  4. Set the minimum and maximum thresholds for the metric using the table in the Scaling Rule Metrics section above as a guide.

  5. Select or fill in any other fields that appear under the threshold fields:

    • If you are adding an HTTP Latency rule, configure Percent of traffic to apply.
    • If you are adding a RabbitMQ depth rule, provide the name of the queue to measure.
    • If you are adding a Custom rule, enter your custom Metric.
    • If you are adding a Compare rule, enter values in the Metric and Comparison Metric fields.
  6. Click Save.

Delete a Scaling Rule

To delete a scaling rule for an app:

  1. Click the × icon next to the rule you want to delete.

  2. Click Save.

Create or Modify Scheduled Limit Changes

Because app demand often follows a weekly, daily, or hourly schedule, you can schedule App Autoscaler to change the allowable instance range to track expected surges or quiet periods.

Create or Modify a Scheduled Limit Change

  1. Click Edit next to Scheduled Limits.

  2. Click Add New to add a new scheduled limit or select an existing entry to modify by clicking EDIT next to the entry.

  3. Edit the following values:

    • Date and Time (local): Set the date and time of the change.
    • Repeat (Optional): Set the day of the week for which you want to repeat the change.
    • Min and Max: Set the allowable range within which App Autoscaler can change the instance count for an app.

      Scheduled changes

  4. Click Save.

To delete an existing entry, click the x icon next to the entry you want to delete.

Example: Scale Down for the Weekend

To schedule an app to scale down for a weekend, you can enter two rules as follows:

  1. Scale down to a single instance on Friday evening:
    • Date and Time (local): Dec, 2, 2018 and 7:00 PM
    • Repeat (Optional): Fr
    • Min and Max: 1 and 1
  2. Increase instances to between 3 and 5 on Monday morning:
    • Date and Time (local): Dec, 5, 2018 and 7:00 AM
    • Repeat (Optional): M
    • Min and Max: 3 and 5

App Autoscaler Events and Notifications

App Autoscaler logs all autoscaling events.

View Event History

To view all autoscaling events in the past 24 hours, click View More in the Event History section of the Manage Autoscaling pane.

Event history

Manage App Autoscaler Notifications

Autoscaler emails or texts its event notifications to all space users by default.

To subscribe or unsubscribe from autoscaling event notifications, do the following:

  1. Navigate to the Manage Notifications page of PCF.

    Note: If installed, Notifications Management should be available at https://notifications-ui.YOUR-SYSTEM-DOMAIN/preferences.

  2. Choose which notifications you want to receive from App Autoscaler: