Sizing App Metrics for Your System

Page last updated:

This topic describes how operators configure App Metrics depending on their deployment size. Operators can use these procedures to optimize App Metrics for high capacity or to reduce resource usage for smaller deployment sizes.

After your deployment has been running for a while, use the information in this topic to scale your running deployment.

If you are not familiar with the App Metrics components, review App Metrics Product Architecture before reading this topic.

App Metrics depends on 3 datastores, the Metric Store, the Log Store, and the Postgres database.

Metric Store tile is separate, and will mostly be discussed elsewhere. The Log Store and Postgres database are part of the App Metrics tile, and so their scaling will be discussed more in detail here.

Scale the Metrics Datastore

App Metrics retrieves metrics from Metric Store, which has its own configuration for resizing. For information, see Metric Store documentation.

Currently, VMware recommends that you scale vertically rather than horizontally. See Limitations in the Metric Store documentation.

Suggested Sizing by Deployment Size

Use the following tables as a guide for configuring resources for your deployment.

Estimate the size of your deployment according to how many apps are expected to be deployed.

SizePurposeApproximate number of app instances
SmallTest use100
MediumProduction use5,000
LargeProduction use15,000
  • Metrics App — App Metrics deploys the app, appmetrics. This app is responsible for serving the UI, proxying requests from the browser to metric and log store, and creating notifications for the user-created monitors.

    The appmetrics app should be scaled to 1 instance.

Deployment Resources for a Small Deployment

Example resource configuration to store approximately 6 weeks of data for a small deployment, about 100 application instances:

Job Instances Persistent Disk Type VM Type
PostgreSQL 1 (not configurable) 10GB CPU Optimized with greater than or equal to 4 CPUs and greater than or equal to 8GB of memory
Log Store 3 200GB Memory Optimized with greater than or equal to 2 CPUs and greater than or equal to 16GB of memory

Deployment Resources for a Medium Deployment

Example resource configuration to store approximately 6 weeks of data for a medium deployment, about 5000 application instances:

Job Instances Persistent Disk Type VM Type
PostgreSQL 1 (not configurable) 10GB CPU Optimized with greater than or equal to 8 CPUs and greater than or equal to 16GB of memory
Log Store 3 500GB Memory Optimized with greater than or equal to 4 CPUs and greater than or equal to 32GB of memory

Deployment Resources for a Large Deployment

Example resource configuration to store approximately 6 weeks of data for a large deployment, about 15,000 application instances:

Job Instances Persistent Disk Type VM Type
PostgreSQL 1 (not configurable) 10GB CPU Optimized with greater than or equal to 8 CPUs and greater than or equal to 16GB of memory
Log Store 6 500GB Memory Optimized with greater than or equal to 8 CPUs and greater than or equal to 64GB of memory

Scale the Log Datastore

App Metrics retrieves logs from Log Store.

By default, App Metrics ships with three xlarge VMs with 500 GB of persistent disk. They can be configured in the App Metrics tile Resource Config.

The scaling of an individual log-store deployment is subject to many variables, including:

  • Required retention time
  • Replication factor
  • Log ingress volume
  • The average size of log messages

Note: Horizontally scaling Log Store currently results in data loss. Vertical scaling is recommended at this time

Considerations for Scaling

Scaling Baseline

6x VMs at 8 core, 64 GB RAM, 500 GB persistent disk provides approximately 42 days of log retention for an environment emitting 40,000 logs per second.

Scaling Recommendations

Maximum ingress throughput is primarily dependent on the number of VMs, with secondary consideration to CPU and Memory resources. Retention and Replication Factor primarily depend on the size of persistent disks attached to the VMs. Abnormally high cardinality of indexed information (particularly source_id and instance_id) can place pressure on VM Memory. This is possible even in the absence of high log volume.