Log and Metric Egress Architecture

Page last updated:

This topic describes the Loggregator architecture and components. It also describes the Ops Manager components that send BOSH-reported VM metrics to Loggregator.

Overview

Log and metric egress is the process by which logs and metrics are transported from the your deployment to destinations such as the cf CLI, monitoring tools, or other internal system components. These logs and metrics can either flow directly from the VMs that produce them to their consumers, or pass through the Loggregator Firehose.

Arrows are shown depicting the flow of information from VMs. One arrow points directly from VMs to Consumers. The other arrow points to the Firehose, which then sends to Consumers.

How Logs and Metrics Egress from VMs

Logs and metrics are sent from VMs via the following process:

A Forwarder Agent appears inside a square labeled VMs. Apps are shown sending to the Forwarder Agent, and Components are shown sending metrics to the the Forwarder Agent. Components also send logs to rsyslon, which then sends platform logs outside the system via syslog RFC5424. The Forwarder Agent sends to three downstream consumers. The Syslog Agent sends application logs via syslog RFC5424. The Metrics Agent exposes metrics via Prometheus endpoints. Finally, the Loggregator Agent sends metrics and application logs via syslog RFC5424.

  1. Applications and components on the system emit logs and metrics.
  2. Logs and metrics pass through two forwarders:
    • rsyslog, which sends platform logs in Syslog RFC5424 format.
    • The Forwarder Agent, which sends metrics and application logs to three agents (see below).
  3. Metrics and application logs are sent through each of the following:
    • The Syslog Agent sends application logs in Syslog RFC5424 format to aggregate and application log destinations.
    • The Metrics Agent exposes metrics for Prometheus-style scraping.
    • The Loggregator Agent sends metrics and application logs to the Loggregator Firehose.

How the Firehose Forwards Logs and Metrics

The Loggregator Firehose sends logs and metrics through the V1 and V2 Firehose APIs, via the following process:

A Loggregator Agent appears inside a square labeled
VMs. Loggregator Agents send to Dopplers, located on Doppler VMs. Dopplers, in turn, send to both
Traffic Controllers and Reverse Log Proxies. V1 Firehose Consumers receive logs and metrics from
Traffic Controllers, whereas V2 Firehose Consumers receive logs and metrics from Reverse Log
Proxies. V2 Firehose Consumers can also receive logs and metrics from Reverse Log Proxy (RLP) Gateways.

  1. The Loggregator Agent takes each log/metric and sends it to one downstream Doppler (distributing among a rondom subset of 5 Dopplers).
  2. Dopplers then make a copy of log/metric for each consumer and send them to the Traffic Controllers and Reverse Log Proxies for distribution.
  3. These two components forward logs and metrics differently:
    • Traffic Controllers receive Websocket connections from V1 Firehose Consumers, and send logs/metrics as V1 Envelopes.

      If any of these consumers start to fall behind, Traffic Controllers will log a slow consumer alert and disconnect that particular consumer.

    • Reverse Log Proxies receive gRPC connections from V2 Firehose Consumers, and send logs/metrics as V2 Envelopes.

      If a consumer falls behind, the envelopes will just be dropped.

    • Reverse Log Proxy Gateways (RLP Gateways) receive HTTP connections from V2 Firehose consumers, and send logs/metrics as JSON-encoded V2 Envelopes.

      Reverse Log Proxy Gateways connect to Reverse Log Proxies, rather than directly to Dopplers.

How Consumers Receive Logs and Metrics

Consumers of logs and metrics may receive them through a few different paths:

A Forwarder Agent appears inside a square labeled VMs. Apps are shown sending to the Forwarder Agent, and Components are shown sending metrics to the the Forwarder Agent. Components also send logs to rsyslon, which then sends platform logs outside the system via syslog RFC5424. The Forwarder Agent sends to three downstream consumers. The Syslog Agent sends application logs via syslog RFC5424. The Metrics Agent exposes metrics via Prometheus endpoints. Finally, the Loggregator Agent sends metrics and application logs via syslog RFC5424.

Log Cache provides a short-term snapshot of logs and metrics to the CF CLI and any CF Web UIs (such as Stratos or Tanzu Application Service Apps Manager). It receives logs and metrics either from the Firehose or directly from VMs via Syslog.

Observability integrations can consume logs and metrics via either:

  • connecting to the Loggregator Firehose, or
  • receiving logs in Syslog RFC 5424 format and scraping metrics from Prometheus endpoints.

Full Architecture References

Loggregator architecture includes components for collecting, storing, and forwarding logs and metrics.

Loggregator Architecture and Components

This section includes the following Loggregator architecture diagrams:

In VMware Tanzu Application Service for VMs (TAS for VMs) v2.10, Loggregator architecture uses Syslog Agents and removes Syslog Adapters.

Platform operators can also enable System Metrics Agents on all VMs deployed with TAS for VMs to collect system-level metrics.

Firehose Architecture

The following diagram shows how logs and metrics are transported from components and apps in your deployment to Loggregator Firehose consumers, such as nozzles, monitoring tools, or third-party software.

You can use this diagram to:

  • Understand the components that transport logs and metrics to the Firehose.
  • Diagnose performance issues related to logging and metrics.
  • Make decisions about how to scale the components or Firehose consumers described in this architecture.

The following diagram shows the architecture of a Loggregator deployment.

A Loggregator Agent appears in squares that depict 'Host VM' and 'Cloud Controller VMs'. Represented by arrows, the Loggregator Agent receives logs from a Forwarder Agent, which forwards logs received from a Syslog Agent. Loggregator Agents send the logs to Dopplers over gRPC communication. Dopplers then forward the logs to Traffic Controllers. For more descriptions of all of the components in Loggregator, see the 'Loggregator Components' section below.

View a larger version of this image.

The following are Loggregator components, as shown in the diagram above:

  • Loggregator Agent: Loggregator Agents run on both Ops Manager component VMs and Diego Cell VMs. They receive logs and metrics from the apps and Ops Manager components located on those VMs. Loggregator Agents then forward the logs and metrics to Dopplers.

  • Doppler: Dopplers receive logs and metrics from Loggregator Agents, store them in temporary buffers, and forward them to Traffic Controllers.

  • Traffic Controller: Traffic Controllers poll Doppler servers for logs and metrics, then translate these messages from the Doppler servers as necessary for external and legacy APIs. The Loggregator Firehose is located on the Traffic Controller.

  • Reverse Log Proxy: Reverse Log Proxies (RLPs) collect logs and metrics from Dopplers and forward them to Log Cache and Traffic Controllers. Operators can scale up the number of RLPs based on overall log volume.

  • Syslog Agents:

    Syslog Agents run on Ops Manager component VMs and host VMs to collect and forward logs to configured syslog drains. This includes syslog drains for individual apps as well as aggregate drains for all apps in your foundation. You can specify the destination for logs as part of the individual syslog drain or in the TAS for VMs tile.

  • Syslog Binding Cache: Syslog Agents can overwhelm CAPI when querying for existing bindings. This component acts a a proxy for the CAPI Binding query.

  • Firehose: The Firehose is a WebSocket endpoint that streams all the event data from a Ops Manager deployment. The data stream includes HTTP events, app logs, container metrics from apps, and metrics from Ops Manager platform components. The Firehose cf CLI plugin allows you to view the output of the Firehose. For more information about the Firehose plugin, see Installing the Loggregator Firehose Plugin for cf CLI.

  • Log Cache:

    Log Cache allows you to view logs and metrics over a specified period of time. The Log Cache includes API endpoints and a CLI plugin to query and filter logs and metrics. To download the Log Cache CLI plugin, see Cloud Foundry Plugins. The Log Cache API endpoints are available by default. For more information about using the Log Cache API, see Log Cache on GitHub.

  • rsyslog: While not part of the firehose itself, rsyslog is responsible for delivering logs from platform components to outside consumers.

Shared-Nothing Architecture

This section describes the shared-nothing architecture for transporting logs and metrics from your deployment.

Similar to the Loggregator Firehose Architecture, this shared-nothing architecture allows you to forward logs and metrics from your deployment to external and internal consumers.

In contrast to the Loggregator Firehose architecture, the shared-nothing architecture allows logs and metrics to pass through fewer components during egress. For example, the shared-nothing architecture does not require the Forwarder Agent, Syslog Agent, or Metrics Agent.

You can use this diagram to:

  • Understand the components that transport logs and metrics to the Firehose.
  • Diagnose performance issues related to logging and metrics.
  • Make decisions about how to best scale the Loggregator components or Firehose consumers described in this architecture.

A Forwarder Agent appears in squares that depict 'Diego Cell VMs' and 'Component VMs'. Within the Diego Cell VMs, applications, Diego Rep and Prom Scraper send logs and metrics to the Forwarder Agent. Within the Component VMs, Diego, Prom Scraper and Statsd Injector send metrics to the Forwarder Agent. The Forwarder Agent sends to both Syslog Agents and Metrics Agents on the same VM. Syslog Agents send logs to Syslog Drain Consumers and logs/metrics to the Log Cache Syslog Server via Syslog RFC 5424. Metrics Agents expose metrics via Prometheus exposition which can be scraped by Metric Scrapers. Metrics Discovery Registrars on the same VM register the Metrics Agents' endpoints with NATS for discovery by Metric Scrapers. Cloud Controller consumes app metrics from Log Cache, and developer tools consume logs and metrics via the Log Cache CF Auth Proxy. For more descriptions of all of the components in Loggregator, see the list of components below.

View a larger version of this image.

The following components are included in the Shared-Nothing Architecture, as shown in the diagram above:

  • Prom Scraper: Prom Scrapers run on both Ops Manager component VMs and Diego Cell VMs. They aggregate metrics from Ops Manager components located on those VMs via Prometheus exposition. Prom Scrapers then forward those metrics to Forwarder Agents.

  • Statsd Injector: Statsd Injectors run on Ops Manager component VMs. They receive metrics from Ops Manager components over the Statsd protocol. Statsd Injectors then forward those metrics to Forwarder Agents.

  • Forwarder Agent: Forwarder Agents run on both Ops Manager component VMs and Diego Cell VMs. They receive logs and metrics from the applications and Ops Manager components located on those VMs. Forwarder Agents then forward the logs and metrics to Loggregator Agents and other agents.

  • Syslog Agent:

    Syslog Agents run on Ops Manager component VMs and host VMs to collect and forward logs to configured syslog drains. This includes syslog drains for individual apps as well as aggregate drains for all apps in your foundation. You can specify the destination for logs as part of the individual syslog drain or in the TAS for VMs tile.

  • Syslog Binding Cache (not pictured): Syslog Agents can overwhelm CAPI when querying for existing bindings. This component acts as a caching proxy between the Syslog Agents and CAPI.

  • Metrics Agents: Metrics Agents receive metrics from Forwarder Agents, and make them available to Metric Scrapers via Prometheus Exposition.

  • Metrics Discovery Registrars: Metrics Discovery Registrars register each scrapeable endpoint with NATS, for discovery by Metrics Scrapers.

  • Log Cache:

    Log Cache allows you to view logs and metrics over a specified period of time. The Log Cache includes API endpoints and a CLI plugin to query and filter logs and metrics. To download the Log Cache CLI plugin, see Cloud Foundry Plugins. The Log Cache API endpoints are available by default. For more information about using the Log Cache API, see Log Cache on GitHub.

  • Log Cache Syslog Server: The Log Cache Syslog Server takes the place of the Log Cache Nozzle (only one will be active at a time). It receives logs and metrics from Syslog Agents and sends them to Log Cache.

System Metrics Agents Architecture

The following diagram shows the architecture of a Loggregator deployment that uses System Metrics Agents to collect VM and system-level metrics.

System Metrics Agent appear in squares that depict 'Host VMs'. Represented by arrows, the System Metrics Agents send VM system-level metrics to a System Metrics Scraper, which forwards these metrics to Loggregator Agents over mTLS. For more descriptions of all of the components in Loggregator, see the 'Loggregator Components' section below.

View a larger version of this image.

The following describes the components of a Loggregator deployment that uses System Metrics Agents, as shown in the diagram above:

  • Loggregator Agent: Loggregator Agents run on both Ops Manager component VMs and Diego Cell VMs. They receive logs and metrics from the apps and Ops Manager components located on those VMs. Loggregator Agents then forward the logs and metrics to Dopplers.

  • System Metrics Agent: A standalone agent to provide VM system metrics using a Prometheus-scrapeable endpoint.

  • System Metrics Scraper: The System Metrics Scraper forwards metrics from System Metrics Agents to Loggregator Agents over mTLS.

This section describes the Ops Manager components that forward BOSH-reported VM metrics to Loggregator. BOSH-reported VM metrics measure the health of BOSH-deployed VMs on which apps and Ops Manager components are deployed. Loggregator streams BOSH-reported VM metrics through the Firehose.

The following are the Ops Manager components that send BOSH-reported VM metrics to Loggregator:

  • BOSH Agent: BOSH Agents are located on component VMs and Diego Cell VMs. They collect metrics, such as Diego Cell capacity remaining, from the VM and forward them to the BOSH Health Monitor.

  • BOSH Health Monitor: The BOSH Health Monitor receives metrics from the BOSH Agents. It then forwards the metrics to a third-party service or to the BOSH System Metrics Forwarder.

  • BOSH System Metrics Plugin: This plugin reads health events, such as VM heartbeats and alerts from the BOSH Health Monitor, and streams them to the BOSH System Metrics Server.

  • BOSH System Metrics Server: The BOSH System Metrics Server streams metrics and heartbeat events to the BOSH System Metrics Forwarder over gRPC.

  • BOSH System Metrics Forwarder: The BOSH System Metrics Forwarder is located on the Loggregator Traffic Controller. It forwards heartbeat events from the BOSH System Metrics Server as envelopes to Loggregator through a Loggregator Agent.