Logs, Metrics, and Nozzles
Page last updated:
Warning: Pivotal Operations Manager v2.4 is no longer supported because it has reached the End of General Support (EOGS) phase as defined by the Support Lifecycle Policy. To stay up to date with the latest software and security updates, upgrade to a supported version.
This topic explains how to integrate PCF services with Cloud Foundry’s logging system, Loggregator, by writing to and reading from its Firehose endpoint.
Overview
Cloud Foundry’s Loggregator logging system collects logs and metrics from PCF apps and platform components and streams them to a single endpoint, Firehose. Your tile can integrate its service with Loggregator in two ways:
By sending your service component logs and metrics to the Firehose, to be streamed along with PCF core platform component logs and metrics
By installing a nozzle on Firehose that directs Firehose data to be consumed by external services or apps – a built-in nozzle can enable a service to:
- Drain metrics to an external dashboard product for system operators
- Send HTTP request details to search or analysis tools
- Drain app logs to an external system
- Auto-scale itself based on Firehose metrics, as detailed in this YouTube video
For a real world production example of a nozzle see Firehose-to-syslog in GitHub.
Firehose Communication
PCF components publish logs and metrics to the Firehose through Loggregator agent processes that run locally on the component VMs. Loggregator agents input the data to the Loggregator system through a co-located Loggregator agent. To see how logs and metrics travel from PCF system components to the Firehose, see the Cloud Foundry documentation.
Component VMs running PCF services can publish logs and metrics the same way, by including the same component, Loggregator Agent. Historically, components used Metron for this communication.
HTTPS Protocol
To enable a service component to supply logs and metrics to the Firehose through encrypted communications, do the following:
Include a Loggregator agent in the service component’s template definitions.
For example:name: service label: Service templates: - name: service release: service manifest: | - name: bpm release: bpm properties: {} - name: loggregator_agent release: loggregator-agent consumes: doppler: deployment: cf-e8e79eaed2a50130f206 properties: deployment: generator loggregator: tls: ca_cert: (( $ops_manager.ca_certificate )) agent: cert: ((CERTIFICATE)) key: ((KEY))
Where
CERTIFICATE
andKEY
are the values used for mutual TLS communication. For example,.properties.agent_certificate.cert_pem
and.properties.agent_certificate.private_key_pem
.Make the Ops Manager CA certificate generate and sign the certificate needed for mutual TLS communication. Do so with the following properties:
- name: agent_certificate type: rsa_cert_credentials label: Agent Security Certificate configurable: false default: domains: - agent.(( ..cf.cloud_controller.system_domain.value )) description: mTLS Certificate for Agent
Nozzles
A nozzle is a component dedicated to reading and processing data that streams from Firehose.
A service tile can install a nozzle as either a managed service, with package type bosh-release
, or as an app pushed to Pivotal Application Service (PAS), with the package type app
.
Develop a Nozzle
Pivotal recommends developing a nozzle in Go to leverage the NOAA library. NOAA does the heavy lifting of establishing an authenticated websocket connection to the logging system as well as de-serializing the protocol buffers.
Draining the logs consists of:
- Authenticating
- Establishing a connection to the logging system
- Forwarding events on to their ultimate destination
Authenticate against the API (https://github.com/cloudfoundry-community/go-cfclient)
with a user in the doppler.firehose
group:
import "github.com/cloudfoundry-community/go-cfclient"
...
config := &cfclient.Config{
ApiAddress: apiUrl,
Username: username,
Password: password,
SkipSslValidation: sslSkipVerify,
}
client, err := cfclient.NewClient(config)
Using the client’s token, create a consumer and connect to Firehose with a subscription ID. The ID is important because Firehose looks for connections with the same ID and only sends an event to one of those connections. A nozzle developer can run two or more instances to prevent message loss during upgrades an other deployments.
token, err := client.GetToken()
consumer := consumer.New(config.TrafficControllerURL, &tls.Config{
InsecureSkipVerify: config.SkipSSL,
}, nil)
events, errors := consumer.Firehose(firehoseSubscriptionID, token)
Firehose
gives back two channels, one for events and one for errors.
The events channel receives the following six types of events.
- ValueMetric represents some platform metric at a point in time, emitted by platform components. For example, how many
2xx
responses the router has sent out. - CounterEvent represents an incrementing counter, emitted by platform components. For example, a Diego cell’s remaining memory capacity.
- Error represents an error in the originating process.
- HttpStartStop represents HTTP request details, including both app and platform requests.
- LogMessage represents a log message for an individual app.
- ContainerMetric represents application container information. For example, memory used.
For the full details on events, see dropsonde protocol in GitHub.
The above events show how this data targets two different personae: platform operators and app developers. Keep this in mind when designing an integration.
The doppler.firehose
scope gets nozzle data for every app as well as the platform.
Any filtering based on the event payload is the nozzle implementor’s responsibility.
An advanced integration could combine a
service broker with a nozzle to:
- Let app developers opt in to logging (implementing filtering in the nozzle)
- Establish SSO exchange for authentication so that developers only can access logs for their space’s apps
For a full working example (suitable as an integration starting point), see firehose-nozzle.
Deploy a Nozzle
Once you have built a nozzle, you can deploy it as a managed service or as an app.
Visit managed service for more details on what it means to be a managed service. See also this example nozzle BOSH release.
You can also deploy the nozzle as an app on PAS. Visit the Tile Generator’s section on pushed apps for more details.
Example Nozzles
There are several open source examples you could use as a reference for building your nozzle.
firehose-nozzle simply writes to standard out. It is a useful starting point as scaffolding, tests, and more are already in place.
example-nozzle in a single file implementation with no tests.
gcp-tools-release drains component syslogs and health data in addition to nozzle data. It shows how to work with a BOSH add-on for additional data outside a nozzle. The nozzle is managed through BOSH. Raw logs and metrics data take different paths in the source.
firehose-to-syslog includes implementation code that adds additional metadata, which might be needed for the access control list (ACL) app name, space UUID and name, and org UUID and name.
logsearch-for-cloudfoundry packages this nozzle as a BOSH release.
splunk-firehose-nozzle has source code based on
firehose-to-syslog
and is packaged to run an app on PCF.datadog-firehose-nozzle is another real world implementation.
Log Format for PCF Components
Pivotal’s standard log format adheres to the RFC-5424 syslog protocol, with log messages formatted as follows:
<${PRI}>${VERSION} ${TIMESTAMP} ${HOST_IP} ${APP_NAME} ${PROD_ID} ${MSG_ID} ${SD-ELEMENT-instance} ${MESSAGE}
The Syslog Message Elements table immediately below describes each element of the log, and the Structured Instance Data Format table describes the contents of the structured data element that carries Cloud Foundry VM instance information.
Syslog Message Elements
This table describes each element of a standard PCF syslog message.
Syslog Message Element | Meaning or Value |
---|---|
${PRI} |
Priority value (PRI), calculated as Pivotal uses a Facility Code value of If in doubt, default to |
${VERSION} |
1 |
${TIMESTAMP} |
The timestamp of when the log message is forwarded; typically slightly after it was generated. Example: |
${HOST_IP} |
Internal IP address of origin server |
${APP_NAME} |
Process name of the program the generated the message. Prefixed with
You can derive this process name from either the program name configured for the local Metron agent or the |
${PROD_ID} |
The Process ID of the syslog process doing the forwarding. If this is not easily available, default to - (hyphen) to indicate unknown. |
${MSG_ID} |
The type of log message. If this is not easily available, default to - (hyphen) to indicate unknown. |
${SD-ELEMENT-instance} |
Structured data (SD) relevant to PCF about the source instance (VM) that originates the log message. See the Structured Instance Data Format table below for content and format. |
${MESSAGE} |
The log message itself, ideally in JSON |
RFC-5424 Severity Codes
PCF components generate log messages with the following severity levels. The most common severity level is 13
.
Severity Code | Meaning |
---|---|
8 |
Emergency: system is unusable |
9 |
Alert: action must be taken immediately |
10 |
Critical: critical conditions |
11 |
Error: error conditions |
12 |
Warning: warning conditions |
13 |
Notice: normal but significant condition |
14 |
Informational: informational messages |
15 |
Debug: debug-level messages |
Structured Instance Data Format
The RFC-5424 syslog protocol includes a structured data element that people can use as they see fit. Pivotal uses this element to carry VM instance information as follows:
SD-ELEMENT-instance element |
Meaning |
---|---|
${ENTERPRISE_ID} |
Your Enterprise Number, as listed by the Internet Assigned Numbers Authority (IANA) |
${DIRECTOR} |
The BOSH director managing the deployment. |
${DEPLOYMENT} |
BOSH spec.deployment value |
${INSTANCE_GROUP} |
BOSH instance_group , currently spec.job.name |
${AVAILABILITY_ZONE} |
BOSH spec.az value |
${ID} |
BOSH spec.id value. This is a UUID, not an index. It is necessary because BOSH Availability Zone index values are not always unique or sequential. |
Making Sense of Metrics
Monitoring Pivotal Cloud Foundry has a great rundown of the various metrics and how to make them useful.