LATEST VERSION: 1.0 - CHANGELOG
Pivotal Cloud Cache for PCF v1.0

Pivotal Cloud Cache Operator Guide

This document describes how a Pivotal Cloud Foundry (PCF) operator can install, configure, and maintain Pivotal Cloud Cache (PCC) for PCF.

Requirements for Pivotal Cloud Cache

Minimum Version Requirements

PCC requires PCF with PCF Elastic Runtime v1.9.11 or later.

Installing Pivotal Cloud Cache

Follow the steps below to install PCC on PCF:

  1. Download the tile from the Pivotal Network.
  2. Upload the product to Ops Manager.
  3. Click Add next to the uploaded product description.
  4. Click on the Cloud Cache Tile and select configuration options.
  5. Click Apply Changes.

Configuring Tile Properties

Assign Availability Zones and Networks

  1. Select an Availability Zone (AZ) where your singleton virtual machines (VMs) will reside.
  2. Select the AZs used to distribute the GemFire VMs. We recommend selecting all of them.
  3. For Network, select your Elastic Runtime network.
  4. For Service Network, select the network to be used for GemFire VMs.

Settings: Smoke Tests

The Plan to use for smoke test field allows you to configure the smoke-tests errand that runs after tile installation to verify successful installation.

Select a plan from the list below to use when the smoke-tests errand runs. The smoke-tests errand uses the selected plan. Ensure the selected plan is enabled on the plan-configuration page. If the selected plan is not enabled, the smoke-tests errand will fail.

Pivotal recommends that you use the smallest four-server plan for smoke tests. Because smoke tests create and later destroy this plan, using a very small plan reduces installation time.

Configuring Smoke Tests Plan

By default, the smoke-test errand runs on the system org and the p-cloudcache-smoke-test space.

Settings: Allow outbound internet access from service instances

The Allow outbound internet access from service instances checkbox is unchecked by default.

The Allow outbound internet access from service instances may need to be checked to allow outbound internet traffic (this is IaaS dependent; see Ops Manager documentation for details). For configuring an external syslog endpoint, this box may need to be checked.

Allow outbound internet access from service instances is needed if BOSH is configured to use external blob store, then the PCC tile should be configured to allow external internet access.

Outbound Internet Access

Settings: External Syslog

Pivotal Cloud Cache supports forwarding syslog to an external log management service (e.g. papertrail, splunk, or your custom enterprise log sink). To enable remote syslog for the service broker set the “Syslog Host” and “Syslog Port” properties on the “Settings” tab.

Configuring Log Management Service

The broker logs are useful for debugging problems creating, updating, and binding service instances.

By default remote syslog will send unencrypted logs. You can enable TLS by selecting the “Enable TLS” radio button. You may need to provide the CA certificate of the log management service endpoint if the server certificate is not signed by a known authority (i.e. in the case of an internal syslog server). If there are several peer servers that may respond to remote syslog connections you will need to provide a regex in the “permitted peer” field (e.g. *.example.com).

Configuring Log Management Service TLS

By default, only the broker logs will be forwarded to your configured log management service. If you would like to forward server and locator logs from all service instances please check the “Send service instance logs to external syslog” checkbox.

Configuring Log Management Service for Service Instances

You may want to enable remote syslog for service instances if you would like to monitor the health of the clusters. However, this will generate a large volume of logs, which is why it is disabled by default. The broker logs only include information about service instance creation, not on-going cluster health. Please note that service instance logs will be sent to the same host and port configured for the broker logs.

Configuring Service Plans

You can configure five individual plans for your developers. Select the Plan 1 through Plan 5 tabs to configure each of them.

Configuring a Plan

The Enable Plan toggle is checked by default. If you do not want to add this plan to the CF service catalog, select Disable Plan. You must enable at least one plan.

The CF Service Access dropdown allows you to configure the plan’s visibility in the CF Marketplace. Enable Service Access will display the service plan to all developers in the CF marketplace. Disable Service Access will not display the service plan to developers in the CF marketplace and cannot be enabled at a later time. Leave Service Access Unchanged will not display the service plan in the CF marketplace by default but can be enabled at a later time.

The Plan Name text field allows you to customize the name of the plan. This plan name is displayed to developers when they view the service in the Marketplace.

The Plan Description text field allows you to supply a plan description. The description is displayed to developers when they view the service in the Marketplace.

The Service Instance Quota sets the maximum number of PCC clusters that can exist simultaneously.

When developers create or update a service instance, they can specify the number of servers in the cluster. The Maximum servers per cluster field allows operators to set an upper bound on the number of servers developers can request. If developers do not explicitly specify the number of servers in a service instance, a new cluster has the number of servers specified in the Default Number of Servers field.

The Availability zones for service instances setting determines which AZs are used for a particular cluster. The members of a cluster are distributed evenly across AZs.

WARNING After you’ve selected AZs for your service network, you cannot add additional AZs; doing so causes existing service instances to lose data on update.

The remaining fields control the VM type and persistent disk type for servers and locators. The total size of the cache is directly related to the number of servers and the amount of memory of the selected server VM type. We recommend the following configuration:

  • For the VM type for the Locator VMs field, select a VM that has at least 1 GB of RAM and 4 GB of disk space.
  • For the Persistent disk type for the Locator VMs field, select 10 GB or higher.
  • For the VM type for the Server VMs field, select a VM that has at least 4 GB of RAM and 8 GB of disk space.
  • For the Persistent disk type for the server VMs field, select 10 GB or higher.

When you finish configuring the plan, click Save to save your configuration options.

Stemcell

Ensure you import the correct type of stemcell indicated on this tab.

Upgrading Pivotal Cloud Cache

Follow the steps below to upgrade PCC on PCF:

  1. Download the new version of the tile from the Pivotal Network.
  2. Upload the product to Ops Manager.
  3. Click Add next to the uploaded product.
  4. Click on the Cloud Cache Tile and review your configuration options.
  5. Click Apply Changes.

Updating Pivotal Cloud Cache Plans

Follow the steps below to update plans in Ops Manager.

  1. Click on the Cloud Cache tile.
  2. Click on the plan you want to update under the Information section.
  3. Edit the fields with the changes you want to make to the plan.
  4. Click Save button on the bottom of the page.
  5. Click on the PCF Ops Manager to navigate to the Installation Dashboard.
  6. Click Apply Changes.

Plan changes are not applied to existing services instances until you run the upgrade-all-service-instances BOSH errand. You must use the BOSH CLI to run this errand. Until you run this errand, developers cannot update service instances.

Changes to fields that can be overridden by optional parameters, for example num_servers or new_size_percentage, will change the default value of these instance properties, but will not affect existing service instances.

If you change the allowed limits of an optional parameter, for example the maximum number of servers per cluster, existing service instances in violation of the new limits will not be modified.

When existing instances are upgraded, all plan changes will be applied to them.

Uninstalling Pivotal Cloud Cache

To uninstall Pivotal Cloud Cache, follow the steps from below from the Installation Dashboard:

  1. Click the trash can icon in the bottom-right-hand corner of the tile.
  2. Click Apply Changes.

Troubleshooting

View Statistics Files

You can visualize the performance of your cluster by downloading the statistics files from your servers. These files are located in the persistent store on each VM. To copy these files to your workstation, run the following command:

bosh scp server/0:/var/vcap/store/gemfire-server/statistics.gfs /tmp

See the Pivotal GemFire Installing and Running VSD topic for information about loading the statistics files into Pivotal GemFire VSD.

Smoke Test Failures

Error: “Creating p-cloudcache SERVICE-NAME failed”

The smoke tests could not create an instance of GemFire. To troubleshoot why the deployment failed, use the cf CLI to create a new service instance using the same plan and download the logs of the service deployment from BOSH.

Error: “Deleting SERVICE-NAME failed”

The smoke test attempted to clean up a service instance it created and failed to delete the service using the cf delete-service command. To trobleshoot this issue, run bosh logs to view the logs on the broker or the service instance to see why the deletion may have failed.

Error: Cannot connect to the cluster SERVICE-NAME

The smoke test was unable to connect to the cluster.

To troubleshoot the issue, review the logs of your load balancer, and review the logs of your CF Router to ensure the route to your PCC cluster is properly registered.

You also can create a service instance and try to connect to it using the gfsh CLI. This requires creating a service key.

Error: “Could not perform create/put on Cloud Cache cluster”

The smoke test was unable to write data to the cluster. The user may not have permissions to create a region or write data.

Error: “Could not retrieve value from Cloud Cache cluster”

The smoke test was unable to read back the data it wrote. Data loss can happen if a cluster member improperly stops and starts again or if the member machine crashes and is resurrected by BOSH. Run bosh logs to view the logs on the broker to see if there were any interruptions to the cluster by a service update.

General Connectivity

Client-to-server Communication

PCC Clients communicate to PCC servers on port 40404 and with locators on port 55221. Both of these ports must be reachable from the Elastic Runtime network to service the network.

Membership Port Range

PCC servers and locators communicate with each other using UDP and TCP. The current port range for this communication is 49152-65535.

If you have a firewall between VMs, ensure this port range is open.

BOSH Director and VMs on Different Networks

A deployment will fail if the VMs in the service network cannot reach the BOSH Director VM on another network. VMs on a different service network need to be able to talk to the BOSH Director.

If the VMs cannot communicate with the BOSH Director, the following error message appears:

Director task 1257
Started preparing deployment > Preparing deployment. Done (00:00:00)

Started preparing package compilation > Finding packages to compile. Done (00:00:00)

Started creating missing vms
Started creating missing vms > locator/d328ac30-fd48-424f-a28f-5087d94c07fd (0)
Started creating missing vms > locator/c41e4988-257c-47a4-bb97-1828dae15df4 (2)
Started creating missing vms > server/54f3690f-2366-405e-aca0-e0d630753e91 (0)
Started creating missing vms > server/cbbb4739-aae0-472f-9f7f-713d6ac15d07 (3)
Started creating missing vms > locator/36758e47-6f89-4003-968d-55620ca28e8a (1)
Started creating missing vms > server/abc4f156-1403-4187-a1f7-35f1ff4d961c (1)
Started creating missing vms > server/5a9c74e8-881e-4f49-99e3-0a708b2b583c (2). Failed: Timed out pinging to 37ebdd5e-4fe8-49cd-8a0f-cc72837a361c after 600 seconds (00:10:35)
Failed creating missing vms > server/cbbb4739-aae0-472f-9f7f-713d6ac15d07 (3): Timed out pinging to db1d846b-a2cd-4c49-942e-3cf1b48edac1 after 600 seconds (00:10:37)
Failed creating missing vms > locator/d328ac30-fd48-424f-a28f-5087d94c07fd (0): Timed out pinging to 578a74a3-5511-4263-adbd-5028c6b7d1ab after 600 seconds (00:10:38)
Failed creating missing vms > server/54f3690f-2366-405e-aca0-e0d630753e91 (0): Timed out pinging to 73aec92f-b563-4f50-b22a-283072455b6e after 600 seconds (00:10:39)
Failed creating missing vms > locator/36758e47-6f89-4003-968d-55620ca28e8a (1): Timed out pinging to 634738c2-2338-4d32-973a-63f91dcfd922 after 600 seconds (00:10:39)
Failed creating missing vms > locator/c41e4988-257c-47a4-bb97-1828dae15df4 (2): Timed out pinging to 35b63dff-a7ff-44e2-984d-481dd9d1337b after 600 seconds (00:10:41)
Failed creating missing vms > server/abc4f156-1403-4187-a1f7-35f1ff4d961c (1): Timed out pinging to e6a59079-974b-4c05-8ce3-d250b582a571 after 600 seconds (00:10:54)
Failed creating missing vms (00:10:54)

Error 450002: Timed out pinging to 37ebdd5e-4fe8-49cd-8a0f-cc72837a361c after 600 seconds

Task 1257 error
Create a pull request or raise an issue on the source for this page in GitHub