Backing Up and Restoring Service Instances
Both disaster recovery and system validation depend upon backups. A PCC service instance backup consists of the configuration for a PCC service instance together with all region data that has been persisted to disk. A PCC service instance backup may be used to restore the service on a new, but unconfigured PCC service instance.
Before You Begin
WARNING: Consider each of these important items when preparing to backup and restore a PCC service instance. Since recovery depends on a correct implemention, address each item during planning.
- The backup consists of data from regions that have been persisted to disk. All non-persistent region data will be lost. A persistent region is one that writes its region entries to disk.
- The backup and restoration process does not restore WAN replication configuration. A PCC service instance that replicates data via WAN may be backed up. A restore of that PCC service instance will contain all the persistent data. That PCC service instance will be able to communicate with other WAN-connected service instances after further configuration.
- The disk size on the jumpbox VM must be large enough to hold the
backup artifacts. The quantity of disk space needs to exceed
the sum total of all disk space used within the PCC service instance to be
backed up.
An approximation of that quantity of disk space may be calculated
by looking at the fully populated PCC service instance.
Sum the disk space (in
/var/vcap/store
) used by each locator and server. - The PCC service instance needs to be quiescent when the backup is taken. An active cluster could have disk writes in progress, leading to a backup for which region data may or may not have been written to disk.
- The fresh PCC service instance that is to be used for a restore must have same quantity of locators and servers as the PCC service instance had when its backup was taken. The fresh PCC service instance must have sufficient resources to hold the data that is in the backup.
- Do not configure the fresh PCC service instance that is to be used for a restore after creation. It must be empty. Do not create any regions or start any gateway senders. The restore will create all regions, both persistent and not persistent. The non-persistent regions will be empty.
- Service keys are not part of the backup. Service keys will need to be created anew on a restored service instance.
Backing Up a PCC Service Instance
Optional: Back up the Pivotal Cloud Foundry environment. For instructions, see Backing Up Pivotal Cloud Foundry with BBR.
Optional: Compact the disk stores of the cluster to be backed up. Use the gfsh
compact disk-store
command after connecting to the cluster with a role that is authorized with theCLUSTER:MANAGE
operation permission. See the documentation for gfshcompact disk-store
.Optional: Acquire a region statistic that may be used for a minimal validation upon using this backup for subsequent restoration. Use the gfsh
describe region
command on each persistent region, after connecting to the cluster with a role that is authorized with theCLUSTER:READ
operation permission.describe region --name=REGION-NAME
For each persistent region (the
data-policy
will contain the stringPERSISTENT
), record thesize
of region. This size is the quantity of entries in the region. Assuming that activity on the region is quiescent, when this backup is used in a restoration, the quantity of region entries in the restored region forms a minimal validation of the region.Ssh to the jumpbox. Assuming that the Ops Manager VM is being used as the jumpbox, follow directions at Log in to the Ops Manager VM with SSH.
Determine all needed credentials and parameters for the command that makes the backup:
BOSH-DIRECTOR-IP
: Obtain the BOSH Director IP address by following the instructions at Step 4:Retrieve BOSH Director Address.-
SERVICE-INSTANCE-DEPLOYMENT-NAME
: Obtain the deployment name by following the instructions at Acquire the Deployment Name. -
BOSH-CLIENT
,BOSH-CLIENT-PASSWORD
, andPATH-TO-BOSH-SERVER-CERTIFICATE
: To learn all three of these values, open the Credentials tab of the BOSH Director tile in Ops Manager. Find and openBosh Commandline Credentials
. The path is the path to the root Certificate Authority (CA) certificate, and its value will be/var/tempest/workspaces/default/root_ca_certificate
if Ops Manager and the jumpbox are on the same VM.
Make the backup. From the jumpbox, issue this command, substituting values acquired in the previous step:
BOSH_CLIENT_SECRET=BOSH-CLIENT-PASSWORD \ bbr deployment \ --target BOSH-DIRECTOR-IP \ --username BOSH-CLIENT \ --ca-cert PATH-TO-BOSH-SERVER-CERTIFICATE \ --deployment SERVICE-INSTANCE-DEPLOYMENT-NAME \ backup
BOSH_CLIENT_SECRET
is an environment variable that is set only for the duration of the command.The backup will be within a directory on the jumpbox named by the depoyment name and a timestamp.
Copy the backup to a permanent home for archiving. Pivotal recommends compressing and encrypting the files. Disaster recovery plans often recommend archiving multiple copies of each backup.
Restoring a PCC Service Instance
Use SSH to connect to the jumpbox. Assuming that the Ops Manager VM is being used as the jumpbox, follow directions at Log in to the Ops Manager VM with SSH.
Transfer the archived backup to the jumpbox. Expand if compressed, and decrypt if encrypted.
Make a new, but empty service instance. See directions in Creating a Pivotal Cloud Cache Service Instance. If the service instance to be restored was connected to a WAN, be sure to create the new instance with the same
distributed_system_id
as the old one.Determine all needed credentials and parameters for the command that does the restore by following step 5 instructions from making the backup.
Do the restore. From the jumpbox, issue this command, substituting values acquired in the previous step:
$ BOSH_CLIENT_SECRET=BOSH-CLIENT-PASSWORD \ bbr deployment \ --target BOSH-DIRECTOR-IP \ --username BOSH-CLIENT \ --ca-cert PATH-TO-BOSH-SERVER-CERTIFICATE \ --deployment SERVICE-INSTANCE-DEPLOYMENT-NAME \ restore --artifact-path PATH-TO-SERVICE-INSTANCE-ARTIFACT
PATH-TO-SERVICE-INSTANCE-ARTIFACT
is the path to the artifact for the instance that you are currently restoring.Create a new service key, as the service key was not an artifact that the backup created. For a stand-alone service instance (one that is not part of a WAN), follow the directions at Create a Service Key. For a WAN-connected service instance, see Restoring a WAN Connection.
If the region size was captured prior to making the backup, in order to do a minimal validation of the restored cluster, use the gfsh
describe region
command on each persistent region, after connecting to the cluster with a role that is authorized with theCLUSTER:READ
operation permission.describe region --name=REGION-NAME
For each persistent region, the
size
of the restored region should be the same as the value captured when the backup was made.Rebind or restage all apps.