Restoring Pivotal Cloud Foundry Manually from Backup

Page last updated:

Important: If you used BOSH Backup and Restore (BBR) to back up Pivotal Cloud Foundry (PCF), you must use BBR to restore. See the restore instructions here.

This topic describes the procedure for manually restoring your Pivotal Cloud Foundry (PCF) deployment from a backup. To create a backup, see the Backing Up Pivotal Cloud Foundry topic.

To restore a deployment, you must import installation settings, temporarily stop the Cloud Controller, restore the state of each critical backend component from its backup file, and restart the Cloud Controller.

To perform these steps, you need the BOSH manifest to locate your critical backend components. BOSH manifests are automatically downloaded to the Ops Manager virtual machine.

However, if you are using a separate jumpbox, you must manually download the BOSH deployment manifest.

Note: The procedure described in this topic restores a running PCF deployment to the state captured by backup files. This procedure does not deploy PCF. See the Installing PCF Guide for information.

Step 1: Import Installation Settings

Note: Pivotal recommends that you export your installation settings before importing from a backup. See the Export Installation Settings section of the Backing Up Pivotal Cloud Foundry topic for more information.

Import installation settings imports the settings and assets of an existing PCF installation. Importing an installation overwrites any existing installation. You must provision a new Ops Manager to import settings.

Follow the steps below to import installation settings.

  1. Select and follow the instruction below for your IaaS to deploy the new Ops Manager VM.

  2. When redirected to the Welcome to Ops Manager page, select Import Existing Installation.

    Welcome

  3. In the import panel, perform the following tasks:

    • Enter your Decryption Passphrase.
    • Click Choose File and browse to the installation zip file that you exported in the Export Installation Settings section of this topic.

    Decryption passphrase

  4. Click Import.

    Note: Some browsers do not provide feedback on the status of the import process, and may appear to hang.

  5. A “Successfully imported installation” message appears upon completion.

    Success

  6. Click Apply Changes to redeploy.

Step 2: Restore BOSH Using Ops Manager

Follow the steps below to restore BOSH using Ops Manager.

  1. From the Product Installation Dashboard, click the Ops Manager Director tile.

  2. Make a change to your configuration to trigger a new deployment. For example, you can adjust the number of NTP servers in your deployment. Choose a change in configuration that suits your specific deployment.

  3. SSH into the Ops Manager VM by following the instructions for your IaaS in the SSH into Ops Manager topic.

  4. Delete the bosh-deployments.yml file. Deleting the bosh-deployments.yml file causes Ops Manager to treat the deploy as a new deployment, recreating missing Virtual Machines (VMs), including BOSH. The new deployment ignores existing VMs such as your Pivotal Cloud Foundry deployment.

    $ sudo rm /var/tempest/workspaces/default/deployments/bosh-deployments.yml
    

  5. Rename, move, or delete the bosh-state.json file. Removing this file causes Ops Manager to treat the deploy as a new deployment, recreating missing VMs, including BOSH. The new deployment ignores existing VMs such as your Pivotal Cloud Foundry deployment.

    $ cd /var/tempest/workspaces/default/deployments/
    $ sudo mv bosh-state.json bosh-state.json.old
    

  6. Return to the Product Installation Dashboard and click Apply Changes.

Step 3: Download the BOSH Manifest

To download the BOSH manifest for your deployment, use the BOSH CLI (either v1 or v2) or access the OpsMan API.

Using BOSH CLI v1

First, identify and target the BOSH Director by performing the following steps:

  1. Install Ruby and the BOSH CLI Ruby gem on a machine outside of your PCF deployment.

  2. From the Installation Dashboard in Ops Manager, select Ops Manager Director > Status and record the IP address listed for the Director. You access the BOSH Director using this IP address.

  3. Click Credentials and record the Director credentials.

  4. From the command line, run bosh target to log into the BOSH Director using the IP address and credentials that you recorded:

    $ bosh target DIRECTOR_IP
    Target set to 'microbosh-1234abcd1234abcd1234'
    Email(): director
    Enter password: ********************
    Logged in as 'director'
    

    Note: If bosh target does not prompt you for your username and password, run bosh login.

Next, download the BOSH manifest for the product by performing the following steps:

  1. Run bosh deployments to identify the name of your current BOSH deployment:

    $ bosh deployments
    +-------------+--------------+-------------------------------------------------+
    | Name        | Release(s)   | Stemcell(s)                                     |
    +-------------+--------------+-------------------------------------------------+
    | cf-example  | cf-mysql/10  | bosh-vsphere-esxi-ubuntu-trusty-go_agent/2690.3 |
    |             | cf/183.2     |                                                 |
    +-------------+--------------+-------------------------------------------------+
    

  2. Run bosh download manifest DEPLOYMENT-NAME LOCAL-SAVE-NAME to download and save each BOSH deployment manifest. You need this manifest to locate information about your databases. For each manifest, you will need to repeat these instructions. Replace DEPLOYMENT-NAME with the name of the current BOSH deployment. For this procedure, use cf.yml as the LOCAL-SAVE-NAME.

    $ bosh download manifest cf-example cf.yml
    Deployment manifest saved to `cf.yml'
    
  3. Place the .yml file in a secure location.

Using BOSH CLI v2

First, target the BOSH Director by performing the following steps:

  1. Install the BOSH v2 CLI on a machine outside of your PCF deployment. You can use the jumpbox for this task.
  2. From the Installation Dashboard in Ops Manager, select Ops Manager Director > Status and record the IP address listed for the Director. You access the BOSH Director using this IP address.

  3. Click Credentials and record the Director credentials.
  4. From the command line, log into the BOSH Director using the IP address and credentials that you recorded:
    $ bosh -e DIRECTOR_IP \
    --ca-cert PATH-TO-BOSH-SERVER-CERT log-in
    Email (): director
    Password (): *******************
    Successfully authenticated with UAA
    Succeeded
    

Next, identify the deployment and download the BOSH manifest for the product by performing the following steps:

  1. After logging in to your BOSH Director, run the following command to identify the name of the BOSH deployment that contains PCF:

    $ bosh -e DIRECTOR_IP \
    --ca-cert /var/tempest/workspaces/default/root_ca_certificate deployments

    Name Release(s) cf-example push-apps-manager-release/661.1.24 cf-backup-and-restore/0.0.1 binary-buildpack/1.0.11 capi/1.28.0 cf-autoscaling/91 cf-mysql/35 ...

    In the above example, the name of the BOSH deployment that contains PCF is cf-example.

  2. Run the following command to download the BOSH manifest and replace DEPLOYMENT-NAME with the deployment name you retrieved in the prevoius step:

    $ bosh -e DIRECTOR_IP -d DEPLOYMENT-NAME manifest > /tmp/cf.yml
    
  3. Place the .yml file in a secure location.

Using the Ops Manager API

First, identify and target the BOSH Director by performing the following steps:

  1. Install the BOSH v2 CLI on a machine outside of your PCF deployment.
  2. Perform the procedures in the Using the Ops Manager API topic to authenticate and access the Ops Manager API.
  3. Use the GET /api/v0/deployed/products endpoint to retrieve a list of deployed products, replacing UAA-ACCESS-TOKEN with the access token recorded in the Using the Ops Manager API topic:
    $ curl "https://OPS-MAN-FQDN/api/v0/deployed/products" \ 
    -X GET \ 
    -H "Authorization: Bearer UAA-ACCESS-TOKEN"
  4. In the response to the above request, locate the product with an installation_name starting with cf- and copy its guid.
  5. Run the following curl command, replacing PRODUCT-GUID with the value of guid from the previous step:

    $ curl "https://OPS-MAN-FQDN/api/v0/deployed/products/PRODUCT-GUID/manifest" \ 
    -X GET \
    -H "Authorization: Bearer UAA-ACCESS-TOKEN" > /tmp/cf.yml

  6. Place the .yml file in a secure location.

Step 4: Stop Cloud Controller

To stop the Cloud Controller, use the BOSH CLI (either v1 or v2).

Using BOSH CLI v1

  1. Run bosh vms DEPLOYMENT-NAME to view a list of VMs in your PCF deployment.

    $ bosh vms cf-example
    +-------------------------------------------+---------+----------------------------------+--------------+
    | Job/index                                 | State   | Resource Pool                    | IPs          |
    +-------------------------------------------+---------+----------------------------------+--------------+
    | ccdb-partition-bd784/0                    | running | ccdb-partition-bd784             | 10.85.xx.xx  | 
    | cloud_controller-partition-bd784/0        | running | cloud_controller-partition-bd784 | 10.85.xx.xx  |
    | cloud_controller_worker-partition-bd784/0 | running | cloud_controller-partition-bd784 | 10.85.xx.xx  |
    | clock_global-partition-bd784/0            | running | clock_global-partition-bd784     | 10.85.xx.xx  |
    | nats-partition-bd784/0                    | running | nats-partition-bd784             | 10.85.xx.xx  |
    | router-partition-bd784/0                  | running | router-partition-bd784           | 10.85.xx.xx  |
    | uaa-partition-bd784/0                     | running | uaa-partition-bd784              | 10.85.xx.xx  |
    +-------------------------------------------+---------+----------------------------------+--------------+
    

  2. Perform the following steps for each Cloud Controller VM, excluding the Cloud Controller Database VM:

    1. SSH onto the VM:
      $ bosh ssh JOB-NAME
    2. From the VM, list the running processes:
      $ monit summary
    3. Stop all processes that start with cloud_controller_:
      $ monit stop PROCESS-NAME

Using BOSH CLI v2

  1. Run bosh instances to view a list of VM instances in your selected deployment.

    $ bosh -e DIRECTOR_IP -d DEPLOYMENT-NAME instances
    

    The command returns results similar to the following:

    Instance                                                            Process State  AZ       IPs
    autoscaling-register-broker/4305bc6d-b391-4d12-af1e-97c42dc746bb    -              default  10.85.101.41
    autoscaling/4a96fc03-ad48-4452-a3a1-21666b56c166                    -              default  10.85.101.40
    bootstrap/952de267-6498-4437-a4eb-d352d9412d85                      -              default  -
    clock_global/a41be911-0b64-477b-be95-04823fe4588e                   running        default  10.85.101.15
    cloud_controller/d8190587-9bd5-436c-9b98-2b307025ef37               running        default  10.85.101.14
    cloud_controller_worker/5059b2a7-5691-47e3-ac45-4874024beb56        running        default  10.85.101.24
    consul_server/06383f02-3837-4ba0-b30a-c49a4aaae832                  running        default  10.85.101.16
    diego_brain/4690bb25-0ef3-43fc-b5f9-902e536340f5                    running        default  10.85.101.31
    diego_cell/c0d1845c-a84e-48b6-9051-f0454e201226                     running        default  10.85.101.25
    ...
    
    The names of the Cloud Controller VMs begin with cloud_controller.

  2. Perform the following steps for each Cloud Controller VM, excluding the Cloud Controller Database VM:

    1. SSH onto the VM:
      $ bosh -e DIRECTOR_IP -d DEPLOYMENT-NAME ssh JOB-NAME
      For example:
      $ bosh -e DIRECTOR_IP -d DEPLOYMENT-NAME ssh cloud_controller
    2. From the VM, list the running processes:
      $ monit summary
    3. Start all processes that start with cloud_controller_:
      $ monit start PROCESS-NAME

Step 5: Restore Critical Backend Components

Your Elastic Runtime deployment contains several critical data stores that must be present for a complete restore. This section describes the procedure for restoring the databases and servers associated with your PCF installation.

You must restore each of the following:

  • Cloud Controller Database
  • UAA Database
  • WebDAV Server
  • Pivotal MySQL Server

Restore WebDAV

Use the File Storage password and IP address to restore the WebDAV server by following the steps detailed below. Find the IP address in your BOSH deployment manifest. To find your password in the Ops Manager Installation Dashboard, select Elastic Runtime and click Credentials>Link to Credential.

  1. Run ssh YOUR-WEBDAV-VM-IP-ADDRESS to enter the WebDAV VM.

    $ ssh vcap@192.0.2.10
    

  2. Log in as root user. When prompted for a password, enter the vcap password you used to ssh into the VM:

    $ sudo su

  3. Temporarily change the permissions on var/vcap/store to add write permissions for all.

    $ chmod a+w /var/vcap/store
    

  4. Use scp to send the WebDAV backup tarball to the WebDAV VM from your local machine.

    $ scp webdav.tar.gz vcap@YOUR-WEBDAV-VM-IP-ADDRESS:/var/vcap/store
    

  5. Navigate into the store folder on the WebDAV VM.

    $ cd /var/vcap/store
    

  6. Decompress and extract the contents of the backup archive.

    $ tar zxf webdav.tar.gz

  7. Change the permissions on var/vcap/store to their prior setting.

    $ chmod a-w /var/vcap/store
    

  8. Exit the WebDAV VM.

    $ exit
    

Restore MySQL Database

Restoring from a backup is the same whether one or multiple databases were backed up. Executing the SQL dump will drop, recreate, and refill the specified databases and tables.

Warning: Restoring a database deletes all data that existed in the database prior to the restore. Restoring a database using a full backup artifact, produced by mysqldump --all-databases for example, replaces all data and user permissions.

There are two ways to restore the MySQL Server:

  • Automatic backup: If you configured automatic backups in your ERT configuration, follow the instructions below for restoring from an automatic backup
  • Manual restore: If you performed a manual backup of your MySQL server, follow the instructions below for restoring from a manual backup.

Restore MySQL from an Automatic Backup

If you configured automatic backups, perform the following steps to restore your MySQL server:

  1. If you are running a highly available ERT MySQL cluster, perform the following steps to reduce the size of the cluster to a single node:
    1. From the Ops Manager Installation Dashboard, click the Elastic Runtime tile.
    2. Click Resource Config.
    3. Set the number of instances for MySQL Server to 1.
    4. Click Save.
    5. Return to the Ops Manager Installation Dashboard and click Apply Changes.
  2. After the deployment finishes, perform the following steps to prepare the first node for restoration:
    1. Retrieve the IP address for the MySQL server by navigating to the Elastic Runtime tile and clicking the Status tab.
    2. Retrieve the MySQL VM credentials for the MySQL server by navigating to the Elastic Runtime tile and clicking the Credentials tab.
    3. SSH into the Ops Manager Director. For more information, see the previous section Restore BOSH Using Ops Manager.
    4. From the Ops Manager Director VM, use the BOSH CLI to SSH into the first MySQL job. For more information, see the BOSH SSH section in the Advanced Troubleshooting with the BOSH CLI topic.
    5. On the MySQL server VM, become super user:
      $ sudo su
    6. Pause the local database server:
      $ monit stop all
    7. Confirm that all jobs are listed as not monitored:
      $ watch monit summary
    8. Delete the existing MySQL data that is stored on disk:
      $ rm -rf /var/vcap/store/mysql/*
  3. Perform the following steps to restore the backup:
    1. Move the compressed backup file to the node using scp.
    2. Decrypt and expand the file using gpg, sending the output to tar:
      $ gpg --decrypt mysql-backup.tar.gpg | tar -C /var/vcap/store/mysql -xvf -
    3. Change the owner of the data directory, because MySQL expects the data directory to be owned by a particular user:
      $ chown -R vcap:vcap /var/vcap/store/mysql
    4. Start all services with monit:
      $ monit start all
    5. Watch the summary until all jobs are listed as running:
      $ watch monit summary
    6. Exit out of the MySQL node.
  4. If you are restoring a highly available ERT MySQL cluster, perform the following steps to increase the size of the cluster back to three:
    1. From the Ops Manager Installation Dashboard, click the Elastic Runtime tile.
    2. Click Resource Config.
    3. Set the number of instances for MySQL Server to 3.
    4. Click Save.
    5. Return to the Ops Manager Installation Dashboard and click Apply Changes.

Restore MySQL from a Manual Backup

If you performed a manual backup, perform the following steps to restore your MySQL server:

  1. Retrieve the IP address of the MySQL server by navigating to the Elastic Runtime tile in the Ops Manager Installation Dashboard and clicking the Status tab.
  2. Set the IP address of the MySQL server as an environment variable:
    $ MYSQL_NODE_IP='YOUR-MYSQL-IP'
  3. Retrieve the MySQL VM credentials and the MySQL Admin credentials by navigating to the Elastic Runtime tile and clicking the Credentials tab.
  4. Locate the user_databases.sql backup file that you created when performing a manual backup.
  5. Use scp to send the backup file to the MySQL Database VM:
    $ scp user_databases.sql vcap@$MYSQL_NODE_IP:~/.
    
  6. SSH into the MySQL Database VM, providing the MySQL VM password when prompted:
    $ ssh vcap@$MYSQL_NODE_IP
    
  7. Enable the creation of tables using any storage engine, providing the MySQL Admin password when prompted:
    $ mysql -h $MYSQL_NODE_IP -u root -p -e "SET GLOBAL enforce_storage_engine=NULL;"
    
  8. Use the MySQL password and IP address to restore the MySQL database by running the following command.
    $ mysql -h $MYSQL_NODE_IP -u root -p < ~/.user_databases.sql
    
  9. Use the MySQL password and IP address to restore original storage engine restriction.
    $ mysql -h $MYSQL_NODE_IP -u root -p -e "SET GLOBAL enforce_storage_engine='InnoDB';"
    
  10. Log in to the MySQL client and flush privileges.
    $ mysql -u root -p -h
    mysql > flush privileges;
    

Step 6: Start Cloud Controller

To start the Cloud Controller, use the BOSH CLI (either v1 or v2).

Using BOSH CLI v1

  1. Run bosh vms to view a list of VMs in your selected deployment. The names of the Cloud Controller VMs begin with cloud_controller.

    $ bosh vms
    +-------------------------------------------+---------+----------------------------------+--------------+
    | Job/index                                 | State   | Resource Pool                    | IPs          |
    +-------------------------------------------+---------+----------------------------------+--------------+
    | cloud_controller-partition-bd784/0        | failing | cloud_controller-partition-bd784 | 10.85.xx.xx  |
    | cloud_controller_worker-partition-bd784/0 | running | cloud_controller-partition-bd784 | 10.85.xx.xx  |
    | clock_global-partition-bd784/0            | running | clock_global-partition-bd784     | 10.85.xx.xx  |
    | nats-partition-bd784/0                    | running | nats-partition-bd784             | 10.85.xx.xx  |
    | router-partition-bd784/0                  | running | router-partition-bd784           | 10.85.xx.xx  |
    | uaa-partition-bd784/0                     | running | uaa-partition-bd784              | 10.85.xx.xx  |
    +-------------------------------------------+---------+----------------------------------+--------------+
    

  2. Perform the following steps for each Cloud Controller VM, excluding the Cloud Controller Database VM:

    1. SSH onto the VM:
      $ bosh ssh JOB-NAME
    2. From the VM, list the running processes:
      $ monit summary
    3. Start all processes that start with cloud_controller_:
      $ monit start PROCESS-NAME

Using BOSH CLI v2

  1. Run bosh instances to view a list of VM instances in your selected deployment.

    $ bosh -e DIRECTOR_IP -d DEPLOYMENT-NAME instances
    

    The command returns results similar to the following:

    Instance                                                            Process State  AZ       IPs
    autoscaling-register-broker/4305bc6d-b391-4d12-af1e-97c42dc746bb    -              default  10.85.101.41
    autoscaling/4a96fc03-ad48-4452-a3a1-21666b56c166                    -              default  10.85.101.40
    bootstrap/952de267-6498-4437-a4eb-d352d9412d85                      -              default  -
    clock_global/a41be911-0b64-477b-be95-04823fe4588e                   running        default  10.85.101.15
    cloud_controller/d8190587-9bd5-436c-9b98-2b307025ef37               running        default  10.85.101.14
    cloud_controller_worker/5059b2a7-5691-47e3-ac45-4874024beb56        running        default  10.85.101.24
    consul_server/06383f02-3837-4ba0-b30a-c49a4aaae832                  running        default  10.85.101.16
    diego_brain/4690bb25-0ef3-43fc-b5f9-902e536340f5                    running        default  10.85.101.31
    diego_cell/c0d1845c-a84e-48b6-9051-f0454e201226                     running        default  10.85.101.25
    ...
    
    The names of the Cloud Controller VMs begin with cloud_controller.

  2. Perform the following steps for each Cloud Controller VM, excluding the Cloud Controller Database VM:

    1. SSH onto the VM:
      $ bosh -e DIRECTOR_IP -d DEPLOYMENT-NAME ssh JOB-NAME
    2. From the VM, list the running processes:
      $ monit summary
    3. Start all processes that start with cloud_controller_:
      $ monit start PROCESS-NAME
Create a pull request or raise an issue on the source for this page in GitHub