Restoring Pivotal Cloud Foundry Manually from Backup

Page last updated:

Important: If you used BOSH Backup and Restore (BBR) to back up Pivotal Cloud Foundry (PCF), you must use BBR to restore. See the restore instructions here.

This topic describes the procedure for manually restoring your Pivotal Cloud Foundry (PCF) deployment from a backup. To create a backup, see the Backing Up Pivotal Cloud Foundry topic.

Note: You can also use the CF Ops automation utility to perform a restore of your PCF backups. See the CF Ops User Guide for more information.

To restore a deployment, you must import installation settings, temporarily stop the Cloud Controller, restore the state of each critical backend component from its backup file, and restart the Cloud Controller.

To perform these steps, you need the BOSH manifest to locate your critical backend components. BOSH manifests are automatically downloaded to the Ops Manager virtual machine. However, if you are using a separate jumpbox, you must manually download the BOSH deployment manifest.

Note: The procedure described in this topic restores a running PCF deployment to the state captured by backup files. This procedure does not deploy PCF. See the Installing PCF Guide for information.

Import Installation Settings

Note: Pivotal recommends that you export your installation settings before importing from a backup. See the Export Installation Settings section of the Backing Up Pivotal Cloud Foundry topic for more information.

Import installation settings imports the settings and assets of an existing PCF installation. Importing an installation overwrites any existing installation. You must provision a new Ops Manager to import settings.

Follow the steps below to import installation settings.

  1. Select and follow the instruction below for your IaaS to deploy the new Ops Manager VM.

  2. When redirected to the Welcome to Ops Manager page, select Import Existing Installation.

    Welcome

  3. In the import panel, perform the following tasks:

    • Enter your Decryption Passphrase.
    • Click Choose File and browse to the installation zip file that you exported in the Export Installation Settings section of this topic.

    Decryption passphrase

  4. Click Import.

    Note: Some browsers do not provide feedback on the status of the import process, and may appear to hang.

  5. A “Successfully imported installation” message appears upon completion.

    Success

  6. Click Apply Changes to redeploy.

Restore BOSH Using Ops Manager

Follow the steps below to restore BOSH using Ops Manager.

  1. From the Product Installation Dashboard, click the Ops Manager Director tile.

  2. Make a change to your configuration to trigger a new deployment. For example, you can adjust the number of NTP servers in your deployment. Choose a change in configuration that suits your specific deployment.

  3. Follow the instructions in the SSH into Ops Manager topic. The following example assumes an Amazon Web Services deployment:

    $ ssh -i ops_mgr.pem ubuntu@OPS-MGR-IP
    

  4. Delete the bosh-deployments.yml file. Deleting the bosh-deployments.yml file causes Ops Manager to treat the deploy as a new deployment, recreating missing Virtual Machines (VMs), including BOSH. The new deployment ignores existing VMs such as your Pivotal Cloud Foundry deployment.

    $ sudo rm /var/tempest/workspaces/default/deployments/bosh-deployments.yml
    

  5. Rename, move, or delete the bosh-state.json file. Removing this file causes Ops Manager to treat the deploy as a new deployment, recreating missing VMs, including BOSH. The new deployment ignores existing VMs such as your Pivotal Cloud Foundry deployment.

    $ cd /var/tempest/workspaces/default/deployments/
    $ sudo mv bosh-state.json bosh-state.json.old
    

  6. Return to the Product Installation Dashboard and click Apply Changes.

Target the BOSH Director

  1. Install Ruby and the BOSH CLI Ruby gem on a machine outside of your PCF deployment.

  2. From the Installation Dashboard in Ops Manager, select Ops Manager Director > Status and record the IP address listed for the Director. You access the BOSH Director using this IP address.

  3. Click Credentials and record the Director credentials.

  4. From the command line, run bosh target to log into the BOSH Director using the IP address and credentials that you recorded:

    $ bosh target 192.0.2.3
    Target set to 'microbosh-1234abcd1234abcd1234'
    Your username: director
    Enter password: ********************
    Logged in as 'director'
    

    Note: If bosh target does not prompt you for your username and password, run bosh login.

Download BOSH Manifest

  1. Run bosh deployments to identify the name of your current BOSH deployment:

    $ bosh deployments
    +-------------+--------------+-------------------------------------------------+
    | Name        | Release(s)   | Stemcell(s)                                     |
    +-------------+--------------+-------------------------------------------------+
    | cf-example  | cf-mysql/10  | bosh-vsphere-esxi-ubuntu-trusty-go_agent/2690.3 |
    |             | cf/183.2     |                                                 |
    +-------------+--------------+-------------------------------------------------+
    

  2. Run bosh download manifest DEPLOYMENT-NAME LOCAL-SAVE-NAME to download and save each BOSH deployment manifest. You need this manifest to locate information about your databases. For each manifest, you will need to repeat these instructions. Replace DEPLOYMENT-NAME with the name of the current BOSH deployment. For this procedure, use cf.yml as the LOCAL-SAVE-NAME.

    $ bosh download manifest cf-example cf.yml
    Deployment manifest saved to `cf.yml'
    

Restore Critical Backend Components

Your Elastic Runtime deployment contains several critical data stores that must be present for a complete restore. This section describes the procedure for restoring the databases and servers associated with your PCF installation.

You must restore each of the following:

  • Cloud Controller Database
  • UAA Database
  • WebDAV Server
  • Pivotal MySQL Server

Note: If you are running PostgreSQL and are on the default internal databases, follow the instructions below. If you are running your databases or filestores externally, disregard instructions for restoring the Cloud Controller and UAA Databases.

Stop Cloud Controller

  1. From a command line, run bosh deployment DEPLOYMENT-MANIFEST to select your PCF deployment. The manifest is located in /var/tempest/workspaces/default/deployments/ on the Ops Manager VM. For example:

    $ bosh deployment /var/tempest/workspaces/default/deployments/cf-bd784.yml
    Deployment set to '/var/tempest/workspaces/default/deployments/cf-bd784.yml'
    

  2. Run bosh vms CF-DEPLOYMENT-NAME to view a list of VMs in your PCF deployment. CF-DEPLOYMENT-NAME corresponds to the name of your PCF release deployment, which is also the filename of your manifest file without the .yml ending. For example:

    $ bosh vms cf-bd784
    +-------------------------------------------+---------+----------------------------------+--------------+
    | Job/index                                 | State   | Resource Pool                    | IPs          |
    +-------------------------------------------+---------+----------------------------------+--------------+
    | ccdb-partition-bd784/0                    | running | ccdb-partition-bd784             | 10.85.xx.xx  | 
    | cloud_controller-partition-bd784/0        | running | cloud_controller-partition-bd784 | 10.85.xx.xx  |
    | cloud_controller_worker-partition-bd784/0 | running | cloud_controller-partition-bd784 | 10.85.xx.xx  |
    | clock_global-partition-bd784/0            | running | clock_global-partition-bd784     | 10.85.xx.xx  |
    | nats-partition-bd784/0                    | running | nats-partition-bd784             | 10.85.xx.xx  |
    | router-partition-bd784/0                  | running | router-partition-bd784           | 10.85.xx.xx  |
    | uaa-partition-bd784/0                     | running | uaa-partition-bd784              | 10.85.xx.xx  |
    +-------------------------------------------+---------+----------------------------------+--------------+
    

  3. Perform the following steps for each Cloud Controller VM, excluding the Cloud Controller Database VM:

    1. SSH onto the VM:
      $ bosh ssh JOB-NAME
    2. From the VM, list the running processes:
      $ monit summary
    3. Stop all processes that start with cloud_controller_:
      $ monit stop PROCESS-NAME

Restore the Cloud Controller Database

Note: Follow these instructions only if you are using a PostgreSQL database.

Use the Cloud Controller Database (CCDB) password and IP address to restore the Cloud Controller Database by following the steps detailed below. Find the IP address in your BOSH deployment manifest. To find your password in the Ops Manager Installation Dashboard, select Elastic Runtime and click Credentials>Link to Credential.

  1. Use scp to send the Cloud Controller Database backup file to the Cloud Controller Database VM.

    $ scp ccdb.sql vcap@YOUR-CCDB-VM-IP-ADDRESS:~/.
    

  2. SSH into the Cloud Controller Database VM.

    $ ssh vcap@YOUR-CCDB-VM-IP
    

  3. Log in to the psql client

    $ /var/vcap/data/packages/postgres/5.1/bin/psql -U vcap -p 2544 ccdb
    

  4. Drop the database schema and create a new one to replace it.

    ccdb=# drop schema public cascade;
    ccdb=# create schema public;
    

  5. Restore the database from the backup file.

    $ /var/vcap/data/packages/postgres/5.1/bin/psql -U vcap -p 2544 ccdb < ~/ccdb.sql
    

Restore UAA Database

Note: Follow these instructions only if you are using a PostgreSQL database.

Drop the UAA Database tables

  1. Find your UAA Database VM ID. To view all VM IDs, run bosh vms from a command line:

    $ bosh vms

  2. SSH into the UAA Database VM using the vcap user and password. If you do not have this information recorded, find it in the Ops Manager Installation Dashboard. Click the Elastic Runtime tile and select Credentials>Link to Credential.

    $ ssh vcap@YOUR-UAADB-VM-IP-ADDRESS

  3. Run find /var/vcap | grep 'bin/psql' to find the locally installed psql client on the UAA Database VM.

    $ vcap@198.51.100.101:~# find /var/vcap | grep 'bin/psql'
    /var/vcap/data/packages/postgres/5.1/bin/psql
    

  4. Log in to the psql client:

    $ vcap@198.51.100.101:~# /var/vcap/data/packages/postgres/5.1/bin/psql -U vcap -p 2544 uaa
    

  5. Run the following commands to drop the tables:

    drop schema public cascade;
    create schema public;
    \q
    

  6. Exit the UAA Database VM.

    $ exit

Restore the UAA Database from its backup state

Note: Follow these instructions only if you are using a PostgreSQL database.

  1. Use the UAA Database password and IP address to restore the UAA Database by running the following commands. You can find the IP address in your BOSH deployment manifest. To find your password in the Ops Manager Installation Dashboard, select Elastic Runtime and click Credentials>Link to Credential.

  2. Use scp to copy the database backup file to the UAA Database VM.

    $ scp uaa.sql vcap@YOUR-UAADB-VM-IP-ADDRESS:~/.
    

  3. SSH into the UAA Database VM.

    $ ssh vcap@YOUR-UAADB-VM-IP-ADDRESS
    

  4. Restore the database from the backup file.

    $ /var/vcap/data/packages/postgres/5.1/bin/psql -U vcap -p 2544 uaa < ~/uaa.sql
    

Restore WebDAV

Use the File Storage password and IP address to restore the WebDAV server by following the steps detailed below. Find the IP address in your BOSH deployment manifest. To find your password in the Ops Manager Installation Dashboard, select Elastic Runtime and click Credentials>Link to Credential.

  1. Run ssh YOUR-WEBDAV-VM-IP-ADDRESS to enter the WebDAV VM.

    $ ssh vcap@192.0.2.10
    

  2. Log in as root user. When prompted for a password, enter the vcap password you used to ssh into the VM:

    $ sudo su

  3. Temporarily change the permissions on var/vcap/store to add write permissions for all.

    $ chmod a+w /var/vcap/store
    

  4. Use scp to send the WebDAV backup tarball to the WebDAV VM from your local machine.

    $ scp webdav.tar.gz vcap@YOUR-WEBDAV-VM-IP-ADDRESS:/var/vcap/store
    

  5. Navigate into the store folder on the WebDAV VM.

    $ cd /var/vcap/store
    

  6. Decompress and extract the contents of the backup archive.

    $ tar zxf webdav.tar.gz

  7. Change the permissions on var/vcap/store to their prior setting.

    $ chmod a-w /var/vcap/store
    

  8. Exit the WebDAV VM.

    $ exit
    

Restore MySQL Database

Restoring from a backup is the same whether one or multiple databases were backed up. Executing the SQL dump will drop, recreate, and refill the specified databases and tables.

Warning: Restoring a database deletes all data that existed in the database prior to the restore. Restoring a database using a full backup artifact, produced by mysqldump --all-databases for example, replaces all data and user permissions.

There are two ways to restore the MySQL Server:

  • Automatic backup: If you configured automatic backups in your ERT configuration, follow the instructions below for restoring from an automatic backup
  • Manual restore: If you performed a manual backup of your MySQL server, follow the instructions below for restoring from a manual backup.

Restore from an Automatic Backup

If you configured automatic backups, perform the following steps to restore your MySQL server:

  1. If you are running a highly available ERT MySQL cluster, perform the following steps to reduce the size of the cluster to a single node:
    1. From the Ops Manager Installation Dashboard, click the Elastic Runtime tile.
    2. Click Resource Config.
    3. Set the number of instances for MySQL Server to 1.
    4. Click Save.
    5. Return to the Ops Manager Installation Dashboard and click Apply Changes.
  2. After the deployment finishes, perform the following steps to prepare the first node for restoration:
    1. Retrieve the IP address for the MySQL server by navigating to the Elastic Runtime tile and clicking the Status tab.
    2. Retrieve the MySQL VM credentials for the MySQL server by navigating to the Elastic Runtime tile and clicking the Credentials tab.
    3. SSH into the Ops Manager Director. For more information, see the previous section Restore BOSH Using Ops Manager.
    4. From the Ops Manager Director VM, use the BOSH CLI to SSH into the first MySQL job. For more information, see the BOSH SSH section in the Advanced Troubleshooting with the BOSH CLI topic.
    5. On the MySQL server VM, become super user:
      $ sudo su
    6. Pause the local database server:
      $ monit stop all
    7. Confirm that all jobs are listed as not monitored:
      $ watch monit summary
    8. Delete the existing MySQL data that is stored on disk:
      $ rm -rf /var/vcap/store/mysql/*
  3. Perform the following steps to restore the backup:
    1. Move the compressed backup file to the node using scp.
    2. Decrypt and expand the file using gpg, sending the output to tar:
      $ gpg --decrypt mysql-backup.tar.gpg | tar -C /var/vcap/store/mysql -xvf -
    3. Change the owner of the data directory, because MySQL expects the data directory to be owned by a particular user:
      $ chown -R vcap:vcap /var/vcap/store/mysql
    4. Start all services with monit:
      $ monit start all
    5. Watch the summary until all jobs are listed as running:
      $ watch monit summary
    6. Exit out of the MySQL node.
  4. If you are restoring a highly available ERT MySQL cluster, perform the following steps to increase the size of the cluster back to three:
    1. From the Ops Manager Installation Dashboard, click the Elastic Runtime tile.
    2. Click Resource Config.
    3. Set the number of instances for MySQL Server to 3.
    4. Click Save.
    5. Return to the Ops Manager Installation Dashboard and click Apply Changes.

Restore from a Manual Backup

If you performed a manual backup, perform the following steps to restore your MySQL server:

  1. Retrieve the IP address of the MySQL server by navigating to the Elastic Runtime tile in the Ops Manager Installation Dashboard and clicking the Status tab.
  2. Set the IP address of the MySQL server as an environment variable:
    $ MYSQL_NODE_IP='YOUR-MYSQL-IP'
  3. Retrieve the MySQL VM credentials and the MySQL Admin credentials by navigating to the Elastic Runtime tile and clicking the Credentials tab.
  4. Locate the user_databases.sql backup file that you created when performing a manual backup.
  5. Use scp to send the backup file to the MySQL Database VM:
    $ scp user_databases.sql vcap@$MYSQL_NODE_IP:~/.
    
  6. SSH into the MySQL Database VM, providing the MySQL VM password when prompted:
    $ ssh vcap@$MYSQL_NODE_IP
    
  7. Enable the creation of tables using any storage engine, providing the MySQL Admin password when prompted:
    $ mysql -h $MYSQL_NODE_IP -u root -p -e "SET GLOBAL enforce_storage_engine=NULL;"
    
  8. Use the MySQL password and IP address to restore the MySQL database by running the following command.
    $ mysql -h $MYSQL_NODE_IP -u root -p < ~/user_databases.sql
    
  9. Use the MySQL password and IP address to restore original storage engine restriction.
    $ mysql -h $MYSQL_NODE_IP -u root -p -e "SET GLOBAL enforce_storage_engine='InnoDB';"
    
  10. Log in to the MySQL client and flush privileges.
    $ mysql -u root -p -h
    mysql > flush privileges;
    

Start Cloud Controller

  1. Run bosh vms to view a list of VMs in your selected deployment. The names of the Cloud Controller VMs begin with cloud_controller.

    $ bosh vms
    +-------------------------------------------+---------+----------------------------------+--------------+
    | Job/index                                 | State   | Resource Pool                    | IPs          |
    +-------------------------------------------+---------+----------------------------------+--------------+
    | cloud_controller-partition-bd784/0        | failing | cloud_controller-partition-bd784 | 10.85.xx.xx  |
    | cloud_controller_worker-partition-bd784/0 | running | cloud_controller-partition-bd784 | 10.85.xx.xx  |
    | clock_global-partition-bd784/0            | running | clock_global-partition-bd784     | 10.85.xx.xx  |
    | nats-partition-bd784/0                    | running | nats-partition-bd784             | 10.85.xx.xx  |
    | router-partition-bd784/0                  | running | router-partition-bd784           | 10.85.xx.xx  |
    | uaa-partition-bd784/0                     | running | uaa-partition-bd784              | 10.85.xx.xx  |
    +-------------------------------------------+---------+----------------------------------+--------------+
    

  2. Perform the following steps for each Cloud Controller VM, excluding the Cloud Controller Database VM:

    1. SSH onto the VM:
      $ bosh ssh JOB-NAME
    2. From the VM, list the running processes:
      $ monit summary
    3. Start all processes that start with cloud_controller_:
      $ monit start PROCESS-NAME
Create a pull request or raise an issue on the source for this page in GitHub