Troubleshooting BBR

Page last updated:

This topic lists common troubleshooting scenarios and their solutions when using BOSH Backup and Restore (BBR) to back up and restore Pivotal Cloud Foundry (PCF).

Troubleshooting During a Backup

Symptom

BBR expects a backup script that doesn’t exist.

The Elastic Runtime backup fails with the following error:

The mysql restore script expects a backup script
which produces mysql-artifact artifact which
is not present in the deployment.

Explanation

To back up MySQL in Elastic Runtime, BBR needs the backup prepare node to be enabled.

Solution

Follow the procedures in the Step 4: Enable Backup Prepare Node section of the Backing Up Pivotal Cloud Foundry with BBR topic and re-run the BBR backup command.

Troubleshooting During a Restore

Symptom

The restore fails with a MySQL monit start timeout.

While running the BBR restore command, restoring the job mysql-restore fails with the following error:

1 error occurred:

* restore script for job mysql-restore failed on mysql/0.
...
Monit start failed: Timed out waiting for monit: 2m0s

Explanation

This happens when mariadb fails to start within the timeout period. It will end up in an “Execution Failed” state and monit will never try to start it again.

Solution

Ensure that your MySQL Server cluster has only one instance. If there are more than one instances of MySQL Server, the restore will fail with a monit start timeout. Scale down to one instance and retry.

If your MySQL Server cluster is already scaled down to one node, it may have taken longer than normal to restart the cluster. Follow the procedure below to manually verify and retry.

  1. List the VMs in your deployment:
    $ bosh -e DIRECTOR_IP --ca-cert /var/tempest/workspaces/default/root_ca_certificate \
    -d DEPLOYMENT_NAME \
    ssh
  2. Select the mysql VM to SSH into.
  3. From the mysql VM, run the following command to check that the mariadb process is running:
    $ ps aux | grep mariadb
    
  4. Run the following command to check that monit reports mariadb_ctrl is in an “Execution Failed” state:
    $ sudo monit summary
  5. If so, run the following command from the mysql VM to disable monitoring:
    $ monit unmonitor
  6. Run the following command to enable monitoring:
    $ monit monitor
  7. After a few minutes, run the following command:
    $ monit summary
    The command should report that all the processes are running.
  8. Re-attempt the restore with BBR.

Symptom

The deployment does not match the structure of the backup.

The following error displays:

Deployment 'deployment-name' does not match the structure of the provided backup

Explanation

The instance groups with the restore scripts in the destination environment don’t match the backup metadata. For example, they may have the wrong number of instances of a particular instance group, or the metadata names an instance group that doesn’t exist in the destination environment.

Solution

BBR only supports restoring to an environment that matches your original environment. Pivotal recommends altering the destination environment to match the structure of the backup.

General Troubleshooting

Symptom

BBR displays an error message containing “SSH Dial Error” or another connection error.

Explanation

The jumpbox and the VMs in the deployment are experiencing connection problems.

Solution

Perform the following steps:

  1. Ensure your deployment is healthy by running bosh vms.
  2. Run bbr deployment backup-cleanup in order to clean up the data from the failed backup on the instances. Otherwise, further BBR commands will fail.
  3. Repeat the BBR operation.

Symptom

BBR backup or restore fails with a metadata error:

1 error occurred:
error 1:
An error occurred while running metadata script for job redis-server on redis/0ce9f81f-1756-480b-8e3e-a4609b14b6a6: error from metadata

Explanation

There is a problem with your PCF install.

Solution

Contact Pivotal Support

Create a pull request or raise an issue on the source for this page in GitHub