This topic lists common troubleshooting scenarios and their solutions when using BOSH Backup and Restore (BBR) to back up and restore Pivotal Cloud Foundry (PCF).
This section provides solutions for errors that occur during a restore.
While running the BBR restore command, restoring the job
mysql-restore fails with the following error:
1 error occurred: * restore script for job mysql-restore failed on mysql/0. ... Monit start failed: Timed out waiting for monit: 2m0s
This happens when the MySQL job fails to start within the timeout period.
It ends up in an Execution Failed state and
monit never tries to start it again.
Ensure that your MySQL Server cluster has only one instance.
If there is more than one instances of MySQL Server, the restore fails with a
monit start timeout.
Scale down to one instance and retry.
If your MySQL Server cluster is already scaled down to one node, it may have taken longer than normal to restart the cluster. Follow the procedure below to manually verify and retry.
To list the VMs in your deployment, run the following command:
bosh -e BOSH-DIRECTOR-IP --ca-cert /var/tempest/workspaces/default/root_ca_certificate \ -d DEPLOYMENT-NAME \ ssh
SSH into the
To check that the MySQL job process is running, perform one of the following from the
If you selected Internal Databases - MySQL - Percona XtraDB Cluster when you configured the PAS tile, run the following command:
ps aux | grep galera-init
If you selected Internal Databases - MySQL - MariaDB Galera Cluster when you configured the PAS tile, run the following command:
ps aux | grep mariadb_ctrl
$ ps aux | grep galera-initFor more information, see the Configuring PAS.
Run the following command to check that
monitreports that the MySQL job process is in an Execution Failed state:
sudo monit summary
If so, run the following command from the
mysqlVM to disable monitoring:
Run the following command to enable monitoring:
After a few minutes, run the following command:
The command should report that all the processes are running.
Re-attempt the restore with BBR.
The following error displays:
Deployment 'deployment-name' does not match the structure of the provided backup
The instance groups with the restore scripts in the destination environment don’t match the backup metadata. For example, they may have the wrong number of instances of a particular instance group, or the metadata names an instance group that does not exist in the destination environment.
BBR only supports restoring to an environment that matches your original environment. Pivotal recommends altering the destination environment to match the structure of the backup.
This section provides solutions for general errors.
BBR displays an error message containing “SSH Dial Error” or another connection error.
The jumpbox and the VMs in the deployment are experiencing connection problems.
Perform the following steps:
To ensure your deployment is healthy, run the following command:
To clean up the data from the failed backup on the instances, run the following command:
bbr deployment backup-cleanup
Note: This step must be performed, otherwise, further BBR commands fail.
Repeat the BBR operation.
BBR backup or restore fails with a metadata error:
1 error occurred: error 1: An error occurred while running metadata script for job redis-server on redis/0ce9f81f-1756-480b-8e3e-a4609b14b6a6: error from metadata
There is a problem with your PCF install.
Contact Pivotal Support