Recovering and Upgrading Ops Manager

This topic provides an overview of upgrading and recovering an Ops Manager using Platform Automation Toolkit, including common errors.

Always Export your Installation

It is recommended to persist the zip file exported from export-installation to an external file store (eg S3) on a regular basis. The exported installation can restore the Ops Manager to a working state if it is non-functional.

Upgrading Ops Manager

It's important to note when upgrading your Ops Manager:

  • always perform an export installation
  • persist that exported installation
  • installation is separate from upgrade
  • an initial installation is done, which maintains state

Upgrade Flowchart

The upgrade-opsman task follows the flow based on state of an Ops Manager VM. This flowchart gives a high level overview of how the task makes decisions for an upgrade.

graph TD; versionChk[Is the existing Ops Manager version less than the Ops Manager image?]; versionError[Version Check Error]; delete[Delete the existing Ops Manager VM]; create[Create a new Ops Manager VM]; iaasErr[IAAS CLI error]; import[Import the provided Installation]; iaasErr2[IAAS CLI error]; versionChk -- No --> versionError; versionChk -- Yes --> delete; delete -- Success --> create; delete -- Failure --> iaasErr; create -- Success --> import ; create -- Failure --> iaasErr2;

On successive invocations of the task, it will offer different behaviour of the previous run. This aids in recovering from failures (ie: from an IAAS) that occur.

Recovering the Ops Manager VM

Using the upgrade-opsman task will always delete the VM. This is done to create a consistent and simplified experience across IAASs. For example, some IAASs have IP conflicts if there are multiple Ops Manager VMs present.

If there is an issue during the upgrade process, you may need to recover your Ops Manager VM. Recovering your VM can be done in two different ways. Both methods require an exported installation.

  1. Recovery using the upgrade-opsman task. Depending on the error, the VM could be recovered by re-running upgrade-opsman. This may or may not require a change to the state file, depending on if there is an ensure set for the state file resource.

  2. Manual recovery. The VM can always be recovered manually by deploying the Ops Manager OVA, raw, or yml from Tanzu Network.

Below is a list of common errors when running upgrade-opsman.

  • Error: The Ops Manager API is inaccessible. Rerun the upgrade-opsman task. The task will assume that the Ops Manager VM is not created, and will run the create-vm and import-installation tasks.

  • Error: The CLI for a supported IAAS fails. (i.e., bad network, outage, etc) The specific error will be returned as output, but most errors can be fixed by re-running the upgrade-opsman task.

Restoring the Original Ops Manager VM

There may be an instance in which you want to restore a previous Ops Manager VM before completing the upgrade process.

It is recommended to restore a previous Ops Manager VM manually. The Running Commands Locally How-to Guide is a helpful resource to get started with the manual process below.

  1. Run delete-vm on the failed or non-desired Ops Manager using the state.yml if applicable. opsman.yml is required for this command.

    1
    2
    docker run -it --rm -v $PWD:/workspace -w /workspace platform-automation-image \
    p-automator delete-vm --state-file state.yml --config opsman.yml
    

  2. Run create-vm using either an empty state.yml or the state output by the previous step. This command requires the image file from Tanzu Network of the original version that was deployed (yml, ova, raw). opsman.yml is required for this command.

    1
    2
    3
    docker run -it --rm -v $PWD:/workspace -w /workspace platform-automation-image \
    p-automator create-vm --config opsman.yml --image-file original-opsman-image.yml \
    --state state.yml
    

  3. Run import-installation. This command requires the exported installation of the original Ops Manager and the env.yml used by Platform Automation Toolkit

    1
    2
    docker run -it --rm -v $PWD:/workspace -w /workspace platform-automation-image \
    om --env env.yml import-installation --installation installation.zip
    

Alternatively, these steps could be completed using the upgrade-opsman command. This command requires all inputs described above.

1
2
3
4
docker run -it --rm -v $PWD:/workspace -w /workspace platform-automation-image \
p-automator upgrade-opsman --state-file state.yml \
--config opsman.yml --image-file original-opsman-image.yml \
--installation installation.zip --env-file env.yml