Migrating to a TLS-Enabled Cluster

An existing PCC service instance that does not use TLS encryption may be migrated to become a PCC service instance with TLS encryption enabled.

WARNING! This procedure will require downtime for the service instance during the migration.

Follow the procedure given here after these prerequisites have been met:

  • All steps within Preparing for TLS have been completed.
  • The service instance has been upgraded to PCC v1.5.2 or a more recent PCC version. There will be no PCC version change during the migration.

Follow this procedure to migrate the existing PCC service instance:

  1. As a PCF operator, stop all apps. First, list all apps to identify the APP_NAME.
    $ cf apps
    
    Then, stop each app with:
    $ cf stop APP_NAME
    
  2. For all non-persistent regions, use the gfsh command line tool to export the data.

    WARNING! Without an export, all non-persistent region entries will be irretrievably lost.

    • Complete the steps within Accessing a Service Instance to acquire the correct version of gfsh, run it, and connect to the cluster using a role that is able to do cluster management operations.
    • List the regions.
      gfsh>list regions 
    • For each region, use gfsh describe to determine if the region is persistent or not and to acquire a server name.
      gfsh>describe region --name=REGION_NAME 
    • For each non-persistent region, use this single gfsh command to export all the data within the region. The SERVER_NAME identifies which GemFire server receives the export command and propagates the command to all other GemFire servers within the cluster.
      gfsh>export data --parallel --region=REGION_NAME --member=SERVER_NAME --dir=/var/vcap/store/gemfire-server 
  3. Your PCF operator needs to target the BOSH Director in order to acquire the DEPLOYMENT_NAME.
    • Run
      $ cf service SERVICE_INSTANCE_NAME 
      to acquire the digits that uniquely identify the service instance. The digits (XXX-XXX in the following instructions) are those between cloudcache- and the period ..
    • Log in to the BOSH Director.
      $ bosh log-in 
    • The DEPLOYMENT_NAME will appear in the output of
      $ bosh deployments | grep XXX-XXX 
  4. Using PCF operator credentials, stop the BOSH deployment:
    $ bosh -d DEPLOYMENT_NAME stop 
    and type “y” when prompted.
  5. Acquire the BOSH manifest with:
    $ bosh -d DEPLOYMENT_NAME manifest > DEPLOYMENT_NAME-manifest.yml 
  6. Edit the acquired BOSH manifest. There are three locations within the manifest file that will require additions. These three locations are identified within this anonymized portion of the manifest file with the symbols , , and . The first part of the manifest file is omitted, as its listed values change based on the PCC version. Real passwords have been replaced with the placeholder password, and user names have been replaced with the placeholder userX within this example.
    instance_groups:
    - name: locator
      instances: 3
      jobs:
      - name: gemfire-locator
        release: gemfire
        properties:
          gemfire:  
            distributed-system-id: 0
            locator:
              bpm_enabled: true
              port: '55221'
              properties:
                enable-time-statistics: true
            persist-pdx: true
            security:
              internal_cluster_password: password
              internal_cluster_username: userX
              roles:
                cluster_operator:
                - CLUSTER:WRITE
                - CLUSTER:READ
                - DATA:MANAGE
                - DATA:WRITE
                - DATA:READ
                - CLUSTER:MANAGE:DEPLOY
                - CLUSTER:MANAGE
                - CLUSTER:MANAGE:GATEWAY
                developer:
                - CLUSTER:READ
                - DATA:WRITE
                - DATA:READ
                gateway:
                - DATA:WRITE
              users:
                cluster_operator_userX:
                  password: password
                  roles:
                  - cluster_operator
                developer_userX:
                  password: password
                  roles:
                  - developer
      - name: route_registrar
        release: routing
        consumes:
          nats:
            deployment: cf-NNNNNNNNNNN
            from: nats
        properties:
          route_registrar:
            routes:
            - name: cloudcache
              port: 8080  
              registration_interval: 20s
              uris:
              - cloudcache-XXX-XXX.example.com
      - name: bpm
        release: bpm
      vm_type: micro.cpu
      stemcell: stemcell
      persistent_disk_type: '10240'
      azs:
      - us-central1-f
      networks:
      - name: example-services-subnet
    - name: server
      instances: 4
      jobs:
      - name: gemfire-server
        release: gemfire
        properties:
          gemfire:
            server:
              bpm_enabled: true
              create-gateway-receiver: true
              development-mode: false
              properties:
                enable-time-statistics: true
                jmx-manager-start: true
              security:
                gateway_password: password
                gateway_username: gateway_sender_userX
      - name: prime-cluster-for-pcc
        release: gemfire
      - name: bpm
        release: bpm
      vm_type: medium.cpu
      stemcell: stemcell
      persistent_disk_type: '10240'
      azs:
      - us-central1-f
      networks:
      - name: example-services-subnet
    update:
      canaries: 1
      canary_watch_time: 1000-600000
      update_watch_time: 1000-600000
      max_in_flight: 32
      serial: true
    features:
      converge_variables: true  
    
    Add lines to the BOSH manifest, using the lines as shown in red in the following modified version of the manifest. Substitute your digits that uniquely identify your service instance for XXX-XXX within the added lines.
    instance_groups:
    - name: locator
      instances: 3
      jobs:
      - name: gemfire-locator
        release: gemfire
        properties:
          gemfire:  
            tls: true
            truststore_password: ((trust-store-password))
            keystore_password: ((key-store-password))
            certificate: ((gemfire-certificate))
            trusted_certs:
            - ((/cf/diego-instance-identity-root-ca))
            - ((/services/tls_ca))
            distributed-system-id: 0
            locator:
              bpm_enabled: true
              port: '55221'
              properties:
                enable-time-statistics: true
            persist-pdx: true
            security:
              internal_cluster_password: password
              internal_cluster_username: userX
              roles:
                cluster_operator:
                - CLUSTER:WRITE
                - CLUSTER:READ
                - DATA:MANAGE
                - DATA:WRITE
                - DATA:READ
                - CLUSTER:MANAGE:DEPLOY
                - CLUSTER:MANAGE
                - CLUSTER:MANAGE:GATEWAY
                developer:
                - CLUSTER:READ
                - DATA:WRITE
                - DATA:READ
                gateway:
                - DATA:WRITE
              users:
                cluster_operator_userX:
                  password: password
                  roles:
                  - cluster_operator
                developer_userX:
                  password: password
                  roles:
                  - developer
      - name: route_registrar
        release: routing
        consumes:
          nats:
            deployment: cf-NNNNNNNNNNN
            from: nats
        properties:
          route_registrar:
            routes:
            - name: cloudcache
              port: 8080  
              tls_port: 8080
              server_cert_domain_san: cloudcache-XXX-XXX.example.com
              registration_interval: 20s
              uris:
              - cloudcache-XXX-XXX.example.com
      - name: bpm
        release: bpm
      vm_type: micro.cpu
      stemcell: stemcell
      persistent_disk_type: '10240'
      azs:
      - us-central1-f
      networks:
      - name: example-services-subnet
    - name: server
      instances: 4
      jobs:
      - name: gemfire-server
        release: gemfire
        properties:
          gemfire:
            server:
              bpm_enabled: true
              create-gateway-receiver: true
              development-mode: false
              properties:
                enable-time-statistics: true
                jmx-manager-start: true
              security:
                gateway_password: password
                gateway_username: gateway_sender_userX
      - name: prime-cluster-for-pcc
        release: gemfire
      - name: bpm
        release: bpm
      vm_type: medium.cpu
      stemcell: stemcell
      persistent_disk_type: '10240'
      azs:
      - us-central1-f
      networks:
      - name: example-services-subnet
    update:
      canaries: 1
      canary_watch_time: 1000-600000
      update_watch_time: 1000-600000
      max_in_flight: 32
      serial: true
    features:
      converge_variables: true  
    variables:
    - name: trust-store-password
      type: password
    - name: key-store-password
      type: password
    - name: gemfire-certificate
      type: certificate
      options:
        ca: /services/tls_ca
        common_name: gemfire-ssl
        alternative_names:
        - gemfire-ssl
        - cloudcache-XXX-XXX.example.com
    
  7. Redeploy the BOSH manifest. Do a BOSH deploy using the edited BOSH manifest:
    $ bosh -d SERVICE-INSTANCE-NAME deploy SERVICE-INSTANCE-NAME-manifest.yml 
    and type “y” when prompted.
  8. Restart the cluster with a sequential BOSH start:
    $ bosh start -d SERVICE-INSTANCE-NAME  --max-in-flight=1 
    and type “y” when prompted.
  9. Run gfsh and follow the directions in Connect with gfsh over HTTPS to connect to the TLS-enabled cluster.
  10. Use gfsh to import all region data that was exported earlier in this procedure. For each earlier-exported region, do:
    gfsh>import data --parallel --region=REGION_NAME --member=SERVER_NAME --dir=/var/vcap/store/gemfire-server 
  11. Revise the app such that it works with a TLS-enabled PCC service instance by following the instructions within Developing an App Under TLS. Re-build, re-deploy, and start the app.