Troubleshooting for BOSH Operators

This topic provides troubleshooting information for BOSH operators.

For more troubleshooting information, see Troubleshooting for Ops Manager Operators.

Administer Service Instances

Pivotal recommends that you use the BOSH CLI for administering the deployments created by the on-demand broker (ODB); for example for checking VMs, ssh, viewing logs. For more information on installing the BOSH CLI, see Install.

Pivotal discourages using the BOSH CLI to update or delete ODB service deployments as it causes cf update-service and cf delete-service operations to fail while the BOSH operation is in progress.

In addition, any changes you make to the deployment are reverted by cf update-service or by running the upgrade-all-service-instances errand. All updates to the service instances must be done using the upgrade-all-service-instances errand. For more information, see Upgrade All Service Instances.

Logs and Metrics

Logs

The ODB writes logs to a log file and to syslog.

The broker log contains error messages and non-zero exit codes returned by the service adapter, as well as the stdout and stderr streams of the adapter.

The log file is located at /var/vcap/sys/log/broker/broker.log. In syslog, logging is written with the tag on-demand-service-broker, under the facility user, with priority info.

If you want to forward syslog to a syslog aggregator, see Syslog Forwarding for Errand Logs below.

The ODB generates a UUID for each request and prefixes all the logs for that request, for example:

[on-demand-service-broker] [4d63080d-e038-45a3-85f9-93910f6b40b1] 2016/09/05 16:43:26.123456 a valid UAA token was found in cache, will not obtain a new one

Note: The ODB’s HTTP server and start up logs are not prefixed with a request ID.

All ODB logs have a UTC timestamp.

Syslog Forwarding for Errand Logs

If you want to forward your errand logs to a syslog aggregator, Pivotal recommends colocating syslog release with the errand job. For information, see the syslog release repository in GitHub.

Example manifest:

- name: delete-all-service-instances-and-deregister-broker
  lifecycle: errand
  ...
  jobs:
  - name: delete-all-service-instances-and-deregister-broker
    release: on-demand-service-broker
    ...
  - name: syslog_forwarder
    release: syslog
    properties:
      syslog:
        address: ((syslog.address))
        port: ((syslog.port))
        transport: udp
        forward_files: false
        custom_rule: |
          module(load="imfile" mode="polling")
          input(type="imfile"
                File="/var/vcap/sys/log/delete-all-service-instances-and-deregister-broker/errand.stdout.log"
                Tag="delete-all-service-instances-and-deregister-broker")
          input(type="imfile"
                File="/var/vcap/sys/log/delete-all-service-instances-and-deregister-broker/errand.stderr.log"
                Tag="delete-all-service-instances-and-deregister-broker")

Note: The errand is configured to redirect stdout and stderr to /var/vcap/sys/log/ERRAND_NAME/errand.stdout.log and /var/vcap/sys/log/ERRAND_NAME/errand.stderr.log. When configuring your errand, be careful to match the actual log file paths in the custom_rule section.

Metrics

If you have configured broker metrics, the broker emits metrics to the Loggregator Firehose. For how to do the configuration, see Configure Service Metrics.

You can consume these metrics by using the CF CLI Firehose plugin. See the firehose-plugin repository in GitHub.

Note: The broker must be registered with a Cloud Foundry in order for metrics to be successfully emitted. For how to register the broker, see Register Broker.

Service-level Metrics

The broker emits a metric indicating the total number of instances across all plans. In addition, if there is a global quota set for the service, a metric showing how much of that quota is remaining is emitted. Service-level metrics use the format shown below.

origin:"BROKER-DEPLOYMENT-NAME" eventType:ValueMetric timestamp:TIMESTAMP deployment:"BROKER-DEPLOYMENT-NAME" job:"broker" index:"BOSH-JOB-INDEX" ip:"IP" valueMetric:<name:"/on-demand-broker/SERVICE-OFFERING-NAME/total_instances" value:INSTANCE-COUNT unit:"count" >
origin:"BROKER-DEPLOYMENT-NAME" eventType:ValueMetric timestamp:TIMESTAMP deployment:"BROKER-DEPLOYMENT-NAME" job:"broker" index:"BOSH-JOB-INDEX>" ip:"IP" valueMetric:<name:"/on-demand-broker/SERVICE-OFFERING-NAME/quota_remaining" value:QUOTA-REMAINING unit:"count" >

Plan-level Metrics

For each service plan, the metrics report the total number of instances for that plan. If there is a quota set for the plan, the metrics also report how much of that quota is remaining. Plan-level metrics are emitted in the following format.

origin:"BROKER-DEPLOYMENT-NAME" eventType:ValueMetric timestamp:TIMESTAMP deployment:"BROKER-DEPLOYMENT-NAME" job:"broker" index:"BOSH-JOB-INDEX" ip:"IP" valueMetric:<name:"/on-demand-broker/SERVICE-OFFERING-NAME/PLAN-NAME/total_instances" value:INSTANCE-COUNT unit:"count" >
origin:"BROKER-DEPLOYMENT-NAME" eventType:ValueMetric timestamp:TIMESTAMP deployment:"BROKER-DEPLOYMENT-NAME" job:"broker" index:"BOSH-JOB-INDEX" ip:"IP" valueMetric:<name:"/on-demand-broker/SERVICE-OFFERING-NAME/PLAN-NAME/quota_remaining" value:QUOTA-REMAINING unit:"count" >

If quota_remaining is 0 then you need to increase your plan quota in the BOSH manifest.

Secure Binding Credentials

If you have configured secure binding credentials, the broker stores credentials on runtime CredHub. For more information, see Enable Secure Binding.

You can see and consume these credentials using the CredHub CLI. For more information, see the credHub-cli repository in GitHub.

Note: Usually, CredHub is not accessible from outside the Cloud Foundry network. Use the CredHub CLI from within the internal network, or connect using an appropriate tunnel.

In failure scenarios, such as when CredHub is down or when the CredHub client credentials are wrong, the broker logs to the file at /var/vcap/sys/log/broker/broker.log where the root cause is generally given. For more information, see Logs above.

Common Causes of Errors

The following are some reasons that you might get an error:

  • CredHub is down / wrong CredHub URL / cannot access URL
  • Wrong credentials to access CredHub
  • Problem with CA certs for CredHub or UAA
  • Binding credentials in an exotic format (the broker only accepts string and string map credentials)

Identify Deployments in BOSH

There is a one-to-one mapping between the service instance ID from Cloud Foundry and the deployment name in BOSH. The convention is that the BOSH deployment name is the service instance ID prefixed by service-instance_. To identify the BOSH deployment for a service instance you can do the following:

  1. Determine the GUID of the service. Run the following command:

    cf service --guid SERVICE-NAME
    

    Where SERVICE-NAME is the name of your service.

    For example:

    $ cf service --guid my-service
    
    Record the GUID in the output of the command.

  2. Identify your deployment. Run bosh deployments and look for service-instance_GUID.

  3. (Optional) Get current tasks for your deployment. Run the following command:

    bosh tasks -d service-instance_GUID
    

    Where GUID is the GUID for your service instance, which you retrieved above.

    For example:

    $ bosh tasks -d \
    service-instance_30d4a67f-d220-4d06-9989-58a976b86b35
    

Identify Tasks in BOSH

Most operations on the on demand service broker API are implemented by launching BOSH tasks. If an operation fails, it may be useful to investigate the corresponding BOSH task. For more information about BOSH tasks, see Tasks in the BOSH documentation.

To identify tasks in BOSH, do the following:

  1. Determine the ID of the service for which an operation failed. Run the following command:

    cf service --guid SERVICE-NAME
    

    Where SERVICE-NAME is the name of your service.

    For example:

    $ cf service --guid my-service
    
    Record the GUID in the output of the command.

  2. SSH on to the service broker VM. Run the following command:

    bosh -d BROKER-DEPLOYMENT-NAME ssh
    

    Where BROKER-DEPLOYMENT-NAME is the name of your broker deployment.

    For example:

    $ bosh -d my-broker ssh
    

  3. In the broker log, look for lines relating to the service, identified by the service ID. Lines recording the starting and finishing of BOSH tasks also have the BOSH task ID:

    on-demand-service-broker: [on-demand-service-broker] [4d63080d-e038-45a3-85f9-93910f6b40b1] 2016/04/13 09:01:50.793965 Bosh task id for Create instance 30d4a67f-d220-4d06-9989-58a976b86b35 was 11470
    on-demand-service-broker: [on-demand-service-broker] [4d63080d-e038-45a3-85f9-93910f6b40b1] 2016/04/13 09:06:55.793976 task 11470 success creating deployment for instance 30d4a67f-d220-4d06-9989-58a976b86b35: create deployment
    
    on-demand-service-broker: [on-demand-service-broker] [8bf5c9f6-7acd-4ab4-9214-363a6f6bef79] 2016/04/13 09:16:20.795035 Bosh task id for Update instance 30d4a67f-d220-4d06-9989-58a976b86b35 was 11473
    on-demand-service-broker: [on-demand-service-broker] [8bf5c9f6-7acd-4ab4-9214-363a6f6bef79] 2016/04/13 09:17:20.795181 task 11473 success updating deployment for instance 30d4a67f-d220-4d06-9989-58a976b86b35: create deployment
    
    on-demand-service-broker: [on-demand-service-broker] [af6fab15-c95e-438b-aa6b-bc4329d4154f] 2016/04/13 09:17:52.803824 Bosh task id for Delete instance 30d4a67f-d220-4d06-9989-58a976b86b35 was 11474
    on-demand-service-broker: [on-demand-service-broker] [af6fab15-c95e-438b-aa6b-bc4329d4154f] 2016/04/13 09:19:56.803938 task 11474 success deleting deployment for instance 30d4a67f-d220-4d06-9989-58a976b86b35: delete deployment service-instance_30d4a67f-d220-4d06-9989-58a976b86b35
    
  4. Use the task ID to obtain the task log from BOSH, adding flags such as --debug or --cpi as necessary. For example:

    $ bosh task 11470
    

Identify Issues When Connecting to BOSH or UAA

The ODB interacts with the BOSH Director to provision and deprovision instances, and is authenticated through the Director’s UAA. For an example configuration, see kafka-example-service-adapter-release in GitHub.

If BOSH or UAA are configured incorrectly in the broker’s manifest, then error messages are displayed in the broker’s log. These messages indicate whether the issue is caused by an unreachable destination or bad credentials.

For example:

on-demand-service-broker: [on-demand-service-broker]
[575afbc1-b541-481d-9cde-b3d3e67e87bf] 2016/05/18 15:56:40.100579
Error authenticating (401): {"error":"unauthorized","error_description":
"Bad credentials"}, ensure that properties.BROKER-JOB.bosh.authentication.uaa is
correct and try again.

List Service Instances

ODB persists the list of ODB-deployed service instances and provides an endpoint to retrieve them. This endpoint requires basic authentication.

During disaster recovery, you can use this endpoint to assess the situation.

Request:

GET http://USERNAME:PASSWORD@ON-DEMAND-BROKER-IP:8080/mgmt/service_instances

Response:

200 OK

Example JSON body:

  [
    {
      "instance_id": "4d19462c-33cf-11e6-91cc-685b3585cc4e",
      "plan_id": "60476620-33cf-11e6-a841-685b3585cc4e",
      "bosh_deployment_name": "service-instance_4d19462c-33cf-11e6-91cc-685b3585cc4e"
    },
    {
      "instance_id": "57014734-33cf-11e6-ba8d-685b3585cc4e",
      "plan_id": "60476620-33cf-11e6-a841-685b3585cc4e",
      "bosh_deployment_name": "service-instance_57014734-33cf-11e6-ba8d-685b3585cc4e"
    }
  ]

List Orphan Deployments

ODB provides an endpoint that compares the list of service instance deployments against the service instances registered in Cloud Foundry. When called, the endpoint returns a list of orphaned deployments, if any are present.

This endpoint is exercised in the orphan-deployments errand. For information about this errand, see Orphan Deployments. To call this endpoint without running the errand, use curl.

Request:

GET http://USERNAME:PASSWORD@ON-DEMAND-BROKER-IP:8080/mgmt/orphan_deployments

Response:

200 OK

Example JSON body:

[
  {
      "deployment_name": "service-instance_d482abd3-8051-48d2-8067-9ccdf02327f3"
  }
]

Knowledge Base (Community)

Find the answer to your question and browse product discussions and solutions by searching the VMware Tanzu Knowledge Base.

File a Support Ticket

You can file a ticket with Support. Be sure to provide the error message from cf service YOUR-SERVICE-INSTANCE.

To expedite troubleshooting, provide your service broker logs and your service instance logs. If your cf service YOUR-SERVICE-INSTANCE output includes a task-id, provide the BOSH task output.