Managing Service Instances

See below for information about managing Data Flow service instances using the Cloud Foundry Command Line Interface tool (cf CLI). You can also manage Data Flow service instances using Pivotal Cloud Foundry® Apps Manager.

Note: In order to have read and write access to a Pivotal Spring Cloud Data Flow service instance, you must have the SpaceDeveloper role in the space where the service instance was created. If you have only the SpaceAuditor role in the space where the service instance was created, you have only read (not write) access to the service instance.

Available Parameters

When creating or updating a Spring Cloud Data Flow service instance, you can configure the service instance using parameters passed to the cf CLI commands. See the following sections for information about the supported parameters.

Setting the Buildpack

Each Data Flow service instance can be given the name of a buildpack to use for deploying stream and task apps. You can set the buildpack for the service instance using a buildpack parameter given to cf create-service or cf update-service. To create a service instance that uses a buildpack named custom-java-buildpack to deploy apps, you might run:

$ cf create-service p-dataflow standard data-flow -c '{"buildpack": "custom-java-buildpack"}'

Setting Dependent Services

Each Data Flow service instance uses three dependent data services. Defaults for these services can be configured in the tile settings, and these defaults can be overridden for each individual service instance at create or update time.

Note: The service offerings with the plan proxy are proxy services used by Pivotal Spring Cloud Data Flow service instances. The Spring Cloud Data Flow service broker creates and deletes instances of these services automatically along with each Spring Cloud Data Flow service instance. Do not manually create or delete instances of these services.

General parameters used to configure dependent data services for a Data Flow service instance are listed below.

Parameter Function
relational-data-service.name The name of the service to use for a relational database that stores Spring Cloud Data Flow metadata and task history.
relational-data-service.plan The name of the service plan to use for the relational database service.
messaging-data-service.name The name of the service to use for a RabbitMQ or Kafka server that facilitates event messaging.
messaging-data-service.plan The name of the service plan to use for the RabbitMQ or Kafka service.
skipper-relational.name The name of the service to use for a relational database used by the Skipper application.
skipper-relational.plan The name of the service plan to use for a relational database used by the Skipper application.

Setting Maven Properties

Each Data Flow service instance can optionally specify Maven configuration properties. For the complete list of properties that can be specified, see the “Maven” section in the OSS Spring Cloud Data Flow documentation.

Maven configuration properties can be set for each Data Flow service instance using parameters given to cf create-service or cf update-service. To set the maven.remote-repositories.repo1.url property, you might use a command such as the following:

$ cf create-service p-dataflow standard data-flow -c '{"maven.remote-repositories.repo1.url": "https://repo.spring.io/libs-snapshot"}'

To configure a private Maven repository that requires authentication, you can provide a username and password, such as the following:

$ cf create-service p-dataflow standard data-flow -c '{"maven.remote-repositories.repo1.url":"https://my.private.maven/repo","maven.remote-repositories.repo1.auth.username":"user","maven.remote-repositories.repo1.auth.password":"password"}'

Limiting Concurrent Tasks

Each Data Flow service instance can execute a maximum number of concurrently-running tasks (the default limit is 10). You can configure the concurrent task limit using a concurrent-task-limit parameter given to cf create-service or cf update-service:

$ cf create-service p-dataflow standard data-flow -c '{"concurrent-task-limit": 30}'

When the number of concurrent tasks reaches the specified limit, the Data Flow service instance will no longer launch new tasks until the number of running tasks is again below the limit.

Enabling Task Support Only (No Streams)

Each Data Flow service instance can be configured to run tasks only (with stream support disabled). You can configure the service instance to enable only task support using a task-only parameter given to cf create-service or cf update-service:

$ cf create-service p-dataflow standard data-flow -c '{"task-only": true}'

With task-only set to true, the Spring Cloud Skipper backing app (with its associated relational database backing service instance and the messaging backing service instance) will not be deployed for the service instance, and the service instance’s dashboard (see Using the Dashboard) will not display the Streams tab.

Enabling Stream Support Only (No Tasks)

Each Data Flow service instance can be configured to run streams only (with task support disabled). You can configure the service instance to enable only stream support using a stream-only parameter given to cf create-service or cf update-service:

$ cf create-service p-dataflow standard data-flow -c '{"stream-only": true}'

With stream-only set to true, the service instance’s dashboard (see Using the Dashboard) will not display the Tasks tab.

Configuring Skipper Memory Allocation

When creating or updating a Data Flow service instance, you can set the memory allocation for the associated Spring Cloud Skipper server deployed to Pivotal Application Service (PAS). The default memory allocation for Skipper is 2 GB.

To configure a value for Skipper’s memory allocation, you can pass a skipper parameter–a JSON object with a single memory key–to the cf create-service or cf update-service command:

$ cf create-service p-dataflow standard data-flow -c '{"skipper": { "memory": "8G" }}'

Using Scheduler for PCF

You can use the Scheduler for PCF service with Pivotal Spring Cloud Data Flow to schedule task executions (see the Spring Cloud Data Flow OSS documentation on Scheduling Tasks). If you configure a Data Flow service instance to use Scheduler for PCF, the Data Flow broker will create a new Scheduler service instance in the Data Flow service instance’s backing space. This Scheduler service instance will be bound to the Data Flow server’s backing application.

To configure a Data Flow service instance to use Scheduler for PCF, pass a scheduler parameter to the cf create-service or cf update-service command. This parameter is a JSON object with the fields listed below.

Parameter Function
name The name of the scheduler service offering to use. Only the Scheduler for PCF service, scheduler-for-pcf, is supported at this time.
plan The name of the service plan to use.

To create a Data Flow service instance that uses a Scheduler for PCF service instance with the standard plan, you might use a command such as the following:

$ cf create-service p-dataflow standard data-flow -c '{"scheduler": {"name": "scheduler-for-pcf", "plan": "standard"}}'

Creating an Instance

Begin by targeting the correct org and space.

$ cf target -o myorg -s development
api endpoint:   https://api.system.example.com
api version:    2.75.0
user:           user
org:            myorg
space:          development

You can view plan details for the Data Flow product using cf marketplace -s.

$ cf marketplace
Getting services from marketplace in org myorg / space development as user...
OK

service             plans    description
p-dataflow          standard Deploys Spring Cloud Data Flow servers to orchestrate data pipelines
p-dataflow-mysql    proxy    Proxies to the Spring Cloud Data Flow MySQL service instance
p-dataflow-rabbitmq proxy    Proxies to the Spring Cloud Data Flow RabbitMQ service instance

TIP:  Use 'cf marketplace -s SERVICE' to view descriptions of individual plans of a given service.

$ cf marketplace -s p-dataflow
Getting service plan information for service p-dataflow as user...
OK

service plan   description     free or paid
standard       Standard Plan   free

Create the service instance using cf create-service. To create a Data Flow service that sets the Maven maven.remote-repositories.repo1.url property to https://repo.spring.io/release, you might run:

$ cf create-service p-dataflow standard data-flow -c '{ "maven.remote-repositories.repo1.url": "https://repo.spring.io/libs-snapshot" }'
Creating service instance data-flow in org myorg / space development as user...
OK

Create in progress. Use 'cf services' or 'cf service data-flow' to check operation status.

As the command output suggests, you can use the cf services or cf service commands to check the status of the service instance. When the service instance is ready, the cf service command will give a status of create succeeded:

$ cf service data-flow

Service instance: data-flow
Service: p-dataflow
Bound apps:
Tags:
Plan: standard
Description: Deploys Spring Cloud Data Flow servers to orchestrate data pipelines
Documentation url: https://cloud.spring.io/spring-cloud-dataflow/
Dashboard: https://p-dataflow.apps.example.com/instances/f09e5c77-e526-4f49-86d6-721c6b8e2fd9/dashboard

Last Operation
Status: create succeeded
Message: Created
Started: 2017-07-20T18:24:14Z
Updated: 2017-07-20T18:26:17Z

Updating an Instance

You can update settings on a Data Flow service instance using the cf CLI. The cf update-service command can be given a -c flag with a JSON object containing parameters used to configure the service instance.

Note: If you upgrade a Data Flow service instance created using Pivotal Spring Cloud Data Flow v1.3.x to the version included in v1.4.0, the upgrade will delete the metrics app and Redis analytics backing service instance for the Data Flow service instance. The metrics app and Redis service are no longer used in Pivotal Spring Cloud Data Flow v1.4.0.

Begin by targeting the correct org and space.

$ cf target -o myorg -s development
api endpoint:   https://api.system.example.com
api version:    2.75.0
user:           user
org:            myorg
space:          development

You can view all service instances in the space using cf services.

$ cf services
Getting services in org myorg / space development as user...
OK

name                                           service              plan      bound apps  last operation
data-flow                                      p-dataflow           standard              create succeeded
mysql-b3e76c87-c5ae-47e4-a83c-5fabf2fc4f11     p-dataflow-mysql     proxy                 create succeeded
rabbitmq-b3e76c87-c5ae-47e4-a83c-5fabf2fc4f11  p-dataflow-rabbitmq  proxy                 create succeeded

Run cf update-service SERVICE_NAME -c '{ "PARAMETER": "VALUE" }', where SERVICE_NAME is the name of the service instance, PARAMETER is a supported parameter (see Available Parameters), and VALUE is the value for the parameter. To upgrade a service instance to the latest version included in the tile, include the parameter upgrade with value true.

$ cf update-service data-flow -c '{"upgrade": true}'
Updating service instance data-flow as user...
OK

Update in progress. Use 'cf services' or 'cf service data-flow' to check operation status.

As the output from the cf update-service command suggests, you can use the cf services or cf service commands to check the status of the service instance. When the Data Flow service instance has been updated, the cf service command will give a status of update succeeded:

$ cf service data-flow
Showing info of service data-flow in org myorg / space dev as user...

name:            data-flow
service:         p-dataflow
bound apps:
tags:
plan:            standard
description:     Deploys Spring Cloud Data Flow servers to orchestrate data pipelines
documentation:
dashboard:       https://p-dataflow.apps.example.com/instances/1cf8ff5b-4a65-469d-bee7-36e6541ac241/dashboard

Showing status of last operation from service data-flow...

status:    update succeeded
message:   Updated
started:   2018-06-19T19:26:09Z
updated:   2018-06-19T19:29:17Z

Deleting an Instance

Deleting a Data Flow service instance will result in deletion of all of its dependent service instances.

Begin by targeting the correct org and space.

$ cf target -o myorg -s development
api endpoint:   https://api.system.example.com
api version:    2.75.0
user:           user
org:            myorg
space:          development

You can view all service instances in the space using cf services.

$ cf services
Getting services in org myorg / space development as user...
OK

name                                           service              plan      bound apps  last operation
data-flow                                      p-dataflow           standard              create succeeded
mysql-b3e76c87-c5ae-47e4-a83c-5fabf2fc4f11     p-dataflow-mysql     proxy                 create succeeded
rabbitmq-b3e76c87-c5ae-47e4-a83c-5fabf2fc4f11  p-dataflow-rabbitmq  proxy                 create succeeded

Delete the Data Flow service instance using cf delete-service. When prompted, enter y to confirm the deletion.

$ cf delete-service data-flow

Really delete the service data-flow?>y
Deleting service data-flow in org myorg / space development as user...
OK

Delete in progress. Use 'cf services' or 'cf service data-flow' to check operation status.

The dependent service instances for the Data Flow server service instance are deleted first, and then the Data Flow server service instance itself is deleted.

As the output from the cf delete-service command suggests, you can use the cf services or cf service commands to check the status of the service instance. When the Data Flow service instance and its dependent service instances have been deleted, the cf services command will no longer list the service instance:

$ cf services
Getting services in org myorg / space development as user...
OK

No services found