Getting Started with Spring Cloud Data Flow for VMware Tanzu

Page last updated:

See below for steps to get started using Spring Cloud Data Flow for VMware Tanzu. The examples below use Spring Cloud Data Flow for VMware Tanzu to quickly create a data pipeline.

Note: In order to have read and write access to a Spring Cloud Data Flow for VMware Tanzu service instance, you must have the SpaceDeveloper role in the space where the service instance was created. If you have only the SpaceAuditor role in the space where the service instance was created, you have only read (not write) access to the service instance.

Consider installing the Spring Cloud Data Flow for VMware Tanzu and Service Instance Logs cf CLI plugins. See the Using the Shell and Viewing Service Instance Logs topics.

The examples in this topic use the Spring Cloud Data Flow for VMware Tanzu cf CLI plugin.

Creating a Data Pipeline Using the Shell

Create a Spring Cloud Data Flow service instance (see the Creating an Instance section of the Managing Service Instances topic). If you use the default backing data services of VMware Tanzu SQL [MySQL] and RabbitMQ for VMware Tanzu, you can then import the Spring Cloud Data Flow OSS “RabbitMQ + Maven” stream app starters.

Start the Data Flow shell using the cf dataflow-shell command added by the Spring Cloud Data Flow for VMware Tanzu cf CLI plugin:

$ cf dataflow-shell my-dataflow
Attaching shell to dataflow service my-dataflow in org myorg / space dev as user...
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
Successfully targeted https://dataflow-9f45f80b-c6b6-43dd-a7d4-e43f14990ffd.apps.example.com
dataflow:>

Import the stream app starters using the Data Flow shell’s app import command:

dataflow:>app import https://dataflow.spring.io/rabbitmq-maven-latest
Successfully registered 65 applications from [source.sftp.metadata,
sink.throughput.metadata, ... sink.router.metadata, sink.mongodb]

With the app starters imported, you can use three–the http source, the split processor, and the log sink–to create a stream that consumes data via an HTTP POST request, processes it by splitting it into words, and outputs the results in logs.

Create the stream using the Data Flow shell’s stream create command:

dataflow:>stream create --name words --definition "http | splitter --expression=payload.split(' ') | log"
Created new stream 'words'

Next, deploy the stream, using the stream deploy command:

dataflow:>stream deploy words
Deployment request has been sent for stream 'words'

Creating a Data Pipeline Using the Dashboard

Create a Spring Cloud Data Flow service instance (see the Creating an Instance section of the Managing Service Instances topic). If you use the default backing data services of VMware Tanzu SQL [MySQL] and RabbitMQ for VMware Tanzu, you can then import the Spring Cloud Data Flow OSS “RabbitMQ + Maven” stream app starters.

In Apps Manager, visit the Spring Cloud Data Flow service instance’s page and click Manage to access its dashboard.

Service Instance Page

This will take you to the dashboard’s Apps tab, where you can import applications. Click Add Application(s).

Dashboard Apps Tab, With No Apps Imported

Select Bulk import application coordinates from an HTTP URI location, then click the Stream Apps (RabbitMQ/Maven) link (or manually enter https://dataflow.spring.io/rabbitmq-maven-latest in the URI field). Finally, click Import the application(s).

Bulk Importing Applications

With the app starters imported, visit the Streams tab. You can use three of the imported starter applications—the http source, the split processor, and the log sink—to create a stream that consumes data via an HTTP POST request, processes it by splitting it into words, and outputs the results in logs.

Dashboard Streams Tab, With No Streams Created

Click Create stream(s) to enter the stream creation view. In the left sidebar, search for the http source application. Click it and drag it onto the canvas to begin defining a stream.

Dashboard Streams Tab: Creating a Stream

Search for and add the splitter processor application and log sink application.

Dashboard Streams Tab: Adding Applications to a Stream

Click the splitter application, then click the gear icon beside it to edit its properties. In the expression field, enter payload.split(' '). Click OK.

Dashboard Streams Tab: Editing an Application's Properties

Click and drag between the output and input ports on the applications to connect them and complete the stream.

Dashboard Streams Tab: A Completed Stream

Click the Create Stream button. Type the name “words”, then click Create the stream.

Dashboard Streams Tab: Naming a Stream

The Streams tab now displays the new stream. Click the button to deploy the stream.

Dashboard Streams Tab, With A Stream

Click Deploy stream.

Dashboard Streams Tab: Deploying a Stream

Using the Deployed Data Pipeline

You can run the cf apps command to see the applications deployed as part of the stream:

$ cf apps
Getting apps in org myorg / space dev as user...
OK

name                        requested state   instances   memory   disk   urls
RWSDZgk-words-http-v1       started           1/1         1G       1G     RWSDZgk-words-http-v1.apps.example.com
RWSDZgk-words-log-v1        started           1/1         1G       1G     RWSDZgk-words-log-v1.apps.example.com
RWSDZgk-words-splitter-v1   started           1/1         1G       1G     RWSDZgk-words-splitter-v1.apps.example.com

Run the cf logs command on the RWSDZgk-words-log-v1 application:

$ cf logs RWSDZgk-words-log-v1
Retrieving logs for app RWSDZgk-words-log-v1 in org myorg / space dev as user...

Then, in a separate command line, use the Data Flow shell (started using the cf dataflow-shell command) to send a POST request to the RWSDZgk-words-http-v1 application:

$ cf dataflow-shell dataflow
...
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
dataflow:>http post --target https://RWSDZgk-words-http-v1.apps.example.com --data "This is a test"
> POST (text/plain) https://RWSDZgk-words-http-v1.apps.example.com This is a test
> 202 ACCEPTED

Watch for the processed data in the RWSDZgk-words-log-v1 application’s logs:

2018-06-07T16:47:08.80-0500 [APP/PROC/WEB/0] OUT 2018-06-07 21:47:08.808  INFO 16 --- [plitter.words-1] RWSDZgk-words-log-v1                     : This
2018-06-07T16:47:08.81-0500 [APP/PROC/WEB/0] OUT 2018-06-07 21:47:08.810  INFO 16 --- [plitter.words-1] RWSDZgk-words-log-v1                     : is
2018-06-07T16:47:08.82-0500 [APP/PROC/WEB/0] OUT 2018-06-07 21:47:08.820  INFO 16 --- [plitter.words-1] RWSDZgk-words-log-v1                     : a
2018-06-07T16:47:08.82-0500 [APP/PROC/WEB/0] OUT 2018-06-07 21:47:08.822  INFO 16 --- [plitter.words-1] RWSDZgk-words-log-v1                     : test