Talend Cloud How-to #4: Executing Data Integration Jobs with REST API

In the previous installment of the Talend Cloud How-to Series, we demonstrated how to use Talend Data Preparation to build data quality logic. Then we took that data preparation recipe and put it into a Talend job to run manually through a flow.  Now that our job is up and running, we will show you how to test a REST API with Swagger UI and Postman. Then we will build out an execution plan that includes scheduling.

talend cloud

Scheduling Jobs in the Cloud     

There are different ways to schedule data integration jobs or flows in Talend Cloud: single job, multi-job, webhooks, and the REST API.  You can establish a schedule for a single flow (“Run it”). Alternatively, you can have logic, essentially an “execution plan” that allows you to tie multiple jobs together, so you can tell Talend Cloud to run the first job, and if it succeeds, run the second job, and so on.  You can also build handlers to alert you via email if there are issues with a job.

webhook

Download Talend Cloud How-to #4: Executing Data Integration Jobs with REST API now.
View Now

A Note About Webhooks

Talend Cloud also supports execution via webhooks. Webhooks are custom HTTP callbacks that help applications provide other applications with real-time data.  Webhooks are great if you’re using Software as a Service (SaaS) applications. In https://www.talend.com/resources/integrating-with-salesforce/Salesforce, for example, you might want to run a different job each time an accountant is updated, or a new account or a new opportunity is created. Salesforce can fire a webhook when that happens, and Talend Cloud can receive that webhook and execute a job in response.  

Executing Jobs with the RESTful API

Another way to execute your data integration jobs is via the REST API in Talend Cloud. Representational State Transfer (REST or RESTful) is an increasingly popular and developer-friendly architecture used for writing APIs. Imagine you have an external dependency, such as needing a process to complete before the job can run.  You can use an enterprise scheduler tool to call the REST API in Talend Cloud to start the job when the time is right.

We open Talend Cloud and navigate to the job we created in Part 2, Moving Data from Salesforce to Snowflake. We see that the job we built in Talend Open Studio has been running in the Talend Cloud environments using the remote engine that's taking data from Salesforce and loading into Snowflake.  In Talend Cloud, any flow that has run successfully will be available to run as a REST API, and that is what we will show you now.

talend cloud

Talend Cloud uses Swagger UI, a very helpful open source interface that makes it easy to see and interact with the components in the REST API. Now we will show you how to run and test the Salesforce to Snowflake job using a simple REST command (the job’s execution ID) through the API via the Swagger UI.  We scroll down to “POST” “/executions” to start the test.

swagger ui

We return to the Swagger UI and paste the ID into the “body” field, where we are building our request in the JSON body.  We then click on “Try It” to kick off the REST process, or the API that is going to start our job.

swagger

Going back to the Talend Cloud screen, we click “Refresh” in the “RUN HISTORY” section to see all the data integration flows currently running or starting to execute on the remote engine.  We keep refreshing the data until we see the results of the data we just loaded into Snowflake. Once we see the data results, we go into our Snowflake dashboard and at the bottom, we hit “Execute All” and see our data that we prepared in Part 3 of this series is now in Snowflake.

Testing the REST API with Postman

Next, we will run an external test on our REST API using the Postman UI.  Postman is a powerful API testing tool that is available as a free download.  We take our (execution ID command string) REST API to be used, and we paste it into the JSON body field. At the top we have pasted our HTTP URL that we copied from the Swagger tool.

In the Postman UI, we also have provided the “Headers” information to authenticate the REST API, and in the “Authorization” tab is the basic authentication information. It looks ready to go and we can now test the REST API we first tested in Swagger in the external Postman tool – so we click “SEND” to start the flow.

postman

Next, we go back into Talend Cloud and hit “Refresh” on the “RUN HISTORY” section and we see our flow is up and running and will keep loading more data from Salesforce to Snowflake for us.  We have just used both Swagger and Postman to test the job to ensure it is running successfully in Talend Cloud.

Building the Custom Execution Plan

Now we are going to build out the execution plan and the scheduling in Talend Cloud.  An execution plan is a way of running multiple flows or jobs at the same time, and have a sequence running things in parallel, or a mix of both. The execution plan is very handy for orchestrating complex loading processes. You can have things run sequentially and then kick off several flows all at the same time.

We start by going into Talend Cloud and clicking on “EXECUTION PLAN” at the top of the screen, then clicking “ADD EXECUTION PLAN” and giving it a name.  We add “On Failure” and “Expected Results” and see all the “FLOWS/JOBS” available that have been run successfully. To run a sequential step, we click on “Add Step”. I want to run a step which is sequential, so I click 'add step'.

The first step we have is “Clean Table” and that is running a flow called “Truncate Snowflake” that truncates our Snowflake table.  Next, we look for a job that we've already built called “Compliance Analytics Snowflake”, and hope that will work for us. If it fails, we have a flow called “Email Failure” that tells me what flow failed by the execution ID. We click on “Save and Go” to run it on demand.

snowflake

On the next screen we see the indication that our Snowflake table is being truncated. We then go into Snowflake and refresh the screen and see that all the data is gone – proving that our truncate is working. Going back into the execution plan, we refresh and So if I go back into the execution plan in Talend Cloud and click refresh we see it’s starting to run the compliance process. If we wait a second, we see our two steps have completed and our data has reloaded. We just did a simple truncating of our Snowflake data and reloaded it with fresh data from Salesforce.

Next, we want to do a little bit more work on the execution plan in Talend Cloud. Just like with flows, we can schedule an execution plan for daily/weekly/monthly, and we can pick which days we want them run. We select Tuesday at 4:00AM in the America/Chicago time zone in the U.S. and then we click “Save and Go” to apply the schedule. We are taken back to the execution overview page and see the job we just scheduled, with an alarm clock next to it indicating it’s ready to run.  

Recap of Job Testing and Scheduling

With our Salesforce to Snowflake job up and running, we just jumped right into the public REST API that is available in Talend Cloud. Using the REST API, we can use external tools and schedulers to trigger flows and run the data integration process. We used the Swagger UI to show you how to find the syntax that you need for the APIs, and Postman to show how an external tool can call and execute flows as well. Finally, we went into the execution plan, which is used for scheduling internally on Talend Cloud, to demonstrate how to tie multiple flows together in a sequence or orchestrate them to run sequentially or in parallel.

Next in the Talend Cloud How-to Series we will combine separate data preparation and data quality processes into a job in a Marketo campaign. 

| Last Updated: August 13th, 2019