TALEND CONNECT 2018 : Get inspired by the movers and shakers in the big data world in NYC
How to Go Serverless with Talend & AWS Lambda
How to Go Serverless with Talend & AWS Lambda
Recently I found myself in a predicament that many of you can relate to, trying to update an aging application that has become too difficult to manage and too costly to continue operating. As we started to talk about what to do, we concluded it was time to start decomposing that application into smaller more manageable pieces.
We spent the rest of the afternoon white boarding what we were going to do. That whole time I kept imaging a car engine that had been completely stripped and all the parts were lying on the table; somehow, we needed to get it to run that way and do so without costing more than it did when it was put together. After reviewing what we had it became clear, to support the goals we set, v2 was moving to the cloud and for several parts that could be run on demand, V2 was going serverless.
For the rest of this post I'll walk you through how to set up a Function as a Service running on AWS Lambda and built using Talend. This example will demonstrate how you could take input from a Kinesis stream and send the results of your job to another Kinesis Stream.
You’ll need a couple of things before we start in order to work along with me:
Talend Studio – Download your open source Talend studio here.
Eclipse In the example below I used Eclipse Neon Version 3 – Installed with AWS toolkit. You can download that here: (http://docs.aws.amazon.com/toolkit-for-eclipse/v1/user-guide/getting-started.html)
Now that you have everything installed, lets create our Talend Job.
Step-by-Step: Creating the Talend Job
Below, I have created a simple job which takes up the value from Context variables during its execution and converts it into UPPERCASE and stores the result into Buffer. Here are the Talend components I used:
Start by creating the context variable:
Next let's take a look at our tFixedFlowInput settings:
Now that we have our settings in for tFixedFlowInput and our context variables, it's time to set up our tmap component:
Finally, lets finish up by setting up the tBufferOutput component. This should be set as you click on "Sync Columns"
The final output of our Talend job looks like this:
Save and close your job. Simple right? After creating the job in Talend, you cannow export it as Standalone Job as shown below. Right click your job on the Repository and click on the “Build Job” option.
Next, choose your folder and click on Finish.
Creating the Lambda Function in AWS
Now that we have our Talend job built and component settings in proper order, it’s time to work with AWS to create the Lambda functionality of the job. First, open your Eclipse with the AWS toolkit installed. Then, create your new AWS lambda project. Here's how:
Click File -> New -> Project -> New Project wizard opens -> select AWS Lambda Java. Create a New AWS Lambda java Project Wizard opens, here change the settings as per the image below and click on Finish.
It might take some time so wait till Eclipse helps you create the Skeleton of your project. Okay, at this point you are almost 50% done. Pat yourself on the back, we are almost there. Now the next window opens your Class file.
Now, if you encounter an error now near @Overwrite annotation, simply right click on your project in your Project Explorer -> Choose Properties - > Java Compiler -> do the compliance settings as shown below and click okay. This should resolve your issue.
Preparing the Lambda Code
Now let’s prepare our Lambda code. Import the previously created Talend job and unzip the Talend job as Standalone job. You will see the folder called “lib” and the folder name as same as your job name.
Go to eclipse, right click on the project name in the project explorer -> Select properties -> go to build path -> choose Add External JARs. Now go to the folder “lib” & select all JAR files then click ok. Follow the same procedure and add the JAR file under the folder “test_job” as shown. Press Apply before you click on OK to finish up.
Now let’s download other External Jars for our Lambda function. Let's take a look at the list below :
AWS – SDK :
Import all the above required jar files as shown above. Now let’s begin writing the lambda function. Let's look at the code below:
Upload your Code to AWS lambda :
Right click on your code -> select AWS Lambda -> Upload function to AWS Lambda
Configure AWS Lambda
- Choose the AWS console region (Remember Lambda,S3 bucket and Kenisis should stay in same region)
- Provide name of the function - >Click ok
- Choose IAM role access (Recommended to choose full access for Lambda)
- Select your S3 bucket where you want to store your lambda function zip file
- Click on finish.
Test run your job :
Right click on your code -> Select AWS Lambda -> Run function on Lambda
Output will be :
Alright, your code got executed in Lambda! As you can see, with a few Talend components and some simple code, we can set up a Talend job to run Serverless on AWS Lambda. Did you find this helpful? Let me know if the comments below or if you have questions.
Most Downloaded Resources
Browse our most popular resources - You can never just have one.