ebook: : The Definitive Guide to Data Integration

Best Practices for Using Context Variables with Talend – Part 2

Best Practices for Using Context Variables with Talend – Part 2

  • Richard Hall
    With more than 10 years of data integration consulting experience (many of which having been spent implementing Talend), Richard really knows his stuff. He’s provided solutions for companies on 6 of the 7 continents and has consulted across many different market verticals. Richard is a keen advocate of open source software, which is one of the reasons he first joined Talend in 2012. He is also a firm believer in engaging developers in “cool ways”, which is why he looks for opportunities to demonstrate Talend’s uses with technologies found around the home. Things like hooking his Sonos sound system to Twitter via Talend, getting Google Home to call Talend web services, and controlling his TV with Talend calling universal plug and play services, are a handful of examples. Prior to 2019, Richard had been running his own business providing Talend solutions. During that time he became a prominent contributor on Talend Community, providing both examples of how to solve business problems and also how to do some of the cool stuff mentioned above. In 2019 he was invited to return to Talend as the Technical Community Manager.

First off, a big thank you to all those who have read the first part of this blog series!  If you haven’t read it, I invite you to read it now before continuing, as ~Part 2 will build upon it and dive a bit deeper.  Ready to get started? Let’s kick things off by discussing the implicit context load.

The Implicit Context Load

The Implicit Context Load is one of those pieces of functionality that can very easily be ignored but is incredibly valuable.

Simply put, the implicit context load is just a way of linking your jobs to a hardcoded file path or database connection to retrieve your context variables. That’s great, but you still have to hardcode your file path/connection settings, so how is it of any use here if we want a truly environment agnostic configuration?

Well, what is not shouted about as much as it probably should be is that the Implicit Context Load configuration variables can not only be hardcoded, but they can be populated by Talend Routine methods. This opens up a whole new world of environment agnostic functionality and makes Contexts completely redundant for configuring Context variables per environment.

You can find the Talend documentation for the Implicit Context Load here. You will notice that it doesn’t say (at the moment…maybe an amendment is due :)) that each of the fields shown in the screenshot below can be populated by Talend routine methods instead of being hardcoded.

JASYPT

Before I go any further it makes sense to jump onto a slight tangent and mention JASYPT. JASYPT is a java library which allows developers to add basic encryption capabilities to his/her projects with minimum effort, and without the need of having deep knowledge on how cryptography works. JASYPT is supplied with Talend, so there is no need to hunt around and download all sorts of Jars to use here. All you need to be able to do is write a little Java to enable you to obfuscate your values to prevent others from being able to read them in clear text.

Now, you won’t necessarily want all of your values to be obfuscated. This might actually be a bit of a pain. However, JASYPT makes this easy as well. JASYPT comes built-in with some functionality which will allow it to ingest a file of parameters and decrypt only the values which are surrounded by ….

ENC(………)

This means a file with values such as below (example SQL server connection settings)…..

TalendContextAdditionalParams=instance=TALEND_DEV

TalendContextDbName=context_db

TalendContextEnvironment=DEV

TalendContextHost=MyDBHost

TalendContextPassword=ENC(4mW0zXPwFQJu/S6zJw7MIJtHPnZCMAZB)

TalendContextPort=1433

TalendContextUser=TalendUser

…..will only have the “TalendContextPassword” variable decrypted, the rest will be left as they are.

This piece of functionality is really useful in a lot of ways and often gets overlooked by people looking to hide values which need to be made easily available to Talend Jobs. I will demonstrate precisely how to make use of this functionality later, but first I’ll show you how simple using JASYPT is if you simply want to encrypt and decrypt a String.

Simple Encrypt/Decrypt Talend Job

In the example I will give you in part 3 of this blog series (I have to have something to keep you coming back), the code will be a little harder than below. Below is an example job showing how simple it is to use the JASYPT functionality. This job could be used for encrypting whatever values you may wish to encrypt manually. It’s layout is shown below….

 

Two components. A tLibraryLoad to load the JASYPT Jar and a tJava to carry out the encryption/decryption.

The tLibraryLoad is configured as below. Your included version of JASYPT may differ from the one I have used. Use whichever comes with your Talend version.

The tJava needs to import the relevant class we are using from the JASYPT Jar. This import is shown below…..

The actual code is….

import org.jasypt.encryption.pbe.StandardPBEStringEncryptor;

Now to make use of the StandardPBEStringEncryptor I used the following configuration….

The actual code (so you can copy it) is shown below….

//Configure encryptor class

StandardPBEStringEncryptor encryptor = new StandardPBEStringEncryptor();

encryptor.setAlgorithm("PBEWithMD5AndDES");

encryptor.setPassword("BOB");




//Set the String to encrypt and print it

String stringToEncrypt = "Hello World";

System.out.println(stringToEncrypt);




//Encrypt the String and store it as the cipher String. Then print it

String cipher = encryptor.encrypt(stringToEncrypt);

System.out.println(cipher);




//Decrypt the String just encrypted and print it out

System.out.println(encryptor.decrypt(cipher));

In the above it is all hardcoded. I am encrypting the String “Hello World” using the password “BOB” and the algorithm “PBEWithMD5AndDES”. When I run the job, I get the following output….

Starting job TestEcryption at 07:47 19/03/2018.




[statistics] connecting to socket on port 3711

[statistics] connected

Hello World

73bH30rffMwflGM800S2UO/fieHNMVdB

Hello World

[statistics] disconnected

Job TestEcryption ended at 07:47 19/03/2018. [exit code=0]

These snippets of information are useful, but how do you knit them together to provide an environment agnostic Context framework to base your jobs on? I'll dive into that in Part 3 of my best practices blog. Until next week!

<<Continue Reading Part 3>>

Join The Conversation

2 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

  1. MDreamer says:

    Thank for this blog guide. Can you please also elaborate on how to use it with Talend TMC? It looks like it is relevant only to the TAC.

    Also:
    1. How can use Context Variables in Metadata -> Db connection
    2. In Stats catchers in project settings?

    Thanks

    • Richard Hall says:

      This blog has 2 further parts to it. Part 3 shows how to hook everything we’ve already seen together. Part 4 talks about doing this with the TMC.