Best Practices for Using Context Variables with Talend – Part 4

Best Practices for Using Context Variables with Talend – Part 4

  • Richard Hall
    With more than 10 years of data integration consulting experience (many of which having been spent implementing Talend), Richard really knows his stuff. He’s provided solutions for companies on 6 of the 7 continents and has consulted across many different market verticals. Richard is a keen advocate of open source software, which is one of the reasons he first joined Talend in 2012. He is also a firm believer in engaging developers in “cool ways”, which is why he looks for opportunities to demonstrate Talend’s uses with technologies found around the home. Things like hooking his Sonos sound system to Twitter via Talend, getting Google Home to call Talend web services, and controlling his TV with Talend calling universal plug and play services, are a handful of examples.Prior to 2019, Richard had been running his own business providing Talend solutions. During that time he became a prominent contributor on Talend Community, providing both examples of how to solve business problems and also how to do some of the cool stuff mentioned above. In 2019 he was invited to return to Talend as the Technical Community Manager.

Last night it occurred to me that everything in the last three parts of this blog series ( Part 1 / Part 2 / Part 3 ) had been oriented towards the Talend on-premise solution. Many developers I have worked with over the years are moving to Talend Cloud. I too have moved much of my code to the cloud as well. In fact, to build a lot of the collateral for this blog, I had to go back to my old on-premise environment to put it together. I then got to thinking, “would these best practices work in Talend Cloud?”

What's a Remote Engine?

Before I explain whether it will and if so, how, I should point out that one of the benefits of Talend Cloud is that there is significantly less server admin work to carry out. If you run everything in the cloud, then all you need to worry about is building your jobs and configuring them to run.

I know that when I was a full-time developer, I always thought too much of my time was taken up dealing with server administration. This not only frustrated me, but it swallowed up time that could have been better spent doing what I was good at. Having the freedom to just develop is a massive benefit, but having everything in a cloud that is managed by another entity can sometimes limit flexibility…..or so you might think. With everything running in the cloud, getting access to set operating system environment variables or create local properties files on the servers is likely to be a challenge, if indeed possible at all. Talend recognized that and have overcome it with the Remote Engine.

The Remote Engine is what bridges the gap between an entirely cloud-hosted solution and an on-premise solution. You can still have total control of your Remote Engine’s config while also allowing Talend to handle the Management Console. You can also have your data processed where you need it processed (none of the data goes from the Remote Engine to anywhere you don’t expressly send it), which means that your current on-premise code can likely be easily migrated without implications caused by moving it away from other on-premise tools you may have.

The reason I have focused on the Remote Engine is that it is this that allows us to do pretty much exactly what has been described in the previous blog posts, using Talend Cloud. When I tried it out, there were a few subtle changes that I had to make, but I will go through these as I explain how I got it working.

However, first I feel I owe an apology to non-Windows users….

Environment Variables on Systems other than Windows

I started my investigations into achieving this with Talend Cloud by setting up on a Mac. I am a new convert to the world of Macs. As yet, I haven’t quite been through all of the situations I have experienced on Windows, with my Mac. Environment variables were an area I thought might be interesting. As it turned out, it went from interesting to downright silly. I spent a couple of hours trying to figure out why my variables, which worked in all of my terminals, would not be picked up by my Talend Studio.

It turns out using .profile, .bashrc, .bash_profile, etc, are all useless when wanting a GUI app to pick up your variables. You need to use a plist file. I won’t go into detail about this here, I’ll just point you to this useful link. This process solved my issue on my Mac. Once I had set up my environment variables here, Talend Studio was able to see them and I could use this functionality as I could on Windows.

However, there is a part 2 to this apology. I must also apologise to those of you who may have tried to configure an on-premise Talend Runtime on Linux as well. I’m kind of hoping that this is a very small number, but I suspect that someone may have (or will in the future and will be pulling their hair out now). The Talend Runtime is an Apache Karaf based OSGI container that runs as a system service on Linux. As such, any environment variables set in .profile, .bashrc, .bash_profile, etc, will be ignored by anything that runs inside it.

The Remote Engine is based upon Apache Karaf as well. However, we can get around this VERY easily. When you install the Talend Runtime or the Remote Engine as a service on Linux, you will make use of a wrapper.conf file. For the Talend Runtime it will be called something like Talend-ESB-Container-wrapper.conf and for the Remote Engine, it will be called something like Talend-Remote-Engine-wrapper.conf. The file will be located in the installation’s /etc folder. All you need to do is to stop the service from running and add a couple of lines to the beginning of the wrapper.conf file.

Look in the file to find some code like this….

set.default.JAVA_HOME=${java.home}

set.default.KARAF_HOME=${karaf.home}

set.default.KARAF_BASE=${karaf.base}

set.default.KARAF_DATA=${karaf.data}

….and add the following with the settings you require for your variables….

set.default.FILEPATH=/home/Richard/Documents/env.txt

set.default.ENCRYPTIONKEY=12345678

These variables will be picked up in exactly the same way as system environment variables, by anything running inside the Talend Runtime or Remote Engine.

My file looks like this….

So, how is this done using the Remote Engine?

Once we have all of the possible environment variable issues resolved, it is extremely easy to get this working by using the Remote Engine. First, we need to install a Remote Engine. If you haven’t done this, there are instructions which can be followed here.

Once the Remote Engine is installed and the updates to the wrapper.conf (described above) are implemented, we can configure our first Task. I’ll assume that you have created a job for this (following the instructions in Part 3 of this blog) and have uploaded it to the artifact repository. If so, you can follow the steps below to see this working in the Remote Engine.

1) Go to the Management Console and click on the “Operations” link in the left sidebar. Then click on the “View Tasks & Plans” button.

2) Click on the “Add” button and select “Task”

3) Select the “Workspace”, “Artifact type” and “Artifact”. The job being set here is a test job that has been configured to use the Implicit Context Load.

4) Leave all of the context variables blank because these will be set via the Implicit Context Load

5) Select the “Runtime” and “Run type”. We are selecting the Remote Engine here. This is important. The “Run type” can be left as “Manual” or you can set this to be scheduled if you want.

6) Once we click “Go Live” the job will start (if we left the “Run type” as “Manual”). The next screen will show the job running on your Remote Engine.

7) If everything has been configured correctly, the next screen will show a success status

Using the method described in this blog series, you can easily control your context variable usage across all of your environments, so long as you can add environment variables to your servers. If the Implicit Context Load settings are configured for your project, you needn’t ever think about which context is used. When you build a new job, it will automatically be set to use the Implicit Context Load, which will be controlled by the settings on the machines you use to run your jobs.

I hope the series has been useful and that you learned a few new tricks. Until next time!

Join The Conversation

0 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *