An Introduction to Continuous Integration and Workflows with Talend and Jenkins – Part 2

An Introduction to Continuous Integration and Workflows with Talend and Jenkins – Part 2

  • Rekha Sree
    Rekha Sree is a Customer Success Architect, using her expertise in Data Integration, Data Warehouse and Big Data to help drive customer success at Talend. Prior to joining Talend, Rekha worked at Target Corporation India Pvt Ltd for more than a decade using her vast knowledge in building their enterprise and analytical data warehouse.


In the first part of this blog series, we looked at the different types of continuous workflows and introduced the concept of Continuous Integration. In this blog (part 2), I will review the daily routine of a developer in Software Development Life Cycle (SDLC) phases with a centralized workflow model. I'll also show you how Talend helps the organization in being continuous.

A Day in the Life of an SDLC Developer

As the industry moves towards agility, speed is becoming the need of the hour. A systematic daily routine, if practiced, would help a developer and the project achieve more in less time. Though the time to develop the code remains same, there would be a drastic improvement in code integration, time to test and the time to deploy. Let’s go through the recommended steps for a smooth day to day activities of a Talend developer.

Development & Tests:

The various steps in Development Phase would be:

Step 1: In Talend Studio, the developer pulls a local copy of the Master branch from the centralized Git Repository

Step 2: He or she then performs all the development/coding/job modifications in the local copy

Step 3:  Unit testing of the changes is done

Step 4: Changes are committed to local copy (Note that the changes are still in the local repository and is not yet committed to the Master branch)

Step 5: When the developer is ready to commit the changes to the Master branch, then they perform a "Pull and Merge"

Step 6: If no conflicts appear, then the code is committed to the Master branch

Step 7: If there are conflicts then resolve the conflict manually. If required repeat steps 2 to 5 and perform a commit on the Master branch

QA Tests:

Once the developer commits the changes to Master branch, the Quality Assurance (QA) phase starts. Here's what that would look like, step-by-step.

Step 1: Beyond unit testing, do the functional testing for the newly modified job and all the other jobs part of the requirement/module/project

Step 2: Perform the non-functional testing for the whole requirement/module/project.

Go Live:

If the job passes all the test performed by QA team, it is built, deployed to a centralized repository.

Step 1: Once the code is in a centralized repository, it could be moved into other environments like Test, Stage or Production.

Step 2: Ensure the code is stable and should be used for next requirement development

Step 3: If the job fails at QA phase, it is sent back to the developer for correction.

Automating with Talend CI-Builder

If you notice most of the times QA’s have the same set of testing to be performed and, as most of the steps are repetitive, it makes sense to automate the work. Talend CI -Builder helps in automating this whole process by utilizing the Jenkins’ configuration. So, while utilizing Talend CI -builder or opting for Continuous Testing, the traditional QA steps would be modified to look something like this: 

Step 1: The development and test phases are completed by the developer in Talend Studio.

Step 2: QA tests are automated. One could schedule the build as well as where the job is built and tested using various modules from both Talend and third parties.

Step 3: If the job passes the QA tests, the job build is automated, if not an email notification is sent to developer about the job not passing the QA test, thereby allowing him to correct the defects in the job.

Step 4: Once the job is built, it is automatically deployed to the central repository where the job could be released to different environments.

Continuous Testing with Talend: Step-by-Step

Now let’s look at the detailed process of automating the testing with Talend. Continuous testing starts within development process where the developer could use Talend Studio to test the functionalities of their code. Tools such as GitHub or other repositories can be used to store test case and version together with the code. The same test case could be further utilized to test the integration or QA tests.

A test case has a set of test data, preconditions, expected results and postconditions, developed for a particular test scenario. Talend Studio comes with a test framework that allows you to create test cases by keeping your application ready and deployable at any point of time. It also enables developers to create test cases for different parts of the integration job. Test cases can be created by right-clicking on the component you want to test and select ‘Create Test Case’ from the Menu.

Talend enables developers to add many instances of test cases, which means that you can run as many test cases as you need with different input and reference/comparison files.  These test cases can then automated using Jenkins.

Once you create a test case, depending on the component selected, a default test skeleton is created. In my example below, I am testing the component tUniqRow and hence my default skeleton would look like one given below.

Note that the skeleton generated depends on the component(s) selected in the job to create the test. Here, the test case aims at:

  • Reading input data files using tFileInputDelimited components,
  • Transforming data with an immutable set of INPUT and OUTPUT components based on the initial Job,
  • Writing the output data to a tFileInputDelimited component
  • Comparing the temporary output file tCreateTemporaryFile component to a reference file you need to define, using a tFileCompare component,
  • Generating the Test execution status like OK if it succeeds, or Fail if it fails using a tAssert component.

A test case is successfully executed only when the output file provided by the developer and the reference file (result to be compared file) both provided by the developer are identical. The panel display the test case execution results like status, % of success, duration of test case execution etc. This also shows the execution history.

If all the test cases are successful, then the code is put into higher/next environment(s). Continuous Deployment automatically deploys this code to Nexus using tools like Jenkins, Bamboo etc.

To set up the automated environment the following tools are needed

  • A Jenkins server configured with JDK, Maven and GIT Plugins
  • A dedicated Talend command line (a separate one apart from the one already dedicated to TAC)
  • The Talend CI Builder plugin installed in local Maven repository
  • Access for Git and Nexus. Jenkin would pull the code from Git and store the Binaries to Nexus.

Let’s look at the automated steps of the CI process with Talend Jobs.

Step 1: All Jenkins jobs should be configured to trigger the CI process. Ideally, the process is started when code is committed from Talend Studio into the Git repository master branch (or any other branch specified in the job configuration). Jenkins allows you to specify various conditions to atomically trigger the jobs. The code generated by Talend consists of XML files having the items and the job properties.

Step 2: Once the Jenkins workflow is triggered, it will check the source code as .xml files, custom Java or routines from the Git repository. The Jenkins jobs should be configured accordingly to check out the source code.

Step 3: Once the jobs checkout to the local workspace, the ComamndLine service generates the Java code form the XML files.

Step 4: The Java source code is then compiled as directed by a Maven POM file. A Project Object Model or POM is the fundamental unit of work in Maven. It is an XML file that contains information about the project and configuration details used by Maven to build the project. It also contains default values for most projects. Some of the configuration that can be specified in the POM is the project dependencies, the plugins or goals that can be executed, the build profiles, and so on. Other information such as the project version, description, developers, mailing lists and such can also be specified.

Step 5: Once the code is compiled and binaries are created, this step will run any unit tests created in Talend (we ran through this test case creation earlier in the blog).

Step 6: If all the test cases are passed, then Jenkins would create a package and publish the jobs. Packaging is a step to create a zip file consisting of scripts, contexts, JVM parameters, and Java libraries. This zip archive will then be published to an artifact repository. If the test cases are not passed, then the Jenkins job aborts and the code is sent to the developer for issue resolution.

Step 7: Once the jobs are published to Nexus, the binaries could be taken from Nexus either Manually or via Meta servlet and deployed, schedule the run to the Talend job server.

Well, I hope this blog will give some clarity on how to use continuous integration and testing with Talend CI builder. For details on the installation, configuration and Jenkins jobs please refer to my knowledge article Continuous Integration with Talend CI Builder in Talend Help Center.  


Join The Conversation


Leave a Reply

Your email address will not be published. Required fields are marked *