Talend Data Integration Advanced

Talend Data Integration provides an extensible, highly scalable set of tools for accessing, transforming, and integrating data from any business system. This course enables you to use the more advanced features of Talend Data Integration as quickly as possible. Participants can work in teams on projects shared on a remote repository to monitor Jobs and database changes.

Duration1 day
(7 hours)
Target audienceAnyone who wants to use Talend Data Integration to perform data integration and management tasks: software developers and development managers
PrerequisitesCompletion of Talend Data Integration Basics and knowledge of computing, including familiarity with Java or another programming language, SQL, and general database concepts
Course objectives

After completing this course, you will be able to:

  • Start Talend Studio and connect it to a remote repository
  • Use SVN branches in Studio
  • Run a Job in Studio or on a remote JobServer
  • Monitor host CPU and JVM memory in real time during Job execution
  • Use debugging features in Studio
  • Configure a Talend project to capture statistics and logs, and monitor them from Activity Monitoring Console (AMC)
  • Implement several methods of parallel execution in a Talend Job
  • Create Joblets
  • Create a unit test from a working Job
  • Configure a database to monitor and log changes in a separate change data capture (CDC) database
  • Use the CDC database to perform incremental updates between the source and target
  • Set up a reference project in order to use items from another project
Course agenda

Connecting to a remote repository

  • Creating a remote connection

SVN in Studio

  • Copying a Job to a branch
  • Comparing Jobs
  • Resetting a branch

Reference project

  • Setting up and using a reference project

Remote Job execution

  • Creating and running a Job remotely

Resource usage and basic debugging

  • Using Memory Run to view real-time resource usage
  • Debugging Jobs using Debug Run

Activity Monitoring Console (AMC)

  • Configuring statistics and logging
  • Using Activity Monitoring Console (AMC)

Parallel execution

  • Writing large files
  • Writing to databases
  • Automatic parallelization
  • Partitioning


  • Creating a Joblet from an existing Job
  • Creating a Joblet from scratch
  • Triggering Joblets

Unit test

  • Creating a unit test

Change data capture

  • Examining databases
  • Configuring the CDC database
  • Monitoring changes
  • Updating a warehouse
  • Resetting the database