`

Talend Data Integration Advanced

Talend Data Integration provides an extensible, highly scalable set of tools to access, transform, and integrate data from any business system. This course enables you to use the more advanced features of Talend Data Integration as quickly as possible. Participants can work in teams on projects shared on a remote repository to monitor Jobs and database changes.

Duration 1 day (7 hours)
Target audience Anyone who wants to use Talend Data Integration to perform data integration and management tasks: software developers and development managers
Prerequisites Completion of Talend Data Integration Basics and knowledge of computing, including familiarity with Java or another programming language, SQL, and general database concepts
Course objectives
After completing this course, you will be able to:
  • Start and connect Talend Studio to a remote repository
  • Use SVN branches in Studio
  • Run a Job in Studio on a remote Job server
  • Monitor host CPU and JVM memory in real time during Job execution
  • Use debugging features in Studio
  • Configure a Talend project to capture statistics and logs, and monitor them from Activity Monitoring Console (AMC)
  • Implement several methods of parallel execution in a Talend Job
  • Create Joblets
  • Create a unit test from a working Job
  • Configure a database to monitor and log changes in a separate change data capture (CDC) database
  • Use the CDC database to perform incremental updates between the source and target
Course agenda

Connecting to a remote repository

  • Creating a remote connection

SVN in Studio

  • Copying a Job to a branch
  • Comparing Jobs
  • Resetting a branch

Remote Job execution

  • Creating and running a Job remotely

Resource usage and basic debugging

  • Using Memory Run to view real-time resource usage
  • Debugging Jobs using Debug Run

Activity Monitoring Console (AMC)

  • Configuring statistics and logging
  • Using Activity Monitoring Console (AMC)

Parallel execution

  • Writing large files
  • Writing to databases
  • Automating parallelization
  • Partitioning

Joblets

  • Creating a Joblet from an existing Job
  • Creating a Joblet from scratch
  • Triggering Joblets

Unit test

  • Creating a unit test

Change data capture

  • Examining databases
  • Configuring the CDC database
  • Monitoring changes
  • Updating a warehouse
  • Resetting the database