Talend Data Integration Basics

Talend Studio for Data Integration dramatically improves the efficiency of data integration Job design through an easy-to-use graphical development environment. With integrated connectors to source and target systems, it enables rapid deployment and reduces maintenance costs. It supports all types of data integration, migration, and synchronization.

This course helps you use Talend Studio for Data Integration as quickly as possible. It focuses on the basic capabilities of Studio and how you can use it to build reliable, maintainable data integration tasks that solve practical problems, including extracting data from common database and file formats, transforming it, and integrating it into targets.

This course serves as a prerequisite for many other Talend courses, and the skills learned apply to most Talend products.

Duration2 days
(14 hours)
Target audienceAnyone who wants to use Talend Studio to perform data integration tasks: software developers and development managers
PrerequisitesBasic knowledge of computing, including familiarity with Java or another programming language, SQL, and general database concepts
Course objectives

After completing this course, you will be able to:

  • Create a project
  • Create and run a Job that reads, converts, and writes data
  • Merge data from several sources within a Job
  • Save a schema for repeated use
  • Create and use metadata and context variables within Jobs
  • Connect to, read from, and write to a database from a Job
  • Access a web service from a Job
  • Work with master Jobs and subJobs
  • Build, export, and test-run Jobs outside Studio
  • Implement basic error-handling techniques
  • Use best practices for Job and component naming, hints, and documentation
Course agenda

Getting started with Talend Data Integration

  • Starting Talend Studio
  • Creating your first Job
  • Running a Job
  • Using the component help
  • Designing a Job using best practices
  • Documenting a Job

Working with files

  • Working with delimited files
  • Working with hierarchical files

Working with databases

  • Creating tables in MySQL databases
  • Reading data from MySQL database tables
  • Applying best practices

Using repository metadata

  • Using delimited file metadata
  • Using XML file metadata
  • Using database metadata
  • Using generic schemas
  • Updating metadata

Processing data

  • Mapping data using tMap
  • Joining data using tMap
  • Capturing join rejects
  • Filtering data and capturing filtering rejects
  • Using other data processing components

Using contexts and context variables

  • Creating a built-in context variable
  • Connecting to databases using context variables
  • Creating a context group in the repository
  • Loading context variables from a flow

Building executables and Docker images from data integration Jobs

  • Building a stand-alone Job
  • Building a new version of the Job
  • Building a Docker image

Controlling execution

  • Managing files
  • Processing files
  • Managing Job execution using a master Job

Handling errors

  • Detecting and handling basic errors
  • Raising a warning

Working with web services

  • Accessing a SOAP web service

Use case: Creating a master sales table from different data sources

  • Setting up a customer table
  • First challenge
  • Setting up a sales table
  • Joining data
  • Performing calculations
  • Second challenge
  • Creating a master Job