Talend Cloud Data Preparation

Talend Data Preparation is a self-service application that enables information workers to prepare data for analysis and other data driven tasks. This course is designed to help you immediately get started with Talend Data Preparation Cloud, and it covers management of data flows in Talend Integration Cloud.

You learn how to create datasets and preparations to deliver cleansed, structured, enriched data to business users. You also learn how to build Data Preparation Jobs in Talend Studio, how to publish them to the cloud, and how to schedule them in Talend Integration Cloud.

Duration 1 day (7 hours)
Target audience Data owners, DI developers, and administrators who want to deliver ready-to-use data to business users and administer data integration flows
Prerequisites Completion of Introduction to Talend Studio or Talend Data Integration Basics, as well as a fundamental understanding of administrative tasks
Course objectives

After completing this course, you will be able to:

  • Use Talend Management Console (TMC) to create Data Preparation users
  • Create and share datasets and preparations
  • Handle large data volumes in Data Preparation
  • Execute a user-defined data preparation in a Talend Job
  • Design and publish live and batch data flows as datasets to authorized users
  • Use Talend Integration Cloud (TIC) to create a remote engine and schedule data integration flows
Course agenda

Data Preparation in context

  • Concepts and purpose

Setup

  • Creating users

Creating a data preparation

  • Creating a data preparation and related dataset
  • Adding a join to a data preparation
  • Promoting the preparation

Working with large data volumes

  • Creating a dataset from a database
  • Using selective sampling
  • Exporting preparations

Using DI for Data Preparation

  • Publishing a dataset to Data Preparation
  • Executing a preparation in Talend Studio

Implementing a live dataset

  • Implementing Live Dataset mode in Studio
  • Publishing a Job to Integration Cloud
  • Creating a remote engine
  • Creating a dataset from a Talend Job