Talend Data Preparation with Big Data
Talend Data Preparation is a self-service application that enables information workers to prepare data for analysis and other data driven tasks. This course is designed to help you immediately access your data lake using Talend Data Preparation, and to combine preparation and integration tools to correct Big Data files stored in a Hadoop Distributed File System (HDFS).
You learn how to create datasets from data stored on HDFS and export clean data to the cluster. You improve your knowledge of Data Preparation by cleaning up Big Data files. You also learn how to use Talend Studio to execute preparations on the Hadoop cluster using the Spark framework.
|Duration||Half day (4 hours)|
|Target audience||Anyone who wants to use Talend Data Preparation to clean up and structure Big Data files|
|Prerequisites||Completion of Talend Data Integration Basics, Talend Data Preparation for Implementers, and Talend Big Data Basics|
After completing this course, you will be able to:
Talend Data Preparation in a Big Data context
Processing data on HDFS
Running a preparation in a Big Data batch Job
Running a preparation in a Big Data streaming Job