Full Resource Library

Data Lake vs Data Warehouse

Data lakes and data warehouses are both widely used for storing big data, but they are not interchangeable terms. A data lake is a vast pool of raw data, the purpose for which is not yet defined. A data warehouse is a repository for structured, filtered data that has already been processed for a specific purpose.

View Now

What is Data Preparation?

Data preparation is the process of cleaning and transforming raw data prior to processing and analysis. It is a time consuming process, but the business intelligence benefits demand it. And today, savvy self-service data preparation tools are making it easier and more efficient than ever.

View Now

Creating Cluster Connection Metadata from Configuration Files

In this tutorial, create Hadoop Cluster metadata by importing the configuration from the Hadoop configuration files.
This tutorial uses Talend Data Fabric Studio version 6 and a Hadoop cluster: Cloudera CDH version 5.4.
1. Create a new Hadoop cluster metadata definition
Ensure that the Integration perspective is selected.
In the Project Repository, expand Metadata, right-click Hadoop Cluster, and click Create Hadoop Cluster to open the wizard.
In the Name field of the Hadoop Cluster Connection wizard, type MyHadoopCluster_files. In the Purpose field, type Cluster connection metadata, in the Description field, type Metadata to connect to a Cloudera CDH 5.4 cluster, and click Next.

Watch Now

Running a Job on YARN

In this tutorial, create a Big Data batch Job running on YARN, read data from HDFS, sort them and display them in the Console.

Watch Now

Running a Job on Spark

Learn how to create a Big Data batch Job using the Spark framework, read data from HDFS, sort them and display them in the Console.

Watch Now

Become a Data Leader with Talend and Snowflake

This session explores how Snowflake and Talend have changed the data game to help their customers be data leaders in their industry with native cloud approaches for cloud data warehouse and data lakes that improve agility, reduce costs, increase value, all while maintaining compliance.

Watch Now

Machine Learning Webinar Series

Machine Learning Webinar Series covering 3 topics; Principles and benefits of Machine Learning, Democratising Machine Learning capabilities, Operationalising Machine Learning for business advantage.

Watch Now


displaying pages of 5