[Step-by-Step] Data Cleansing & Discovery with Talend Data Preparation Cloud

[Step-by-Step] Data Cleansing & Discovery with Talend Data Preparation Cloud

  • Mark Balkenende
    Mark Balkenende is a Sales Solution Architects Manager at Talend. Prior to joining Talend, Mark has had a long career of mastering and integrating data at a number of companies, including Motorola, Abbott Labs and Walgreens. Mark holds an Information Systems Management degree and is also an extreme cycling enthusiast.


Whether it’s CRM data used for a marketing campaign or sales figures compiled for the quarterly financial reporting, every company knows that data quality is the cornerstone to accurate and efficient work. Today, only 3% of companies’ data meets basic quality standards. But, there is hope as simple solution exists to prepare data and improve data quality: Talend Data Preparation.

The following video shows how users can use Talend Data Preparation to instantly correct, browse, visualize and share their data. 

This data can come from any data source: an SQL table, a Salesforce.com module, S3, HDFS, or a local file. Once uploaded to Talend Data Preparation tool, users can check that data was profiled correctly by data type and edit invalid or empty values.

It is then possible to filter data to only specific subsets of data such as California-based leads or Accounts owned by John Doe, and to apply functions to those data subsets only. These functions can range from detecting an invalid email address to reformatting a date. Users can then create “recipes” which consist of the full step-by-step actions or functions used to adjust data and manipulate the datasets to display them in a more insightful way. Data discovery is within reach of everyone!

All recipes can be modified, users can deactivate and delete functions from the recipes and change the preparation as they want. And once their dataset is ready to be shared, they can save and export recipes. The data preparation can be shared with authorized colleagues, Preparations and Recipes do not affect the original dataset as it remains unchanged.  This means that the same dataset can be used by different users with different preparations!

Users can accelerate their data discovery process, even if data quality standards are not enforced at the enterprise level. Try Talend Cloud free

Join The Conversation


Leave a Reply

Your email address will not be published. Required fields are marked *