Talend Data Quality Essentials

Talend Studio for Data Quality enables business users and data management teams to assess the quality of data in any data source. This product also lets you verify data completeness, accuracy, and integrity in preparation for data migration, instance consolidation, and data integration.

This course is designed to help you immediately utilize Talend Studio for Data Quality. You learn how to evaluate data quality according to a set of metrics and thresholds based on indicators, models, and rules for each data item to be analyzed or monitored.

Duration2 days
(14 hours)
Target audienceAnyone who wants to use Talend Studio for Data Quality to assess data quality
PrerequisitesCompletion of Introduction to Talend Studio or Talend Data Integration Basics, as well as familiarity with SQL
Course objectives

After completing this course, you will be able to:

  • Connect to a database and run an analysis on it
  • Examine the contents of a connection to a data source
  • Create, configure, and run a column analysis
  • Generate regular expressions for pattern matching in an analysis to test data quality
  • Define indicator thresholds that are flagged in analysis results when violated
  • Create, configure, and run different types of table analysis
  • Define a SQL business rule and set up an analysis to identify rows that conflict with your rule
  • Create, configure, and run a table match analysis to search for duplicates
  • Use advanced matching to enhance identification of duplicates
  • Ensure data privacy by shuffling and masking customer data
  • Display analysis reports in PDF files and on Data Quality Portal
Course agenda

Data quality in context

  • Concepts
  • Analysis summary

Data quality analysis

  • Creating a database connection
  • Performing structural analyses
  • Performing a basic column analysis
  • Adding regular expressions
  • Defining indicator thresholds
  • Applying advanced statistics
  • Generating Jobs from an analysis
  • Using a column set analysis
  • Using a business rule analysis
  • Using redundancy analysis

Advanced matching

  • Getting ready for match analysis
  • Reviewing the match analysis process
  • Performing a match analysis
  • Configuring additional settings for the table match analysis
  • Using a matching integration Job
  • Deduplicating addresses

Data cleansing

  • Cleansing email addresses
  • Standardizing country codes

Data privacy

  • Shuffling data for privacy
  • Masking data for privacy
  • Masking data based on a pattern