Talend Data Quality Basics

Talend Studio for Data Quality enables business users and data management teams to assess the quality of data in any data source. This product also lets you verify data completeness, accuracy, and integrity in preparation for data migration, instance consolidation, and data integration.

This course is designed to help you immediately utilize Talend Studio for Data Quality. It teaches you how to evaluate the quality of data in the information system according to a set of metrics and thresholds based on a series of indicators, models, and rules for each data item to be analyzed or monitored.

Duration2 days (14 hours)
Target audienceAnyone who wants to use Talend Studio for Data Quality to assess data quality
PrerequisitesCompletion of Talend Data Integration Basics, familiarity with SQL
Course objectives
After completing this course, you will be able to:
  • Connect to a database or file delimited data source and run an analysis on it
  • Examine the contents of a connection to a data source
  • Run a data analysis using catalog and schema analysis tools
  • Create, configure, run, and analyze results for every type of data quality analysis offered in the Studio on several sample data sets. This includes profiling data based on these categories of analysis: structural, column, table, cross-table, and correlation
  • Generate regular expressions for pattern matching within an analysis to test data quality
  • Define indicator thresholds that are flagged in analysis results when violated
  • Create and apply a set of business rules to separate compliant data from noncompliant data
Course agendaConnections
  • Creating database and file delimited connections
Structural analysis
  • Using connection overview analysis
  • Using catalog overview analysis
Column analysis
  • Performing a basic column analysis
  • Adding regular expressions
  • Defining indicator thresholds
  • Running additional basic column analyses
  • Running and reconfiguring predefined column analyses
Semantic discovery analysis
  • Configuring and using a semantic discovery analysis
Table analysis
  • Using a column set analysis
  • Using a match analysis
  • Using a business rule analysis
  • Using a functional dependency analysis
Cross-table analysis
  • Using redundancy analysis
Correlation analysis
  • Using numerical correlation analysis
  • Using time correlation analysis
  • Using nominal correlation analysis
Tasks
  • Defining and managing tasks in the profiling perspective