`

Talend Data Quality Advanced

This course covers using tools to isolate, monitor, and correct noncompliant values in a data set. It extends the concepts of Talend Data Quality Basics to cover advanced cleansing of data quality issues.

Duration 1 day (7 hours)
Target audience Anyone who wants to use Talend Studio for Data Quality to assess data quality
Prerequisites Completion of Talend Data Integration Basics, familiarity with SQL
Course objectives
After completing this course, you will be able to:
  • Isolate noncompliant data for examination, assessment, and clean-up
  • Remove invalid data from a data set
  • Analyze, standardize, and consolidate data before sending it to a target
  • Clean a data set so it contains only compliant values
  • Use the Data Stewardship Console
  • Use the Data Quality Portal
  • Use dashboards to monitor data quality
  • Create and verify reports
Course agenda

Starting Talend

  • Extracting table schemas

Identifying invalid data

  • Identifying invalid data in the profiling perspective
  • Identifying invalid data in the integration perspective

Parsing data

  • Parsing data in the profiling perspective
  • Parsing

Creating a lookup table

  • Creating and using lookup tables

Standardizing data

  • Building an integration Job
  • Consolidating data

Identifying duplicate records

  • Identifying duplicates in the profiling perspective
  • Exporting match rule
  • Identifying duplicates in the integration perspective

Resolving conflicts

  • Creating conflict resolution tasks
  • Resolving two matching records
  • Updating the database

Reports

  • Configuring the data quality database
  • Creating and running single analysis and multiple analysis reports
  • Using evolution reports to view changes over time

Data Quality Portal

  • Accessing the Data Quality Portal
  • Running reports