Talend Data Quality

Data quality entails more than helping companies get correct data into their information systems; it also means getting rid of bad, corrupted, or duplicate data. Clean data is a key element when integrating information across systems, because misinformation can proliferate quickly - internally of course, but also to business partners. With today's interconnected information systems, poor quality data spreads the same way viruses are spread by travelers: erroneous information can spread quickly to other applications. The cost of compromised data is incalculable, including lost sales, wasted productivity, loss of reputation or goodwill, and missed opportunities

All functionality is completely integrated with Talend Integration Suite, Talend's leading open source enterprise data integration solution, ensuring that data quality is built into the open source integration processes during the design phase.

Want to learn more about open source Talend Data Quality? Then watch an online demo or check out our users' testimonials.

Not sure if you need open source Talend Open Profiler or Talend Data Quality? Check out the features comparison matrix.

Data Profiling

Talend Data Quality: open source Data Profiling

The first step in improving the quality of an enterprise's data is to "profile" (data profiling) or evaluate that data. Sophisticated, yet easy to use, the data profiler is an advanced UI-based system that does not require an understanding of database engines and file structures. Business analysts or other non-technical personnel can define a set of indicators, patterns and business rules for each data element that needs to be analyzed or monitored through the open source data profiling tool. These indicators can range from simple or advanced statistics, to pattern and soundex frequencies as well as text string and numeric analysis, including summary data and statistical distributions of records. The patterns are preset or customized expressions that define the expected form of data analyzed and the open source data quality business rules help define custom business thresholds and value ranges.

By reviewing the metrics on a regular basis, and following their evolution and trend, a company can follow the evolution (improvement or degradation) of the quality of its data through data profiling.

Other functionalities include:

  • History of data profiling analyses
  • Batch analyzing
  • Report stylesheet customization
  • Various report formats including PDF, HTML and XML.

 


Data Cleansing

Talend Data Quality: open source Data Cleansing

Once the problem areas are identified, the data must be corrected. All data goes through a "data quality firewall" and records with missing values; values that are improperly formatted or do not match other values in the record in other data sources; duplicates; duplicates with synonyms; even simple typos -all need to be brought into alignment. This is done by cross checking against other databases and reference data.

 


Data Enrichment

Talend Data Quality: open source Data Enrichment

Open Source Data Enrichment provides value-added information to the data. The variety of this information is limitless - it can include incorporating a company's Dun & Bradstreet information or a consumer's credit score, getting the longitude and latitude of an address to help plan delivery routes, or collecting census data to target demographics or income categories.

Analytical Portal

Talend Data Quality: Analytical Portal

Data Quality Portal provides customizable web-based data quality monitoring and reporting to help organizations apply tangible data profiling and data quality metrics and support data quality reporting enterprise-wide.

Data Quality Portal releases specialized portlets according to the different user typologies and allows the use of many categories of analytical tools: Reporting, TDQ Dashboard, User Dashboard, Analytical Processing (OLAP), and Adhoc Query. It also provides access to a predefined set of reports, of global quality gauges.