Data quality entails more than helping companies get correct data into their information systems; it also means getting rid of bad, corrupted, or duplicate data. Clean data is a key element when integrating information across systems, because misinformation can proliferate quickly - internally of course, but also to business partners. With today's interconnected information systems, poor quality data spreads the same way viruses are spread by travelers: erroneous information can spread quickly to other applications. The cost of compromised data is incalculable, including lost sales, wasted productivity, loss of reputation or goodwill, and missed opportunities.
Want to learn more about open source Talend Data Quality? Then watch an online demo or check out our users' testimonials.
Not sure if you need open source Talend Open Profiler or Talend Data Quality? Check out the features comparison matrix.
Data ProfilingThe first step in improving the quality of an enterprise's data is to "profile" (data profiling) or evaluate that data. Sophisticated, yet easy to use, the data profiler is an advanced system that does not require an understanding of database engines and file structures. Business analysts or other non-technical personnel can define a set of indicators, patterns and business rules for each data element that needs to be analyzed or monitored through the open source data profiling tool. These indicators can range from simple or advanced statistics, to pattern and soundex frequencies as well as text string and numeric analysis, including summary data and statistical distributions of records. The patterns are preset or customized expressions that define the expected form of data analyzed and the open source data quality business rules help define custom business thresholds and value ranges.
By reviewing the metrics on a regular basis, and following their trends, a company can follow the evolution (improvement or degradation) of the quality of its data through data profiling. Talend Data Quality includes other profiling and reporting functionalities:
Data Cleansing
Once the problem areas have been identified, the data must be corrected. For data that does not conform to your standards, Talend Data Quality has powerful tools for repairing and cleansing it. Talend Data Quality allows you to use reference data to set the standards for values, regular expressions to set standards for data shape and size, and matching algorithms to find and repair duplicates and near duplicates in your data. Set up cleansing processes using a wide range of dedicated data integration and quality components. These dedicated components, such as name & address cleansing components, fuzzy deduplication components, are natively available in Talend Data Quality. Data Enrichment
Data Enrichment fills in the missing pieces in your data so that you can reach your business goals. The variety of this information is limitless - it can include incorporating a company's Dun & Bradstreet information or a consumer's credit score, getting the longitude and latitude of an address to help plan delivery routes, or collecting census data to target demographics or income categories. The intuitive development environment helps users develop seamless processes in one single environment, to consolidate, merge or simply insert data into any target system. Analytical PortalData Quality Portal provides customizable web-based data quality monitoring and reporting to help organizations keep watch over crucial data quality metrics that may impact important business processes. Data Quality Portal delivers customized key quality indicators (KQI) to a web-based portal where teams can collaborate on the process of improving data quality across the enterprise. It includes PDF report generation, user customized dashboards, ad-hoc queries and time-based monitoring of KQIs. The Data Quality Portal also provides access to a predefined set of reports and global quality gauges that watch for the violation of data quality thresholds. Data Quality and Data IntegrationSince all Talend products are part of the same unified platform, all data quality functionality is seamlessly integrated with Talend Integration Suite, and with Talend MDM, providing users with consistent ergonomics, fast learning curve and a high-level of reusability. This offers unrivaled benefits in terms of resource optimization & utilization, and project consistency. Key features of this integrated platform include:
|