What is Data Quality? Definition, Examples, and Tools
Data quality is the process of conditioning data to meet the specific needs of business users. Data is your organization’s most valuable asset, and decisions based on flawed data can have a detrimental impact on the business. That is why you must have confidence in your data quality before it is shared with everyone who needs it.
The impact of poor data quality
The insights that a business can extract out of data are only as good as the data itself. Bad data can come from every area of your organization in many forms, and can lead to difficulties in mining for insights and ultimately poor decision-making.
Data quality is a worrisome subject for many executives. According to the Forbes Insights and KPMG “2016 Global CEO Outlook” 84% of executives are concerned about the quality of the data they’re using for business intelligence. Poor data quality can be costly; an astonishing study conducted by MIT Sloan notes that bad data can cost as much as 15-25% of total revenue.
The good news is that you don’t have to allow bad data to cost your company any more time and money. Keeping the six data quality metrics at the forefront of your data collection initiatives will promote optimal performance of business systems and support user faith in the data’s reliability.
Setting data quality expectations
Regardless of an organization’s size, function, or market, every organization needs to pay attention to data quality to understand its business and to make sound business decisions. The kinds and sources of data are extremely numerous, and its quality will have different impacts on the business based on what it’s used for and why. That is why your business needs to set unique and agreed upon expectations, decided in a collaborative manner, for each of the six metrics above, based on what you hope to get out of the data.
Data’s value comes primarily when it underpins a business process or decision-making based on business intelligence. Therefore, the agreed data quality rules should take account of the value that data can provide to an organization. If it is identified that data has a very high value in a certain context, then this may indicate that more rigorous data quality rules are required in this context. Companies therefore must agree on data quality standards based not only on the data quality dimensions themselves — and, of course, any external standards that data quality must meet — but also on the impact of not meeting them.
The high cost of ignoring data quality
The cost of doing nothing explodes over time. Poor data quality management can be mitigated much more easily if caught before it is used — at its point of origin. If you verify or standardize data at the point of entry, before it makes it into your back-end systems, we can say that it costs about $1 to standardize it. If you cleanse that data later, going through the match and cleanse in all the different places, then it would cost $10 in comparison to the first dollar in terms of time and effort expended. And just leaving that bad quality data to sit in your system and continually give you degraded information to make decisions on, or to send out to customers, or present to your company, would cost you $100 compared to the $1 it would’ve cost to deal with that data at its entry point. The cost gets greater the longer bad data sits in the system. The goal, therefore, is to catch bad data before it ever enters your systems.
A winning approach to data quality
To do this, you need to establish a pervasive, proactive, and collaborative approach to data quality in your company. Data quality must be something that every team (not just the technical ones) has to be responsible for; it has to cover every system; and has to have rules and policies that stop bad data before it ever gets in.
Does this sound impossible? It’s not. Here’s your roadmap to develop this approach:
- Build your interdisciplinary team: Recruit data architects, business people, data scientists, and data protection experts as a core data quality team. It should be managed by a deployment leader who should be both a team coach and a promoter of data quality projects.
- Set your expectations from the start: Why data quality? Find your data quality answers among business people. Make sure you and your team know your finish line. Make sure you set goals with a high business impact.
- Anticipate regulation changes and manage compliance: Use your data quality core team to confront short term compliance initiatives such as GDPR. You will then gain immediate short-term value and strategic visibility.
- Establish impactful and ambitious objectives: When establishing your data quality plan, don’t hesitate to set bold business-driven objectives. Your plan will retain attention of the board and stretch people’s capabilities.
- Still deliver quick wins: Quick wins start by engaging the business in data management. Examples include onboarding data, migrating data faster to the cloud, or cleansing your Salesforce data.
- Be realistic: Define and actively use measurable KPIs accepted and understood by everyone. data quality is tied to business so drive your projects using business driven indicators such as ROI or Cost-Saving Improvement Rate.
- Celebrate success: When finishing a project with measurable results, make sure you take time to make it visible within key stakeholders. Know-how is good. It’s better with good communication skills.
Managing data across the enterprise
A proactive approach to data quality allows you to check and measure that level of quality before it even really gets into your core systems. Accessing and monitoring that data across internal, cloud, web, and mobile applications is a big task. The only way to scale that kind of monitoring across all of those systems is through data integration. It therefore becomes necessary to control data quality in real-time.
Of course, avoiding the propagation of erroneous data by inserting control rules into your data integration processes is key. With the right data quality tools and integrated data, you can create whistleblowers that detect some of the root causes of overall data quality problems. Then you will need to track data across your landscape of applications and systems. That allows you to parse, standardize, and match the data in real time. You can organize the activity to check the correct data whenever needed.
The cost of bad data quality can be counted in lost opportunities, bad decisions, and the time it takes to hunt down, cleanse, and correct bad errors. Collaborative data management, and the tools to correct errors at the point of origin are the clear ways to ensure data quality for everyone who needs it. Learn about the numerous apps Talend Data Fabric offers to help achieve both those goals.
Ready to get started with Talend?
More related articles
- What is Data Profiling?
- What is Data Integrity and Why Is It Important?
- What is Data Quality Management?
- What is Data Redundancy?
- What is data synchronization and why is it important?
- 8 Ways to Reduce Data Integrity Risk
- 10 Best Practices for Successful Data Quality
- Data Quality Analysis
- Data Quality and Machine Learning: What’s the Connection?
- Data Quality Software
- Data Quality Tools - Why the Cloud is the Cure for Dirty Data
- How to Choose a Big Data Quality Model
- How to Choose the Right Data Quality Tools
- The Value of Data Quality in Healthcare
- Using Machine Learning for Data Quality