Making Data a Team Sport: Muscle your Data Quality by Challenging IT & Business to Work Together
Data Quality is often perceived as the solo task of a data engineer. As a matter of fact, nothing could be further from the truth. People close to the business are eager to work and resolve data related issues as they’re the first to be impacted by bad data. But they are often reluctant to update data as Data Quality apps are not really made for them or just because they are not allowed to use them. That’s one of the reasons bad data keeps increasing. According to Gartner, poor data quality cost rose by 50% in 2017, reaching $15 million per year for every company. This cost will explode in the upcoming years if nothing is done.
But things are changing: Data Quality is now increasingly becoming a company-wide strategic priority involving professionals from different horizons. To succeed, working as a team like in a sport team is a fair analogy to illustrate the key ingredients to succeed and win any data quality challenge:
- As in team sports, you will hardly succeed with a solo approach rather than tackling from all angles
- As in team sports, there are some practice to make the team succeed and win
- As in team sports, Business/IT Teams would need the right tools, the right approach, and the right people to tackle the data quality challenge
This said, it is not as difficult as one could imagine. You just need to take up the challenge and do things the right way from the get go.
- The right tools. How to fight complexity with simple but interconnected apps
There is a plethora of data quality tools on the market. Go and register to a big data tradeshow and you will discover plenty of data preparation, stewardship, and tools offering several benefits to fight bad data. But only a few of them cover Data Quality for all. On one side, you will have sophisticated tools requiring deep expertise for a successful deployment.
These tools are often complex and require in-depth training to be deployed. Their User Interface is not suitable for everyone so only IT people can really manage them. If you have short term data quality priorities, you will miss your deadline. That would be like trusting a rookie to pilot a jumbo jet with flight instruments that are obviously too sophisticated to end successfully.
On the other side, you will find simple and powerful apps that are often too siloed to be injected into a data quality process. Even if they successfully focus on the business people with simple UI, they will miss a big piece to the puzzle, collaborative data management. And that’s precisely the challenge: success relies not only in the tools and capabilities themselves, but in their ability to simply talk to one another. For that you would need to have a platform-based solution that share, operate and transfer data, actions, and models together. That’s precisely what Talend provides.
You will confront multiple use cases where it will be next to impossible to manage your data successfully alone. By working together, users will empower themselves through the full data lifecycle. Giving your business the power to overcome traditional obstacles such as cleaning, reconciling, matching or resolving your data.
- The right approach
It all starts with the key simple steps approach to manage data better together: the right approach: analyze, improve and control.
Analyze your Data Environment:
Start by getting the big picture and identify key data quality challenges. Analyzing will help to give the big picture of your data. Rather than profiling data on its own with Data profiling in Talend Studio, a data engineer could simply delegate that task to a business analyst who knows customers best. In that case, Data Preparation offers simple yet powerful features that help the team get a glimpse of Data Quality with inflight indicators such as quality in every Data Set Columns. Data Preparationallows you to easily create a preparation based on a data set.
Let’s take the example of a team wishing to prepare a marketing campaign together with sales but suffering from bad data in the SalesForce CRM System. With Data Preparation, you have the ability to automatically as well as interactively profile and browse business data coming from SalesForce. Connected to Salesforce thru DataPrep, you will get a clear picture of your data quality. Once you identified the problem, you can solve it on your own with simple but powerful operations. But you’ve only just scratched the surface. That’s where you would need the expertise of a Data Engineer to go deeper and improve your data quality flows.
Improve your Data with in depth tools and start remediation designing stewardship campaigns
Using Talend Studio as your Data Quality Engine, the data engineers of your IT department will get access to a wide array of very powerful features included into Talend Studio. You will for example separate the wheat from the chaff using a simple data filter operation such as t-filter to identify wrong email patterns or exclude from your domain list improper domain addresses. At that stage, you will need to make sure you isolate bad data into your data quality process. Once filtering is done, you will then continue to improve your data and for that you will call on others for help. Talend Studio will work as the pivot of your data quality process. From Talend Studio, you will enable you to log on your credentials to Talend Cloud and expand your data quality to users close to the business. Whether you’re a business user or a data engineer, Stewardship-now in the Cloud will then allow you to launch cleaning campaigns and solve the bad data challenge with your extended team. This starts with designing your campaign.
Using the same UI look and feel as Talend Data Preparation, Talend Data Stewardship will offer the same easy to use capacities that business users love. As it’s fully operationalized and connected to Talend Studio, it will enable IT or Business Process People to expand Data Quality Operations to new people unfamiliar to technical tools but keen on cleaning data with simple apps relying on their business knowledge and experience.
That’s the essence of collaborative data management: one app for each dedicated operation but seamlessly connected on a single platform that manages your data from ingestion to consumption.
As an example, feel free to view this webinar to learn how to use cloud-based tools to make data better for all: https://info.talend.com/en_tld_better_dataquality.html
Control your data quality process to the last mile with the whole network of stewards
Once you have designed your stewardship campaign, you need to call on Stewards for help and conduct the campaign to have them checked the data at their disposal. Talend Data Stewardship will play a massive role here. Unless other tools existing on the market, the ability to extend your data quality to stewards with UI-friendly applications will make it easier to resolve your data and make sure you have engaged key business contributors in an extended data resolution campaign. They will feel comfortable resolving business data using simple apps.
Engaging business people in your data quality process will bring your data quality processes several benefits too. You will get more accurate results as business analysts have the experience and required skills to choose the proper data. You will soon realize that they will feel committed and be eager to cooperate and work together with you as they’re finally the most concerned by Data Quality.
Machine learning will act here as a virtual companion of your data-driven strategy: as stewards will complete missing details, the machine learning capabilities of Talend Data Quality Solutions will learn from stewards and predict future matching records based on initial records resolved by Stewards. As the system will learn from users, it will give you free hands to pursue other stewardship campaigns and reinforce the impact and control of your data processes.
Finally, you will then build a data flow back to your SalesForce CRM System from your stewardship campaign so that bad data cleaned and resolved by Stewards will then be reinjected into the Salesforce CRM System. Such operations can only be achieved with simplicity if you have apps connected together on a single platform. You’ll have the opportunity to mark data sets as certified directly into a business app like Data Preparation so that users getting access to data will then have cleaned and trusted data to be used.
Remember this three-steps approach is a continuous improvement process that will only get better with time.s
To learn more about Data Quality, please download our Definitive Guide to Data Quality