What is Data Governance and Why Do You Need It?

Data governance is a requirement in today's fast-moving and highly competitive enterprise environment. Now that organizations have the opportunity to capture massive amounts of diverse internal and external data, they need a discipline to maximize their value, manage risks, and reduce cost.

What is Data Governance?

Data governance is a collection of processes, roles, policies, standards, and metrics that ensure the effective and efficient use of information in enabling an organization to achieve its goals. It establishes the processes and responsibilities that ensure the quality and security of the data used across a business or organization. Data governance defines who can take what action, upon what data, in what situations, using what methods.

A well-crafted data governance strategy is fundamental for any organization that works with big data, and will explain how your business benefits from consistent, common processes and responsibilities. Business drivers highlight what data needs to be carefully controlled in your data governance strategy and the benefits expected from this effort. This strategy will be the basis of your data governance framework.

For example, if a business driver for your data governance strategy is to ensure the privacy of healthcare-related data, patient data will need to be managed securely as it flows through your business. Retention requirements (e.g. history of who changed what information and when) will be defined to ensure compliance with relevant government requirements, such as the GDPR.

Data governance ensures that roles related to data are clearly defined, and that responsibility and accountability are agreed upon across the enterprise. A well-planned data governance framework covers strategic, tactical, and operational roles and responsibilities.

What Data Governance is Not

Data Governance is frequently confused with other closely related terms and concepts, including data management and master data management.

Data Governance is Not Data Management

Data management refers to the management of the full data lifecycle needs of an organization. Data governance is the core component of data management, tying together nine other disciplines, such as data quality, reference and master data management, data security, database operations, metadata management, and data warehousing.

Data Governance is Not Master Data Management

Master data management (MDM) focuses on identifying an organization's key entities and then improving the quality of this data. It ensures you have the most complete and accurate information available about key entities like customers, suppliers, medical providers, etc. Because those entities are shared across the organization, master data management is about reconciling fragmented views of those entities into a single view—a discipline that gets beyond data governance.

However, there is no successful MDM without proper governance. For example, a data governance program will define the master data models (what is the definition of a customer, a product, etc.), detail the retention policies for data, and define roles and responsibilities for data authoring, data curation, and access.

Data Governance is Not Data Stewardship

Data governance ensures that the right people are assigned the right data responsibilities. Data stewardship refers to the activities necessary to make sure that the data is accurate, in control, and easy to discover and process by the appropriate parties. Data governance is mostly about strategy, roles, organization, and policies, while data stewardship is all about execution and operationalization.

Data stewards take care of data assets, making certain that the actual data is consistent with the data governance plan, linked with other data assets, and in control in terms of data quality, compliance, or security.

Benefits of Data Governance

An effective data governance strategy provides many benefits to an organization, including:

  • A common understanding of data — Data governance provides a consistent view of, and common terminology for, data, while individual business units retain appropriate flexibility.
  • Improved quality of data — Data governance creates a plan that ensures data accuracy, completeness, and consistency.
  • Data map — Data governance provides an advanced ability to understand the location of all data related to key entities, which is necessary for . Like a GPS that can represent a physical landscape and help people find their way in unknown landscapes, data governance makes data assets useable and easier to connect with business outcomes.
  • A 360-degree view of each customer and other business entities — Data governance establishes a framework so an organization can agree on “a single version of the truth” for critical business entities and create an appropriate level of consistency across entities and business activities.
  • Consistent compliance — Data governance provides a platform for meeting the demands of government regulations, such as the EU General Data Protection Regulation (GDPR), the US HIPAA (Health Insurance Portability and Accountability Act), and industry requirements such as PCI DSS (Payment Card Industry Data Security Standards).
  • Improved data management — Data governance brings the human dimension into a highly automated, data-driven world. It establishes codes of conduct and best practices in data management, making certain that the concerns and needs beyond traditional data and technology areas — including areas such as legal, security, and compliance — are addressed consistently.

Cloud Data Governance

As more and more businesses and organizations realize the benefits of moving some or all of their data storage and processes to cloud integration strategies and iPaaS, the need for effective data governance increases at scale.

Moving to the cloud is all about delegating certain tasks to third parties, such as infrastructure management, application development, security, etc. Cloud is also about virtualization of technical resources, which can create data sovereignty challenges—such as with regulations that mandate that data resides in a certain place or country. In addition, cloud-first strategies generally encourage decentralization, allowing lines of business or workgroup to roll out their own system independently, which could result in a uncontrolled data sprawl.

That’s where governance finds its place. First, a strategic data governance plan is crucial for migrating content to the cloud. Whether an organization is moving to a hybrid or completely cloud data model, the data migration process will enjoy all the same benefits of an overall data governance plan, and the migration itself will be more efficient and secure.

Additionally, moving data processes to the cloud adds a layer of complexity regarding security and access. While a completely on-premises data solution still needs a robust data governance strategy, stakeholders especially appreciate the value of data governance when that data is moving through the cloud.

Data Governance Tools

In order to find the right data governance approach for your organization, look for open source, scalable tools that can be quickly and economically integrated with the organization’s existing environment.

Additionally, a cloud-based platform will allow you to quickly plug into robust capabilities that are cost-efficient and easy to use. Cloud-based solutions also avoid the overhead required for on-premises servers.

As you start comparing and selecting data governance tools, focus on selecting tools that will help you realize the business benefits laid out in your data governance strategy.

These tools should help you:

  • Capture and understand your data through discovery, profiling, and benchmarking tools and capabilities. For example, the right tools can automatically detect a piece of personal data, like a social security number, in a new data set and trigger an alert.
  • Improve the quality of your data with validation, data cleansing, and data enrichment.
  • Manage your data with metadata-driven ETL and ELT, and data integration applications, so data pipelines can be tracked and traced with end-to end data lineage.
  • Control your data with tools that actively review and monitor.
  • Document your data so that it can be augmented by metadata to increase its relevance, searchability, accessibility, linkability, and compliance.
  • Empower the people that know the data best, to contribute to the data stewardship tasks with self-service tools.

Talend understands data governance and provides useful, cloud-based tools that can help any size organization move from ungoverned data to active data governance. Talend’s data quality, data and metadata management, and data stewardship tools are robust and easy to use, allowing you to quickly and effectively address your data governance needs.

Data Governance Is Not Optional

Organizations today have incredible amounts of data about customers, clients, suppliers, patients, employees, and more. When this information is properly used to better understand the market and your target audience, an organization will be more successful. The same data governance will also ensure this data is trusted, well-documented, and easy to find and access within your organization, and that it is kept secure, compliant, and confidential.

Make certain that your organization is positioned to maximize data governance investments and minimize risk of data breaches. Take a look at our data governance solutions when you’re ready to get started.

Ready to get started with Talend?