What is data mesh?
Data mesh is a data platform architecture that allows end-users to easily access important data without transporting it to a data lake or data warehouse and without needing expert data teams to intervene. Data mesh focuses on decentralization, distributing data ownership among teams who can manage data as a product independently and securely —reducing bottlenecks and silos in data management and enabling scalability without sacrificing data governance.
Simply put? Data mesh makes your data discoverable, widely accessible, secure, and interoperable — giving you better decision-making power and faster time to value.
What you need to know about data mesh
Developed by Thoughtworks' Zhamak Dehghani, data mesh leverages a domain-driven, self-serve data infrastructure.
This addresses the flaws in monolithic data warehouse models — rethinking the human side of technology and improving organization to get your company access to your data faster and easier.
Like the microservice architecture that came before it, the possibilities of democratization of data through a domain-driven design are exciting.
The four essential pillars of data mesh are:
- Unite diverse data sources — giving your organization a single source of truth, despite scattered data assets in different systems that may not communicate with each other
- Safeguard through data governance
- Achieve the highest data quality — regardless of your volume of big data, knowing that demand is increasing for instantaneous data access and response times
- Enable self-service without the need for data team intervention — promoting effective data management and collaboration between data engineers, data scientists, and data consumers alike.
Data mesh democratizes data management
Data architectures are constantly improving and evolving to fit your data management needs — but centralizing your data is difficult no matter where you store it. Data lakes are a cost-effective data architecture, but the drawbacks are clear:
- It's slow and difficult to access the data you need.
- Data is locked into proprietary formats, tacking on fees and limiting access control.
- It quickly becomes expensive (requiring storage, software, and data teams to move and copy data and maintain pipelines).
- It becomes unmanageable to have all data on a singular data platform due to ingestion limitations.
Problems with centralized data ownership
- Data on a centralized team requires importing and transporting data to a central data lake, which can quickly become time-consuming and expensive. Data mesh solution: The distributed data architecture of data mesh views data as a product with separate domain ownership of each business unit. This data ownership model is decentralized — reducing the time to value and empowering your teams with discoverable data.
- With an increase in data volume, queries become more complicated, requiring changes in the entire data pipeline that can't scale — slowing down the response time and agility of your team as a whole. Data mesh solution: Data mesh delegates datasets ownership from centralized to the domains (individual teams or business users), enabling agility and scalability. Data mesh architecture powers real-time decision-making in businesses.
Data products: data mesh vs data fabric
Data mesh and data fabric have many similarities and overlaps, but there are a few key differences to be aware of.
Data mesh connects siloed data to help enterprises move towards automated analytics at scale. It allows businesses to slash operational inefficiencies monolithic data architectures and save themselves from massive operational and storage costs. This new distributed approach aims to clear the data access bottlenecks of centralized data ownership by giving data management and ownership to domain-specific business teams.
Data mesh architecture:
- Gives individual teams control over certain datasets (domain-driven data ownership, not relying on data product owners)
- Focuses on data as a product
- Emphasizes a self-service infrastructure
- Ensures data governance and security
Benefits of data mesh in data management
- Agility and scalability
- Data mesh powers decentralized data operations — improving time-to-market, scalability, and business domain agility.
- Flexibility and independence
- Enterprises that take on data mesh architecture avoid becoming locked into one data platform or data product.
- Faster access to critical data
- Data mesh offers easy access on a centralized infrastructure with a self-service model, allowing for faster data access and SQL queries.
- Transparency for cross-functional use across teams
- Centralized data ownership on traditional data platforms makes expert data teams isolated and heavily depended upon - creating a lack of transparency. Data mesh decentralizes data ownership and distributes it among cross-functional domain teams.
Update your data products with data mesh
Data mesh unlocks endless possibilities for businesses in various consumption scenarios — including behavior modeling, data analytics, and business intelligence. Use cases benefit your data team, engineering team, and data scientists alike.