Data is like food to every organization. All the functions are based on the working of data in the respective fields. Distributed data system makes it important to bring up an idea that can make data easily accessible and interconnected across the entire business.
Many vendors, educators, and data pundits have attempted to define the term data mesh. Broadly, it is often referred to define one of the most disruptive trends of data, Artificial Intelligence (AI), and analytics worlds. In fact, Google Trends mentioned that the term “data mesh” overcame the term “Data lakehouse,” which had gained immense popularity in the industry in the last few years.
Data mesh is a new approach to design modern data architectures by taking into account organizational constructs with data-centric ones, data management, governance, etc. This approach aims to facilitate easily accessible and interconnected data across the organization.
The term was introduced by Zhamak Dehghani, a ThoughtWorks consultant, in his research paper in 2019. According to Zhamak, data mesh is a type of data platform architecture that brings the ubiquity of data in the enterprise through a domain-oriented, self-serve design. Data mesh, for sure, marks the next big architectural shift in data.
Compared to traditional monolithic data infrastructures, which possess a central data lake to handle the consumption, storage, transformation, and output of data, a data mesh is an all-in-one platform that tackles distributed, domain-specific data consumers and views “data-as-a-product.” Here, each domain handles its data pipelines.
Data mesh aims to eliminate the challenges related to data availability and accessibility. Business users and data scientists stand to benefit from it. It allows access, analysis, and operationalization of business insights from virtually any data source, irrespective of location, without any interference from expert data teams.
Data mesh helps organizations make data accessible, available, discoverable, secure, and interoperable. Faster access to query data directly translates into faster time to value without the need for data transportation.
Need for data mesh
According to reports, global data creation will exceed 180 zettabytes in five years. Present data platforms have limitations that impact enterprise data processing and affect business growth.
Current data platforms are facing the following challenges –
Challenge #1 – At present, organizations are working on a centralized strategy to operate a large amount of data with various data sources, use cases, and types. Users need to receive/transmit data from edge locations to a central data lake to analyze it and send it across in the centralization process. This makes it an expensive and time-consuming process.
Solution: Data mesh provides a distributed architecture that differentiates data as a product with separate domain ownership of each business unit. This decentralized data ownership model helps reduce the time-to-insights and time-to-value. The model does it by empowering business units and operational teams to access and analyze “non-core” data quickly and easily.
Challenge #2 – While the global data volume is continuously increasing, the query process in a centralized management model requires changes in the complete database pipeline that fails to respond at scale. This slows down the response time for new customers or data sources with the increase in the number of sources, thus badly affecting business agility to obtain useful data and reflex to change.
Solution: Data mesh divides datasets ownership from the central to the domains (individual teams or business users) to enable business agility and change at scale. Data mesh architecture turns enterprises towards real-time decision-making by bridging the time and space gap between an event and its consumption/process for analysis.
Challenge #3 – While transferring data, it is often receptive to data residency and privacy guidelines that often restrict data migration if it is located in a particular location or legal jurisdiction. For instance, it becomes difficult for a user to access data stored in an EU country while residing in North America. Following up on data governance regulations consumes time and makes processes tedious. It can even significantly delay data processing that analysis teams carry for critical business intelligence to help maintain a competitive advantage.
Solution: In decentralized data management, the domains take control of the quality, security, and transfer of their data products. Here, Data mesh plays a role of connectivity layer that provides direct access and query capabilities from technical and non-technical users to data sets where they reside, ignoring costly data transfers and residency concerns.
Data mesh benefits
- Makes business agile and scalable – Data mesh powers the decentralized data operations, independent team performance, and provisions data infrastructure as a service. This results in improved time-to-market, scalability, and business domain agility. It also helps remove the process complexities and IT backlog that reduces operating and storage costs.
- Can access data faster with prompt delivery – Data mesh delivers an easily governable and centralized infrastructure that’s premised on a self-service model without underlying complexity for faster data access and accurate delivery. It becomes easier for businesses to access data from anywhere through SQL queries with much lower latency. The distributed architecture of the data mesh helps reduce the processing and intervention layers resulting in delays of time to insight.
- Maintains flexibility and gives independence – Organizations adopting data mesh architecture are turning vendor-agnostic businesses that are completely dependent on one data platform. The distributed infrastructure provides companies with unparallel flexibility and choices as it is connected to multiple systems.
- Maintains data governance for end-to-end compliance – Data ingestion is reconciled by distributed architecture with its sources, formats, and volumes, allowing organizations to regulate their security at the source system. Decentralized data operations make it easier to comply with global data governance rules for quality data supply and data accessibility.
- Cross-Functional teams improve transparency – Traditional data platforms’ centralized data ownership isolates expert teams, causes a lack of transparency and fails to provide a backup plan in the event of data control/ownership loss. Data mesh decentralizes data ownership through its domain-oriented strategy by spreading it among cross-functional domain teams, including domain experts, business teams, IT, and agile virtual teams, for enhanced transparency and data quality.
- Ensures platform connectivity and data security – With a decentralized structure, cloud applications can connect to on-site sensitive data streaming live or may even exist on devices in real-time. Data mesh helps compile data analytics where the data resides, and it does not ask users to make a copy and send it via a public network to a data warehouse.
It also eliminates the risk of information loss or data breach resulting in improved security and latency reduction. These factors help improve overall performance in different use cases, including live streaming, financial trading, online gaming, etc., using platform connectivity in a distributed model.
It also eliminates the risk of information loss or data breach resulting in improved security and latency reduction. These factors help improve overall performance in different use cases, including live streaming, financial trading, online gaming, etc., using platform connectivity in a distributed model.
Data mesh is here to stay
When rolled in action, data mesh helps obtain more out of distributed data. Data mesh has opened the gate for endless possibilities for businesses in multiple consumption scenarios, including behavior modeling, analytics, and data-intensive applications. This could be core data comprising the business sales data or/and non-core data encompassing web data and clickstream. The distributed data architecture makes it easier to access data and enables faster delivery without a vendor lock-in with an expensive enterprise warehouse.
To learn more about data mesh and other related technologies, visit our whitepapers here.