Ebook693 pages4 hours

Data Mesh in Action

By Jacek Majchrzak, Sven Balnojan and Marian Siwiak

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Revolutionize the way your organization approaches data with a data mesh! This new decentralized architecture outpaces monolithic lakes and warehouses and can work for a company of any size.

In Data Mesh in Action you will learn how to:

    Implement a data mesh in your organization
    Turn data into a data product
    Move from your current data architecture to a data mesh
    Identify data domains, and decompose an organization into smaller, manageable domains
    Set up the central governance and local governance levels over data
    Balance responsibilities between the two levels of governance
    Establish a platform that allows efficient connection of distributed data products and automated governance

Data Mesh in Action reveals how this groundbreaking architecture looks for both small startups and large enterprises. You won’t need any new technology—this book shows you how to start implementing a data mesh with flexible processes and organizational change. You’ll explore both an extended case study and multiple real-world examples. As you go, you’ll be expertly guided through discussions around Socio-Technical Architecture and Domain-Driven Design with the goal of building a sleek data-as-a-product system. Plus, dozens of workshop techniques for both in-person and remote meetings help you onboard colleagues and drive a successful transition.

About the technology
Business increasingly relies on efficiently storing and accessing large volumes of data. The data mesh is a new way to decentralize data management that radically improves security and discoverability. A well-designed data mesh simplifies self-service data consumption and reduces the bottlenecks created by monolithic data architectures.

About the book
Data Mesh in Action teaches you pragmatic ways to decentralize your data and organize it into an effective data mesh. You’ll start by building a minimum viable data product, which you’ll expand into a self-service data platform, chapter-by-chapter. You’ll love the book’s unique “sliders” that adjust the mesh to meet your specific needs. You’ll also learn processes and leadership techniques that will change the way you and your colleagues think about data.

What's inside

    Decompose an organization into manageable domains
    Turn data into a data product
    Set up central and local governance levels
    Build a fit-for-purpose data platform
    Improve management, initiation, and support techniques

About the reader
For data professionals. Requires no specific programming stack or data platform.

About the author
Jacek Majchrzak is a hands-on lead data architect. Dr. Sven Balnojan manages data products and teams. Dr. Marian Siwiak is a data scientist and a management consultant for IT, scientific, and technical projects.

Table of Contents

PART 1 FOUNDATIONS
1 The what and why of the data mesh
2 Is a data mesh right for you?
3 Kickstart your data mesh MVP in a month
PART 2 THE FOUR PRINCIPLES IN PRACTICE
4 Domain ownership
5 Data as a product
6 Federated computational governance
7 The self-serve data platform
PART 3 INFRASTRUCTURE AND TECHNICAL ARCHITECTURE
8 Comparing self-serve data platforms
9 Solution architecture design

Skip carousel

Computers

LanguageEnglish

PublisherManning

Release dateMar 21, 2023

ISBN9781638351849

Author

Jacek Majchrzak

Jacek Majchrzak is a hands-on lead architect in the area of drug discovery where he implements the data mesh idea. Jacek is a workshop facilitator with a strong focus on domain-driven design, software architecture and socio-technical systems design.

Related authors

Skip carousel

Related to Data Mesh in Action

Related ebooks

Skip carousel

Modern Big Data Architectures: A Multi-Agent Systems Perspective
Ebook
Modern Big Data Architectures: A Multi-Agent Systems Perspective
byDominik Ryzko
Rating: 0 out of 5 stars
0 ratings
Data Mesh: Building Scalable, Resilient, and Decentralized Data Infrastructure for the Enterprise Part 1
Ebook
Data Mesh: Building Scalable, Resilient, and Decentralized Data Infrastructure for the Enterprise Part 1
byTom Lesley
Rating: 0 out of 5 stars
0 ratings
Data Mesh: What Is Data Mesh? Principles of Data Mesh Architecture
Ebook
Data Mesh: What Is Data Mesh? Principles of Data Mesh Architecture
byBrian Murray
Rating: 0 out of 5 stars
0 ratings
Data Lake Development with Big Data
Ebook
Data Lake Development with Big Data
byPasupuleti Pradeep
Rating: 0 out of 5 stars
0 ratings
Scala for Data Science
Ebook
Scala for Data Science
byBugnion Pascal
Rating: 0 out of 5 stars
0 ratings
Graph Databases in Action: Examples in Gremlin
Ebook
Graph Databases in Action: Examples in Gremlin
byJosh Perryman
Rating: 0 out of 5 stars
0 ratings
Streaming Data: Understanding the real-time pipeline
Ebook
Streaming Data: Understanding the real-time pipeline
byAndrew Psaltis
Rating: 0 out of 5 stars
0 ratings
NoSQL Essentials: Navigating the World of Non-Relational Databases
Ebook
NoSQL Essentials: Navigating the World of Non-Relational Databases
byKameron Hussain
Rating: 0 out of 5 stars
0 ratings
Making Sense of NoSQL: A guide for managers and the rest of us
Ebook
Making Sense of NoSQL: A guide for managers and the rest of us
byAnn Kelly
Rating: 0 out of 5 stars
0 ratings
The Microsoft Data Warehouse Toolkit: With SQL Server 2008 R2 and the Microsoft Business Intelligence Toolset
Ebook
The Microsoft Data Warehouse Toolkit: With SQL Server 2008 R2 and the Microsoft Business Intelligence Toolset
byJoy Mundy
Rating: 0 out of 5 stars
0 ratings
Spring in Practice
Ebook
Spring in Practice
byJoshua White
Rating: 0 out of 5 stars
0 ratings
Spark for Data Science
Ebook
Spark for Data Science
bySrinivas Duvvuri
Rating: 0 out of 5 stars
0 ratings
Big Data Modeling and Management Systems
Ebook
Big Data Modeling and Management Systems
byAlexander Afriyie
Rating: 0 out of 5 stars
0 ratings
Data Mesh: Transforming Data Architecture for Decentralized and Scalable Insights
Ebook
Data Mesh: Transforming Data Architecture for Decentralized and Scalable Insights
byDaniel Garfield
Rating: 0 out of 5 stars
0 ratings
Solutions Architect's Handbook: Kick-start your career with architecture design principles, strategies, and generative AI techniques
Ebook
Solutions Architect's Handbook: Kick-start your career with architecture design principles, strategies, and generative AI techniques
bySaurabh Shrivastava
Rating: 0 out of 5 stars
0 ratings
Six-Word Lessons for Data-Driven Decision-Making: 100 Lessons Today's Data Pros Must Adopt for Exceptional Bottom-Line Results
Ebook
Six-Word Lessons for Data-Driven Decision-Making: 100 Lessons Today's Data Pros Must Adopt for Exceptional Bottom-Line Results
byDaniel Rubiolo
Rating: 0 out of 5 stars
0 ratings
A Simplified Approach to It Architecture with Bpmn: A Coherent Methodology for Modeling Every Level of the Enterprise
Ebook
A Simplified Approach to It Architecture with Bpmn: A Coherent Methodology for Modeling Every Level of the Enterprise
byDavid W. Enstrom
Rating: 0 out of 5 stars
0 ratings
Mastering Large Language Models with Python: Unleash the Power of Advanced Natural Language Processing for Enterprise Innovation and Efficiency Using Large Language Models (LLMs) with Python
Ebook
Mastering Large Language Models with Python: Unleash the Power of Advanced Natural Language Processing for Enterprise Innovation and Efficiency Using Large Language Models (LLMs) with Python
byRaj Arun R
Rating: 0 out of 5 stars
0 ratings
Hexagonal Architecture Explained
Ebook
Hexagonal Architecture Explained
byAlistair Cockburn
Rating: 0 out of 5 stars
0 ratings
The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data
Ebook
The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data
byRalph Kimball
Rating: 4 out of 5 stars
4/5
Enterprise Architecture at Work: Modelling, Communication and Analysis
Ebook
Enterprise Architecture at Work: Modelling, Communication and Analysis
byMarc Lankhorst
Rating: 2 out of 5 stars
2/5
DevOps Handbook: What is DevOps, Why You Need it and How to Transform Your Business with DevOps Practices
Ebook
DevOps Handbook: What is DevOps, Why You Need it and How to Transform Your Business with DevOps Practices
byFrank Millstein
Rating: 4 out of 5 stars
4/5
Data warehouse Complete Self-Assessment Guide
Ebook
Data warehouse Complete Self-Assessment Guide
byGerardus Blokdyk
Rating: 4 out of 5 stars
4/5
The Data Model Resource Book, Volume 1: A Library of Universal Data Models for All Enterprises
Ebook
The Data Model Resource Book, Volume 1: A Library of Universal Data Models for All Enterprises
byLen Silverston
Rating: 0 out of 5 stars
0 ratings
A Manager's Guide to Data Warehousing
Ebook
A Manager's Guide to Data Warehousing
byLaura Reeves
Rating: 2 out of 5 stars
2/5
Cloud Data Architectures Demystified: Gain the expertise to build Cloud data solutions as per the organization's needs (English Edition)
Ebook
Cloud Data Architectures Demystified: Gain the expertise to build Cloud data solutions as per the organization's needs (English Edition)
byAshok Boddeda
Rating: 0 out of 5 stars
0 ratings
Patterns, Principles, and Practices of Domain-Driven Design
Ebook
Patterns, Principles, and Practices of Domain-Driven Design
byScott Millett
Rating: 0 out of 5 stars
0 ratings
Building the Data Warehouse
Ebook
Building the Data Warehouse
byW.H. Inmon
Rating: 5 out of 5 stars
5/5
The Autonomous Revolution: Reclaiming the Future We've Sold to Machines
Ebook
The Autonomous Revolution: Reclaiming the Future We've Sold to Machines
byWilliam H. Davidow
Rating: 0 out of 5 stars
0 ratings
UML Summarized: Key Concepts and Diagrams for Software Engineers, Architects, and Designers
Ebook
UML Summarized: Key Concepts and Diagrams for Software Engineers, Architects, and Designers
byPhillip Mrzyglocki
Rating: 0 out of 5 stars
0 ratings

Computers For You

Skip carousel

ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
Ebook
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
byAlec Rowe
Rating: 0 out of 5 stars
0 ratings
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Ebook
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
byMargot Lee Shetterly
Rating: 4 out of 5 stars
4/5
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
Ebook
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
byTriumph Books
Rating: 4 out of 5 stars
4/5
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution
Ebook
The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution
byWalter Isaacson
Rating: 4 out of 5 stars
4/5
The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game
Ebook
The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game
byRizwan Virk
Rating: 5 out of 5 stars
5/5
Elon Musk
Ebook
Elon Musk
byWalter Isaacson
Rating: 4 out of 5 stars
4/5
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
Ebook
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
byAlex Parkinson
Rating: 4 out of 5 stars
4/5
Uncanny Valley: A Memoir
Ebook
Uncanny Valley: A Memoir
byAnna Wiener
Rating: 4 out of 5 stars
4/5
The Invisible Rainbow: A History of Electricity and Life
Ebook
The Invisible Rainbow: A History of Electricity and Life
byArthur Firstenberg
Rating: 5 out of 5 stars
5/5
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
Ebook
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
byAaron Smith
Rating: 0 out of 5 stars
0 ratings
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
Ebook
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
byQuentin Docter
Rating: 0 out of 5 stars
0 ratings
Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition
Ebook
Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition
byAndrew Hodges
Rating: 4 out of 5 stars
4/5
Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics
Ebook
Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics
byGary Smith
Rating: 4 out of 5 stars
4/5
Deep Search: How to Explore the Internet More Effectively
Ebook
Deep Search: How to Explore the Internet More Effectively
byAlan Pearce
Rating: 5 out of 5 stars
5/5
The Professional Voiceover Handbook: Voiceover training, #1
Ebook
The Professional Voiceover Handbook: Voiceover training, #1
byPeter Baker
Rating: 5 out of 5 stars
5/5
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
Ebook
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
byTJ Books
Rating: 4 out of 5 stars
4/5
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
Ebook
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
byNigel Tillery
Rating: 0 out of 5 stars
0 ratings
Dark Aeon: Transhumanism and the War Against Humanity
Ebook
Dark Aeon: Transhumanism and the War Against Humanity
byJoe Allen
Rating: 5 out of 5 stars
5/5
Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls
Ebook
Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls
byKathleen Hale
Rating: 4 out of 5 stars
4/5
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
Ebook
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
bySeth Stephens-Davidowitz
Rating: 4 out of 5 stars
4/5
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
Ebook
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
bySteven Cooper
Rating: 4 out of 5 stars
4/5
How to Write a Book: An 11-Step Process to Build Habits, Stop Procrastinating, Fuel Self-Motivation, Quiet Your Inner Critic, Bust Through Writer's Block, & Let Your Creative Juices Flow (Short Read)
Ebook
How to Write a Book: An 11-Step Process to Build Habits, Stop Procrastinating, Fuel Self-Motivation, Quiet Your Inner Critic, Bust Through Writer's Block, & Let Your Creative Juices Flow (Short Read)
byDavid Kadavy
Rating: 5 out of 5 stars
5/5
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 5 out of 5 stars
5/5
Remote/WebCam Notarization : Basic Understanding
Ebook
Remote/WebCam Notarization : Basic Understanding
byJeannie Eunice Franks
Rating: 3 out of 5 stars
3/5
The Hacker Crackdown: Law and Disorder on the Electronic Frontier
Ebook
The Hacker Crackdown: Law and Disorder on the Electronic Frontier
byBruce Sterling
Rating: 4 out of 5 stars
4/5
CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide
Ebook
CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide
byJoe Shelley
Rating: 5 out of 5 stars
5/5
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
Summary of Digital Minimalism: by Cal Newport - Choosing a Focused Life in a Noisy World - A Comprehensive Summary
Ebook
Summary of Digital Minimalism: by Cal Newport - Choosing a Focused Life in a Noisy World - A Comprehensive Summary
byAlexander Cooper
Rating: 5 out of 5 stars
5/5

Related podcast episodes

Skip carousel

Build A Data Lake For Your Security Logs With Scanner: Monitoring and auditing IT systems for security events requires the ability to quickly analyze massive volumes of unstructured log data. The majority of products that are available either require too much effort to structure the logs, or aren't fast enough for interactive use cases. Cliff Crosland co-founded Scanner to provide fast querying of high scale log data for security auditing. In this episode he shares the story of how it got started, how it works, and how you can get started with it.
Podcast episode
Build A Data Lake For Your Security Logs With Scanner: Monitoring and auditing IT systems for security events requires the ability to quickly analyze massive volumes of unstructured log data. The majority of products that are available either require too much effort to structure the logs, or aren't fast enough for interactive use cases. Cliff Crosland co-founded Scanner to provide fast querying of high scale log data for security auditing. In this episode he shares the story of how it got started, how it works, and how you can get started with it.
byData Engineering Podcast
0 ratings
0% found this document useful
#176 - Acing the System Design Interview - Zhiyong Tan
Podcast episode
#176 - Acing the System Design Interview - Zhiyong Tan
byTech Lead Journal
0 ratings
0% found this document useful
Preparing for System Design Interview
Podcast episode
Preparing for System Design Interview
byContinuous improvement
0 ratings
0% found this document useful
Why you should be using a hexagonal architecture for microservices with José Haro Peralta: Our guest for today is a consultant, author, and …
Podcast episode
Why you should be using a hexagonal architecture for microservices with José Haro Peralta: Our guest for today is a consultant, author, and …
byCoding Over Cocktails
0 ratings
0% found this document useful
#21 - Domain-Driven Design and Event-Driven Architecture - Vaughn Vernon
Podcast episode
#21 - Domain-Driven Design and Event-Driven Architecture - Vaughn Vernon
byTech Lead Journal
0 ratings
0% found this document useful
Keeping Your Data Warehouse In Order With DataForm - Episode 102: An interview about Dataform and how it helps you to keep your data warehouse in good working order
Podcast episode
Keeping Your Data Warehouse In Order With DataForm - Episode 102: An interview about Dataform and how it helps you to keep your data warehouse in good working order
byData Engineering Podcast
0 ratings
0% found this document useful
Azure Databricks: I sat down with Ali Ghodsi, CEO and found of Databricks, and John Chirapurath, GM for Data Platform Marketing at Microsoft related to the recent announcement of Azure Databricks. When I heard about the announcement, my first thoughts were...
Podcast episode
Azure Databricks: I sat down with Ali Ghodsi, CEO and found of Databricks, and John Chirapurath, GM for Data Platform Marketing at Microsoft related to the recent announcement of Azure Databricks. When I heard about the announcement, my first thoughts were...
byData Skeptic
0 ratings
0% found this document useful
Revisit The Fundamental Principles Of Working With Data To Avoid Getting Caught In The Hype Cycle: The data ecosystem has seen a constant flurry of activity for the past several years, and it shows no signs of slowing down. With all of the products, techniques, and buzzwords being discussed it can be easy to be overcome by the hype. In this episode Juan Sequeda and Tim Gasper from data.world share their views on the core principles that you can use to ground your work and avoid getting caught in the hype cycles.
Podcast episode
Revisit The Fundamental Principles Of Working With Data To Avoid Getting Caught In The Hype Cycle: The data ecosystem has seen a constant flurry of activity for the past several years, and it shows no signs of slowing down. With all of the products, techniques, and buzzwords being discussed it can be easy to be overcome by the hype. In this episode Juan Sequeda and Tim Gasper from data.world share their views on the core principles that you can use to ground your work and avoid getting caught in the hype cycles.
byData Engineering Podcast
0 ratings
0% found this document useful
Running Databases on Kubernetes
Podcast episode
Running Databases on Kubernetes
byThe Cloudcast
0 ratings
0% found this document useful
Eureka moments with natural language processing: featuring Nicholas Mohnacky of bundleIQ
Podcast episode
Eureka moments with natural language processing: featuring Nicholas Mohnacky of bundleIQ
byPractical AI: Machine Learning, Data Science, LLM
0 ratings
0% found this document useful
AI Today Podcast: AI Glossary Series – Loss Function, Cost Function and Gradient Descent: In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer define the terms Loss Function, Cost Function and Gradient Descent, explain how these terms relates to AI and why it's important to know about them. Show Notes:
Podcast episode
AI Today Podcast: AI Glossary Series – Loss Function, Cost Function and Gradient Descent: In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer define the terms Loss Function, Cost Function and Gradient Descent, explain how these terms relates to AI and why it's important to know about them. Show Notes:
byAI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion
0 ratings
0% found this document useful
Design Patterns – Podcast S08 E03: Joshua Greene and Jay Strawn, the authors of "Design Patterns by Tutorials", join us to talk about different Design Patterns and SOLID.
Podcast episode
Design Patterns – Podcast S08 E03: Joshua Greene and Jay Strawn, the authors of "Design Patterns by Tutorials", join us to talk about different Design Patterns and SOLID.
byThe Kodeco Podcast: For App Developers and Gamers
0 ratings
0% found this document useful
Algolia’s Sarah Dayan on what sets a staff plus engineer apart
Podcast episode
Algolia’s Sarah Dayan on what sets a staff plus engineer apart
byThe Ticket: Discover the Future of Customer Service, Support, and Experience, with Intercom
0 ratings
0% found this document useful
Hadoop Ops: Rocana CTO Eric Sammer Interview: Rocana applies big data, advanced analytics, and visualizations to dev ops in order to guide users to the root causes of problems. Eric Sammer is the co-founder and CTO of Rocana. At Cloudera, he served as an Engineering Manager responsible for tools a...
Podcast episode
Hadoop Ops: Rocana CTO Eric Sammer Interview: Rocana applies big data, advanced analytics, and visualizations to dev ops in order to guide users to the root causes of problems. Eric Sammer is the co-founder and CTO of Rocana. At Cloudera, he served as an Engineering Manager responsible for tools a...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Culture, Teams and Adoption of FinOps
Podcast episode
Culture, Teams and Adoption of FinOps
byThe Cloudcast
0 ratings
0% found this document useful
#86 - Adaptive Systems with Wardley Mapping, Domain-Driven Design, and Team Topologies - Susanne Kaiser
Podcast episode
#86 - Adaptive Systems with Wardley Mapping, Domain-Driven Design, and Team Topologies - Susanne Kaiser
byTech Lead Journal
0 ratings
0% found this document useful
Revisiting The Technical And Social Benefits Of The Data Mesh: An interview with Zhamak Dehghani about her experience working with the community that has grown up around her idea of the data mesh and the lessons that she has learned.
Podcast episode
Revisiting The Technical And Social Benefits Of The Data Mesh: An interview with Zhamak Dehghani about her experience working with the community that has grown up around her idea of the data mesh and the lessons that she has learned.
byData Engineering Podcast
0 ratings
0% found this document useful
Interview with Kunal Das, Chief Architect at SouthState Bank.
Podcast episode
Interview with Kunal Das, Chief Architect at SouthState Bank.
byBizzdesign Enterprise Architecture Podcast
0 ratings
0% found this document useful
Bring Your Own Data to LLMs (W/ Jerry Liu of LlamaIndex): Jerry Liu is the CEO and co-founder of LlamaIndex. LlamaIndex is an open-source framework that helps people prep their data for use with large language models in a process called retrieval augmented generation. LLMs are great decision engines, but in...
Podcast episode
Bring Your Own Data to LLMs (W/ Jerry Liu of LlamaIndex): Jerry Liu is the CEO and co-founder of LlamaIndex. LlamaIndex is an open-source framework that helps people prep their data for use with large language models in a process called retrieval augmented generation. LLMs are great decision engines, but in...
byThe Analytics Engineering Podcast
0 ratings
0% found this document useful
152: The Future Database with Sam Lambert: Databases are key to almost any project, large or small. Most database systems in the cloud are designed for heavy use and the costs can get expensive quickly, but database-as-a-service is a rapidly growing area, where many databases can share the same h
Podcast episode
152: The Future Database with Sam Lambert: Databases are key to almost any project, large or small. Most database systems in the cloud are designed for heavy use and the costs can get expensive quickly, but database-as-a-service is a rapidly growing area, where many databases can share the same h
byProgramming Throwdown
0 ratings
0% found this document useful
Data Mechanics: Data Engineering with Jean-Yves Stephan: Apache Spark is a popular open source analytics engine for large-scale data processing. Applications can be written in Java, Scala, Python, R, and SQL. These applications have flexible options to run on like Kubernetes or in the cloud.
Podcast episode
Data Mechanics: Data Engineering with Jean-Yves Stephan: Apache Spark is a popular open source analytics engine for large-scale data processing. Applications can be written in Java, Scala, Python, R, and SQL. These applications have flexible options to run on like Kubernetes or in the cloud.
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Cloud Native Observability with Martin Mao: Maintaining availability in a modern digital application is critical to keeping your application operating and available and to keep meeting your customers growing demands. There are many observability platforms out there and certainly Prometheus is a ...
Podcast episode
Cloud Native Observability with Martin Mao: Maintaining availability in a modern digital application is critical to keeping your application operating and available and to keep meeting your customers growing demands. There are many observability platforms out there and certainly Prometheus is a ...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Cloud Dataflow with Eric Anderson: Batch and stream processing systems have been evolving for the past decade. From MapReduce to Apache Storm to Dataflow, the best practices for large volume data processing have become more sophisticated as the industry and open source communities have ...
Podcast episode
Cloud Dataflow with Eric Anderson: Batch and stream processing systems have been evolving for the past decade. From MapReduce to Apache Storm to Dataflow, the best practices for large volume data processing have become more sophisticated as the industry and open source communities have ...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Building A Data Mesh Platform At PayPal: There has been a lot of discussion about the practical application of data mesh and how to implement it in an organization. Jean-Georges Perrin was tasked with designing a new data platform implementation at PayPal and wound up building a data mesh. In this episode he shares that journey and the combination of technical and organizational challenges that he encountered in the process.
Podcast episode
Building A Data Mesh Platform At PayPal: There has been a lot of discussion about the practical application of data mesh and how to implement it in an organization. Jean-Georges Perrin was tasked with designing a new data platform implementation at PayPal and wound up building a data mesh. In this episode he shares that journey and the combination of technical and organizational challenges that he encountered in the process.
byData Engineering Podcast
0 ratings
0% found this document useful
Tackling Real Time Streaming Data With SQL Using RisingWave: Stream processing systems have long been built with a code-first design, adding SQL as a layer on top of the existing framework. RisingWave is a database engine that was created specifically for stream processing, with S3 as the storage layer. In this episode Yingjun Wu explains how it is architected to power analytical workflows on continuous data flows, and the challenges of making it responsive and scalable.
Podcast episode
Tackling Real Time Streaming Data With SQL Using RisingWave: Stream processing systems have long been built with a code-first design, adding SQL as a layer on top of the existing framework. RisingWave is a database engine that was created specifically for stream processing, with S3 as the storage layer. In this episode Yingjun Wu explains how it is architected to power analytical workflows on continuous data flows, and the challenges of making it responsive and scalable.
byData Engineering Podcast
0 ratings
0% found this document useful
Episode 8: Interview Eric Evans: Eric Evans is the author of the well known Domain-Driven Design book. In his day job he works as a consultant and coach for his own company, Domain Language. In this interview, Eric talks about the essential building blocks of domain-driven design as w...
Podcast episode
Episode 8: Interview Eric Evans: Eric Evans is the author of the well known Domain-Driven Design book. In his day job he works as a consultant and coach for his own company, Domain Language. In this interview, Eric talks about the essential building blocks of domain-driven design as w...
bySoftware Engineering Radio - the podcast for professional software developers
0 ratings
0% found this document useful
Platform as a Service with Sinclair Schuller: Platform as a service can mean different things to different people. The most prominent feature of a PaaS is the ability to abstract away issues that every developer within an organization has to deal with. As an example,
Podcast episode
Platform as a Service with Sinclair Schuller: Platform as a service can mean different things to different people. The most prominent feature of a PaaS is the ability to abstract away issues that every developer within an organization has to deal with. As an example,
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Building Self Serve Business Intelligence With AI And Semantic Modeling At Zenlytic: Business intellingence has been chasing the promise of self-serve data for decades. As the capabilities of these systems has improved and become more accessible, the target of what self-serve means changes. With the availability of AI powered by large language models combined with the evolution of semantic layers, the team at Zenlytic have taken aim at this problem again. In this episode Paul Blankley and Ryan Janssen explore the power of natural language driven data exploration combined with semantic modeling that enables an intuitive way for everyone in the business to access the data that they need to succeed in their work.
Podcast episode
Building Self Serve Business Intelligence With AI And Semantic Modeling At Zenlytic: Business intellingence has been chasing the promise of self-serve data for decades. As the capabilities of these systems has improved and become more accessible, the target of what self-serve means changes. With the availability of AI powered by large language models combined with the evolution of semantic layers, the team at Zenlytic have taken aim at this problem again. In this episode Paul Blankley and Ryan Janssen explore the power of natural language driven data exploration combined with semantic modeling that enables an intuitive way for everyone in the business to access the data that they need to succeed in their work.
byData Engineering Podcast
0 ratings
0% found this document useful
#143 - How to Think Like a Software Engineering Manager - Akanksha Gupta
Podcast episode
#143 - How to Think Like a Software Engineering Manager - Akanksha Gupta
byTech Lead Journal
100%
100% found this document useful
#567: AWS Lambda SnapStart
Podcast episode
#567: AWS Lambda SnapStart
byAWS Podcast
0 ratings
0% found this document useful

Skip carousel

Why Is ELT Better For Cloud Data Warehousing?
Techfastly
Article
Why Is ELT Better For Cloud Data Warehousing?
Apr 1, 2021
2 min read
Step In To Your Time Machine
MacLife
Article
Step In To Your Time Machine
Sep 13, 2022
SAVED THE WRONG edit of a document? Time Machine! Spent hours on a project only to find someone else has saved an old version over it? Time Machine! Time Machine isn’t just a great way to retrace your steps. It’s an efficient backup app too; once set
3 min read
Cloudy With No Chance Of Erp
Architectural Review Asia Pacific
Article
Cloudy With No Chance Of Erp
Nov 11, 2019
ERP (enterprise resource planning) was born around the time the first ‘[Something] for Dummies’ book was published*. It’s typically inflexible, uncompromising software designed for large businesses, like banks, large corporations, manufacturing and s
2 min read
A.I.-POWERED RASPBERRY Pi
Linux Format
Article
A.I.-POWERED RASPBERRY Pi
Sep 19, 2023
1 min read
What is ELT?
Techfastly
Article
What is ELT?
Apr 1, 2021
It stands for extract, load, and transform- the processes a data pipeline uses for replicating the data from a source system into a target system such as a cloud data warehouse. 1. Extraction is the first step in which data is copied from the source
6 min read
Us Senators Call Out Big Tech’s New Approach To Poaching Talent, Products From Smaller AI Startups
TechLife News
Article
Us Senators Call Out Big Tech’s New Approach To Poaching Talent, Products From Smaller AI Startups
Jul 20, 2024
4 min read
We Need an FDA For Algorithms
Nautilus
Article
We Need an FDA For Algorithms
Nov 1, 2018
In the introduction to her new book, Hannah Fry points out something interesting about the phrase “Hello World.” It’s never been quite clear, she says, whether the phrase—which is frequently the entire output of a student’s first computer program—is
10 min read
AI Summit Apple Is Entering The Artificial Intelligence Race
AppleMagazine
Article
AI Summit Apple Is Entering The Artificial Intelligence Race
Feb 24, 2023
5 min read
Generative AI: What Leaders Need To Know
Rotman Management
Article
Generative AI: What Leaders Need To Know
Jan 1, 2024
12 min read
The Winter Getaway That Turned the Software World Upside Down
The Atlantic
Article
The Winter Getaway That Turned the Software World Upside Down
Dec 8, 2017
13 min read
How Netflix’s OTT Architecture Functions?
Techfastly
Article
How Netflix’s OTT Architecture Functions?
May 1, 2022
With so many OTT platforms in the market today, Netflix has managed to capture a majority of the audience on a global scale. Netflix has become the go-to source of so much entertainment for consumers in less than 20 years. It can even be said that Ne
4 min read
Seeing The Light
Linux Format
Article
Seeing The Light
Jun 28, 2022
7 min read
The Fundamental Limits of Machine Learning
Nautilus
Article
The Fundamental Limits of Machine Learning
Sep 20, 2016
5 min read
The Art Of Data Interrogation
Rotman Management
Article
The Art Of Data Interrogation
May 1, 2023
12 min read
Saxo Bank And Thoughtworks: Enabling Data Democratization At A Global Investment Bank
Business Today
Article
Saxo Bank And Thoughtworks: Enabling Data Democratization At A Global Investment Bank
Jan 20, 2023
2 min read
The Network NAS appliances 2024
PC Pro Magazine
Article
The Network NAS appliances 2024
Apr 4, 2024
4 min read
Business NAS appliances 2022
PC Pro Magazine
Article
Business NAS appliances 2022
Apr 10, 2022
4 min read
Network-monitoring software 2024
PC Pro Magazine
Article
Network-monitoring software 2024
Feb 8, 2024
4 min read
Cloud File Sharing And Collaboration
PC Pro Magazine
Article
Cloud File Sharing And Collaboration
Mar 9, 2023
4 min read
Buyer’s Guide Network Monitoring
PC Pro Magazine
Article
Buyer’s Guide Network Monitoring
Feb 9, 2023
4 min read
What Should You Know About Cloud Security Solutions?
HWM Singapore
Article
What Should You Know About Cloud Security Solutions?
Apr 9, 2021
3 min read
Hybrid Backup For Business
PC Pro Magazine
Article
Hybrid Backup For Business
Apr 8, 2021
4 min read
One Tree To Rule Them All
Family Tree
Article
One Tree To Rule Them All
Apr 19, 2022
7 min read
Is My Data Really Safe? Your Questions About Cloud-Based Storage, Answered.
Entrepreneur
Article
Is My Data Really Safe? Your Questions About Cloud-Based Storage, Answered.
Nov 1, 2014
2 min read
Cloud Configuration
PC Pro Magazine
Article
Cloud Configuration
Sep 10, 2020
2 min read
Opinion
Linux Format
Article
Opinion
Aug 20, 2024
Italo Vignoli is one of the founders of LibreOffice and the Document Foundation. “Think about the personal and confidential information in your office suite documents; it’s essential your office suite respects user privacy. LibreOffice does not ask y
3 min read
There’s A New Career In Town
True Love
Article
There’s A New Career In Town
Oct 21, 2019
2 min read
BUYER'S GUIDE TO Cloud File Sharing In 2021
PC Pro Magazine
Article
BUYER'S GUIDE TO Cloud File Sharing In 2021
Jan 7, 2021
4 min read
Network monitoring 2022
PC Pro Magazine
Article
Network monitoring 2022
Feb 10, 2022
4 min read
“The Biggest Problem I See When People Are Working From Home Is A Poorly Designed Network”
PC Pro Magazine
Article
“The Biggest Problem I See When People Are Working From Home Is A Poorly Designed Network”
Jun 8, 2023
6 min read

Related categories

Skip carousel

Reviews for Data Mesh in Action

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Data Mesh in Action - Jacek Majchrzak

inside front cover

Data mesh development elements—data product development cycle details

Data Mesh in Action

Jacek Majchrzak, Sven Balnojan, and Marian Siwiak, with Mariusz Sieraczkiewicz

Foreword by Jean-Georges Perrin

To comment go to liveBook

Manning

Shelter Island

For more information on this and other Manning titles go to

www.manning.com

Copyright

For online information and ordering of these and other Manning books, please visit www.manning.com. The publisher offers discounts on these books when ordered in quantity.

For more information, please contact

Special Sales Department

Manning Publications Co.

20 Baldwin Road

PO Box 761

Shelter Island, NY 11964

Email: orders@manning.com

No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher.

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps.

♾ Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine.

ISBN: 9781633439979

brief contents

Part 1. Foundations

1 The what and why of the data mesh

2 Is a data mesh right for you?

3 Kickstart your data mesh MVP in a month

Part 2. The four principles in practice

4 Domain ownership

5 Data as a product

6 Federated computational governance

7 The self-serve data platform

Part 3. Infrastructure and technical architecture

8 Comparing self-serve data platforms

9 Solution architecture design

Appendix A.

Appendix B.

Appendix C.

Appendix D.

Front matter

foreword

preface

acknowledgments

about this book

about the authors

about the cover illustration

Part 1. Foundations

1 The what and why of the data mesh

1.1 Data mesh

1.2 Why the data mesh?

Alternatives

Data warehouses and data lakes inside the data mesh

Data mesh benefits

1.3 Use case: A snow-shoveling business

1.4 Data mesh principles

Domain-oriented decentralized data ownership and architecture

Data as a product

Federated computational governance

Self-serve data infrastructure as a platform

1.5 Back to snow shoveling

1.6 Socio-technical architecture

Conway’s law

Team topologies

Cognitive load

1.7 Data mesh challenges

Technological challenges

Data management challenges

Organizational challenges

2 Is a data mesh right for you?

2.1 Analyzing data mesh drivers

Business drivers

Organizational drivers

Domain-data drivers

Minor organizational drivers

Is a data mesh a good fit for me?

2.2 Data mesh alternatives and complementary solutions

Enterprise data warehouse

Data lake

Data lakehouse

Data fabric

Data mesh vs. the rest of the world

2.3 Understanding a data mesh implementation effort

The data mesh development cycle

Development cycle in the shoveling example

Enabling the team

Development cycle in detail

3 Kickstart your data mesh MVP in a month

3.1 Getting the lay of the land

Drawing a system landscape diagram

Performing stakeholder analysis

3.2 Identifying candidates for the MVP implementation team

Choosing development teams

Choosing the cooperation model

Choosing a data governance team

3.3 Setting up MVP governance

Defining data mesh value statement(s)

Defining data governance policies

Federating data governance

3.4 Developing minimal data products

Identifying domain-oriented datasets

Choosing data product owners

Deciding on the minimum viable data product description

Developing the simplest tools to expose your data

3.5 Setting up the minimal platform

Ensuring platform-forced governability

Ensuring platform security

Part 2. The four principles in practice

4 Domain ownership

4.1 Capturing and analyzing domains

Domain-driven design 101

Invite the right people

Choose the correct workshop technique

4.2 Applying ownership using domain decomposition

Domain, subdomain, and business capability

Decompose domains using business capability modeling

How are domains and business capabilities related to data?

Assign responsibilities to the data-product-owning team

Choose the right team to own data

4.3 Applying ownership using data use cases

Data use cases

Model and bounded context

Set up boundaries of use-case-driven data products

Choose the right team to own data

4.4 Applying ownership using design heuristics

What is a heuristic?

Using design heuristics

Designing heuristics and possible boundaries

4.5 Final landscape: The mesh of interconnected data products

Messflix data mesh

Data products form a mesh

Is it already a data mesh?

5 Data as a product

5.1 Applying product thinking

Product thinking analysis

Data product canvas

5.2 What is a data product?

Data product definition

Product, not project

What can be a data product?

5.3 Data product ownership

Data product owner

Data product owner responsibilities

An Agile DevOps team as a base for data product dev team

Data product owner and product owner

5.4 Conceptual architecture of a data product

External architecture view

Internal architecture view

5.5 Data product fundamental characteristics

Self-described data product

Introduction to metadata

Metadata as code

Data product metadata

Domain dataset metadata

Other kinds of metadata

5.6 Additional data product characteristics: FAIR and immutability

Findability

Accessibility

Interoperable

Reusable

Immutable

5.7 Data contracts and sharing agreements inside the data mesh

Data contracts and sharing agreements

Implementing data contracts and sharing agreements

6 Federated computational governance

6.1 Data governance in a nutshell

6.2 Benefits of data governance

Business value perspective

Data usability perspective

Data control perspective

6.3 Planning data governance outcomes

Hierarchy of data governance outcomes

Strategic-level outcomes

Tactical-level outcomes

Implementation-level outcomes

6.4 Federating data governance

Thinking of data governance in terms of sliders

Extreme ends of data governance models

Federated data governance model

Setting-up governance team operations

6.5 Making data governance computational

Making policies computational

Automating policy checks

7 The self-serve data platform

7.1 The MVP platform

Platform definition

Platform thinking

7.2 Improvements with X as a service

X as a service explained

X as a service applied

7.3 Improvements with platform architecture

Platform architecture explained

Platform architecture applied

7.4 Improvements for the data producers

Part 3. Infrastructure and technical architecture

8 Comparing self-serve data platforms

8.1 Data mesh on Google Cloud Platform

Self-serve data platform architecture

Identifying the components of the platform

Identifying the components of the data product

Workflows

Variations

Relation to data mesh ideas

GCP architecture summary

8.2 Data mesh on AWS

Self-serve data platform architecture

Identifying the components of the platform

Identifying the components of the data products

Workflows

Relation to data mesh ideas

Variations

AWS architecture summary

8.3 Data mesh on Databricks

Self-serve data platform architecture

Identifying the components of the platform

Identifying the components of the data product

Workflow considerations

Variations

Databricks architecture summary

8.4 Data mesh on Kafka

Self-serve data platform architecture

Identifying the components

Considerations

Kafka architecture summary

9 Solution architecture design

9.1 Capturing and understanding the current state

What is software architecture?

How to document architecture: The C4 model

9.2 Understanding architectural drivers of a data product design

Architectural drivers

Capturing architectural drivers for a data-product design

9.3 Designing the future architecture of a data product and related systems

Design session

File-based data product: Spreadsheet

From monolith and microservice to a data product

Exposing data for stream processing and batch processing

Appendix A.

Appendix B.

Appendix C.

Appendix D.

index

front matter

foreword

The data mesh is to data as agile is to software engineering, or as microservices are to architecture patterns. It will be an essential component of your future data strategy. Data Mesh in Action addresses both the technology of the data mesh and the methodology your organization can follow to implement it.

This book teleports you into the seat of the chief architect on a data mesh project. The authors will coach you through the chaotic process of your first data product. As you gain more and more of those components, your mesh will build itself. The authors’ collective experience drives this transformation. Your responsibility will be to pick, choose, and adapt this framework to your needs and organization.

The data mesh is based on four key principles: domain ownership, data as a product, federated computational governance, and self-serve data platform. The book details organizational impact of these principles, as well as their technology, in great length. Individually, all those principles are well-known to engineers and architects; the real (r)evolution of the data mesh is its ability to combine them and deliver a global approach to building modern data platforms.

In my more than 15 years of building hybrid data platforms, I have always been missing something. Whether it was due to the strict approach of ingesting data in a warehouse or the lack of governance of a lake, to name two popular patterns, there was always this feeling of it ain’t gonna work. The mesh is different. It does not focus solely on technology; it puts governance and quality at the center and allocates ownership to the real owner, not some central commanding and demanding group. As a result, with adequate self-service tools, the data mesh will liberate the forces of innovation in your organization. And that is what this book will help you achieve.

—Jean-Georges Perrin,

Intelligence platform lead at PayPal,

president and cofounder of AIDAUG,

and Lifetime IBM Champion

preface

Each one of us authors has experienced—at length and at different companies—the old way of doing data, usually through centralized data lakes and data warehouses in combination with a set of central teams organized inside an analytics function. The old way basically looked like this:

Multiple decentralized development teams have data that is accessible through storage systems like a shared drive, a decentralized database, a Representational State Transfer (REST) API, or any other interface.

One or more centralized data teams are tasked with collecting this data into one monolithic pot. This is either a data lake or a data warehouse.

The same set of teams is tasked with transforming this data into something useful.

Multiple decentralized analysts, development teams, or machine learning (ML) teams pick up that transformed data and convert it into value in the form of reports, recommendation systems, or anything else they can think of.

We learned the hard way that this concept has its limits, producing a bottleneck in terms of both technology and team capacities. We all saw companies struggling to get the flow from data to value to be as productive as the companies needed it to be. Then the data mesh and the ideas behind it appeared on the horizon.

The data mesh is a decentralization paradigm. It decentralizes the ownership of data, its transformation into information, and its serving. It aims to increase the value extraction from data by removing bottlenecks in the data value stream by these means.

The concept of the data mesh appeared on the stage in 2019 and has since lit not just the data world, but the whole technology world, on fire. The data mesh concept breaks with the current world of data, which usually treats data as a by-product of software components. This new approach turns the spotlight on data producers and gives them the responsibility to handle the data just as they would handle their software.

With this, the data mesh takes the same journey software components have taken, with microservices architectures and with the DevOps movement. It takes the same journey frontends are currently taking with microfrontends. And just as in these examples, we believe that the data mesh is the right approach to finally gain the flexibility to extract value from our data at scale, be that in business intelligence (BI), ML learning, or any other use case you can think of.

The data mesh concept is often referred to as a socio-technical paradigm shift: its core is not about technology but about the alignment of people, processes, and organizations. This significant complexity is why we wrote this book. However, we don’t just present the available theoretical knowledge that is out there; we focus on parts of the data mesh that are, in our experience, critical for successful implementation. We have organized those parts into a digestible resource to help you put a data mesh in action!

To guide you through the process, we’ve prepared hands-on examples with a lot of architecture sketches, describing various technologies, workshop techniques, team organization forms, and the like. After reading this book, you should be able to do the following:

Evaluate whether a data mesh will suit your organization’s business needs

Lay the groundwork for data mesh development

Develop a minimal data mesh to start your journey

Keep iteratively developing and expanding your data mesh

Don’t expect to find a lot of code in this book, other than a little JavaScript Object Notation (JSON) here and there. That’s because we truly believe the magic is not in the technology, but in the people, processes, and organizations. But, of course, you can expect to find a lot of technology inside this book in the form of deep architecture sketches with reference to various technologies and cloud providers, explanations, and blueprints inspired by multiple real-world examples.

That said, we don’t believe in a black-and-white implementation of the data mesh idea. This book will help you adjust the data mesh idea to your company by offering a lot of degrees of freedom, shortcuts, and a healthy level of pragmatism.

To tie together our experience, we will use an imaginary company called Messflix LLC, which resembles a lot of what we’ve seen out there in the data world. This company will be our go-to example as we go through the mess-to-mesh journey; however, since we also focus on making the data mesh adaptable to many types of companies, not just one, this is not the only example we utilize throughout the book. Later in this front matter, we provide a brief introduction to Messflix by taking a look at the data mess the company has gotten itself into.

acknowledgments

First, we would like to express our gratitude to the community engaged with data mesh development. Their discussions and openness about problems and challenges helped us broaden our perspectives and put our particular experiences into the generalized framework you’ll find in this book.

We owe our thanks to the wonderful people at Manning who made this book possible: Publisher Marjan Bace, Development Editor Ian Hough, and last but not least, Acquisitions Editor Andrew Waldron. Without their patience with our ever-evolving view on the data mesh, and their ability to make us synthesize it into a coherent view, we wouldn’t be able to finish Data Mesh in Action in a form we could so proudly present to you. We would like also to thank the marketing, editorial, and production teams, without whom this book would gather dust in a Manning drawer.

A heartfelt thanks also to Michael Jensen and Al Krinker for technical reviews, which allowed us to further condense and clarify data mesh concepts.

We would also like to thank all our reviewers, who trusted us and invested their time in reading this book, even when no one was sure it would make it to publication. To Alain Couniot, Arnaud Castelltort, Arnaud Estève, Jean-Georges Perrin, Juan Gabriel Guzmán Guerra, Mary Anne Thygesen, Massimo dr, Matthias Busch, Mike Fowler, Milan Sarenac, Nathan B. Crocker, Pradeep Bhattiprolu, Rahul Jain, Richard Vaughan, Salil Athalye, Sampath Chaparala, Shiroshica Kulatilake, Simon Tschöke, Stefano Ongarello, Sumih Damodaran, Suriyanto Bongso, and Yi Wei, your suggestions helped make this a better book.

about this book

This book serves two purposes. First, it organizes and presents knowledge about the new socio-technological paradigm of the data mesh. Second, it will help you implement a data mesh. From considering whether the data mesh is a suitable solution for your organization, to laying the groundwork, to developing a minimum viable product (MVP), to implementing data mesh principles, this book provides the tools needed to get you well on your way on your data mesh journey.

Who should read this book?

The most general description of our reader is someone who is involved in extracting value from data. However, because that describes almost everyone in our modern economy, we’ll outline the benefits this book will bring to various audiences.

The first group is people involved in creating, managing, and utilizing data within companies that have the following:

High socio-technological complexity (e.g., big corporations)

Complex data use cases

Many and diverse data sources

This encompasses, but is not limited to, roles including data architects, data engineers, software architects, tech leads, and senior developers.

The more you feel like these quantifiers apply to your business, the more likely it is that a data mesh could be a good solution. This book will help you understand data mesh concepts, including whose cooperation you need to secure, and what steps to take in both your organization and technical environment to move from a data mess to data mesh.

Beyond that, as the data mesh is a company-wide transformation process, the book’s content will be directly useful to executive-level personnel, including the technical C-suite, engineering directors and managers, enterprise architects, chief and lead architects, and solution/program owners. This book will help you decide to what extent and level of priority you should shift your company’s data environment into a data mesh direction, and help you plan the change management.

How this book is organized: A road map

While the book is meant to be read linearly, it is broken into three main parts and allows you to skip sections. The first part is a quick and hands-on introduction, the second explains the four principles of the data mesh in detail, and the third tackles the technical side of things in detail as well as the complete enterprise journey.

Part 1: Foundations

The goal of the first part of the book is to familiarize you with the data mesh paradigm as quickly as possible. To do so, we first go through the basics of the data mesh and then get our hands dirty by building our first data mesh within a month.

Chapter 1: The what and why of the data mesh

This chapter gives the overview needed to put the rest of the book into the proper context, including why you might want to consider following the data mesh mindset shift as well as a short explanation of the four key principles detailed in part 2.

Chapter 2: Is a data mesh right for you?

This chapter provides you with the context of the data mesh implementation and the drivers to consider when deciding on the transformation. It helps you decide whether you want to start the journey now and to identify your place on the data maturity scale. This helps you to match your data mesh journey to your particular situation.

Chapter 3: Kickstart your data mesh MVP in a month

This chapter is a hands-on example of how to go about building an MVP. The Messflix MVP focuses a lot on the organizational challenges and stays light on the technology side of things, which an MVP should. The technology details will be picked up later. The chapter provides you with tools like stakeholder mappings and FAIR principles (findable, accessible, interoperable, reusable) to get you started.

Part 2: The four principles in practice

The goal of the second part of the book is to provide you with the tools to tackle the four principles of the data mesh so you can advance your data mesh beyond the first month.

Chapter 4: Domain ownership

This chapter is all about domains and business capabilities and how you can identify suitable owners for data inside a company. It provides you with a lot of workshop techniques, including domain storytelling.

Chapter 5: Domain data as a product

Data is often treated as a by-product. This chapter is about changing to a product perspective called data as a product. The chapter provides examples of data products from Messflix and explains in detail concepts like the data product canvas and data ports.

Chapter 6: Federated computational governance

This chapter tackles data governance in the data mesh context. Inside data meshes, this is called federated computational governance, because of the balance of central and distributed governance aspects as well as an automated execution needed to unfold the data mesh. This chapter contains a discussion of centralized versus decentralized aspects, hands-on examples from Messflix, and a guide for setting up a governance team.

Chapter 7: The self-serve data platform

The last chapter on data mesh principles covers the platform, the enabling technology that makes the data mesh work. The chapter works through three iterations on our data platform for Messflix and explains important concepts like platform thinking along with these examples.

Part 3: Infrastructure and technical architecture

The third part focuses on all things technical. We break out of the Messflix example to highlight various architectures and discuss multiple options for moving from your existing structure to a data mesh.

Chapter 8: Comparing self-serve data platforms

This chapter explains blueprints for data mesh platforms that fit various cloud providers as well as different sizes of companies.

Chapter 9: Solution architecture design

In this chapter, we focus on the migration from your existing system to various kinds of architectures step by step and component by component. We talk about data lakes, data warehouses, REST APIs, and more.

How to use this book

We don’t want to present just another theory of the data mesh. This book is more of a structured, collective diary of actions leading to data mesh development in various environments. The emphasis is on actions leading to. We arrived at the data mesh after a long and often painful journey through multiple other solutions. Over the years, we’ve been testing, researching, discussing, and, last but not least, failing a lot in the process. In this book, we share with you the summary of I wish someone had told me earlier insights. We hope you will be able to immediately put the information you’ll get out of it, well, in action.

Depending on your goal, there are a few focal points you could set while reading this book to dive deeper into. If your interest is purely informational, and your goal is to be able to explain the concepts to your team, your management, or your company, we recommend you put a lot of focus on chapters 1 and 2, which provide a quick overview, as well as the MVP presented in chapter 4. In addition, by reading through chapter 9 for a deeper dive into the reasons for this paradigm shift and a lighter look into part 2, you will be well equipped to explain the data mesh paradigm to someone else.

If you want to launch a larger initiative inside your company, you’ll need to be convincing. In that case, we recommend you take a deep dive into the entirety of chapter 9 and pay close attention to chapter 3, which offers insight into the question of whether you should start this journey at all. Chapter 4, presenting the full-scale data mesh MVP development, and chapter 2, offering a quick glance into a lightweight application of data mesh principles, will allow you to balance the big-picture view with notes on requirements of quick implementation and getting results fast. All together, this material should equip you with enough convincing material to get top-level buy-in.

If you’re interested in the technical side of things, like automated governance and the self-serve platform, chapters 5 to 8 will provide you with a lot of interesting content to dig through.

If you work inside a development team, we particularly recommend that you turn your attention to chapter 4. This chapter explains exactly what is broken in the current mode of thinking and should also help you advance your ways of working without ever touching the data mesh concept. Additionally, we recommend chapter 8, as it explains possible architecture alternatives for serving data from a development team’s point of view.

If you want to advance the way you work inside your data team, you could focus on chapters 3 and 4 to deeply understand the source of your current troubles. You could also focus on chapter 6 to understand what platform thinking in a data context means. Both could help you advance your ways of working without actually adopting a full data mesh approach inside the company.

We’re sure there are many more reasons for you to open up this book; these are simply a few possible ways you could go about putting this book into use.

The Messflix case study

To help you conceptualize the practical aspects of putting a data mesh in action, we combined our experiences and merged them into a single data mesh journey of Messflix LLC.

Messflix, a movie- and TV-show streaming platform, just hit a wall. A data wall. The company has all the data in the world but complains about not even being able to build a proper recommendation system for its movies and shows. The competition seems to be able to get it done; in fact, the competition is famous for being the first movers in a lot of technology sectors.

Other companies in equally complex industries seem to be able to put their data to work. Messflix does work with data, and analysts are able to get some insights from it, but the organization’s leaders don’t feel like they can call themselves data driven.

The data science trial runs seem to all end in pretty prototypes with no clear business value. The data scientists tell their managers that it’s because the product team just doesn’t want to put these great prototypes on the roadmap, or, in another instance, because the data from the source is way too messy and inconsistent.

In short, Messflix hopefully sounds like your average business, which for some reason doesn’t feel like it’s able to let the right data flow to the right use cases. The data landscape, just like the technology landscape, has grown organically over time and has become quite complex.

The two key technology components of Messflix are its Messflix Streaming Platform and Hitchcock Movie Maker. The streaming platform does just what it says: enable subscribers to watch shows and movies. The movie maker is a set of tools helping the movie production teams choose good movie topics, themes, and content.

Additionally, Messflix has a data lake with an analytics platform on top of it taking data from everywhere. A few teams manage these components. The teams Orange and White together operate a few of the Hitchcock Movie Maker tools. Team Green is all about the subscriptions, the log-in processes, etc., and team Yellow is responsible for getting things on the screen inside the streaming platform. Figure 1 depicts a rough architecture sketch of a few of these components before we briefly discuss how data is currently handled at Messflix.

The main Messflix software components. The data team handles a large variety of data sources and responsibilities.

The Data team gets data into the data warehouse from a few different places—for example, cost statements from the Hitchcock Movie Maker and subscriptions from the subscriptions service. The team also gets streaming data and subscription profiles from the data lake.

Then the Data team does some number crunching to transform this data into information for fraud analysis and business decisions.

Finally, this information is used by decentralized units to make those business decisions and for other use cases. This currently is a centralized workflow. The data team sits in the middle.

No matter where you’re coming from and where you want to go, you will find yourself somewhere along the Messflix journey. So let’s take one final look at the complete journey Messflix is going through.

No data journey is a simple straight line. Likewise, we don’t pretend that the Messflix journey is a simple linear progression of a series of steps. You’ll see different approaches in the chapters and ways to make the data mesh fit your company, even though the Messflix example illustrates one main thread to guide you.

You can follow that main thread used by Messflix throughout chapters 2 through 6 and chapter 9. Table 1 gives you an overview of the stages of the company, as we highlight two dimensions alongside the journey to a data mesh. The first is the number of organizational units and teams affected. The second is the types of company responsibilities that are decentralized.

The core of the data mesh paradigm shift is the decentralization of the responsibility for data. But responsibility for data today is practically split into multiple parts, all of which need to be decentralized. Thus we highlight all four kinds of responsibility for data in table 1; each corresponds to one of the principles presented in part 2.

Table 1 The Messflix journey

Enjoying the preview?

Page 1 of 1

Data Mesh in Action

About this ebook

Jacek Majchrzak

Related authors

Related to Data Mesh in Action

Related ebooks

Modern Big Data Architectures: A Multi-Agent Systems Perspective

Data Mesh: Building Scalable, Resilient, and Decentralized Data Infrastructure for the Enterprise Part 1

Data Mesh: What Is Data Mesh? Principles of Data Mesh Architecture

Data Lake Development with Big Data

Scala for Data Science

Graph Databases in Action: Examples in Gremlin

Streaming Data: Understanding the real-time pipeline

NoSQL Essentials: Navigating the World of Non-Relational Databases

Making Sense of NoSQL: A guide for managers and the rest of us

The Microsoft Data Warehouse Toolkit: With SQL Server 2008 R2 and the Microsoft Business Intelligence Toolset

Spring in Practice

Spark for Data Science

Big Data Modeling and Management Systems

Data Mesh: Transforming Data Architecture for Decentralized and Scalable Insights

Solutions Architect's Handbook: Kick-start your career with architecture design principles, strategies, and generative AI techniques

Six-Word Lessons for Data-Driven Decision-Making: 100 Lessons Today's Data Pros Must Adopt for Exceptional Bottom-Line Results

A Simplified Approach to It Architecture with Bpmn: A Coherent Methodology for Modeling Every Level of the Enterprise

Mastering Large Language Models with Python: Unleash the Power of Advanced Natural Language Processing for Enterprise Innovation and Efficiency Using Large Language Models (LLMs) with Python

Hexagonal Architecture Explained

The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data

Enterprise Architecture at Work: Modelling, Communication and Analysis

DevOps Handbook: What is DevOps, Why You Need it and How to Transform Your Business with DevOps Practices

Data warehouse Complete Self-Assessment Guide

The Data Model Resource Book, Volume 1: A Library of Universal Data Models for All Enterprises

A Manager's Guide to Data Warehousing

Cloud Data Architectures Demystified: Gain the expertise to build Cloud data solutions as per the organization's needs (English Edition)

Patterns, Principles, and Practices of Domain-Driven Design

Building the Data Warehouse

The Autonomous Revolution: Reclaiming the Future We've Sold to Machines

UML Summarized: Key Concepts and Diagrams for Software Engineers, Architects, and Designers

Computers For You

ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind

Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race

101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters

SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL

The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution

The Simulation Hypothesis: An MIT Computer Scientist Shows Why AI, Quantum Physics and Eastern Mystics All Agree We Are In a Video Game

Elon Musk

How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally

Uncanny Valley: A Memoir

The Invisible Rainbow: A History of Electricity and Life

Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad

CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61

Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition

Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics

Deep Search: How to Explore the Internet More Effectively

The Professional Voiceover Handbook: Voiceover training, #1

Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees

The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology

Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence

Dark Aeon: Transhumanism and the War Against Humanity

Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls

Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are

Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work

How to Write a Book: An 11-Step Process to Build Habits, Stop Procrastinating, Fuel Self-Motivation, Quiet Your Inner Critic, Bust Through Writer's Block, & Let Your Creative Juices Flow (Short Read)

Mastering ChatGPT: 21 Prompts Templates for Effortless Writing

Remote/WebCam Notarization : Basic Understanding

The Hacker Crackdown: Law and Disorder on the Electronic Frontier

CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide

Grokking Algorithms: An illustrated guide for programmers and other curious people

Summary of Digital Minimalism: by Cal Newport - Choosing a Focused Life in a Noisy World - A Comprehensive Summary

Related podcast episodes

Related articles

Related categories

Reviews for Data Mesh in Action

What did you think?

Book preview

Data Mesh in Action - Jacek Majchrzak

Data Mesh in Action

brief contents

contents

Part 1. Foundations

Part 2. The four principles in practice

Part 3. Infrastructure and technical architecture

101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters