UPDATED 16:03 EDT / JUNE 12 2022

BIG DATA

Data mesh creator: Standards needed to avoid ‘potential for abuse and misuse’

Having fired the imagination of the data analytics industry with her concept of a data mesh, Zhamak Dehghani has taken the natural next step and written a book about it.

Data Mesh: Delivering data-driven value at scale” covers not just the technology and architectural aspects of the distributed and federated alternative to data warehousing Dehghani (pictured) envisions but also the organizational changes required to reimagine data as a shared responsibility and product.

Dehghani, who is a principal consultant at Thoughtworks Holding Inc., took time out recently for a catch-up interview with SiliconANGLE.

How has the reception to the book been?

Amazing. It’s O’Reilly’s third best-selling book since publication in March. That’s no credit to me; it’s an indication of interest in the topic and a lack of deep knowledge in the area. People want to try it.

You said in the preface that you wrote this book a bit earlier than you would have liked. Why?

I felt I was forced to write this early because the concept went viral very quickly and I know it’s going to be unrecognizable before long.

I would have liked to point to a North Star that has already been reached, a great data mesh execution that has fully manifested and has a paved path to get there. But we’re not there. We have a rugged path, a vision, and a lot of people on the journey but there are still lessons to be learned. We are moving from the technologically advanced, happy-to-play-with leading edge to enterprises and even federal organizations. We haven’t yet reached the peak before the trough of disillusionment.

What will the trough of disillusionment look like?

There is a mad rush where we all have to have data mesh and quickly. I think organizations are going to realize that this is a transformation and takes time. The ones who don’t have the skills or patience will fall off early.

What are some of the challenges early adopters have encountered?

Bootstrapping the right way is the main challenge, along with ambiguity and figuring out how to do it right. The early adopters think they have their heads around the concept and are now migrating away from an approach they’ve built over years. One of the challenges is knowing organizationally where to start and who should lead the effort.

Who are the best leaders of a data mesh initiative?

It’s the visionaries in the organization. It may be a combination of the [chief technology officer] or chief data officer and the CEO. Often they are newly appointed because previous approaches haven’t worked. But this isn’t something they can do alone. They need to engage a wider team.

What surprises have you encountered as organizations begin to build data meshes?

People have found many different ways to execute. Everyone implements it slightly differently because they want to leverage the millions in investments they’ve made in technology over the years. There’s no blueprint for how to build it.

As data mesh is co-opted by vendors and adapted to their own needs, do you worry that the concept will become corrupted?

If there’s one thing that keeps me up at night it’s the potential for abuse and misuse of the concept. Data mesh was introduced as an approach to solving some blatantly obvious pain points. The market showed interest and vendors had to respond in a short span of time. They already had a nucleus of a strategy that fit their last paradigm, so they extended their existing products by adopting the language of a data mesh. They’re saying “Buy my product and you have data mesh.” Some of those products won’t be able to scale and it can become a big mess.

What has to happen for a more structured definition of data mesh to be created?

The standards in place are sufficient for where we are now but if we want data to be shared throughout the organization some standards are necessary.

I envisage a federated approach with a set of companies that attack different areas of standardization. For example, data sharing in a mesh model is very different from a data lake or data warehouse model. We need to agree on what the contract looks like for sharing to occur.

Have you been approached about creating a data mesh foundation?

I’ve been approached but not by the right people.

What’s your opinion of the concept of the “citizen data scientist?”

The idea of anyone in an organization, regardless of where their skills fall on the spectrum, being empowered to hypothesize and test is a great one to strive for. However, for hypotheses to become sustainable and resilient, you need engineering and development.

I have yet to see low-code/no-code platforms result in resilient and long-term solutions that can be mature and adaptable. If you’re talking about building an end-to-end business solution with citizen data scientists, that’s a bit far-fetched.

You recently told SiliconANGLE’s David Vellante, “We are moving from reason-based, logical algorithmic decision-making to model-based computation and decision-making where we exploit the patterns and signals within the data.” What is the difference between the two?

There are a few niche areas where that transformation has already happened, such as risk analysis and fraud detection. They react to behavior rather than using if/then/else reasoning. But to be widely adopted, the next generation of developers will have to be familiar with the statistical and mathematical models needed to think that way.

For example, one retail client has a rules-based system that tells them when and where to move products between warehouses and shops. But someone has to think of all the “ifs” and “elses.” Alternatively, there could be a mathematical model of optimization that’s trained with data to figure out the optimized route for any particular item. You exploit the model for the best results.

Photo: ThoughtWorks

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU