UPDATED 22:25 EST / AUGUST 02 2023

BIG DATA

What is a data product?

We have had data on the brain for the past few months. Building on the discussion of the brewing battle for data platform dominance, I thought it would be good to unpack what a data product is.

We are basing this definition, which we will continue to evolve, on conversations we had with Shinji Kim, founder and chief executive of Select Star Inc., and a conversation with Lior Gavish, co-founder and chief technology officer of Monte Carlo, and other prominent data companies, in addition to personal experience and discussions with data teams at companies building data products in the financial services, online gaming, bio-life sciences and pharma, and tech software.

We can define a data product as a purpose-built entity that goes beyond a simple data set. A data product needs to be designed to deliver value to end users or customers by putting data into their hands for meaningful insights and decision-making. In past decades, a data product would have been in the form of quarterly reports or binders prepared manually and submitted physically to executives. However, with the rapid advancements in technology, the landscape has evolved significantly.

In the modern context, data products come in various forms and are accessible to a wide range of users within organizations. These products can be visualized in interactive dashboards in tools such as Tableau or Looker, a recommendation engine behind a retail website, or large language models that interact with users directly.

Data products may also be used by data scientists for performing specific analyses or integrated into applications that automate various business processes, such as marketing spend optimization, such as return on advertising spend, pricing analysis and risk assessment. Within many companies, the proliferation of data products and their rapidly increasing use, it becomes essential to address scale, reliability and usability issues.

Data product managers, a relatively new and growing role inside companies, need to consider how to expose these products in a repeatable, trusted and easily discoverable manner, similar to how traditional physical products are made known to consumers in the real world. Ensuring that data products are findable, well-documented and efficiently maintained is crucial for fostering effective data-driven decision-making across the organization.

By adopting a product management mindset and applying principles of software engineering, organizations can optimize their data products to meet the diverse needs of users and deliver value at scale. This involves treating data as a valuable asset, implementing data contracts, cataloging data products and establishing robust data modeling practices to facilitate efficient collaboration between data teams and business stakeholders.

Here is the hierarchy of data products as we see it (pictured). Let’s break it down:

  1. Data sets: These are raw or structured collections of data that primarily reflect a state or fact. Data sets are fundamental building blocks but lack the purposeful attributes that define a product.
  2. Data features: Data features are higher-level constructs that add purpose to data sets. They may include specific data transformations or manipulations that make the data more valuable and usable for certain tasks.
  3. Data products: Data products take data features to the next level by combining them with models or algorithms to achieve a specific goal. Examples of data products include recommendation engines, fraud detection models and personalization models. These products can be applied in various contexts and augmented or iterated upon to enhance their capabilities.
  4. Data apps: These are collections of data products and data features that are integrated into an application. A data app may have various components working together to provide insights or drive specific decisions based on the data. It may involve data analysis, visualizations, and data-driven functionalities to serve end-users effectively.

It’s important to note that although data products can be powerful and versatile, they may not always stand alone as individual products. In fact, we see that data products are built on top of other data products and features. Data products often power data applications, where their capabilities are exposed and utilized by end users or other systems.

Incorporating the perspective of data product managers, who adopt a product management mindset and apply software engineering principles, can help organizations treat data as a valuable asset. By maintaining data contracts, cataloging data and managing data requests through well-defined processes, data product managers can ensure the efficient and effective use of data products within the organization. This is especially important when you have hierarchically built data apps that are built on multiple data products and data features that other data teams may own. 

The challenge lies in bridging the gap between data teams and business stakeholders. Data product managers and analysts must work closely with business stakeholders to understand their requirements and domain-level knowledge. On the other hand, business stakeholders should familiarize themselves with data models and how data analysis is conducted to facilitate effective collaboration and decision-making.

In conclusion, we will continue digging into data products. We define them as purpose-built entities that go beyond raw data sets, incorporating specific features and attributes for a particular goal. They form the foundation for data apps, which are collections of data products and features integrated into applications to provide valuable insights and drive decision-making processes. By adopting a product management mindset and promoting collaboration between data teams and business stakeholders, organizations can effectively leverage data products to achieve their objectives.

Feel free to reach out and stay connected through robs@siliconangle.com, read @realstrech on Twitter, and comment on our LinkedIn posts.

Image: Rob Strechay

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU