UPDATED 13:23 EDT / NOVEMBER 07 2023

BIG DATA

AnalystANGLE: dbt Labs’ vision and strategy: a bellwether for data platforms

Introduction

In the ever-evolving landscape of data platforms, dbt Labs Inc. stands out as a pivotal player, driving the modern data stack’s transformation and offering innovative solutions that bridge the gap between data engineering and data analytics. We see the lines in the sand being drawn as organizations build data features and products on top of data lakes and warehouses, with the fights over whether you do extract, transform and load (ETL), or extract, load and transform (ELT), use one warehouse or multiple, and what role artificial intelligence plays in all of this.

At the company’s Coalesce 2023 conference, we sat down with founder and Chief Executive Tristan Hand and Vice President of Product Luis Maldonado to gain insights into the company’s vision and strategy. We delved into topics such as its investments in both open source and commercial offerings, enhancements to dbt Cloud, the significance of the semantic layer, and its role in fostering a robust data ecosystem.

Dbt Labs: a central figure in the data ecosystem

The conversation kicked off with Handy emphasizing dbt Labs’ role in facilitating seamless interactions between various data teams. While they may not consider themselves the sole API to data platforms, dbt Labs is crucial in bridging the gaps between teams within larger organizations. These teams, often divided by responsibilities like finance and marketing, must rely on each other’s data and data engineering teams without disruptions and conflicts.

Handy explains that dbt Labs aims to help these teams create data interfaces and data contracts, much like those championed by Amazon.com Inc.’s two-pizza teams and service-oriented architecture, where each team owns its code base and is responsible for providing stable interfaces to other teams. This approach ensures trust and minimizes disruptions, fostering collaboration and efficient data utilization.

Here’s the full AnalystBrief from dbt Labs Coalesce with Handy:

Dbt Core and dbt Cloud: scaling the investment

When discussing the transition from dbt Core to dbt Cloud, Handy emphasized that dbt Cloud is the ideal platform for scaling investments in dbt. He acknowledged that users have varying preferences regarding writing code and managing development environments. Dbt Cloud addresses this by catering to power users through the newly released command-line interface and newcomers via an intuitive integrated development environment and browser-based interface. This flexibility makes it easier for larger organizations to adopt dbt, accommodating users with diverse needs.

One of the key themes discussed during the interview was the continuous enhancement of dbt Cloud. Maldonado pointed out that dbt Cloud is evolving to become the connective tissue in the modern data stack. We agree that this could be true, as dbt Labs customers move from the ones using dbt Core on a laptop to those organizations looking for more scale, visibility and governance across dbt projects. Maldonado highlights the delicate balance between open-source and commercial offerings at dbt Labs.

“We’ve got a thriving open-source community, and we’re committed to investing in both the open-source and commercial sides of our business,” he said. “We want to ensure that our open-source users continue to benefit from our innovations while providing added value to our commercial customers.”

At the same time, he added, “we’re investing heavily in dbt Cloud to make it a central hub that seamlessly integrates with various data warehouses and data lakes. Our goal is to simplify the data analytics process and empower organizations to build modern data platforms.”

Indeed, it is a delicate balance that the dbt Labs team will continue to navigate, as Handy stated from the stage that dbt Core would remain Apache, a jab at other companies that have found they have had to change to a Business Source License, which has significant restrictions.

Semantic layer and data contracts

A key highlight of the discussions with Tristan and Luis was the semantic layer introduced by dbt Labs. This semantic layer acts as a set of guardrails, ensuring data governance and providing clear data contracts between teams. By defining these data contracts, dbt Labs helps organizations avoid conflicts and disruptions when different teams interact with shared data. It provides a structured framework for data features and products to be built on top of dbt Cloud.

Data contracts and semantic layers aren’t new, but their integration within the modern data stack is. dbt resides at the crucial transform layer, granting it insight into data model changes. Imagine your organization defining all attributes of the digital representation of their “customer”; dbt sits where this transformation occurs, ensuring visibility on both ends. This is vital for governance and consistency in data products, which rely on reusable data features and a single source of truth. With dbt, coupled with a semantic layer tool, business intelligence and visualization maintain uniformity across data products.

The interview shed light on the importance of the semantic layer in dbt Labs’ strategy. Maldonado emphasized that the recently announced Semantic Layer, along with dbt Cloud’s native integrations to the BI layer, such as Tableau, Google Sheets and Hex, simplifies data access for business analysts. One missing piece, which may surprise many, is Microsoft Excel as a data source and target. Excel is still a critical heritage application in the data world.

“We want to make data more accessible and empower business analysts to harness the full potential of data without the need for complex coding or transformations,” Maldonado said.

Here’s the full AnalystBrief from dbt Labs Coalesce with Maldonado:

Dbt Labs and AI integration

Handy touched on the potential of integrating AI into dbt workflows, thanks to its code-first approach. The ease of integrating AI into dbt’s code-first environment opens up possibilities for AI models to generate dbt code. That could simplify complex data transformations and accelerate data engineering tasks, making dbt even more accessible to a broader audience.

This could be as simple as using already existing LLMs to build dbt code, such as Handy has used OpenAI LP’s ChatGPT. Security about what queries are in the public model versus specific language models or SLMs, much like co-pilots, will be a discussion point in the future for sure. This is a place to watch in dbt Lab’s strategy to expand beyond mere data modeling in the transformation layer.

Company sustainability and growth

In his keynote, Handy showcased dbt Labs’ impressive growth and sustainability. He discussed the challenges of transitioning from an open-source, community-driven project to a sustainable commercial venture. He emphasized the importance of weaving open-source ideals into a commercial story, ensuring that dbt remains rooted in its community while thriving as a business.

Handy praised the passionate dbt community and noted the cultural changes within the company to embrace a commercial mindset without compromising on open-source values. The company’s growth has allowed it to invest in dbt Core and dbt Cloud, balancing innovation and user experience.

Expanding the ecosystem

Maldonado discussed dbt Labs’ commitment to expanding its ecosystem by integrating various data analytics tools and platforms. He mentioned that dbt Labs is actively working on building connections with popular data warehouses and data lake solutions, including AWS, Google, Databricks, Snowflake, Starburst and Microsoft Azure, which is soon to be available in private preview. Scaling the Data Platform ecosystem is a key competitive advantage dbt Cloud will focus on and is essential to its strategy.

We anticipate those platforms will include other major players, such as MongoDB and Couchbase.

Our ANGLE

Looking ahead, dbt Labs will continue to iterate on the innovations introduced this year. With new offerings such as dbt mesh and Explorer, as well as the continued development of the semantic layer, which came from the acquisition of Transform, dbt Labs is enhancing its dbt Cloud platform and the data ecosystem. As it aims to make data engineering and analytics more accessible to a broader range of users while scaling with the needs of larger organizations, it will run into more co-opetition with vendors it has been partnered with in the past.

The semantic layer, dbt mesh and Explorer features are definitive indications of this. Ecosystem partners will have to look at places where dbt is in the organization and where it’s not, focusing on being a broader solution, not solely focusing on dbt as the market. We have been saying for a while that there are too many data tools and that the modern data stack is too complex. Organizations will continue to have more than one data platform to meet different organizational use cases.

The rise of generative AI and movement from LLMs to SLMs, segmenting data for IP protection and governance, is showing that we will continue to have data silos, not solely using the transformation and semantic layers being provided by data platform vendors. As organizations continue to create data product groups as part of their product strategies, SLMs and LLMs will increasingly be part of the solutions for the data apps that are built.

We call this “Uber for all,” where organizations continue to build digital representations of their companies, which will speed up with the introduction of AI as data features of data products. Organizations will look to standardize the toolsets above the data platform layer, leading to dbt Labs being in a great position to be a key competitor in that market.

The path for dbt Labs is still early on its journey from a community-driven project to a thriving commercial venture. It’s in a very different space than HashiCorp, meaning it can create enough value in dbt Cloud to enable it to stay open-source. As dbt Labs continues to innovate and grow, it remains a central figure in the evolving data ecosystem, empowering organizations to harness the power of their data.

All statements made regarding companies or securities are strictly beliefs, points of view and opinions held by SiliconANGLE Media, other guests on theCUBE and guest writers. Such statements are not recommendations by these individuals to buy, sell or hold any security. The content presented does not constitute investment advice and should not be used as the basis for any investment decision. You and only you are responsible for your investment decisions.

Disclosure: Many of the companies cited in AnalystANGLE are sponsors of theCUBE and/or clients of Wikibon. None of these firms or other companies have any editorial control over or advanced viewing of what’s published in Breaking Analysis.

Image: Adobe Stock

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU