UPDATED 09:00 EDT / JUNE 28 2023

BIG DATA

Databricks’ Delta Lake 3.0 bridges compatibility gaps with Apache Iceberg and Hudi

Databricks Inc. today released the latest version of Delta Lake, the storage framework that it donated to open source a year ago.

Version 3.0 adds support for the Apache Iceberg and Apache Hudi data lake platforms using a universal format that allows data stored in Delta Lake to be read from either source. The move is intended to simplify the often complicated integration work required in building a lakehouse, which is an open, hybrid architecture that combines elements of both a data warehouse and a data lake.

The market for lakehouses is crowded and fast-growing. Although no lakehouse-specific forecasts could be found, SNS Insider Pvt Ltd. estimates that the data lake market was valued at just over $12 billion last year and is expected to grow more than 21% annually, to $57 billion by 2030. Databricks said Delta Lake is the most widely used lakehouse storage format in the world, with more than 1 billion downloads per year.

Metadata mismatch

Iceberg and Hudi are two of the most popular open-source lakehouse options. They and Delta Lake work with the Apache Parquet open-source format but “they all generate different metadata,” said Databricks Marketing Vice President Joel Minnick. “How you interact with that metadata affects the type of connectors in the engines that connect to those platforms. We could end up in a format war that slows down the adoption of the lakehouse because we’ve created different ecosystems.”

Delta Lake 3.0 can generate metadata automatically in all three formats and understands the source used by connectors. “By building for Delta Lake, you can build for every platform,” Minnick said.

Data stored in Delta Lake can now be read from as if it were Iceberg or Hudi. Databricks’ UniForm universal format automatically generates the metadata needed for Iceberg or Hudi so manual conversion between the formats isn’t needed.

A component called Delta Kernel provides a single stable application program interface for connectors that bridge different data management engines. Connectors that are built against a core Delta library and that implement Delta specifications don’t need to be updated with each new version or protocol change, the company said.

A new layout called Liquid Clustering provides cost-efficient data clustering as data grows to help ensure that read and write performance requirements are met, Databricks said.

Delta Lake also supports Delta Sharing, an open protocol for secure data exchange that the company says is used by more than 6,000 data consumers.

Databricks is advocating for the Hudi and Iceberg communities to adopt its approach. “Customers use all of these different systems and they are asking for ways to make the translation between all these different systems much easier,” Minnick said. “By making the format effectively irrelevant, adoption of the lakehouse can be rapidly accelerated.”

Photo: Niklas Tschöpe/Wikimedia Commons

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

Are you AWS customer? Support SiliconANGLE Financially by buying your AWS services from our Marketplace portal page and links.

https://siliconangle.com/aws-marketplace/

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Databricks’ Delta Lake 3.0 bridges compatibility gaps with Apache Iceberg and Hudi

Metadata mismatch

Photo: Niklas Tschöpe/Wikimedia Commons

A message from John Furrier, co-founder of SiliconANGLE:

Are you AWS customer? Support SiliconANGLE Financially by buying your AWS services from our Marketplace portal page and links.

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

RAISE Summit 2026

Pure Accelerate 2026

FinOps X 2026

Snowflake Summit 2026

Freshworks Refresh 2026

Databricks’ Delta Lake 3.0 bridges compatibility gaps with Apache Iceberg and Hudi

Metadata mismatch

Photo: Niklas Tschöpe/Wikimedia Commons

A message from John Furrier, co-founder of SiliconANGLE:

Are you AWS customer? Support SiliconANGLE Financially by buying your AWS services from our Marketplace portal page and links.

LATEST STORIES

LATEST STORIES

RAISE Summit 2026

Pure Accelerate 2026

FinOps X 2026

Snowflake Summit 2026

Freshworks Refresh 2026