UPDATED 16:16 EDT / JUNE 05 2024

Sanjeev Mohan, principal at SanjMo, and George Gilbert, senior analyst at theCUBE Research discuss data governance with theCUBE’s Dave Vellante. BIG DATA

Balancing metadata and security: Snowflake leads the charge for data governance

The data cloud industry is facing a pivotal moment as it grapples with the challenges of standardizing data governance across various compute engines.

With the rise of open table formats, such as Apache Iceberg, companies are striving to balance technical metadata management with strong security measures. Snowflake Inc. and Databricks Inc. are at the forefront of this battle, each developing unique strategies to enhance data governance and interoperability.

Sanjeev Mohan, principal at SanjMo, and George Gilbert, senior analyst at theCUBE Research discuss data governance with theCUBE’s Dave Vellante.

SanjMo’s Sanjeev Mohan and analyst George Gilbert discuss data governance with theCUBE’s Dave Vellante.

“Snowflake actually provides this Polaris Catalog, which is a technical metadata catalog,” said Sanjeev Mohan (pictured, right), principal at SanjMo. “If you want to do [role-based access control], row-level security, column-level security … you need Horizon. If you don’t have a Horizon, then every single compute engine, like if it’s Spark or Dremio or Trino or Starburst, they have to figure out how to apply data access governance onto Iceberg. That’s why Horizon is so important.”

Mohan, along with George Gilbert (left), senior analyst at theCUBE Research, spoke with Dave Vellante, chief analyst at theCUBE Research, at Data Cloud Summit, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed the challenges in standardizing governance across various compute engines and the critical role of interoperability and governance in shaping the industry’s future. (* Disclosure below.)

The challenge of governance in open table formats

Snowflake’s strategy involves open-sourcing Polaris for technical metadata while relying on Horizon for advanced governance features. Meanwhile, Databricks’ acquisition of Tabular highlights the industry’s focus on enhancing policy engines and governance. These developments underscore the ongoing battle for supremacy in the data cloud market

“The whole point is you can’t separate compute from storage, as you guys were saying, until you put the security, but the security now it’s no longer enough,” Gilbert said, concerning the Tabular acquisition. “The next level is can you apply this policy? I think what Databricks is buying is not just the table interoperability, but someone who’s building a policy engine.”

The complexities of governing open table formats such as Apache Iceberg were also discussed. This highlights a critical gap in the current standards, where basic technical metadata provided by formats such as Polaris falls short in offering comprehensive governance solutions, according to Mohan.

“If you look at Iceberg, Iceberg Table Spec, that’s what it is, it’s a table format, does not specify certain things like permission,” Mohan said. “There are no permissions, there’s no security. Security has to be applied above in a different catalog.”

The battle for supremacy in data governance

The competition between Snowflake and Databricks is heating up, with both companies vying to provide the most comprehensive and user-friendly data governance solutions. The acquisition of Tabular by Databricks was positioned as an AI announcement, but its real significance lies in its potential to enhance data governance, according to Gilbert.

“The next step is to take the security parts and attach it to the table. That’s what the Tabular guys were building. That’s what Starburst guys are building, and that’s what today is in Unity and Horizon,” Gilbert said. “If the security is attached to the data, then you have interoperability for the data policy and the governance. But then you go to the value add for all the lineage and the observability and the quality.”

This strategy underscores the importance of interoperability in the industry. By decoupling the technical metadata from the governance features, both companies aim to offer more flexible and scalable solutions, Gilbert pointed out.

“If you wanted to read and write to the open table formats, you had to take the entire Databricks catalog,” he said. “By getting everyone to agree that Polaris or some other catalog is enough, then they can break that link.”

Here’s the complete video interview, part of SiliconANGLE’s and theCUBE Research’s coverage of Data Cloud Summit:

(* Disclosure: TheCUBE is a paid media partner for Data Cloud Summit. Neither Snowflake Inc., the sponsor of theCUBE’s event coverage, nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU