Strategic shift: Snowflake co-founder Benoit Dageville on embracing open standards with open-source Polaris
Snowflake Inc. is navigating a pivotal shift in the data landscape with its ambitious move to evolve the Snowflake Data Platform into an AI data cloud.
The company’s strategic dilemma has centered on balancing solutions integration with the adoption of open standards. Addressing the increasing significance of enterprise metadata, Snowflake has taken decisive action by making Polaris open-source, aiming to foster greater interoperability and governance within its ecosystem.
“It has full governance at that layer, at the metadata layer,” said Benoit Dageville (pictured), co-founder and president of product at Snowflake. “The Azure governance it has is we support all the free clouds. So Amazon, Google storage, Azure blob storage. We also vet credentials for storage, so not only is it logical access control policies, but also at the storage layer, when you access files on Iceberg tables.”
Dageville spoke with theCUBE Research’s Dave Vellante at the Supercloud 7: Get Ready for the Next Data Platform event, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed Polaris’ role in an expanding Snowflake ecosystem as the vision of an AI data cloud with a thriving marketplace of applications is realized.
Polaris on the Snowflake Data Platform: The AI data cloud emerges
Polaris going open source has fueled considerable momentum for Snowflake. Now available under the Apache 2.0 license, Polaris offers developers the ability to access and contribute to its codebase freely. This move reinforces the commitment to open-source principles and fosters a collaborative approach to enhancing the product, according to Dageville.
“Why we are embracing Iceberg and open data and Polaris, pushing that all the way to the catalog, is because our customers, or a class of our customers, is asking us for interoperability between tools,” he said. “Many tools are accessing the same data, and they want Snowflake to be one of these tools or one of the systems accessing this open data. They ask us to come to their data versus their data coming to us, and they want to avoid vendor lock-in.”
In tandem with the open-source release, Polaris has been integrated into the Snowflake Data Platform. Users can now create Polaris accounts and deploy Polaris within Snowflake, demonstrating the product’s strength and seamless functionality. This integration positions Polaris as a “first-class citizen” within the Snowflake ecosystem, operating as an independent catalog that can be utilized without relying on the entirety of Snowflake’s suite, according to Dageville.
Polaris brings many features to the table, particularly in metadata governance. It supports role-based access control, allowing administrators to “create catalog roles and give privilege to whoever [has] access [to] this catalog. It has full governance at that layer, at the metadata layer, the Azure governance … we support all the free clouds,” he added.
There is a critical need for strong data governance in open data environments, according to Dageville. There is a balance required between accessibility and control, which highlights Snowflake’s expertise in this area, Dageville explained.
“You need to provide governance on top of these tables, which is very important,” he said. “We want to bring governance to the open and it’s kind of a contradictory world. On one hand you want to open your data, but on the other hand, you want to have full governance and control. We have unique expertise in the governance area, and we think that we can have people benefit from this expertise and put this expertise toward the open data.”
Horizon vs. Polaris and market trends
While Polaris boasts impressive capabilities, it is essential to distinguish it from Horizon, which offers more granular access controls and governance features that extend beyond what Polaris currently provides. For instance, Horizon allows for precise policies at the table level, including column masking and subgroup access restrictions, according to Dageville.
“You can define a rule-based policy to say, for example, that group can only see this subset of data and you can mask columns,” he said. “So all these things are not in Polaris and they’re not in Polaris because we didn’t want to open source. That part is just because we first need to agree between all the tools that this is the way to define this policy.”
The industry’s response to these updates has been profound. Competitors such as Databricks Inc. have also made significant moves, such as open-sourcing the Unity Catalog and acquiring Tabular. These developments within a short span highlight the competitive and rapidly evolving landscape of data management and governance. Snowflake’s approach is to meet customers at their current needs, providing them with the tools to manage open data while ensuring strong governance, Dageville explained.
“A critical part of why our name is the AI data cloud is because a cloud is somewhere where applications are running,” he said. “This is the difference with a platform where applications are running on top of a platform. We want to be a cloud where applications are directly executing in our cloud. We started with the data and we wanted to support all types of data, from structured data to completely unstructured data.”
Here’s the complete video interview, part of SiliconANGLE’s and theCUBE Research’s coverage of the Supercloud 7: Get Ready for the Next Data Platform event:
Photo: SiliconANGLE
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU