UPDATED 09:00 EDT / JUNE 03 2024

BIG DATA

Snowflake catalog supports cross-engine access to Iceberg data

Snowflake Inc. is expanding its support for the Apache Iceberg open-source table format with today’s announcement of Polaris Catalog, described as a vendor-neutral, open catalog implementation for Iceberg and other data architectures.

Polaris is intended to provide centralized, cross-engine access to data. Snowflake said the catalog, which will be released to open source within the next 90 days, will give customers more flexibility and control over their data. It will have cross-engine read and write capability, strong security and interoperability with major cloud infrastructure companies, data lakehouse provider Dremio Corp. and streaming data processing firm Confluent Inc.

“Very critical to us is that we are focused on integrating with other query engines to give customers the choice to mix-and-match multiple query engines with read and write capability and without lock-in,” Christian Kleinerman, executive vice president of product at Snowflake, said in a statement ahead of Snowflake’s Data Cloud Summit this week in San Francisco. “Polaris Catalog extends Snowflake’s commitment to Apache Iceberg as the open standard of choice.”

Originally developed by Netflix Inc., Iceberg has surged in popularity, with 31% of respondents to Dremio’s 2024 State of the Data Lakehouse report saying they’re using Apache Iceberg now and 29% planning to adopt it in the next three years. Among its features are the ability to evolve table schemas over time without rewrites, flexible partitioning and a feature called “time travel” that allows queries to be run against historical data.

Polaris Catalog provides a single source for any engine to find and access an organization’s Iceberg tables with consistent security and interoperability, Snowflake said. It uses Iceberg’s representational state transfer protocol for accessing and retrieving data from any engine that supports the Iceberg Rest API, including Apache Flink, Apache Spark, Dremio, Python and the Trino open-source query engine.

Snowflake pointed to the recent expansion of its partnership with Microsoft Corp. as an example of the interoperability Iceberg enables. The two companies are working to enable bidirectional data access between Snowflake and the Microsoft Fabric data analytics platform. It said both organizations will use Polaris Catalog to allow users to access their data anywhere for artificial intelligence applications and model development.

The company began expanding up its support for Iceberg over the past two years. Last August it introduced Unified Iceberg Tables, which let Snowflake customers work with their own Iceberg data managed by Snowflake.

Photo: Unsplash

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU