UPDATED 16:56 EST / AUGUST 27 2024

Tomer Shiran, Dremio Corp. - AWS reInvent 2022

BIG DATA

Dremio says it has dramatically improved query performance on Iceberg data lakes

Data lakehouse company Dremio Corp. today announced a set of advanced analytics performance capabilities that it says significantly speed query performance on Apache Iceberg tables while reducing the need for user intervention.

The two major new features are Live Reflections and Result Set Caching. Dremio Reflections are a feature of the company’s data lake engine that accelerates query performance by creating optimized, precomputed data representations. They’re similar in concept to materialized views but are more flexible and integrated with Dremio’s architecture. As a result, they enable faster and more interactive querying of large datasets stored in data lakes without data movement or duplication.

Live Reflections ensure that materialized views and aggregations are automatically updated for optimal performance whenever changes are made to base Iceberg tables. Users can accelerate queries without any maintenance overhead with the system recommending Reflections that provide the best value and system-wide performance.

“It used to be that you had to figure out which Reflections you wanted to create and then manage the refresh cycle,” said Dremio Founder Tomer Shiran (pictured). “You had to logically figure out what aggregations you needed, how to sort the table and how frequently to refresh. We’ve now solved both of those problems.”

Recommended Reflections essentially monitor activity across the entire data lake and learn what queries are being used most often and how they can be accelerated. Any updates to a table automatically refresh all the downstream Reflections incrementally, even if joins cross multiple tables.

Shiran said Apache Iceberg’s embedded change-tracking features make this possible. “You can note that the version of this table that was used for this query is the same as the version currently being queried,” he said. “I don’t have to worry that something may have changed. I know with certainty that it won’t return a different result than what the user expects.”

Result Set Caching can accelerate query responses up to 28-fold across all data sources by storing frequently accessed query results rather than just the queries, Dremio claimed. “People often query the same data,” Shiran said. “The optimizer takes the query plan, and asks if it can use one of the existing Reflections. The user isn’t aware of it.”

Storing query results instead of queries in the database consumes more storage but “object storage is cheap,” Shiran said. “Compute is expensive.”

A new data merge-on-read feature speeds Iceberg table writes and ingestion operations by up to 85%. Notification-based auto ingest ensures continuous updates with fresh data by automatically monitoring object storage for new files and automatically ingesting them when a notification is received.

“It’s all incremental and live, unlike in the past when you had to manually schedule an operation,” Shiran said. “Now you just insert the records automatically, and because all the updates are incremental, they’re cheap.”

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Dremio says it has dramatically improved query performance on Iceberg data lakes

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

MWC Barcelona 2026

Vast Forward 2026

CES 2026

AWS re:Invent 2025

Microsoft Ignite 2025

Dremio says it has dramatically improved query performance on Iceberg data lakes

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

MWC Barcelona 2026

Vast Forward 2026

CES 2026

AWS re:Invent 2025

Microsoft Ignite 2025

Cookies