UPDATED 08:00 EDT / NOVEMBER 28 2023

BIG DATA

Starburst simplifies analytics on data lakes

Starburst Data Inc., the commercial developer of a distributed query engine based on the open-source Trino project, today announced a set of new features intended to make it easier for organizations to build data-intensive applications on top of data lakes.

The enhancements provide for unified data ingestion, data governance and data sharing on a single platform.

“This is all about building interactive data-driven applications,” said Starburst co-founder and Chief Executive Justin Borgman. “We see customers building embedded analytics into their own applications and are increasingly using data lakes to store data from multiple sources.”

Among the new features, which are part of a standard update cycle, is support for real-time analytics with streaming ingestion. Customers can leverage open-source Apache Kafka or the commercial version of Kafka from Confluent Inc. to hydrate a data lake in near-real time to ensure that applications have the most up-to-date information.

New data arriving in the lake is stored in the Apache Iceberg open-source table format. Starburst also supports Apache Parquet and the Delta Lake format created by Databricks Inc., but “we think Iceberg is going to win this battle,” Borgman said. “We see Iceberg being embraced by a broad ecosystem whereas we only see Delta being embraced by Databricks.”

Automated classification

Machine learning models in Starburst’s Gravity cross-cloud data access and analytics layer automatically apply classifications and access policies for certain categories and classes. Gravity can identify personal information and restrict access automatically.

Automated data maintenance abstracts away common management tasks like data compaction and data vacuuming, which is a process that automatically collects and consolidates data from various sources into a single repository. Starburst said the capability enables users to maintain warehouse-like performance without adding manual processes.

Gravity can also be used to package data sets into shareable and secured data products regardless of the source, format or cloud provider, Borgman said. “Our approach is data source-agnostic,” he said. “You can curate a data set for sharing that can span any data source you have, such as a table from Oracle, Hadoop, [Amazon Web Services Inc.’s] S3 and Redshift and stitch them together into a data product that can reside anywhere.” Data doesn’t physically move and access is enforced by role-based controls.

Starburst is also adding some basic self-service analytics features to Galaxy like text-to-SQL processing in an effort to some exploratory analytics from data teams to business users. “You can say ‘show sales from last month’ and it will create a well-formed SQL query,” Borgman said. “You can also give it a SQL query and it will tell you what the query does.” The technology leverages a fine-tuned version of OpenAI LP’s ChatGPT generative artificial intelligence engine.

Starburst said the new features will be available on AWS’ fastest hardware, including Graviton3, and integrate with other AWS tools such as QuickSight analytics and Bedrock service to train foundational AI models.

Photo: photopin

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Starburst simplifies analytics on data lakes

Automated classification

Photo: photopin

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

MWC Barcelona 2026

Vast Forward 2026

CES 2026

AWS re:Invent 2025

Microsoft Ignite 2025

Starburst simplifies analytics on data lakes

Automated classification

Photo: photopin

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

MWC Barcelona 2026

Vast Forward 2026

CES 2026

AWS re:Invent 2025

Microsoft Ignite 2025

Cookies