UPDATED 11:45 EDT / DECEMBER 03 2024

CLOUD

AWS expands Amazon S3 with features to support Apache Iceberg and metadata management

Amazon Web Services Inc. announced at its re:Invent conference today significant updates to its Amazon Simple Storage Service that are designed to make S3 the first cloud object store with fully managed support for Apache Iceberg.

The move to S3 supporting Apache Iceberg will deliver faster analytics and make it easier to store and manage tabular data on any scale. Additional new features announced include the ability to automatically generate queryable metadata, simplifying data discovery and understanding to help customers unlock the value of their data in S3.

Amazon S3 Tables is the first cloud object store to feature built-in Apache Iceberg table support, introducing a specialized bucket type for optimized storage and querying of tabular data. S3 Tables delivers up to three times faster query performance, 10 times higher transactions per second, and automated maintenance to simplify analytics workloads.

The release of Amazon S3 Tables seeks to address the issue of managing large-scale tabular data, which customers typically organize using Apache Parquet, a file format optimized for data queries. As Parquet emerges as one of the fastest-growing data types in Amazon S3, AWS customers increasingly rely on open table formats such as Apache Iceberg to efficiently organize, update and query their data across billions of files.

Though Iceberg has become the leading open table format for managing Parquet files, AWS argues that its complexity often requires dedicated teams to handle maintenance tasks like data compaction and access control. The systems are also resource-intensive and costly, creating challenges for scalability and diverting valuable expertise from strategic analytics efforts.

Amazon S3 Tables addresses these issues by providing a purpose-built solution for managing Apache Iceberg tables in data lakes. Optimized for analytics workloads, S3 Tables deliver faster query performance and higher TPS compared to general-purpose S3 buckets. The service automates key maintenance tasks such as data compaction and snapshot management to continuously optimize query performance and storage costs as data lakes grow.

Customers using S3 Tables can create dedicated table buckets that streamline the storage and querying of tabular data in fully managed Iceberg tables. The service also offers advanced Iceberg features like row-level transactions, queryable snapshots via time travel and schema evolution. Additionally, table-level access controls provide robust security that allows customers to define and manage permissions easily.

Announced alongside S3 Tables today was Amazon S3 Metadata, another new service that streamlines data discovery by automatically capturing queryable object metadata and custom metadata using object tags. S3 Metadata then stores the data in S3 Tables for accelerating analytics across data lakes.

Amazon S3 Metadata automatically generates queryable object metadata in near-real-time, simplifying data discovery and enhancing data understanding. Doing so eliminates the need for customers to build and maintain complex metadata systems, allowing them to query, locate and utilize data for business analytics, real-time inference and other applications.

By capturing system-defined details such as object size and source and integrating metadata into S3 Tables, S3 Metadata ensures an up-to-date view of data as objects are added or removed.

Using the service, customers can also enrich their data by adding custom metadata with object tags and annotating objects with business-specific details like product SKUs, transaction IDs, or content ratings. The metadata — queryable through simple SQL queries — enables efficient data preparation for use in analytics, artificial intelligence and machine learning workflows, and storage optimization. The capabilities also support diverse tasks such as fine-tuning foundation models to integrate with data warehouse workflows and performing retrieval-augmented generation.

“We have seen the rapid rise of tabular data and, increasingly, customers want to query across tables, improve query performance and understand and organize troves of data so they can easily find exactly what they need,” Andy Warfield, vice president of storage and distinguished engineer at AWS, said in a statement. “AWS S3 Tables and S3 Metadata remove the overhead of organizing and operating table and metadata stores on top of objects, so customers can shift their focus back to building with their data.”

“Our perspective on the new S3 Tables buckets and S3 Metadata is very exciting to the platform engineering teams required to manage these massive open-source data lakes based on Apache Iceberg,” says Rob Strechay, managing director of theCUBE Research. “But the proof will be in the pudding of how this impacts the ‘compute engines’ that manage those tables.”

Image: SiliconANGLE/Ideogram

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

AWS expands Amazon S3 with features to support Apache Iceberg and metadata management

Image: SiliconANGLE/Ideogram

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

Google Cloud AI Agents in Action Series 2025/2026

MWC Barcelona 2026

Vast Forward 2026

CES 2026

AWS re:Invent 2025

AWS expands Amazon S3 with features to support Apache Iceberg and metadata management

Image: SiliconANGLE/Ideogram

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

Google Cloud AI Agents in Action Series 2025/2026

MWC Barcelona 2026

Vast Forward 2026

CES 2026

AWS re:Invent 2025

Cookies