

Google LLC’s cloud unit today introduced new features for its BigLake and BigQuery services, which enable companies to run analyses on large datasets.
Both updates focus on an open-source technology called Apache Iceberg, which suggests that Google is making the announcements to get out ahead of two rivals in data management. Snowflake Inc. is holding its user and developer conference next week and Databricks Inc. will follow suit with its own event, both in San Francisco.
Iceberg organizes data in tables, which are collections of spreadsheet-like rows and columns. It collects information on how those tables change over time, which is useful for many data management tasks. It also provides fast query times and SQL support.
BigLake allows companies to store and analyze Iceberg-based datasets. The search and cloud giant says BigLake is now better supported by the storage infrastructure that underpins Google Cloud. As a result, BigLake can now automatically move infrequently accessed data to slower, less expensive storage hardware.
Loading data into BigLake from external sources is now easier as well. Google has added features that automate some of the work involved in moving records to the service from Hadoop and systems that use the Delta data format. Delta is an open-source alternative to Iceberg. Customers with stringent cybersecurity requirements can secure the data they load into BigLake using their own encryption keys.
Going forward, BigLake will work better with the other services in Google Cloud. AlloyDB for PostgreSQL, one of the company’s managed database services, can now read and write BigLake-managed datasets. The BigQuery data warehouse can now likewise run queries on those datasets.
The improved BigLake integration is one of several enhancements rolling out to BigQuery. It’s also receiving a set of features designed to speed up user queries. The feature collection is known as the BigQuery advanced runtime.
The BigQuery advanced runtime can perform data pruning, the process of removing unnecessary records from a dataset, to make queries more efficient. It then vectorizes the remaining records to further boost query speeds.
Vectorization makes it possible to carry out data operations on multiple data points at once rather one after one another. According to Google, data updates, deletions and short-duration queries will all become faster.
“The BigQuery advanced runtime (Preview), can automatically accelerate analytical workloads, using enhanced vectorization and short query optimized mode, without requiring any user action or code changes,” Google Cloud executives Andi Gutmans and Yasmeen Ahmad wrote in a blog post today.
For developers, the company is rolling out an enhanced version of BigQuery Notebooks. It’s a coding tool that can be used to find patterns in datasets, visualize them and perform related analytics tasks. Google is adding programming assistance features powered by its Gemini large language models.
The BigLake and BigQuery upgrades are rolling out alongside a tool called Dataplex Universal Catalog. It aggregates metadata, information about a company’s records that describes details such as when they were created. The tool can collect metadata from BigLake-native Iceberg tables, BigQuery datasets and other sources.
“AI automates metadata curation, infers hidden relationships between data elements, proactively recommends insights from data backed by complex queries, and enables semantic search with natural language,” Gutmans and Ahmad wrote.
Apache Spark is also a focus of today’s updates. Google is rolling out a technology called the Lightning Engine that it says can speed up the open-source analytics platform. The technology does so using a combination of vectorization, caching and a performance optimization method known as data shuffling. Google says that Lightning Engine can provide a more than threefold increase in processing speeds.
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.