UPDATED 01:00 EST / APRIL 06 2022

BIG DATA

Google Cloud boosts big-data accessibility with new innovations at Data Summit

Google Cloud early Wednesday announced a number of new products and innovations aimed at making data-driven insights and digital transformation more accessible to companies of all sizes.

The new products and services, unveiled at Google Cloud Data Summit, include BigLake and Spanner change streams to unify customer data while ensuring it’s delivered in real time, Vertex AI Workbench and Model Registry for companies looking to feed artificial intelligence, and a new unified business intelligence experience.

The backdrop for the raft of new announcements is the rising importance and difficulty in using reams of data in business. “The data-to-value gap is increasing,” Gerrit Kazmaier, vice president and general manager of databases, data analytics and Looker business intelligence at Google Cloud, told reporters in a briefing.

He said that’s because of outdated data architectures and ideas, requiring several “paradigm changes.” They include every company realizing it’s a big-data company, understanding that data workloads and types of data continue to expand and realizing that the impact of data reaches everyone around an organization, such as customers, partners and suppliers. “We believe in limitless data,” Kazmeier. “So we need a limitless Data Cloud.”

Among the new services to address this situation is BigLake (pictured), a data lake storage engine now available in preview that aims to unify data lakes and warehouses. In a blog post, Kazmaier explained that managing data across disparate lakes and warehouses creates silos and increases the risk and cost of storing data, especially in cases where that information needs to be moved.

With BigLake, companies can unify their data warehouses and lakes and analyze data in place, without concerning themselves with its underlying storage format or system. By doing this, Google is removing the need to duplicate or move data. The service provides fine-grained access controls with an application programming interface that spans Google Cloud and open-source processing engines such as Apache Spark, plus open file formats such as Parquet.

As for Spanner change streams, it’s an extension to Google Cloud Spanner, a distributed SQL database management and storage service. Spanner change streams will be available soon and will further reduce data limits for customers, Kazmaier said, allowing customers to track changes within their Spanner databases in real time.

“Spanner change streams tracks Spanner inserts, updates, and deletes to stream the changes in real time across a customer’s entire Spanner database,” Kazmaier explained in the blog post. “This ensures customers always have access to the freshest data as they can easily replicate changes from Spanner to BigQuery for real-time analytics, trigger downstream application behavior using Pub/Sub, or store changes in Google Cloud Storage (GCS) for compliance.”

Powering AI

For customers building AI models, Google said it’s making Vertex AI Workbench generally available starting today, bringing data and machine learning systems into a single interface so teams will have a common toolset to use across their data analytics, data science and machine learning initiatives. Vertex AI Workbench offers native integrations with services such as BigQuery, Serverless Spark and Dataproc, enabling teams to build, train and deploy machine learning models at up to five-times faster than when using traditional notebooks, Google said.

Vertex AI Workbench also enables customers to regularly update their machine learning models with ease. To ensure smoother model maintenance, Google is introducing new MLOps capabilities with Vertex AI Model Registry, now in preview. This acts as a central repository for customers to discover, use and govern machine learning models. From here, data scientists can share machine learning models more easily, while application developers have an easier way to access them.

Boosting accessibility

Kazmaier also announced the debut of Connected Sheets for Looker, Google’s data analytics and business intelligence service. With Connected Sheets, customers can now interact with their data in whatever way they choose, be it through Looker Explore, Google Sheets or using the drag-and-drop Data Studio interface. Kazmaier said that will make it easier for all kinds of workers to access and unlock insights from data and make data-driven decisions.

Mercado Libre Inc., a Latin American e-commerce giant, has already adopted Connected Sheets for Looker to provide broader access to data for its workers using a spreadsheet interface they were already familiar with. By lowering the barrier to entry, it claims, it has managed to build a data-driven culture wherein everyone can inform their decision-making with data.

Expanded partner ecosystem

In addition, Google is expanding its partner ecosystem for companies that want to deliver new products that are tightly integrated and optimized with its BigQuery service. To that end, it announced a new validation called Google Cloud Ready – BigQuery, which recognizes that partner solutions meet a core set of functional and interoperability requirements. At launch, more than 25 partners have already been recognized under the new validation, Kazmaier said.

For example, Informatica Inc. now offers Google Cloud-validated connectors that can help customers streamline data transformations and rapidly shift data from any software-as-a-service application, on-premises database or big-data source, to Google BigQuery. Meanwhile, Tableau Software Inc. is now integrated with Google Cloud to enable customers to analyze billions of rows of data in seconds, without writing a single line of code and with no server-side management.

Data Cloud Alliance

In a final announcement, Kazmaier said Google Cloud is one of the founding members of the new Data Cloud Alliance, along with companies including Accenture Plc., Confluent Inc., Databricks Inc., Dataiku Inc., Deloitte Touche Ltd., Elastic Inc., Fivetran Inc., MongoDB Inc., Neo4j Inc., Redis Ltd., and Starburst Data Inc.

The alliance said its mission is to ensure that businesses across the globe have more seamless access and insights into the data that’s required for digital transformation. To that end, the founding members said, they’re committed to making data more portable and accessible across disparate business systems, platforms and environments to ensure that data access is never a barrier to digital transformation.

It will work to accelerate adoption of data analytics, AI and machine learning best practices across all industries through common data models, open standards and integrated processes, it said. It will also work to reduce challenges around data governance, privacy, compliance and security. Its members will do this by providing infrastructure, APIs and integration support, and by creating new, common industry data models, processes and platform integrations to improve data portability and reduce complexity.

“Data is the common foundation for all digital transformations,” Kazmaier said. “By committing to open data standards, access, and integration between the most popular data platforms and applications today, we believe we can significantly accelerate business transformations and close the data to value gap.”

With reporting from Robert Hof

Images: Google

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU