UPDATED 04:00 EDT / OCTOBER 09 2024

BIG DATA

Google Cloud brings Gemini models to Looker to create conversational AI agents using private data

Just a week after it updated a host of its cloud database services, Google Cloud is rolling out yet more data-focused updates, with a focus on helping companies to build artificial intelligence agents that can perform tasks on behalf of their users.

Gerrit Kazmaier, Google Cloud’s general manager and vice president of data analytics, revealed various other capabilities too, including support for open data table formats, expanded vector search features, more governance controls and a semantic search experience in BigQuery.

Gemini-powered data agents

To simplify the process of building AI agents, Google is rolling out a series of new conversational application programming interfaces designed to work with its Gemini large language models to create chatbots that not only understand the user’s intent perfectly, but can also do whatever is asked of them.

In a briefing, Peter Bailis, vice president of engineering for Looker and AI, showed SiliconANGLE a demo of the new coffee shop AI agent powered by its conversational APIs, explaining that users can ask simple questions about what are the company’s top-selling drinks and receive an immediate response. Users can then ask more complex, follow up questions, such as “What are the average sales for first-time visitors?” and get an instant answer.

Its capabilities also expand to making predictions, Bailis said. For instance, the coffee shop manager could ask it what the total sales forecast for the next 30 days looks like, and it will automatically run a forecast based on existing data to generate a response.

The key thing here is that it works by combining Google’s analytics tools, such as Looker, with the power of its Gemini models, which are grounded in private enterprise data. “This is what makes it an agent,” Kazmaier said. “It’s not just retrieval of existing information. It’s just like you’re chatting with your data analyst. That’s what’s ground-breaking.”

Bailis said in future it should be possible to create a data preparation agent within BigQuery that does all of the time-consuming data prep work. In the case of the coffee shop, it would be able to organize the most relevant data into a new table file that’s optimized for specific searches, such as questions about coffee sales.

It basically takes away the biggest pain point of all this data and how to find what I want in it,” Bailis added.

The company said it has already used these APIs itself to build the conversational semantic search experience in Looker, combining the capabilities of Gemini with the former platform’s enterprise-scale semantic layer.

Expanded data access in BigQuery

Most of the other updates today were related to BigQuery, which is Google Cloud’s flagship serverless cloud data warehouse that serves as a hub for data analytics workloads. Kazmaier said it’s getting a host of new capabilities, including a managed experience that’s designed to make life easier for users of Iceberg, Hudi and Delta file formats, which are open-source, standard table formats for working with very large datasets in a performant way. Along with this, those data sources also gain support for multimodal data types in BigQuery, including information from artificial intelligence applications such as document understanding, vision AI and text-to-speech processing.

As for the new semantic search tools, these are primarily aimed at customers looking to use data in BigQuery to power AI applications built on the Vertex AI platform, which is Google’s primary AI app development tool.

Kazmaier said that in a recent update, BigQuery added support for retrieval-augmented generation techniques and vector embeddings, making it possible for AI applications to perform inference directly on the unstructured data it stores. At the time, the company also announced an integration with LangChain to simplify the job of pre-processing that data so it can be transformed into vector embeddings.

Building on those updates, Kazmaier said BigQuery is now getting enhanced vector search capabilities with support for the ScaNN vector index, which is tool for making unstructured data such as videos searchable by representing it as vector embeddings. ScaNN is the same technology that powers video and image searches in Google Search itself, and it also powers YouTube’s search engine.

Enhanced data search, protection and governance

Moving on to data governance, Kazmaier said Google is implementing various new features designed to help companies make their most sensitive information accessible to AI without compromising it.

First, it’s making it easier for customers to process data using familiar Python application programming interfaces via BigQuery DataFrames. With DataFrames, users have a simple way to generate synthetic data, based on their proprietary datasets, so they can be used to train AI in lieu of the genuine, highly sensitive information.

Second, Google announced the general availability of BigQuery’s unified catalog, a data discovery tool that works by automatically ingesting, harvesting and indexing metadata from across a company’s entire data estate, including AI models, business intelligence tools and databases. And to query those assets, users can take advantage of the new BigQuery catalog semantic search capability that’s in preview now.

That makes it possible to ask questions about the data in natural language. BigQuery will understand the user’s intent and retrieve the most relevant results, making it much easier to find what they’re looking for, Kazmaier said.

Third, there’s a new BigQuery metastore feature that further reduces data complexity, enabling multiple engines to run on a single copy of data spread across both structured and unstructured object data tables. That has the effect of providing a single data plane for policy enforcement and performance management.

Fourth, BigQuery is getting new governance tools specifically for those who use the service in tandem with Looker, Google’s primary business intelligence tool. What it’s offering is a fully managed, self-service experience for connecting and ingesting metadata from Looker, with no need to maintain a data connector.

Fially, Google is adding encryption and disaster recovery features to BigQuery to ensure customers have sufficient failover and redundant compute capacity for their most critical workloads.

With reporting from Robert Hof

Image: SiliconANGLE/Microsoft Designer

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU