UPDATED 04:00 EDT / OCTOBER 09 2024

BIG DATA

Google Cloud brings Gemini models to Looker to create conversational AI agents using private data

Just a week after it updated a host of its cloud database services, Google Cloud is rolling out yet more data-focused updates, with a focus on helping companies to build artificial intelligence agents that can perform tasks on behalf of their users.

Gerrit Kazmaier, Google Cloud’s general manager and vice president of data analytics, revealed various other capabilities too, including support for open data table formats, expanded vector search features, more governance controls and a semantic search experience in BigQuery.

Gemini-powered data agents

To simplify the process of building AI agents, Google is rolling out a series of new conversational application programming interfaces designed to work with its Gemini large language models to create chatbots that not only understand the user’s intent perfectly, but can also do whatever is asked of them.

In a briefing, Peter Bailis, vice president of engineering for Looker and AI, showed SiliconANGLE a demo of the new coffee shop AI agent powered by its conversational APIs, explaining that users can ask simple questions about what are the company’s top-selling drinks and receive an immediate response. Users can then ask more complex, follow up questions, such as “What are the average sales for first-time visitors?” and get an instant answer.

Its capabilities also expand to making predictions, Bailis said. For instance, the coffee shop manager could ask it what the total sales forecast for the next 30 days looks like, and it will automatically run a forecast based on existing data to generate a response.

The key thing here is that it works by combining Google’s analytics tools, such as Looker, with the power of its Gemini models, which are grounded in private enterprise data. “This is what makes it an agent,” Kazmaier said. “It’s not just retrieval of existing information. It’s just like you’re chatting with your data analyst. That’s what’s ground-breaking.”

Bailis said in future it should be possible to create a data preparation agent within BigQuery that does all of the time-consuming data prep work. In the case of the coffee shop, it would be able to organize the most relevant data into a new table file that’s optimized for specific searches, such as questions about coffee sales.

It basically takes away the biggest pain point of all this data and how to find what I want in it,” Bailis added.

The company said it has already used these APIs itself to build the conversational semantic search experience in Looker, combining the capabilities of Gemini with the former platform’s enterprise-scale semantic layer.

Expanded data access in BigQuery

Most of the other updates today were related to BigQuery, which is Google Cloud’s flagship serverless cloud data warehouse that serves as a hub for data analytics workloads. Kazmaier said it’s getting a host of new capabilities, including a managed experience that’s designed to make life easier for users of Iceberg, Hudi and Delta file formats, which are open-source, standard table formats for working with very large datasets in a performant way. Along with this, those data sources also gain support for multimodal data types in BigQuery, including information from artificial intelligence applications such as document understanding, vision AI and text-to-speech processing.

As for the new semantic search tools, these are primarily aimed at customers looking to use data in BigQuery to power AI applications built on the Vertex AI platform, which is Google’s primary AI app development tool.

Kazmaier said that in a recent update, BigQuery added support for retrieval-augmented generation techniques and vector embeddings, making it possible for AI applications to perform inference directly on the unstructured data it stores. At the time, the company also announced an integration with LangChain to simplify the job of pre-processing that data so it can be transformed into vector embeddings.

Building on those updates, Kazmaier said BigQuery is now getting enhanced vector search capabilities with support for the ScaNN vector index, which is tool for making unstructured data such as videos searchable by representing it as vector embeddings. ScaNN is the same technology that powers video and image searches in Google Search itself, and it also powers YouTube’s search engine.

Enhanced data search, protection and governance

Moving on to data governance, Kazmaier said Google is implementing various new features designed to help companies make their most sensitive information accessible to AI without compromising it.

First, it’s making it easier for customers to process data using familiar Python application programming interfaces via BigQuery DataFrames. With DataFrames, users have a simple way to generate synthetic data, based on their proprietary datasets, so they can be used to train AI in lieu of the genuine, highly sensitive information.

Second, Google announced the general availability of BigQuery’s unified catalog, a data discovery tool that works by automatically ingesting, harvesting and indexing metadata from across a company’s entire data estate, including AI models, business intelligence tools and databases. And to query those assets, users can take advantage of the new BigQuery catalog semantic search capability that’s in preview now.

That makes it possible to ask questions about the data in natural language. BigQuery will understand the user’s intent and retrieve the most relevant results, making it much easier to find what they’re looking for, Kazmaier said.

Third, there’s a new BigQuery metastore feature that further reduces data complexity, enabling multiple engines to run on a single copy of data spread across both structured and unstructured object data tables. That has the effect of providing a single data plane for policy enforcement and performance management.

Fourth, BigQuery is getting new governance tools specifically for those who use the service in tandem with Looker, Google’s primary business intelligence tool. What it’s offering is a fully managed, self-service experience for connecting and ingesting metadata from Looker, with no need to maintain a data connector.

Fially, Google is adding encryption and disaster recovery features to BigQuery to ensure customers have sufficient failover and redundant compute capacity for their most critical workloads.

With reporting from Robert Hof

Image: SiliconANGLE/Microsoft Designer

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.