UPDATED 08:00 EST / AUGUST 29 2023

Google expands Vertex AI with new models and enterprise-ready tooling

Google LLC’s cloud unit Google Cloud is continuing its trajectory toward providing users broader access to generative artificial intelligence capabilities, allowing them to customize their own AI for intelligent apps using their own data with new enterprise-ready models and tools added to Vertex AI.

Vertex AI is a constellation of cloud services that companies can use to build machine learning models and customize generative AI models that can perform a variety of conversational, text and image generation tasks. Google Cloud released generative AI capabilities in Vertex AI six months ago, which gave developers access to foundational models and customization capabilities.

“We understand that based on customer’s use cases and their own technical skill sets they will consume generative AI differently,” June Yang, vice president of cloud AI and industry solutions at Google Cloud said in a press briefing ahead of its Next conference this week in San Francisco. “For example, business users can consume it directly such as the translation hub like any other SaaS solution. No technical expertise required. For developers, they get access to foundational model in Vertex AI via API directly. And data scientists can leverage the AI platform to tune foundation models with enterprise data.”

A big part of Vertex AI is the Model Garden, where customers can get access to foundational AI models through application programming interfaces for doing numerous tasks such as conversational logic, code generation, image generation and more. As part of today’s announcement, Google said it added Meta Platform Inc.’s new Llama 2 and Technology Innovative Institute’s Falcon. Anthropic’s Claude 2 text model will also be coming soon. The Model Garden now has more than 100 enterprise-ready large models available.

Users will now have access to a new version of PaLM 2, the second iteration of Google’s Pathways Language Model, that is now generally available with 38 languages. It has also been expanded with a much larger context window for providing extremely long questions and answer chats, examining large documents, such as research papers or books. With the new version of PaLM 2, it can handle up to 32,000 tokens, which is large enough to read an 85-page document in a single prompt.

Although 32,000 tokens isn’t the largest context window out there, Nenshad Bardoliwalla, product lead at Vertex AI, explained that it fit the best price-to-performance ratio. “Our customers are striving to balance the flexibility of the modeling that they’re able to do with large models with the cost of inference and the ability to fine tune,” he said. “We felt at this time given the evolution of the market the results with 32K are quite impressive.”

For comparison, Claude 2’s context window is 100,000 tokens or about 75,000 words. This is large enough for the model to read “The Great Gatsby” if a user wanted it to.

Codey, a foundation model capable of generating computer code and providing code chat, has been improved by over 25% for major supported programming languages. Codey can generate code for more than 20 languages, including Python, C, Javascript and Java.

Google’s image generating AI foundation model Imagen has also been improved with added capabilities for brands. Image generators work by having a user prompt it for an image, such as, “Give me an image of a truck,” and the AI produces images of trucks.

However, a brand may want those images to be produced in a particular theme or style based on the company’s branding. Imagen can be tuned to match this style in just 10 images or less. According to Google, this is one of the lowest tuning sets in the industry.

For example, if a brand wants to have the trucks produced with washed out sunny backgrounds imposed on the pictures, Imagen could be trained with 10 photos. After the AI is tuned it will then spell out how to adjust the prompt to use the style on the pictures, such as “in golden photo style” according to the user’s settings.

Vertex AI gets tooling upgrades for developers

Although foundational models themselves are extremely powerful, they no longer receive additional information after training. With Vertex AI Extensions, developers can connect them to external data sources through APIs for real-time data and also allow them to automate real-world actions.

With Extensions, developers can build powerful generative AI applications that can access databases and enterprise information in real-time in order to keep up with changing information. As a result, AI apps can benefit from up-to-date information without needing to have it fed to them through prompts.

Vertex AI has prebuilt extensions for cloud services such as BigQuery, AlloyDB and other popular databases such as DataStax, MongoDB and Redis. Developers will also be able to integrate with the AI toolkit LangChain, a framework for building large language model chatbots.

Building on tools for developers, Google also announced that Vertex AI Search and Conversation are now generally available. Search allows organizations to build Google Search-quality, multimodal search applications that are powered by foundation models directly into their apps, grounded securely in their own data. Conversation works to allow developers to embed chatbot and voicebot capabilities quickly and directly into apps.

With these two offerings, enterprise users will have a much larger variety of tooling to bring AI capabilities into their businesses and apps that will allow them to build immersive conversational experiences. These tools allow developers to embed conversational AI capabilities into their software that allows the user interface to take context from the data on screen, from enterprise sources and provide insights to the user.

By putting generative AI to use, developers will also be able to escape from having to manage indices of information and conversational decision trees for interactions. The chatbot or voicebot can do that for them by using context and data fed to it in order to process results. That alone will relieve a great burden on the production of interactive user interfaces.

Image: Google

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Google expands Vertex AI with new models and enterprise-ready tooling

Vertex AI gets tooling upgrades for developers

Image: Google

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

CES 2026

AWS re:Invent 2025

Microsoft Ignite 2025

SC25

Refresh North America 2025

Google expands Vertex AI with new models and enterprise-ready tooling

Vertex AI gets tooling upgrades for developers

Image: Google

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

CES 2026

AWS re:Invent 2025

Microsoft Ignite 2025

SC25

Refresh North America 2025

Cookies