UPDATED 19:14 EDT / OCTOBER 27 2024

AI

Google reportedly developing new AI that can automate web browsing tasks in Chrome

Google LLC is developing an advanced artificial intelligence system that is designed to autonomously operate web browsers that could make its debut in December, The Information reported Saturday.

The new AI, internally known as “Project Jarvis,” is expected to enhance user productivity by automating routine tasks such as online shopping, research and booking flights.

Project Jarvis is reportedly powered by Google’s Gemini 2.0 large language model, which promises substantial improvements in understanding and generating humanlike text. Sources told The Information the AI is specifically engineered for Google Chrome and includes capabilities to interpret screenshots, click buttons and input text, simulating user interactions within the browser to complete various web-based actions.

However, the AI takes “a few seconds” between actions, according to sources. Whether the final release would have similar delays remains to be seen.

The news comes less than a week after Anthropic PBC introduced new models that included a new way for models to interact with computers in public beta mode. Anthropic’s Claude Sonnet model can interact with computers by moving the mouse, typing text and clicking buttons to interact with the user interface.

Athropic’s take differs from what Google is reportedly working on in that the AI can control a computer, while Project Jarvis can only access webpages within Google Chrome.

The move toward AIs that can either interact or see what’s on a computer is a growing trend in AI, with other companies working on similar systems, such as Microsoft with Copilot Vision. First revealed by Microsoft on Oct. 1 but not yet available, Copilot Vision can analyze the images on a webpage and answer questions about them.

Apple Inc. is also working on similar AI-driven interactions through its upcoming Apple Intelligence platform. Unlike Project Jarvis, which operates primarily through Chrome to handle tasks across the web, Apple’s approach integrates AI directly into device features such as Siri, enabling contextual responses and actions based on on-screen content.

Though different companies may have different takes and abilities when it comes to AI being able to interact with or analyze what’s on a screen, what is clear is that AI agents that can interact and undertake tasks are quickly becoming the next wave of AI development.

Image: SiliconANGLE/Ideogram

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU