UPDATED 18:52 EDT / DECEMBER 25 2023

AI

Apple quietly launched an open-source multimodal LLM called Ferret

Artificial intelligence researchers from Apple Inc. and Cornell University quietly unveiled an open-source and multimodal large language model last October known as Ferret, which is said to use parts of images as queries.

According to VentureBeat, the release of Ferret on GitHub in October went completely under the radar, with no announcement being made. However, it has since gotten a lot of attention from AI researchers. Bart De Witte, who operates a non-profit focused on open-source AI in medicine, posted on X that the release of Ferret “solidifies Apple’s place as a leader in the multimodal AI space.”

The way Ferret works is that it examines a specific region of an image, determines the elements within it that could be of use in response to a query, identifies those elements, and draws a bounding box around them. Then, it can use the identified elements as part of a query, which it will respond to in a traditional manner.

For instance, if a user highlights an image of an animal within a larger image, then asks the LLM what the animal is, it will respond to that query by identifying what species the creature is. It can then use the context of other elements it detects within the image to provide further responses or provide context on what the animal is doing.

The open-source Ferret model is a system that can “refer and ground anything anywhere at any granularity”, said Apple AI research scientist Zhe Gan in an earlier post on X:

AI researchers claim the release of Ferret is important as it demonstrates a surprising openness from Apple, which is in direct contrast to the company’s usual secretive nature.

The open-source approach may suit Apple in the AI industry, however, as the company is struggling to compete with rivals such as Microsoft Corp. and Google LLC due to a lack of computing resources. According to tech blogger Ben Dickson, Apple’s infrastructure is not designed to serve up LLMs at scale, which means the company cannot expect to compete with models such as ChatGPT. Apple therefore has to choose between partnering with a cloud hyperscale on its AI efforts, or share its work with the open-source community, similar to the approach taken by Meta Platforms Inc.

Photo: Pexels/Pixabay

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU