Apple publishes research to bring AI models to iPhones and make videos into 3D avatars
New research projects from Apple Inc. into artificial intelligence have shown that the company is forging ahead with generative AI tools that will bring significant efforts to the industry including the ability to run AI models on devices with limited memory.
The research papers were first reported by VentureBeat, the first of which would allow more efficient use of memory to run generative AI large language models on devices such as iPhones. The second paper introduced new techniques to scan video of human beings to produce 3D virtual avatars of people for digital experiences such as virtual reality and consumer experiences.
Large language models, the technology behind AI chatbots such as OpenAI’s ChatGPT, are enormous in size and take up a lot of memory, and thus often cannot be deployed on handheld devices such as iPhones given their limited RAM. In July, Apple was reported to have developed its own ChatGPT-like chatbot referred to internally as Apple GPT and now the company is researching ways to bring its AI onto smartphones.
In the paper, titled “LLM in a flash: Efficient Large Language Model Inference with Limited Memory,” Apple states that it can handle loading an entire LLM onto a device but still execute the reasoning portion of the AI with the limited storage available on the iPhone. This is done through what is called “model inference,” or where an AI model does its predictive and computational execution, the “flash” portion of the title is a reference to the memory on the phone that allows for extremely fast transfers of data so that the AI can run.
“Our method involves constructing an inference cost model that harmonizes with the flash memory behavior, guiding us to optimize in two critical areas: reducing the volume of data transferred from flash and reading data in larger, more contiguous chunks,” the researchers wrote.
According to the paper, the new technique allows LLMs to run up to 25 times faster on devices with limited memory and it would allow for deploying advanced AI on handhelds such as iPhones, iPads and wearables.
The second research paper proposed an AI system called Human Gaussian Splats, or HUGS, a generative AI technology that can create digital human avatars from a video. The system can use ordinary video of a static scene and produce a fully animated 3D model of a single person that can be turned into a digital avatar model that can be used in virtual reality environments.
According to researchers the AI model behind HUGS is capable of taking a small number of frames of a person moving around in a video, approximately 50-100 frames, which is approximately two to four seconds of a 24 frames per second video. The researchers said it takes around 30 minutes to take the scene and rebuild it into a 3D avatar.
The end result produces a realistic 3D model of the human body, capturing as many details as possible, into a format called Skinned Multi-Person Linear, or SMPL. However, the researchers warned that it couldn’t model every detail such as cloth and hair and would potentially deviate from those when it was unable to match them to the real-world individual.
These avatars could be used anywhere such as in a video game, a virtual reality environment, or a consumer-facing application. The breakthrough here is that making the avatar requires only the video. Whereas many other scanning technologies require expensive cameras and much longer processing times.
According to the researchers, the technique is capable of producing 60 frames per second while being 100 times faster to train than other methods.
Currently, HUGS does not have any practical applications, but 3D digital avatars could have a place concerning apps for the company’s mixed reality headset, the Vision Pro. Numerous applications in mixed and virtual reality, where people would use virtual environments to view other people using virtual personas would benefit from full, animated 3D renderings of colleagues and friends.
Image: Apple
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU