UPDATED 09:00 EDT / NOVEMBER 18 2022

AI

Meta’s MultiRay platform aggregates training for high quality, large-scale AI models

Facebook parent company Meta Platforms Inc. today lifted the lid off its latest innovation in artificial intelligence training. It’s called MultiRay, a new platform for running the most powerful AI models at large scale, with more efficiency and lower costs.

Meta explained in a blog post that, until now, many companies have been forced to make compromises on their AI systems. To obtain the best possible results, an AI system that processes text, images and other modalities must be trained on an immense data set, and then specialized for a specific task like identifying hate speech.

The result is a high-quality but extremely expensive one-trick pony: The model might be excellent at spotting hate speech, but that’s all it can do. So the process becomes prohibitively expensive for teams that want to use AI to solve multiple problems. As a result, the most capable AI models are rarely used in the real world, and companies generally rely on smaller, simpler and less capable algorithms instead.

MultiRay changes this by making it possible to reuse the results of AI training for multiple different tasks. Numerous AI models trained for specific tasks can be run on the same input, thereby sharing the processing costs between them. It results in a much lower per-model processing cost when creating more powerful AI models.

“Doing this helps us optimize the total cost of performing these AI tasks,” Meta’s AI team wrote in a blog post. “We can more easily introduce AI accelerators due to the concentration of company-wide computation into a single model, and we can also trade off between compute power and storage at the company level.”

MultiRay creates what Meta calls “universal models” that have been trained to perform strongly across a wide range of tasks and domains. These jack-of-all-trade models have been shown to deliver higher-quality results, allowing Meta’s teams to improve and iterate quickly on all manner of machine learning models for numerous types of applications, such as topic tagging for posts, hate speech detection, fake news and so on. Meta’s first such model is called TextRay, and it has been up and running since 2020 to support various text understanding applications.

Meta is using MultiRay to create AI systems around more modalities than text alone. For instance, some Facebook posts might contain text, images and a video. In that case, its AI systems need to analyze those elements separately and assess them in the context of the others. Normally, this would involve combining several compute-intensive models into one much larger, even more intensive model.

“The resulting increase in compute and power consumption slows down our efforts to bring the most advanced ML models into production for our products and services,” Meta explained.

To solve this challenge, Meta created PostRay, which brings text and image understanding capabilities into a single model. Because PostRay models incorporate multiple capabilities into a single model, they’re more complex to train, deploy and maintain. However, by using MultiRay, Meta said, it only has to perform these tasks once, and that model can then be reused by dozens of different teams within the company.

“A centralized system serving a jack-of-all-trades model allows us to work directly with cutting-edge research teams and bring their work to production soon after it is published,” Meta’s researchers said.

Meta said there are two key advantages of centralizing AI models, with the first being amortization across multiple teams. Normally, training powerful models places a huge demand on resources such as graphics processing units, and each model must be trained separately. With MultiRay, teams can train multiple models at once, and split the bill between them as they can all benefit from the same resources.

A second advantage is that MultiRay enables a simpler development and operations process. “MultiRay serves a small number of large centralized models, allowing a single team to handle the majority of the operations and optimization,” the company explained. “Client teams own smaller, task-specific models that are easier to manage. This allows many teams that didn’t have the bandwidth to train, deploy and manage cutting-edge AI to use that technology.”

Meta admitted that implementing MultiRay led to many new challenges around client management, quotas and cost attribution that had previously been solved. Because query size and cache hit rates both impact the energy required to process queries, things like quotas become more complex.

In addition, splitting the costs of training high-quality MultiRay models only works if each of the models is widely used. So the models must all provide state-of-the-art quality across multiple use cases. To ensure this, Meta has had to make some heavy investments in model refresh and innovate new model architectures and training flows to reduce its research to production time.

Meta didn’t say anything about open-sourcing the code that powers MultiRay, or if it will make it available to other organizations or researchers. Still, Meta has a history of making much of its AI research available to the community, so others might be able to benefit from MultiRay’s capabilities before long.

Image: Freepik

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU