UPDATED 12:30 EST / AUGUST 26 2024

INFRA

Startup FuriosaAI debuts RNGD chip for LLM and multimodal AI inference

FuriosaAI Inc., a semiconductor startup that’s laser-focused on artificial intelligence, has unveiled a new accelerator chip it says is geared for large language models and multimodal AI.

Its new chip is called RNGD, pronounced “Renegade,” and it was unveiled at the Hot Chips 2024 conference in Stanford University today. It’s sampling to early access customers now, with broader availability slated for next year.

According to Furiosa, the RNGD chip is an extremely efficient data center accelerator that’s designed to support high-performance LLMs and multimodal model inference. The company is positioning it as an alternative to Nvidia Corp.’s graphics processing units.

RNGD is based on a Tensor Contraction Processor or TCP architecture, which the company says provides the perfect balance between efficiency, programmability and performance. It boasts some formidable specifications, with a Thermal Design Power of 150-watts, compared to more than 1,000 watts for some of the leading GPUs on the market today.

Furiosa also claims extremely high performance, with the chip packing 48 gigabytes of high-bandwidth memory. That makes it possible to run open-source LLMs such as Meta Platforms Inc.’s Llama 3.1 8B efficiently on a single card.

The RNGD chip was built on Taiwan Semiconductor Manufacturing Co.’s five-nanometer process and boasts a frequency of 1 gigahertz and 1.5 megabytes of memory bandwidth, with 256 megabytes of on-chip standard random-access memory and a PCIe Gen5 x16 interconnect that supports up to 64-gigabits-per-second throughput.

Programmability is enabled by a “robust compiler” that’s co-designed to be optimized for TCP-based chips, treating entire AI models as a single-fused operation. This means that the RNGD chips can be customized to run almost any LLM or multimodal AI workload, the company said.

What all of these numbers mean is that the Furiosa RNGD chip (pictured, adjacent) is extremely capable when it comes to running some of the best-known LLMs. Indeed, the startup claims some impressive results on industry standard benchmarks with models such as OpenAI’s GPT-J 6B, where it was able to process 15.13 queries per second.

Furiosa has a decent pedigree. It was founded in 2017 by three hardware and software engineers who previously worked for chipmaking giants such as Advanced Micro Devices Inc., Qualcomm Inc. and Samsung Electronics Co. Ltd.

Since its founding, the company has focused on a strategy of rapid iteration and product delivery. Its first-generation chip, known as Warboy, is a high-performance data center accelerator specifically designed for computer vision workloads that compares well with some of Nvidia’s older GPU designs in in the ResNet-50 image classification and SSD – MobileNetV1 object detection benchmarks.

Furiosa co-founder and Chief Executive June Paik revealed RNGD is the result of years of innovation by the startup. “RNGD is a sustainable and accessible AI computing solution that meets the industry’s real-world needs for inference,” he said. “With our hardware now running LLMs at full speed, we’re entering an exciting phase of continuous advancement.”

Featured image: SiliconANGLE/Microsoft Designer

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU