Exploring AI’s deepening role in modern supercomputing — Nvidia, Weka and Run:ai weigh in
The real-world application of artificial intelligence to solve age-old problems across various technology sectors proves that AI is no longer just a buzzword. It’s a transformative force with real-world implications. A blossoming arena witnessing that transformation is supercomputing, and new AI infrastructure developments and nifty ecosystem partnerships highlight the innovation underway.
“We created this reference architecture called Weka AI RAG Reference Platform,” said Shimon Ben-David (pictured, second from left), chief technology officer of WekaIO Inc. “Weka is a high-performance data platform … we are seeing customers still struggling with how to implement RAG inferencing. It has a lot of moving components. Honestly, there’s no real blueprint or protocols defined yet for that. We created this environment that shows all of the layers that are needed. We’re heavily using Run:ai and the Nvidia stack also, the GPUs, but also the software frameworks.”
Ben-David, alongside Ronen Dar (left), co-founder and chief technology officer of Runai Labs USA Inc., and Dion Harris (right), director of accelerated data center GTM at Nvidia Corp., spoke with theCUBE Research’s Savannah Peterson at SC24, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed the three companies joining forces to create cutting-edge AI infrastructure solutions that are more than the sum of their parts. (* Disclosure below.)
Weka’s WARRP redefines AI infrastructure
WARRP has been designed to simplify the implementation of retrieval-augmented generation workflows. While RAG enables enterprises to customize AI models by integrating proprietary data, thus enhancing their relevance and utility, deploying these systems at scale has remained a significant challenge. Underpinned by Nvidia GPUs and Run:ai’s orchestration solutions, the platform provides fertile ground for enterprises to integrate and scale AI effortlessly, according to Harris.
For its part, Nvidia has long experimented with the AI idea, with evidence showing the company’s inroads as early as 2006 with foundational tools such as CUDA. Today, these tools are vital for enterprises as they transition from AI experimentation to large-scale deployment.
“RAG is the way to customize foundational models to incorporate proprietary data or data that you care about and want to be represented in your AI models,” Harris said. “As it relates to Nvidia, we’ve been down this path of trying to be a proponent of AI and help customers adopt AI. [We’re] providing more guidelines, blueprints, templates and APIs that make it easy to plug and play and leverage these tools. Working with Weka and Run:ai is a great example of doing exactly just that.”
Run:ai has emerged as a crucial player in optimizing AI workloads. As organizations deploy open-source large language models to maintain control over data, costs and intellectual property, the platform’s orchestration solutions ensure efficient scaling and GPU utilization. With an emphasis on “tokenomics,” or the economics of AI token production, Run:ai’s tools address the increasing demand for cost-effective and scalable AI operations, according to Dar.
“When you scale your application, GPU utilization, the cost of those LLMs and the cost to serve those LLMs becomes a real problem,” he said. “As we all move forward and LLMs become more and more important, GPU utilization will become more and more important — just increasing GPU utilization and reducing the cost of serving LLMs.”
Here’s the complete video interview, part of SiliconANGLE’s and theCUBE Research’s coverage of SC24:
(* Disclosure: WekaIO Inc. sponsored this segment of theCUBE. Neither Weka nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)
Photo: SiliconANGLE
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU