UPDATED 14:00 EST / NOVEMBER 18 2024

Jonathan Martin, president of WekaIO, and Shimon Ben-David, CTO of WekaIO talk to theCUBE about enterprise AI during a CUBEConversation. AI

Unlocking AI’s potential: Inference, scalability and sustainability in focus

Artificial intelligence is reshaping the business landscape, with enterprise AI driving a shift from experimental models to actionable insights.

As organizations transition from training large language models to deploying AI in real-world scenarios, the focus has turned to unlocking proprietary data’s value, optimizing inference and addressing critical challenges, such as scalability, performance density and sustainability, according to Shimon Ben-David (pictured, left), chief technology officer of WekaIO Inc.

“We’re future-proofing the environment by looking at where customers are actually going to be in the next two, three, four years and making sure that Weka can actually accommodate that,” Ben-David said. “We’re seeing how these large-scale AI projects in production, which are a year or two ahead of the market, even are doing things. We take that and we help customers accommodate for their new environments using that experience.”

Ben-David and Jonathan Martin (right), president of WekaIO, spoke with theCUBE Research’s Dave Vellante for a special CUBE Conversation about AI’s second wave, as part of theCUBE’s SC24 pre-event coverage. They discussed how enterprise AI is transitioning from training LLMs to deploying scalable, sustainable solutions for real-world applications and what’s next for AI in 2025. (* Disclosure below.)

From training to inference: Navigating the next wave

As AI continues to advance, enterprises are shifting their focus from the large-scale training of LLMs to inference. The accessibility of pre-trained LLMs is lowering the barriers to adoption, enabling organizations to implement AI at scale, according to Ben-David.

“In the past you had to train your models and you had to have a whole practice of data scientists around it,” he said. “Today, it’s very easy to just take models, existing models, pre-trained LLMs and run them in your environment. Enterprises that will implement it are looking for an outcome and we’re seeing that going through 2025, they will actually look for a better ROI on their investment because they need to benefit from these LLMs and they need to do it in a way that actually allows them to get more revenue than not doing it.”

The rise of retrieval-augmented generation further illustrates the challenges enterprises face. While promising, RAG pipelines require significant know-how to transition from proof-of-concept to secure, scalable deployment. Companies such as Nvidia Corp. are streamlining these processes with tools such as NeMo, but the lack of established blueprints keeps organizations in an exploratory phase, according to Ben-David.

“The way to do it is enterprises are now exploring whether to continue using or start using inferencing services as a service in cloud environments or maybe build their own enterprise inferencing environment in a GPU cloud or on-prem,” he said. “If they’re actually going to do it, how are they going to do it? We’re seeing Nvidia dominating that market. We’re seeing that trend increasing with maybe additional players coming into play.”

Looking ahead: Enterprise AI and its transformative potential

The growing volume of enterprise data is driving the need for exascale computing, capable of performing quintillions of calculations per second. While this capability was once confined to supercomputing labs, it is now becoming a necessity for large corporations, Martin explained.

“This is the year where exascale has become very real,” he said. “At the start of this year, we had no customers that were over an exabyte of storage capacity. We’ll end this year with five customers over an exabyte and one customer almost 10 exabytes.”

Sustainability also looms as a critical concern. Training AI models consumes immense energy, with figures such as $100 million for training ChatGPT-4 highlighting the environmental impact. It is important to balance enterprise AI advancements with sustainability goals, Martin stressed. By leveraging software optimization and cloud-based scaling, Weka aims to make AI deployment more energy-efficient and environmentally responsible.

“Every time you’re on one of these new generative sites and you’re typing in a prompt to create an image, every time that image gets created, it’s the same power consumed as a full charge of an iPhone,” he said. “It’s maybe no surprise that people are beginning to wake up that while AI may solve the sustainability and environmental challenges on the planet, it’s probably also equally possible that it’s going to be the thing that melts it.”

The trajectory of enterprise AI suggests that organizations embracing AI-native frameworks will lead the market in the coming years. In the future, companies will integrate diverse models — both general and specialized — into orchestrated systems to generate actionable insights, Martin envisions. Enterprises that build strong data pipelines and refine their inferencing capabilities will distinguish themselves as industry leaders.

“I think for the more progressive customers that we’re working with that is the nut that they’re trying to crack. If they can crack that, then I think you’re going to find in just a few years there is going to be two types of companies on the planet,” Martin said. “There’s going to be companies that are AI native at the core that have built a solid data pipeline, and they are industrializing the process of ingesting large volumes of data and transforming that data through an ensemble of models into tokens and into insights.”

Here’s the complete video interview, part of SiliconANGLE’s and theCUBE Research’s coverage of SC24:

(* Disclosure: WekaIO Inc. sponsored this segment of theCUBE. Neither WekaIO nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU