

Data is a fast-moving playing field, and expanding data needs demand quick innovation turnarounds in compute, networking and analysis. WekaIO Inc., with its accelerated data platform designed for artificial intelligence and high-performance computing workloads, offers an environment that eliminates traditional trade-offs in storage performance.
“We built Weka to create an environment, a file system at the core that has no compromises,” said Shimon Ben-David (pictured), chief technology officer of WekaIO. “Looking back, we saw that there were multiple storage environments, [and] it was a game of compromises. Do I want a high volume of files? Do I want a high amount of small files at work latency? We created Weka to service the new AI workloads in a very efficient manner to ensure you can get the most utilization out of your high-performance compute [graphics processing unit] environments.”
Ben-David spoke with theCUBE’s Rob Strechay for the Tech Innovation CUBEd Awards 2025 interview series, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed how Weka is rethinking storage performance from the ground up to create a solution that’s faster, more efficient scalable and future-proof.
In recognition of its efforts in data storage, specifically for AI, Weka received a CUBEd award for “Most Innovative Data Platform.” At its core is a distributed, shared file system that functions as an extension of local GPU memory. This allows AI and high-performance computing environments to maximize their computing power without bottlenecks, enhancing efficiency and speed across GPU-based workloads. By eliminating the need to move data manually between different storage environments, the platform ensures smooth, high-speed access to massive datasets, according to Ben-David.
“We see customers copying data to the local GPU servers [Non-Volatile Memory Express] and working with it,” he said. “Imagine Weka is that local NVMe, only faster and distributed and shared across all of your GPU environments. So now all of your GPU servers are working with a shared, distributed and protected Weka environment that’s faster than their local NVMe.”
One of Weka’s key innovations is in AI inferencing, where it drastically improves the efficiency of large-scale AI deployments. AI inferencing demands immense computing power and memory bandwidth. Weka tackled this challenge by developing a distributed key-value cache system, leveraging its low-latency, high-performance platform as an extension of GPU memory, according to Ben-David.
“Inferencing is a memory-limited challenge, and GPUs are limited in the amount of memory and … in the number of tokens per second that they can create and the value that they can provide,” he said. “What we did is [that] we hypothesized how to use Weka as an extension to the GPU memory as the next stage, the next tier of the GPU memory — which is something that we imagined that we could do because of the Weka low latency, high-performance environment capabilities.”
Here’s the complete video interview, part of SiliconANGLE’s and theCUBE Research’s coverage for the Tech Innovation CUBEd Awards 2025 interview series:
THANK YOU