UPDATED 09:00 EDT / OCTOBER 16 2024

BIG DATA

Big-data startup UltiHash unveils advanced deduplication to boost AI storage efficiency

Big-data startup UltiHash GmbH is looking to disrupt the artificial intelligence storage market with the launch of a “unified storage layer” that combines a new and more sophisticated deduplication algorithm with the scalability of data lakes and the querying power of data warehouses.

UltiHash reckons its unified storage layer is essential to tackle some of the most pressing challenges around AI storage, which has led to rising infrastructure costs and is having a significant environmental impact.

The company cites a report by Forbes that shows how the market for AI will grow to more than $407 billion by 2027. As the AI industry grows, so does the need for storage, as training AI requires vast amounts of information that must be processed on the highest-performing data infrastructure.

According to UltiHash, storing this data is both expensive and resource-intensive, yet it’s essential, acting as a kind of “gas tank” for AI. Existing gas tanks are highly inefficient, the company says, hence the need for something superior.

The most compelling feature of UltiHash’s platform is its new deduplication algorithm, which the startup claims is able to reduce data volumes by up to 60% in some cases by analyzing and eliminating redundant data at the byte level, to reduce problems around “data bloat.”

UltiHash co-founder and Chief Executive Tom Lüdersdorf told SiliconANGLE that the company takes a different approach to deduplication that’s focused on ensuring minimal overheads on writes, while maintaining peak performance on reads, as opposed to traditional techniques that prioritize space savings.

“Our approach fragments files into variable-sized binary fragments, allowing us to find similarities of files across the entire storage clusters,” Lüdersdorf explained. “It allows us to limit the computing tradeoffs to write operations, which occur less frequently than reads. This allows for high-throughput data handling, especially on reading data, without the tradeoffs seen in traditional deduplication methods that use compression techniques that slow down data retrieval.”

According to Lüdersdorf, this novel deduplication technique enables UltiHash to deliver a vastly superior level of performance in terms of read and write operations, with a 250% boost in read speeds compared to Amazon Web Services Inc.’s S3 storage service. Moreover, because the platform possesses the capabilities of both data lakes and data warehouses, it can handle a more diverse range of data types and formats, while scaling to support petabytes of information.

The platform also happens to be AWS S3-compatible, thanks to an application programming interface that ensures smooth integration with that service. It boasts a “Kubernetes-native design,” which makes it easy to integrate with existing cloud and on-premises data architectures.

“This makes it highly flexible and deployable across various environments,” Lüdersdorf said. “It can be deployed on cloud platforms like AWS using AWS EKS, as well as in private or colocation clouds.”

Finally, UltiHash provides integrated “erasure coding” capabilities that protect data from hardware failures, ensuing it can be recovered quickly and easily in the event of any system disruptions.

“In the event of partial outages, such as a hard drive failure, erasure coding enables the system to reconstruct lost data from remaining data fragments,” Lüdersdorf explained. “This ensures continuity and data integrity, providing a robust layer of protection that keeps data accessible even in the face of hardware failures.”

Analyst Steve McDowell of NAND Research Inc. told SiliconANGLE that UltiHash is interesting because it’s targeting one of the most important segments of the storage industry, with object storage becoming the de facto format for the vast majority of cloud-native, AI and data analytics workloads.

The analyst said UltiHash’s approach is quite different from most object storage solutions, as it brings a number of features that are more closely associated with traditional block and file storage

“Chief among these capabilities is its deduplication features, which promises about a 2:1 compression ratio,” McDowell said. “This is less than the dedupe you’d expect from a filer, but better than you see from most object storage solutions. It’s a nice differentiator that can bring cost savings to cloud customers as their unstructured data grows.”

Those cost savings may be offset by a small impact on write performance, McDowell said, noting that the company doesn’t provide any specific numbers in terms of its write speeds, focusing instead on its read performance. But he said this won’t be a problem for most enterprises, as object storage workloads typically read much more than they write data. He added that the startup has clearly put in a lot of effort into optimizing its caching and metadata architecture to deliver its superior write performance.

“This is exactly what you want for AI and analytics workflows where time-to-value from data trumps most other considerations,” McDowell said. “It looks like a nice offering targeting a growing market, though UltiHash is competing in a crowded place, with various other compelling software-defined object storage solutions. The challenge will be showing enough differentiation to drive adoption. But its dedupe capabilities and the read performance make it worth evaluating.”

With its storage layer becoming generally available today, the ambitious company said it’s looking to engage with clients in industries spanning AI, telecommunications, manufacturing, automotive, research and more, so it can refine its offering for different use cases. That said, AI remains the company’s major area of focus.

“The AI revolution is generating data at an unprecedented rate and traditional storage solutions are struggling to keep pace,” Lüdersdorf said. “The future of storage will make it possible to avoid ballooning data costs without compromising on speed.”

Image: SiliconANGLE/Microsoft Designer

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU