UPDATED 11:30 EDT / MARCH 26 2024

BIG DATA

Activeloop raises $11M to grow its specialized tensor database for AI training and inference

Activeloop, creator of a database platform that’s designed to cater specifically to artificial intelligence workloads, said today it has closed on an $11 million early-stage funding round, bringing its total amount raised to about $20 million.

Today’s Series A round was led by Streamlined Ventures and saw participation from Y Combinator, Samsung Next, Alumni Ventures and Dispersion Capital, the company said.

The startup, which is officially known as Snark AI Inc., has created a specialized database called Deep Lake that’s designed to streamline the flow of unstructured information, such as audio, video, image, text files and embeddings, into machine learning and large language models. It also provides data storage and knowledge retrieval capabilities for managing complex datasets for AI. Deep Lake is an open source platform that has been downloaded more than 1 million times.

Activeloop founder and Chief Executive Davit Buniatyan told SiliconANGLE that Deep Lake tackles the problem of unlocking multimodal data for AI, which is largely inaccessible for traditional databases. It does this by storing unstructured data files in what’s known as a “tensor format,” which are machine learning-native mathematical representations that make this information readily available to AI algorithms.

“This format also enables users to query complex data just like querying a structured table with SQL,” the CEO said. “Activeloop provides a way to visualize large datasets, manage dataset versions like Git, and query them with a SQL-like Tensor Query Language.”

Buniatyan said Activeloop has also developed a more rapid data loader that enables information to be efficiently streamed to graphics processing units to train AI models faster. He said this is a key innovation because when users are dealing with the enormous datasets required for AI training, it’s impossible to fit all of this information into the GPU at once.

“Instead, what companies sometimes do is they physically copy the data from its storage to the location of the GPU, which for 100 gigabytes could be hours of idle GPU time,” Buniatyan explained. “Instead, Activeloop enables companies to hand off just enough data to compute for [the GPU] to be fully utilized.”

Another innovation developed by Activeloop is its Tensor Query language, which enables companies to iterate much faster on the unstructured data they have collected. He said it’s these fast iteration cycles that are the key to ensuring rapid AI deployments.

“In a nutshell, Activeloop offers the advantage of a traditional data lake with a crucial distinction: it stores complex data in the form of tensors, facilitating rapid streaming of data to Tensor Query Language and an in-browser visualization engine, without sacrificing GPU utilization,” Buniatyan said.

The startup reckons it has seen rapid adoption of its database platform among Fortune 500 customers in highly regulated industries such as biopharma, life sciences, medical tech, automotive and legal. One of its earliest adopters is Bayer Radiology, a subsidiary of the pharmaceutical giant Bayer AG, which is using Deep Lake to train and fine-tune LLMs and deep learning algorithms using retrieval-augmented generation techniques.

Steffen Vogler, principal imaging technology scientist at Bayer Radiology, explained that Activeloop’s technology helped the company to get around the time-consuming process of preparing data for AI models. Previously, its developers were forced to tangle with complex multimodal data subsets, control data versions and continually integrate new data as it became available.

Deep Lake allowed Bayer’s team to unify these different data modalities into a single data source, significantly reducing data pre-processing times. “It’s next-level,” Vogler said. “We’ve enabled a new human-machine interface that is natural to use and yields high-accuracy results for end-users.”

Activeloop said its database can enhance data retrieval accuracy and reduce LLM errors by as much as 22% compared with other types of database. What’s more, its AI-native embedded architecture enables it to be set up in on-premises environments with just a few lines of code, making it ideal for enterprises seeking to use confidential data for generative AI workloads.

Streamlined Ventures General Partner Ullas Naik said enterprises are quickly catching on that the only way to unlock the value within their complex data is to use systems such as Activeloop’s Deep Lake. “Given their solid track record, we trust the team to execute its vision and are excited to invest again,” he said.

Activeloop said the funds will be used to onboard more enterprise customers to its database for AI and hire more staff to expand its engineering team.

Images: Activeloop

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU