UPDATED 16:52 EDT / MAY 15 2024

AI

LanceDB raises $8M to speed up AI models with its open-source vector database

LanceDB Inc., the developer of a database optimized for artificial intelligence models, today disclosed that it has raised a $8 million seed round.

CRV led the investment with participation from Essence VC and Swift Ventures. According to LanceDB, its outside funding now stands at $11 million.

Before processing data, an AI model turns it into mathematical objects known as vector embeddings, or vectors for short. Those objects make it possible to represent individual pieces of information as points on a kind of map. If two information snippets are connected to one another in some way, such as because they describe the same topic, the points that represent them will be located close together on the map.

AI models’ use of vectors is one of the reasons they can perform complex reasoning tasks. Vectors’ ability to capture information about data relationships, such as similarly between snippets of text, makes it easier for neural networks to draw conclusions. But there are also tradeoffs: Information stored in this format can be challenging to manage with traditional databases. 

San Francisco-based LanceDB offers an open-source database, also called LanceDB, that is specifically geared toward storing vectors. The company says the software can hold upwards of billions of vectors for AI applications. Moreover, LanceDB promises to boost those applications’ performance.

After companies turn a dataset into vectors to let AI models process it, they typically don’t discard the original information but rather store it for later use. Such raw information usually has to be kept in a separate system. LanceDB says its database can store vectors and the raw files that were used to generate them in one place, which simplifies data management tasks.

LanceDB keeps the information it holds in a custom file format called Lance. The technology can be used to store not only vectors but also raw data such as text, images and videos. LanceDB says that Lance allows AI models to retrieve information up to 100 times faster than with Parquet, a popular file format commonly used in machine learning projects.

The company is also promising other benefits. Lance includes a built-in versioning tool, which makes it easier to manage the different versions of a record that an AI model generates while processing it. Another time-saving feature allows developers to turn Parquet files into the format with two lines of code.

LanceDB’s namesake database combines Lance’s performance optimizations and versioning features with a number of other capabilities. According to the company, the software provides integrations with popular data science tools from the open-source ecosystem. It also enables developers to interact with their data using multiple programming languages including Python.

“LanceDB is able to deliver unparalleled scalability for semantic search using an order of magnitude less infrastructure than vector databases,” LanceDB co-founder and Chief Executive Chang She detailed in a blog post. “It supports interactive data exploration on petabyte-scale AI data. And it drastically reduces the cost of managing multimodal datasets for training and fine tuning.”

The seed round LanceDB announced today will enable the company to hire more employees as it gears up to launch its paid offering, LanceDB Cloud, into general availability. It’s a managed version of the database that removes the need for customers to maintain the underlying infrastructure. It also offers a second paid edition, LanceDB Enterprise, that provides additional capabilities including an enhanced set of cybersecurity features.

Image: Unsplash

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU