UPDATED 13:46 EDT / APRIL 06 2023

Chroma funding: A computer screen filled with code snippets AI

Chroma bags $18M to speed up AI models with its embedding database

Updated:

Database provider Chroma Inc. today announced that it has raised $18 million in seed funding. This Chroma funding round is expected to accelerate the company’s growth and expansion plans.

The investment was led by Quiet Capital. Executives from Hugging Face Inc. and more than a half-dozen other tech companies contributed as well.

San Francisco-based Chroma is led by co-founder and Chief Executive Officer Jeff Huber. (Note: In a previous version of this story, SiliconANGLE mistakenly identified Huber as the Jeff Huber who previously worked at Google LLC.) Huber was previously a co-founder at Standard Cyborg, which built computer vision systems.

Chroma develops an open-source database that is specifically designed to power artificial intelligence applications. Since its release less than two months ago, the database has been downloaded more than 35,000 times.

AI models have a data bank that they draw on to make decisions. A shopping recommendation model, for example, may maintain a database of the latest product listings. Neural networks built for cybersecurity tasks store information about hacker activity.

AI models don’t store their data in its raw form but rather as abstract mathematical structures called vectors. A collection of vectors is known as an embedding. Chroma’s open-source database, which is also called Chroma, is specifically built to store AI models’ embeddings.

“Developers use Chroma to give LLMs pluggable knowledge about their data, facts, tools, and prevent hallucinations,” Huber and his co-founder Anton Troynikov wrote in a blog post today. “Many developers have said they want ‘ChatGPT but for my data’ — and Chroma provides the ‘for my data’ bridge through embedding-based document retrieval.”

One of the most important features of embeddings is their ability to highlight similarities between the data points they store. An embedding that contains vectors representing handsets, for example, can highlight handsets that have a similar price. The opposite is also true: It’s possible to point out handsets that have a significant price difference. 

Embeddings’ ability to highlight similarity between data points is essential to the functioning of AI models. Recommendation models, for example, generate shopping suggestions by analyzing what items a user has bought in the past and finding similar merchandise. Neural networks that detect malware look for network activity that resembles known hacking tactics.

Embeddings also make it possible to detect when two items are dissimilar, which is likewise useful for AI applications. In the cybersecurity market, some AI-powered breach prevention tools work by mapping out how customers typically interact with an application. Such tools then look for activity that is dissimilar to a customer’s usual access patterns.

Many traditional databases are not specifically designed to store AI model embeddings, which complicates the work of developers. Chroma’s database is designed to address that challenge. According to the startup, its platform is specifically optimized to store AI embeddings and can consequently provide relatively simple developer experience.

The task of turning the data that an AI model ingests into embeddings it can use for processing is done with specialized algorithms. According to Chroma, its database provides features that make it easier to use such algorithms. The result is a reduction in manual work for software teams.

Chroma supports several open-source embedding generation algorithms. It can also make it easier to use a number of commercial tools in the category, including OpenAI LLC’s cloud-based service for creating embeddings. Developers with more advanced requirements can deploy their own custom algorithms.

To speed up queries, Chrome offers an in-memory mode. Databases usually store information on disk or flash storage and bring it into memory only when it’s actively used. An in-memory system keeps information in RAM from the outset, which skips the process of retrieving data from storage and thereby speeds up computations.

Chroma says it will use its newly announced funding round to build new features. The startup is planning, among other additions, a capability that will allow developers to determine if information retrieved by the database is relevant to a given query. It’s also developing a commercial, managed version of its database that’s set to launch in the third quarter. 

Image: Unsplash

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU