DataStax adds vector search to boost support for generative AI workloads
Database startup DataStax Inc. has announced general availability of its new vector search capability for every version of Astra DB, a popular database-as-a-service offering that’s built atop the open-source Apache Cassandra product.
Announced today, the new capability makes Astra DB much more suitable for hosting data that’s used to train artificial intelligence models, including generative AI-powered chatbots.
Astra DB is an enhanced version of the distributed Apache Cassandra database that’s geared toward managing enormous volumes of data. Cassandra can store petabytes, or quadrillions of bytes, and provides an extremely high level of resiliency with its ability to withstand outages, so long as a single server within a deployment stays online.
With Astra DB, companies get additional functionality such as simplified deployment and day-to-day management features. Because Astra DB is a serverless cloud service, users don’t have to worry about the underlying infrastructure that hosts the database. Customers can choose to host Astra DB on Amazon Web Services, Microsoft Azure, Google Cloud or another platform.
DataStax said the new vector search feature makes Astra DB an ideal platform for AI projects because it enables the database to store data as vector embeddings. Unstructured data such as documents, videos, images and user behaviors can now all be converted into vectors, which are a series of complex numbers that can be accessed by AI algorithms more easily.
With AI models, inference is often a matter of finding which vectors are nearest or most similar to others. Given these capabilities, vector databases have become an essential element for any AI initiative that needs to be trained on proprietary data.
DataStax Chief Product Officer Ed Anuff said vectors can be thought of as the “language” of large language models that power generative AI. “Every company is looking for how they can turn the promise and potential of generative AI into a sustainable business initiative,” he said. “Databases that support vectors are crucial to making this happen.”
Vector Search was first made available in preview on Astra DB on Google Cloud earlier this year, and now comes to the AWS and Azure versions too. The on-premises and self-managed version, DataStax Enterprise, will also get the same capability within one month, DataStax said.
DataStax pointed out there are several reasons for enterprises to consider moving or starting new AI initiatives on Astra DB, including its global scale and reliability, and its support for the most stringent enterprise standards for managing sensitive data, such as the Protected Health Information, Payment Card Industry and Personally Identifiable Information data standards.
According to Anuff, most enterprises pursuing generative AI will need a vector database that scales to trillions of vectors, so they’ll require a platform with unlimited horizontal scalability. “Astra DB is the only vector database on the market today that can support massive-scale AI projects with enterprise-grade security on any cloud platform,” he promised.
DataStax has shown strong interest in expanding the AI capabilities of its database platform. Earlier this year, it acquired the AI development platform startup Kaskada Inc. for an undisclosed fee. With that acquisition, DataStax added important feature engineering tools to its platform that enable developers to improve the accuracy of their AI models.
Image: GarryKillian/Freepik
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU