Pioneering the future of generative AI: A deep dive with Databricks co-founder Matei Zaharia
The explosive growth of interest in generative artificial intelligence has led to a belief among some tech industry leaders that the massive datasets and common access to information provided by engines such as ChatGPT will lead to a new wave of customization in the enterprise.
That outcome was echoed by Matei Zaharia (pictured), co-founder and chief technologist of Databricks Inc., in an exclusive conversation with theCUBE, SiliconANGLE Media’s livestreaming studio. While ChatGPT can return amazing results based on a lightning-fast search of the web, enterprises will need something more specific to the business at hand.
“The needs of the enterprise are very different, and this is exactly the thing that we’re specializing in,” Zaharia said. “In the enterprise … you need a level of precision and reliability that’s quite a bit higher. We think enterprises would want to control it in very domain-specific ways, and we’re building the governance tools for AI based on the rich governance tools we already have for data.”
Zaharia spoke with theCUBE industry analyst John Furrier at the Databricks Data + AI Summit. They discussed how recent product updates and a timely acquisition are positioning Databricks for the generative AI future.
Tracing data lineage
To achieve an optimal level of precision and reliability through GenAI models, Databricks has been fine-tuning its portfolio to more easily discover, query and govern data across a wide range of platforms.
“We have something called Unity Catalog, which is the only data catalog in the industry that spans unstructured files and gives you very rich controls, lineage quality across them,” Zaharia said. “You really want to trace exactly what data went into this and be able to fix that as you release your applications. It is a new set of use cases where this matters.”
In June, Databricks signed an agreement to acquire GenAI startup MosaicML Inc. for $1.3 billion. Databricks has indicated that MosaicML’s technology will be delivered through its Lakehouse platform and provide customers with a way to serve and customize GenAI models.
“You can submit a job, and they’ll learn it; and they’ll have a big pool of GPUs that you can assign to different workloads,” Zaharia said. “We really like that model. Ot’s basically a serverless model, which is also what we’ve been doing with data warehousing.”
Databricks has also made recent news in the vector database arena, announcing Vector Search, enabling developers to improve the accuracy of generative AI responses.
“Vector search is important,” Zaharia said. “It’s something that will be incorporated into many technologies in the same way that most modern data processing engines have.”
Databricks is also looking down the road toward a future in which new tools, such as LakehouseIQ, will power natural language access for a broad range of business applications.
“The thing I’m really excited about is opening data to less technical users, to business users directly with LakehouseIQ, which is this knowledge engine that learns how you query your data,” Zaharia said. “Lots of folks in the industry are working on this, but if this succeeds, I think it will open up data AI to way more users.”
Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of the Databricks Data + AI Summit:
Photo: SiliconANGLE
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU