Vaultree open-sources its technology for computation on encrypted data
Data encryption startup Vaultree Ltd. said today that it has released and open-sourced elements of its data encryption technology stack that permit operations to be performed on encrypted data without the need for a decryption stage.
The Cork, Ireland-based company’s technology is designed to address the scalability limitations of Fully Homomorphic Encryption schemes, which permit arbitrary computations, including unlimited additions and multiplications, on encrypted data.
Homomorphic encryption works by applying an encryption scheme that turns plaintext and other data types into a ciphertext that can only be unscrambled with a decryption key. It enables computations to match the outcome of operations performed on plaintext without requiring a decryption stage. That makes it useful when privacy protection is paramount, such as those involving healthcare or financial records.
FHE is more versatile than other homomorphic encryption options but computationally intensive and slower.
Privacy is becoming an increasingly important issue in training generative artificial intelligence models. A Deloitte LLP report earlier this year found that 72% of IT professionals ranked data privacy as a top-three concern about generative AI, up from 22% last year.
Fostering feedback
Co-founder and Chief Executive Ryan Lasmaili said the company decided to open-source its inventions, for which it has already received 27 patents, to foster transparency and encourage feedback from the encryption community.
“The largest tech companies have been trying to solve the FHE scalability problem for 40 years and a lot of claims have been made,” he said. “We’re interested in being transparent about what works and how it solves a problem.”
Vaultree described its approach in a paper it released last summer. Its method generates a unified key structure that enables constant ciphertext size and execution time for encrypted computations regardless of the data set size. The approach addresses size and noise accumulation issues that have traditionally thwarted scalability and multiuser environments and is only 10% to 15% slower than plaintext operations. That’s a significant improvement from some FHE schemes, which are up to 40 times slower, Lasmali said.
Previous approaches to FHE “have been developed by brilliant minds but only in an academic setting, and only a handful of people know these technologies very well,” he said. “You have to go back to the drawing board and solve the math problems of creating scalable and production-ready FHE.”
Python libraries
Vaultree’s VENum technology includes Vaultree Encrypted Numerical Python, an internal FHE library that facilitates secure and scalable machine learning operations. Numpy is a popular open-source Python library for numerical computation that supports multidimensional arrays and matrices, as well as a wide range of mathematical functions.
Another element, VENum Machine Learning, is a Python library based on Vaultree’s encryption scheme that specifically addresses machine learning and enables users without advanced data science skills to perform advanced ML tasks securely. “We don’t expect cryptographers to get data science degrees,” Lasmaili said.
VENum allows for searching and ranking of encrypted files for enhanced information discovery. Healthcare institutions can use it to pool patient data for modeling without violating privacy and financial organizations can securely share encrypted data in such use cases as fraud prevention.
The company said its technology is designed to minimize performance impacts. The library supports multiple data formats, including images, tabular data, unstructured data, graphs and time-series. It integrates with leading key management systems such as HashiCorp Inc.’s Vault and Google LLC’s Cloud Key Management Service. This allows for dynamic management of encryption keys while ensuring that data remains encrypted.
Vaultree, which has raised $16 million, sells proprietary tools and services based on its now-open-sourced technology. Lasmali said the firm will let community feedback drive its development agenda. The current roadmap includes support for vector databases and multiplication depth, which is the number of sequential multiplication operations performed in a computational process.
Lasmali said enabling encrypted data to be used in model training solves another problem. “Large language models will exhaust available public data by 2026,” he said. “Now we can use private data.”
Image: Pixabay
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU