UPDATED 10:00 EDT / DECEMBER 25 2013

RainStor release offers security, encryption and fast search on Hadoop

RainStor has been delivering enterprise data storage for the last five years. Its latest version adds enhanced security and faster search capabilities. RainStor’s Database runs natively on Hadoop with no requirement to move data from one repository to HDFS.

 As data needs in highly regulated industries, such as banking and telecommunications, continues to double and triple, businesses are looking for ways not only to store this data, but also access it quickly and securely while keeping the storage costs low. Additionally having a solution that does not require systems to be rewritten in order to scale is becoming a necessity.

 RainStor takes advantage of HDFS combined with its compression capabilities to shrink the storage footprint up to 90%, according to the company. RainStor also provides SQL capabilities so data can be moved without needing to rewrite years of queries.

 “Today, one of the things that the industry faces is, without having a true SQL capability, you have to rewrite everything. That’s not trivial,” says John Bantleman, CEO RainStor. “Similarly, if I put my data onto Hadoop, and it’s not secure, I can’t use it. It may work perfectly technically, but if my data is at risk, I can’t use. If I have to rewrite all my queries, I can’t use it. If I can’t back it up, I can’t use it.”

Privacy & security

 RainStor has added Kerberos, LDAP, Active Directory and PAM support to provide standard authorization and authentication capabilities within a Hadoop environment. “You can centralize your authentication services, and RainStor will integrate fully with those existing services,” says Mark Cusack, chief architect at RainStor. “To make things even more simple, on a server by server basis, we’ve added a Linux pluggable authentication module support.”

 Addressing the need for privacy and data integrity, RainStor has added on-disk encryption and data masking. Encryption, traditionally, has not been popular in actual production environments because of the amount of compute overhead it adds. RainStor, however, compresses the data first, then encrypts it before writing it out to storage. By reducing the size of the data, the encryption issue is reduced by the same amount. “It goes beyond that,” Cusack says. “If your SQL query is only pulling out one column, RainStor only pulls in that one column and decrypts that one column of the data.”

 Key management is a big issue. RainStor says it manages it’s own data encryption keys and key encryption keys. “The PCI standards, for example, are very clear about the separation of key encryption keys and data encryption keys,” says Cusack. “You could encrypt data encryption keys, and those can be placed on the Hadoop cluster, but the key encryption key can’t be. We would prime a RainStor Hadoop cluster by sending the key encryption key out over secure communication channels where it resides in memory and then decrypts the data encryption key when required. We’re also building out support for external hardware security modules, which would give you a further level of integration, in terms of key management in an enterprise.”

 Compliance

For industries concerned with regulatory compliance, audit-trails and tamper-proofing, RainStor logs and tracks all data. “When data gets loaded into RainStor, we take hash-based fingerprints of the data,” Cusack says. “Those can be stored out to the scope of RainStor and be used to compare for any tampering that may occur against the data. We’ve also added record level delete to RainStor running on Hadoop, which is, I think, a first. I don’t think anyone else has got that degree of fine granularity over data disposition in Hadoop.”

In the Telco world, it’s more about performance data. Companies have a need to review data use over a period of time. “We have a client who could only store three months worth of data before maxing out their capacity. Now they can keep it for multiple years, so they can actually look for all the patterns over two-three year time-frames.”

A growing need in many industries is efficient deletion of data. RainStor has added the ability to configure data disposable down to the record level.

“What we’re seeing with banks, in terms of compliance risk and regulatory requirements, is the need to keep years of history on low cost commodity hardware,” Diedre …point out. “But after a regulated period of time, say five years, the bank is at risk if they keep the data any longer. Rainstor provides the ability to search through all that data and automatically delete the parts that need to be removed. Right now, it’s really more, I’ve got so much data and I need to quickly find that needle, because I have a legal requirement to do so.”

Search

In terms of search capabilities,  RainStor says it has boosted query performance 10-100 times. It’s done this by essentially focusing the search based on the type of data being searched and the likely places that information is stored. “RainStor has always had the ability to apply indexes to fields within tables. We call them dynamic filters. They’re not really traditional database indexes. They’re powered by an approach called bloom filters. They give you a probabilistic mapping of the data in your tables. They will allow you rapidly, given a search term, to identify which parts of the range of data could possibly contain that phrase.  What we’ve really done here is extended our bloom filter support to cope with key words, parts of words, substrings, etc. What we’ve seen in our labs and in early tests with customers is a potential 100 times speed increase compared to making those searches in tradition RainStor.”

RainStor on Hadoop offers a compelling solution for high risk, compliance based data. The ability to utilize SQL for data access will help companies maintain low-cost storage solutions with the added benefits of data encryption and fast data retrieval.


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU