RainStor, a provider of online data retention solutions, released RainStor 4.5 for Cloudera’s Distribution including Apache Hadoop. This combo, according to the company, can lead to great increases in efficient from more aspects than one, including a physical footprint at least 97 percent smaller when used to retain and access data on Hadoop Distributed File System (HDFS). RainStor 4.5 is designed for Hadoop deployments processing petabytes worth of big data, and significantly increases compression to up to 8 times more than what binary methods provide.
“…most Hadoop deployments rely on the use of binary compression (such as LZO), which typically yields on average 5 to 1 compression and comes with a re-inflation performance penalty upon access. In contrast, RainStor achieves compression rates of 40 to 1 or greater and allows data access without re-inflation.”
RainStor exemplified that using 4.5 for Cloudera’s Distribution including Apache Hadoop to store 2 petabytes of raw data for 6 months would result in a physical storage savings of 5.85 petabytes.
Big data management and Hadoop have been in the spotlight recently, and alongside the news from RainStor comes a fresh announcement from IBM. The tech industry veteran announced 20 new data analytics services, as well as Hadoop-based software. InfoSphere BigInsights and Streams incorporates more than 50 patents, including Watson-like technologies, while putting a certain emphasis on digital marketing optimization. IBM also offers a free ‘Basic’ version of the new software.
Open-source is one major trend to watch for in the big data analytics space, but data visualization may not be too far behind. Companies like Google launched some relatively small-scale initiatives in this field, but data analytics solutions provider Kitenga just launched an enterprise-scale offering purposed to do exactly that. ZettaVox is a no-programming-required applications that take a pretty broad approach to simplifying the whole process.