Revolution R 5.0: Hadoop Integration, New Features

Revolution Analytics launched the latest version of its software suite for analytics projects based on the R programming language, Revolution R 5.0. The solution now provides users with the ability to tap two of the biggest buzzwords in the big data analytics space today: Hadoop, and operation at scale.

“Erik Segur, Michigan State University’s Information Technologist in the Department of Statistics and Probability, said, “The Revolution R Enterprise 5.0 environment has delivered order of magnitude performance improvements, which has allowed our department to process four times the amount of analytics jobs. The researchers are now able to run, evaluate, modify and re-run their models multiple times to get more precise conclusions – it’s been amazing.”

Version 5.0 is built on the R 2.13.2 release, which includes a byte-compiler that can come handy for users that looking to extend the environment with open-source components. Revolution R also features support MapReduce in R as well as integration with the Hadoop Distributed File System (HDFS) and HBASE, the non-relational database that sits on top of it. The company announced it has joined the Hadoop ecosystem last month, and one of the products will be a tool called RevoConnectR for Apache Hadoop designed to help developers carry out MapReduce directly from R.

Dave Champagne talked a bit about what his company has been doing around integration and the expansion of its offering during Hadoop World 2011 at theCube.

In addition to this extended functionality, Revolution added new data cleaning and import tools, alongside the latest version of RevoDeployR Server which ships with new provisioning capabilities.

The final and one of the more significant aspects to the R 5.0 release is the fact it can be run parallel now, meaning that it can deliver much greater performance leveraging the same functionality Hadoop has been using.