

Jeffrey Breen of Atmosphere Research Group presented at tk how to use Apache Hadoop with the statistical programming language R using RHadoop. Hadoop has become practically synonymous with big data and R has become the language of choice for data scientists so it’s natural to want to use the two together.
Breen has made his presentations available on SlideShare and the code and configuration files available on Github.
The first tutorial explains how to install Hadoop on a local virtual machine to help you get familiar with Hadoop:
The second guides you through the process of setting up R and RStudio on an Amazon Web Services EC2 instance:
The final presentation demonstrates how to launch a Hadoop cluster on EC2 using Apache Whirr.
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.