

Hadoop has become an extremely big name here at SiliconANGLE, being one of the premiere open source cloud-storage and -computing projects. If you’re a Java developer and you haven’t had a chance to take a test drive with it, there’s a very easy tutorial up by Carlo Scarioni covering Hadoop basics.
Hadoop is an open source project for processing large datasets in parallel with the use of low level commodity machines.
Hadoop is build on two main parts: a special file system called Hadoop Distributed File System (HDFS) and the Map Reduce Framework.
The HDFS File System is an optimized file system for distributed processing of very large datasets on commodity hardware.
The Map Reduce framework works in two main phases to process the data. Which are the Map phase and the Reduce phase.
The tutorial shows a developer where to download the source files from Apache, how to unpack the helper executables, and provides a small set of Java code.
The code implements a dictionary translation by taking a series of compiled dictionaries (English-Spanish, English-Italian, English-French) and then outputs a single dictionary that displays the English word followed by every translation. Under normal circumstances, the could would start with an English word and then search every file for each instance. Hadoop speeds this up by distributing the file and processing.
The code uses a cloud-storage mechanism to speed up the hash mapping of the various dictionaries, but it does not use cloud-processing to accelerate itself. Since this is only a basic tutorial series, Carlo mentions that he’ll hit that up later.
So, if you know Java and want to play around with Hadoop, here is an excellent place to begin.
Also, it’s a good way to get an understanding of how this framework can give you a jumpstart on the cloud computing revolution.
Support our open free content by sharing and engaging with our content and community.
Where Technology Leaders Connect, Share Intelligence & Create Opportunities
SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.