DeveloperANGLE: AllThingsHadoop Podcast with Cube Alum Arun Murthy on Hadoop

The allthingshadoop blog had a great podcast and blog post with Hortonworks cofounder Arun Murthy, a Cube Alum.

Here is the podcast:  Episode #8 of the Podcast is a talk with Arun C. Murthy.

Main areas they talked about — Hortonworks HDP1, the first release from Hortonworks, Apache Hadoop 2.0,NextGen MapReduce (YARN) and HDFS Federations.

Hortonworks Data Platform (HDP)

Hortonworks Data Platform (HDP) is a 100% open source data management platform based on Apache Hadoop. It allows you to load, store, process and manage data in virtually any format and at any scale.

Apache Hadoop 2.0

Apache Hadoop 2.x consists of significant improvements over the previous stable release (hadoop-1.x).  Two areas include HDFS Federation and MapReduce NextGen aka YARN.

HDFS Federation:  

HDFS Federation uses multiple independent Namenodes/Namespaces to scale the name service horizontally. The Namenodes are federated, that is, the Namenodes are independent and don’t require coordination with each other. The datanodes are used as common storage for blocks by all the Namenodes. Each datanode registers with all the Namenodes in the cluster. Datanodes send periodic heartbeats and block reports and handles commands from the Namenodes.

MapReduce NextGen:  

The new architecture introduced in hadoop-0.23, divides the two major functions of the JobTracker: resource management and job life-cycle management into separate components.  The new ResourceManager manages the global assignment of compute resources to applications and the per-application ApplicationMaster manages the application‚ scheduling and coordination.

For more details check out the full post at