

Apache Hadoop has a new release, its first in more than a year. According to the Cloudera blog, the new release, Apache 0.23.0, has a number of new features and improvements, including HDFS federation and a new MapReduce framework.
But as the Cloudera blog states, before you go to far with this new release it is important to note that 0.23.0 is not a production release. Cloudera warns that it should not be put on a production cluster.
According to Cloudera:
HDFS federation improves HDFS scalability by allowing multiple independent namenodes, each managing a portion of the namespace. Each datanode in the cluster can provide storage to all the namenodes (which means datanodes do not, for example, belong to a single namenode). Note that HDFS federation is not to be confused with HDFS High Availability, which will be coming in a future 0.23 release.
MapReduce 2 (“next gen”) is a re-write of the the MapReduce runtime to overcome scalability bottlenecks in the jobtracker. It is based on a new framework called YARN for cluster resource management, and a MapReduce “application” which runs users’ jobs on YARN. In this design MapReduce becomes a user-space library, and also allows other parallel applications to run on Hadoop clusters, beside MapReduce applications.
There are some additional changes to MapReduce that should be noted. See the Cloudera blog post for more detail.
A criticism about Hadoop is the slow development. Getting to this new release is important for the community to build on developments such as BigTop.
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.