

Few would disagree that software-defined-networking (SDN) holds a lot of potential – it promises numerous benefits to organizations that run lots of virtual networks and machines, most obviously the automation and flexibility that it provides.
Even so, there are plenty of critics who say that SDN just doesn’t have that many applications – after all, there are few data centers that have to deal with the enormous traffic problems SDN can solve.
But could SDN be used as a solution for other problems? Enter a bunch of Chinese scholars and their paper “Bandwidth-Aware Scheduling with SDN in Hadoop: A New Trend for Big Data”, which proposes using SDN to solve a common Big Data headache.
According to the authors, from the Huazhong University of Science and Technology in Wuhan, Hadoop possesses multiple task schedulers, but these fail to take into consideration the bandwidth that’s available. What this means is that Big Data ops using Hadoop often lose “optimized opportunities for task assignment.”
While parallelism is one of Hadoop’s biggest advantages, it’s not ideal when we miss a chance to slot in a task. Hence, the Chinese scholars attempted to tackle the following question: “Can we combine the bandwidth control capability of SDN with Hadoop system to exploit an optimized task scheduling solution that has high efficiency and agility in terms of job completion time for big data processing?”
They believe they can, thanks to a proposed task scheduler dubbed “Bandwidth-Aware Scheduling with SDN in Hadoop”, or “BASS” for short.
What BASS does is it interfaces with an OpenFlow controller, allowing it to learn about the Hadoop cluster and its networking rig’s available bandwith. Once BASS is done learning, it’s able to allocate tasks according to how quickly they can carry them to the Hadoop nodes. According to the authors, tests have shown that BASS is much faster than other kinds of task schedulers. They even suggest an improvement that can be made – something called “Pre-BASS” – which makes queues even more efficient by grooming them first.
The authors detail a series of tests they made using a six-node Hadoop cluster spread over five physical hosts. Admittedly, most Hadoop clusters operate at a far greater scale than this, but the authors believe it should be possible to scale BASS to almost any size.
photo credit: ElDave via photopin cc
Support our open free content by sharing and engaging with our content and community.
Where Technology Leaders Connect, Share Intelligence & Create Opportunities
SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.