

We’re all familiar with the Hadoop skills gap by now, so I won’t delve into the details of why we’re facing this conundrum in this post. Rather, let’s focus on solutions to the problem.
It’s human nature to see the world in black-and-white. In the Hadoop world, this means there is one camp that says the only way to spur Hadoop adoption and close the skills gap is to develop better, more intuitive, easier-to-use tools. Let’s lower the barrier to entry by making the tools so easy to use even a mediocre DBA could master them, goes this argument (no offense meant towards mediocre DBAs, by the way.)
The opposing camp argues that the Hadoop skills gap can only be closed by training and educating both existing and aspiring data management pros on the intricacies of writing MapReduce jobs, tuning clusters and writing parallel proceesing-optimized applications. We need smarter, better-trained people, not idiot-proof Big Data tools, goes this line of thinking.
So which is it? Better tools or better education and training? The answer, as is often the case in life, is both. Like Big Data technologies themselves (MPP analytic DBs v. Hadoop), the two options for addressing the Hadoop skills gap are not mutually exclusive. Indeed, they compliment one another much like Hadoop compliments existing data warehousing environments. What we need are dynamic, innovative vendors (both start-ups and others) building the next generation of easy-to-use Big Data tools and accessible, effective Big Data education and training opportunities to build the talent base.
Hadoop Tools Abstract Away Complexity
Hadoop needs more tools that abstract away the underlying complexity of the framework and associated technologies to allow non-expert users to interact with the platform. In many cases, this means applying graphical user interfaces and other visualization techniques in place of writing complex code.
We are already seeing a number of such tools emerge both from the open source community – i.e. Datameer’s Hadoop-based business intelligence suite – and from the commercial market – such as new visual-based Big Data integration capabilities from Informatica.
Big Data Expertise
At the same time, the more engineers, admins and developers that understand the intricacies of MapReduce and other Big Data processes and technologies, the more innovative and effective uses of Hadoop we are likely to see. Indeed, those enterprises that apply Hadoop in the most unique and novel ways are the ones with the best chance to gain significant competitive advantage.
Hadoop and Big Data education must become an accepted university-level course of study so that more college graduates can move seamlessly from academia to the enterprise that relies on Big Data and analytics to drive innovation and improve efficiencies. The good news this process has already begun, with programs sprouting up at USC, N.C. State, NYU and elsewhere, but more programs are needed.
But we can’t ignore existing DBAs, systems administrators, and application developers. Training opportunities that allow these and other existing data management pros to learn Big Data both on the job and in more formal classroom/lab environments are needed. Again, we are seeing progress in this area, with training seminars and courses from Hadoop providers like Hortonworks (See Hortonworks’ VP of Marketing John Kreisa discuss Hadoop training and tools at Hadoop Summit 2012 in video below) well as industry organizations like TDWI.
Its also critical that CIOs, IT management and executives give data management pros the needed leeway to take advantage of these resources, both in the form of allowing time off from work for training and providing tuition reimbursement. After all, well-trained Hadoop pros will return significant value to the enterprise, so the enterprise should bear some of the burden of training them.
Only by fighting the Hadoop skills gap war on two fronts – better tools, better education and training — will we as a community emerge victorious.
THANK YOU