UPDATED 12:01 EDT / APRIL 22 2014

The academic + corporate perspectives on Big Data education : Cloudera goes to school

classroom school desks educationCloudera may have received a hefty investment from Intel, a move that validated Big Data efforts and cloud-driven service models for an emerging market, yet the platform provider  is continuing to make significant investments of its own, namely in Big Data education.

Among the first companies to offer Big Data management services, Cloudera has every reason to encourage the development of skill sets required in data science. This burgeoning sector is high on job openings but low on applicants, and the gap is only growing. There’s more colleges and universities now offering Big Data courses, and interested parties like Cloudera and IBM are teaming up directly with education organizations to develop curriculum and extend learning opportunities in the classroom and online.

Just last week Cloudera announced a hands-on training course for designing and building Big Data applications. The program teaches developers to analyze and solve real-world problems using Apache Hadoop and associated tools in its Enterprise Data Hub. The four day course walks developers through the entire process of designing and building solutions, including ingesting data, determining the appropriate file format for storage, processing the stored data, and presenting the results to the end-user in an easy-to-digest form.

In the heart of Silicon Valley, Cloudera has also teamed up with San Jose State University  (SJSU) to provide students the hands-on experience of working with Big Data technologies. Today we hear from Cloudera’s Senior Product Manager Ryan Goldman, and Peter Zadrozny, Founder and CTO of Opallios and Adjunct Professor of Big Data Analytics at SJSU.

The two discuss the need for more technical training in the Big Data sector, the obstacles faced by the enterprise and educational groups, as well as the evolving landscape of SJSU as it pertains to trends like fragmentation amongst the types of Big Data educational programs designed for MBAs, math majors and beyond.

The corporate perspective

 .

How does Cloudera seek and retain Big Data engineers, and how have your company’s online Big Data education tools been received since launching?

Ryan Goldman Cloudera

Cloudera’s Senior Product Manager Ryan Goldman

Goldman: Cloudera’s firm belief is that most organizations already have the manpower to get started, but need help training existing developers, administrators, and data analysts to use the tools of the enterprise data hub with maximum impact and minimum friction. Cloudera University classes focus on hands-on experience with real data that simulates the most common problems our customers see in the wild and focuses on the best practices to overcome the learning curve. The ultimate goal is to treat Hadoop (and your data) like an enterprise resource, not a sandbox environment.

The online resources focus on both ends of the adoption funnel: the very top, where people early in their Big Data journey are trying to learn about Hadoop and develop their own professional use cases by working through introductory materials and tutorials, and the more advanced practitioners at the bottom of the funnel who have been using the tools of the enterprise data hub in development or production but are looking to advance their skills and keep up with the state-of-the-art on new tools, new features, new functionality, and new power-user insights.

Cloudera University’s online training resources range from Introduction to Hadoop and MapReduce and Introduction to YARN and MapReduce 2 to Writing UDFs for Hive and Pig and our newly released Data Science Challenge Solution Kit.

 .

What are you hearing from clients regarding obstacles in finding Hadoop/Big Data engineers?

Goldman: Hadoop developers and administrators are now the most sought-after technical professionals in the world (according to BusinessInsider.com). Because there are still relatively few certified Big Data professionals available to hire and because there is such a high premium for Hadoop experience (25 percent salary bump, according to Dice.com), the sentiment from the industry is that internal training, complemented by professional architecture services and consulting, are requirements for success.

Although there’s often a predisposition towards learning the tools on the job, customers who have trained their teams and engaged Cloudera’s Solution Architects onsite tell us that proactively enabling their Big Data operations, development, and analytics teams is the key to realizing the objectives for which Hadoop was adopted in the first place: saving money through operational and data management efficiencies and driving towards new revenue opportunities through information-driven projects that require a level of strategy and experience to deliver on a proposed use case.

 .

How is Cloudera building in functionality, user interfaces and automation capabilities to democratize data for employees that aren’t skilled in Hadoop and advanced analytics?

Goldman: Cloudera’s recent introduction of an enterprise data hub featuring Hadoop at its core focuses on enabling the business end-user — from BI analysts to marketers to members of the CFO’s office — to become more information-driven and to use data as a competitive advantage without the complexity usually associated with learning new technologies. In fact, the benefit of an enterprise data hub is that customers can use the skills and resources they already know and love to perform more varied queries, on bigger, more diverse data sets, faster, interactively, when they want, without the pain of requesting the data, moving the data to the query, or having to build out new systems, all with full governance and data security.

This is all possible because the enterprise data hub centralizes more data of all structures and formats and makes it accessible via a huge variety of familiar business tools already in use: Microstrategy, Tableau, Qlikview, Splunk, SAS,Revolution, etc., etc., etc.

Moreover, for the early-stage user, including college and graduate students in analytics, software innovations around interactive SQL via Impala and interactive search via Cloudera Search make Cloudera’s platform the most appropriate starting point for the journey to Big Data.

The academic perspective

 .

What’s SJSU’s corporate partnership strategy to determine employer needs, and how is that converted into academic programs?

SJSU adjunct profressor Peter Zadrozny

Peter Zadrozny, Founder and CTO of Opallios and Adjunct Professor of Big Data Analytics at SJSU

Zadrozny: Whereas I cannot answer for all of SJSU, I can say that the Computer Science department is focused on incorporating courses related to the areas of Big Data, cyber-security and gaming. In addition to the team of professors the department has, it uses industry experts as lecturers to bring the most recent technological advances into the classroom, and so better prepare the students to successfully contribute when they go to work.

For example, the Big Data analytics course I teach was designed with the employer in mind. After interviewing a number of CIOs and VPs of Engineering, it became clear that the potential employers want to have candidates that have gone through the worst of the learning curve of the most popular Big Data tools, in this case Hadoop and Splunk.

 .

What trends are you seeing as far as education program fragmentation for niche areas of Big Data, such as Business Analytics or Big Data for Supply Chain Management?

Zadrozny: Every area or domain of expertise wants to take advantage of Big Data technologies, but we feel that it isn’t scalable to have specialized courses for every area. The solution we have developed is to break that problem in two parts. The first one, which is the focus of the Big Data analytics course, is to create what I call data wranglers. At the end of the course, the students have hands-on experience with Hadoop and Splunk; they know how to use these tools and how-to (and how not to) analyze the data at hand.

The second part is where we team up these data wranglers with the domain experts, who have a deep understanding of the data that is being analyzed and can provide direction to the data wranglers. Together, they can achieve incredible results.

 .

What type of change (increase or decrease) have you seen from students regarding Big Data engineering programs?

Zadrozny: The growth has been unbelievable. When I started teaching the Big Data analytics course two years ago we barely got a couple dozen students to register. This semester the 30 slots of the course were sold out in no time and the waiting list had 21 students.

To get the students perspective, we asked Pradeep Roy, a Masters in Computer Science student why he chose SJSU.

The student perspective

 .

Why did you choose SJSU?

Roy: I cannot emphasize enough what being in the heart of Silicon Valley does to an individual. Gain a wider perspective, breathe innovation as well as watch the world change and eventually do it yourself. San Jose State University accords you this while grooming you with their quality and practice-oriented courses and exposure. That is precisely why I chose SJSU for getting a Masters’ after my undergrad and a brief work stint in India. Big Data has fascinated me for a while now and as the technology evolves, SJSU is one of the few schools that could offer me the desired infrastructure along with its robust industrial relations with several Silicon Valley companies currently migrating to Big Data.

Overall, SJSU has everything it takes for a Computer Science major to learn the tricks of the trade, grasp the intricacies of the ever-changing industry domains and make it big for yourself (and perhaps several others). A Spartan, through and through!

Big Data and education are two words that play very well together. The more we understand about the world, the better our education schools and processes will be. Equally important is the focus being put on educating both businesses and people on how to analyze Big Data. The jobs and the people that fill them in the new Internet of analyzing and predicting using these new expansive data sets will be imperative to the next wave of technological innovation.

feature image : dcJohn via photopin cc

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU