Alistair Croll and Edd Dumbill are the co-chairs of the O’Reilly Strata Conference. They dropped in on theCube at Strata-Hadoop World 2012 to talk Big Data. The growing conference has been experiencing incredible growth and maturation; it illustrates the growing importance of the topics covered here. Croll notes that this is right where the event should be, anticipating the curve and finding that spot where humans touch technology. Currently, the event is in the phase where the hype is behind them, and real world issues and applications are all the discussion, as the technology has climbed into the “people stack”.
The path is navigated by a reliance on realizing the importance of data. The concept of “garbage in” and “garbage out” is critical – thus curating data which can certainly lead to bad things such as privacy issues, civil rights issues and more is a central strategy here, if you can feed good data and nourish it, data can help the organization be healthy and wise. Edd Dumbill notes that machines exist in service to people. Data tells us things about people, and focus on design and thus user experience is extremely important.
Data has things to conquer, and the approach is adapting to these situations. One element is design. Design is defined as helping people arrive at an outcome that you want. The simpler the element you are dealing with, the easier it is to line up with business. As we simplify- Cloud computing is a tool and becomes just computing, Cloud storage becomes just storage, Big Data just becomes data. Organizations historically have used to make decisions based on guesses. What they now possess is the ability to ask good questions, gain analysis, and base decisions on that. Organizations that win are those that can turn that info into an outcome. It then exists in a realm of organizational structure of decision theory, and is at the heart of cultural change in a company. Many analytics need to be accompanied by coaches that can say such things as:
“You’re getting closer to that outcome”
“These are the people resisting that outcome”
“Here are the obstacles between you and that new outcome”
Thus programming the organization to go after the data suggested is a completely unresolved outcome. Over time, if we can assist people in understanding why they made decisions with great metadata and support traced throughout the system, then understanding of what decisions were made and how as data flows becomes the norm.
Dumbill showed off a little mini conference project, with multi-interface data sensors placed throughout the Strata event. These fifty sensors have been streaming data up through a wireless mesh networking network and a feature a slew of information sensors – PIR, temperature, noise, and humidity. The project is an actual example of data collection, tracking and delivering data to a team of data scientists. Along with photos, hotel data, and different points of information, the results of the lab and information that is analyzed from it is planned to be shared at next year’s conference.
Strata is looking to the future, where a metadata wave is expected. Metadata goes beyond Hadoop and into some important things to consider. The shift to metadata reinforces Cloudera’s statements that interactive access to data is crucial. The need to cope with streaming data will continue to be important, in addition to design. The focus will turn to developing a strategy to discover valid enterprise data architectures as they actually start to happen. These will come from ongoing case studies and harvesting data easily from these models will drive consumption in new ways.