NASA Talks About Big Data, Hadoop


NASA scientist Chris Mattmann sat down in theCube at Hadoop Summit 2012 with Stuart Miniman and Abhi Mehta, co-founder of Tresata.  Mattman is a Ph.D. and among other things is a senior computer scientist at the NASA Jet Propulsion Laboratory.   Throughout the interview Mattman discusses scientific use cases for Big Data, some of the massive computing challenges that Big Data can help solve, and how these relate to the world outside of NASA.

View the video below or click here.

One of the first applications mentioned by Mattmann is the Square Kilometer Array, a massive undertaking comprised of a network of radio telescopes that collective will form the largest radio telescope ever.  This international project is the next-generation radio astronomy instrument and will require an astonishing 700TB per second data rate.  Mehta adds that this is 15 times greater than the requirements of the Large Hadron Collider, the particle accelerator project that is producing scientific findings through international effort.  Mattmann states it will take decades of research to support that.  Also discussed is Big Data’s utility in US National Climate Assessment.  NASA is trying to motivate the development of technologies and use of solutions like Hadoop to answer these challenges.

Hadoop’s roots are described as a cottage industry nurtured over the last 6 years on the web and grown to the level it is today.  Mattmann was one of the original Nutch commiters in the open source world.  Originally hard to install and difficult to use, the changes present in the current generation are significant.  Many more people are able to install it, deploy it, and gain knowledge from its use.  This is where Hadoop has evolved to.

NASA’s applications for Big Data relate to business in many ways, they differentiate in the data types they handle.  For NASA, a lot of its focus is on remote sensing instruments.  Tools that operate on these tools are quite different and the scientists that utilize them rarely have any formal programming training.  So there are differences in variety and velocity of information.  Predictive analytics and Big Data have further applications not quite realized yet, as in the case of work being done by Dr. Tom Painter, a snow hydrologist that is measuring snowpack levels.  This relates to climate change tracking and the effects extend to parks, recreation, snowing scenarios, just to name a few.  The benefit of using Big Data in climate study can produce some real answers and draw on the retrospective model that the body of information provides to advance technology and thoughts on this topic.