UPDATED 13:21 EDT / MARCH 23 2011

The Role of the Data Scientist: Infographic

Data science has been around for some time now, but only recently has it become so popular that it gave rise to a breed of new professionals, called data scientists. As the term suggests, a data scientist specializes in analyzing and interpreting data, then formulating products out of these data to keep a certain consumer-base interested. “The sexiest job in the next 10 years will be statiscians,” said Google chief economist Hal Varian.

According to this Wkibon infographic, data science can be broken down into four parts: mining data, statistics, interpret, and leverage. Mining data is essentially collecting and formatting information; statistics is information analysis; interpret is representation or visualization of data in infographics, graphs, charts. etc; and leverage is to apply the data and see how it interact with other data, and eventually come up with predictions from studying it.

The role of data scientist in connection to the four parts of data science is scouring, organization, extraction, and expansion. By scouring, data scientist have their eyes information around the web; organization is the voice that asks question about what they hope to accomplish at the end of project; extraction is taking out information they want and organize them using mathematical methods which includes factor analysis, regression analysis, correlation and time series analysis; and finally, expansion and application.

Since data is the basis in formulating new theories and predictions, it is important that data scientists ask questions to stretch the data beyond hard numbers and facts, apply information in a useful, innovative manner, and immediately process terabytes of data flow in to prevent pile-up and missed opportunities.

Data scientists are required to have a skill that opens up not only for himself but for the entire team. As such, the designer depends on the information architect, the information architect depends on statistics by the statistician, and so on. A data scientist has to have more than one skill, and the essential skills for these professionals is hacking and computer science; expertise in mathematics, statistics and data mining; and creativity and insight.

Facts and statistics can be very useful, and both accurate or inaccurate, and damaging, depending on how it is presented. It becomes destructive when facts are left out, and a collection of selected information is used to work in favor of a certain opinion.

A little history about data science: It started back in 1970 by the U.S. census’ big data collection project. The first hard drive was a 5-megabyte server the size of a refrigerator. Collecting massive data involved human remedial input in a process called crowd sourcing such as the Amazon’s mechanical turk. Today, we have 32 gigabyte micro-SD cards that measures around 5/8 x 3/8 inch and weighs about 0.5 grams, as well as use cloud computing in collecting big data. “The computing and processing of data is literally 100 to 1000 times faster and cheaper than before,” said Greenplum’s Scott Yara.


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU