Breaking Analysis: The Big Data Lottery Ticket – Why Data Science Isn’t Easy

Using big data has big payoffs. That is if you know how to be a data scientist. Wait it’s easier than you think according to GigaOm. Well not so fast says Joseph Misiti a blogger expert. I had to comment on this on my morning Breaking Analysis program this morning.

Is Data Science being easy. NOT. In fact it’s hard.

Here is my video from SiliconANGLE TV Breaking Analysis program.

Data Science is hard.

Here are the two posts that caught my attention from which I was commenting on in today’s Breaking Analysis program.

GigaOm story: Why Becoming a Data Scientist Might Be Easier than you think

Joseph Misiti Developer/Expert/Blogger/Twitter Reaction to Gigaom Article: Why Data Scientist is Not Easy Actually

Here is my Video: video from SiliconANGLE TV Breaking Analysis program.

The issue is that the article and commentary assumes that a data scientist is one person, as we well know, its takes a village. A data scientist realistically shouldn’t be one person….computer programming, stats, business acumen are very separate specialties.

Everyone in the tech business is talking about the mechanics and tools, but there lacks any real discussion about knowing what to do with the data once you have it. For every company, startup, and/or developer, you have to have someone that knows what “gold” looks like if you are going to find “nuggets” in the data.

The Big Trend – It’s not about hitting the Lottery

The programming/mechanics piece will be commodity in 5 years and what will differentiate a real scientist from another is how to come up with a hypothesis, formulate a research plan to test the hypothesis and make money from the outcome. Instincts about a market are still critical in order to come up with the hypothesis. But running a bunch of tests and looking for significant relationships is fraught with statistical error. I call this the Big Data lottery ticket mindset – that is scratch away til you hit something.

Bottom Line: Today it’s very hard and the hurdles are high today. Parallel programming is hard, but map-reduce techniques, new innovation, and better tools should minimize this complexity over time.

We all want it to be simpler and easier – that is what the top entrepreneurs are working on.

About John Furrier

Founder and CEO of SiliconAngle.com.

2 Responses to Breaking Analysis: The Big Data Lottery Ticket – Why Data Science Isn’t Easy

  1. jilldyche says:

    Interesting post, John, but the whole point of Big Data is to avoid the “instincts” you write about. In fact, the beauty of Big Data is that–arguably for the first time ever–it’s more practical to perform raw exploration on unstructured data with lightning-fast performance via Hadoop and *develop* a hypothesis based on that data discovery. In other words, Big Data provides us with the wherewithal to dispense with (subjective and often-fraught) instinct and make data-driven decisions about our customers, products, patients, genomes, cancer cells, land parcels, weather patterns, and all the other data you’re hearing about at #strataconf. 
     
    ‘Bout time, I’d say.
     
    Jill Dyche
    @jilldyche

  2. Jeff Frick says:

    Seems like you’ve got start with some type of idea, and then let the process and results take you on a “random” walk to discovery.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>