UPDATED 16:56 EDT / DECEMBER 09 2009


Clean vs. Dirty Data: Data as the New Developer Kit [Twitter and Facebook]

Facebook is racing to open up their privacy settings. Why? Their $15 billion dollar valuation and future depend on it.

The faster Facebook opens up the data the faster a few things happen. First, Facebook can starting implementing and then scale a search offering. Second, Facebook can start rolling out new ad products for advertisers that can command massive premiums to that once locked data. This is huge more on that later.

Third, Facebook can start fostering a healthy and profitable ecosystem of third party developers to build much needed new applications and tools that provide better users experiences. Facebook can’t do it all on their own and they want to have a developer ecosystem. If you’re interested in what Facebook’s engineering vision then read this interview that I did with their VP of Engineering Mike Schroepfer.

The New Developer Kit – DATA

I just posted my Angle on the Twitter Firehose Myth- What you need to know about Twitter’s APIs.

Twitter’s big developer focus is mainly based upon their accidental success with developers due to their clean data – or unstructured data. I think that the Twitter data is a big win for developers, and I’m glad to see them reinforce their position as a “friendly” to developers. In this cloud and social media infested market of innovation all creative developers love access to data and tons of it.

One thing that isn’t being talked about in Twitter’s announcement about their firehose is the quality of the data. From a developers perspective there is clean data and there is dirty data. Let me elaborate.

Clean Data: Twitter

Twitter data is easy to work with. We have seen massive innovation and new venture creation around the data on Twitter. This is in direct contrast to Facebook. Facebook is moving fast to change this (and rightly so) in their move today where they announced that they are going to open up their privacy settings. Translation: Facebook’s data although huge it’s been closed hence messy for developers. Twitter data is huge and very open hence great for creative developers.

Dirty Data: Facebook

Facebook data is massive. At Supernova it was said Facebook has over 10 billion shares per day – that’s just on the sharing. However Facebook own success (invite only social graph) has been their biggest Achilles heal. Developers have had a hard time in dealing with their data due to all the privacy settings. I’ve also heard that the privacy settings have prevented Facebook from really “killing it” in deploying search and selling huge scalable advertising deals (not CPM deals but data based deals).

We can see the evidence of this from the success (or lack thereof) of the Facebook platform. Frankly, it has been problematic (just ask Scott Rafer and others). That is why we are seeing Facebook shift quickly to pushing and expanding on Facebook Connect. Facebook Connect is a much cleaner value proposition for developers and users. Frankly a better move for Facebook.

Having open data and clean data is a wonderful thing for developers. Lets hope Facebook can get there fast.

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy