UPDATED 08:41 EDT / JULY 17 2013

We Take On the Next Big Data Mystery @ MIT: Quality vs. Quantity [LIVE Broadcast]

Quality vs. Quantity – this is the looming question when it comes to, well, almost anything.  The same question can be directed at the busy Big Data industry, now becoming an expected standard within the industry.  Now that we can collect and analyze more data than we ever imagined,  how do we determine the quality of that data? How would you define data quality in Big Data, and what’s the criteria?

First off, data is considered high quality if “they are fit for their intended uses in operations, decision-making and planning,” a definition easily applied to the emerging applications of Big Data.

Big Data should indeed undergo the process of data cleaning or data cleansing, wherein corrupt or inaccurate records from a record set, table, or database are detected and corrected either by replacing, modifying, or deleting.  It’s easier said than done, especially when you’re dealing with huge amounts of data.  One of the best ways to go about this is to cluster these data to determine similar features, ultimately making it more useful for data scientists.

Wikibon Principal Research Contributor Jeff Kelly explains in a recent post, titled “Big Data Adds Complexity, Nuance to the Data Quality Equation,” how Big Data quality can be overlooked in some industries while in some, it can mean saving lives.

“Big Data evangelists maintain that the sheer volume of data in Big Data scenarios mitigate the effects of occasional poor data quality,” Kelly writes. “If you’re exploring petabytes of data to identify historical trends, a few data input errors will barely register as a blip on a dashboard or report. Is it even worth the time and effort, then, to apply data quality measures in such a scenario? Probably not.

“But that doesn’t mean data quality isn’t important to Big Data. This is particularly true in real-time transactional scenarios. Big Data applications that recommend medicines and doses for critically ill patients, for one, better be relying on good data. Same goes for Big Data operational applications that support commercial aviation, the power grid and other Industrial Internet use cases,” Kelly says.

Watch today’s LIVE broadcast from MIT’s IQ Symposium

The  MIT Chief Data Officer and Information Quality (CDOIQ) Symposium kicks off today in Cambridge, Massachusetts and will run through until the 19th.  The symposium will focus on delivering the importance of good data for the success of Big Data via sessions such as How To Avoid The Most Common Big Data Problems, A Practical Approach To Data Governance, IQ and Compliance, The Role Of IQ In Performance Excellence, Human Factors In Information Quality, IQ Issues In Public Sector, Government, Healthcare, Finance, and The Latest Information Quality Research From MIT.

SiliconANGLE’s premier video production, theCUBE, will be at the event, extracting the signal from the noise, and you can watch out coverage at SiliconANGLE.tv or tune in for updates here on SiliconANGLE, Wikibon, and on Twitter – @SiliconANGLE, @CDOIQ, and @Wikibon.

Joining Kristin Feledy on this morning’s Live NewsDesk Show is SiliconANGLE Senior Managing Editor Kristen Nicole, discussing some of the topics we’ll be investigating at the MIT event.  In the video below, Kristen provides her Breaking Analysis on how data practitioners should go about selecting data that will be good, quality data for analysis.

“This is an ongoing debate in the industry right now and we’ve reached a point… where we collected all these data, we’ve created ways to analyze it and now there’s a lot of data that’s here and we have to determine if that data is worth our time, if it’s not how we can determine the best data out of all these information that we are now able to collect and analyze.

“For businesses looking to make those determinations, there are certainly some emerging standards that are coming into the industry now, and that’s one of the topics we’ll be looking at closely throughout the rest of this year particularly for the MIT event that kicks off today.  Because this is going to be an increasing importance as more companies look to use data in their everyday practices, decision making, things of that nature,” Kristen explained.


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU