UPDATED 16:00 EDT / AUGUST 10 2012

Building Big Data: How Can Connected Cars Cut Down on Data Costs?

What’s the most connected device you currently own? Most people would point to their iPhone or their iPad, or some kind of Android device. But in a few years time they could well be pointing to something much bigger – their cars.

So long as Verizon has their way, cars in the (near) future will feature embedded LTE that will allow them to not only send and receive data, but possibly even act on the information they receive as well.

This sounds great of course, but there’s one problem that needs to be tackled. Wi-Fi itself is free, but data comes at a very big cost. Unless something is done to minimize the costs of mobile data, the dream of being liberated with an always connected vehicle could well be shackled by an enormous bill.

Lucky for us then that the researchers at MIT, working alongside experts from Georgetown University and the National University of Singapore, are trying to find an answer to the problem.

See the entire Building Big Data Series on Pinterest and Springpad!

.

It’s not an easy one to solve because by their nature, cars are constantly on the move, forming new connections and breaking off old ones, drifting in and out of different networks. By definition, a distributed network of cars is a very messy one, and so trying to get all of these unpredictable nodes to cooperate with one another in order to reduce data costs isn’t going to be easy.

Still, it might just be possible to do so with some rather complicated mathematical equations, according to MIT graduate Alex Cornejo.

Cornejo and his team have been working on an algorithm that would allow for the internet-bound data of hundreds of different cars to be aggregated and compressed, before being sent over a single LTE connection – something that would substantially reduce the bandwidth (and hence, the costs) being used by all of the cars in that network.

The theory largely depends on proximity. Two cars need to be in Wi-Fi range, looking to hook up to the internet to send an email, download content or whatever. One of these cars would send its data to the other, instead of sending it directly to the server. Then, as these cars move through the network, patterns begin to emerge, which determine which of the vehicles is to become an aggregation node for the entire network.

Initially, the car chosen as the aggregator would be selected at random, but as Cornejo explains, the choice would them become more biased as individual vehicles start amassing more and more data:

“Cars that have already aggregated a lot will start ‘winning’ more and more, and you get this chain reaction. The more people you meet, the more likely it is that people will feed their data to you.”

Once that car has enough data aggregated, it would then establish a connection to download and upload content to the internet, before redistributing this throughout its network.

Cornejo explains that the nature of the data itself would determine how much time is spent aggregating before a connection is established. Email files for example have a much longer ‘shelf-life’ than other types, and so these could be bounced between hundreds of different vehicles before finally finding their way into the web. Obviously, real time connections such as Skype wouldn’t be able to put up with this kind of delay, but Cornejo claims that it would still be possible for a handful of vehicles making VoIP calls to share a connection.

It could well be possible to aggregate all of the data from 1,000 cars into just five connections, even when taking into account those vehicles which suddenly disconnect and disappear with all of the data they had stored up. To hold everything together so to speak, the algorithm needs to define distinct ‘clusters’ of vehicles among all of that traffic that are likely to remain in the same network long enough to establish a connection and make sure everybody’s data is sent out and received.

This is where the researcher’s biggest problem lies at the moment. Should that distinction between ‘clusters’ break down, the whole system would fall apart with it. You can have 1,000 cars in a single cluster aggregating data, but if that cluster comes across another one in its vicinty, and data starts passing back and forth between them, aggregation could well become impossible.


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU