UPDATED 10:00 EDT / FEBRUARY 15 2012

April’s the Best Month to Fly: Rearden, Hadoop Make Sense of Flight Data

Big data’s predicting powers have countless possibilities, and recent studies show that it can predict things like property-selling markets better than real-estate analysts themselves. Now we have next-generation ecommerce platform Rearden Commerce analyzing air flight factors and related issues as they leverage Hadoop technology.  Rearden uses the data to recommend travel plans for consumers via its Deem Travel app, using sentiment analysis along with other research to determine your best flight options.  With this latest study, Rearden’s also created a nifty animation to demonstrate flight data in the US (see video below).

Best days to fly

Findings reveal that over the course of 10 years, flight delays, cancelations and diversions have a 13 percent chance of happening. The worst day of the year to fly is Dec. 23rd, with 1 in 4 of flights being late, diverted, or cancelled, and this subsequently makes the December the worst month to fly. On the other hand, October 3rd was said to be the best day, with only seven in a thousand arrivals or departures late, cancelled or diverted. While October may have the best day, April is hailed the best month to fly, followed by September through Thanksgiving, with some regional exceptions.

Winter and summer seasons are full of flight delays. Surprisingly, however, July 4th is an exception as it displays stellar on-time performance. At the end of the school year, better than average on-time performance turns sour, with delays sweeping from southeast towards northwest.

Best airports for departures, layovers

Among the many airports in the United States, EWR in Newark, New Jersey has the worst arrival performance. There’s a 1 in 4 chance of arriving late. DUT, Dutch Harbor, Alaska was said to be the overall worst, with 1 in 3 chance of arriving late. Let’s just be thankful some flights were still taking off–it’s Alaska after all.

Getting the numbers

“We wanted to take a look at what the Bureau of Transportation has on on-time performance for domestic flights, and what they had to say about helping users pick up a connecting hub during tough times of the year,” says Steve Bernstein, head of analytics at Rearden.

“Conventional wisdom says don’t fly through Chicago in winter, but there’s probably other choices for end users, so we want to recommend flight times such that we can fit one piece of the puzzle with predictive analytics for choices for travel that best fit their needs.”

• The source of the data is the Bureau of Trans. Statistics – covering info for all 67m flights from 1/1/2001 through 12/31/2010 as reported to the BTS from the airlines themselves. Only airlines having at least 1% US domestic market share in dollar terms are required to report—a total of 24 carriers.

• Average on-time arrival and on-time departure performance by airport across the entire ten year period.

• Aggregated hourly arrival and departure on-time performance by hour of the year (8,760 hours in a year) across the ten year period, normalizing the leap years 2004 & 2008 such that 2/29 = 3/1.

• For every hour of the year, we compare the airports’ performance in that hour to its overall ten year average to show when it’s out- or under-performing the “typical” case.

• We smoothed the performance with a 168 hour (= one week) central moving average to dampen the strong daily pattern in on-time performance. We did this because our objective was to uncover the seasonal patterns, not the daily patterns.

• We removed ten days of data starting 9/11/2001 because of the 100% cancellations for the three days after the attacks and the very high cancellation incidence in the aftermath.

• Finally we “rotated” the time period to start at the end of May, a relatively quiet traffic time, rather than January 1 so we could see the winter pattern in one contiguous go instead of breaking it up by setting the “edge” there.

Big data takes flight

Big data can predict the rise and fall of airfare, and which airports offer good value. It’s an important area of research for every member of the airline industry, from executive to consumer.  Rearden turns to Hadoop for analyzing sentiment, adding new layers of context to travel research and planning.  For Bernstein and his team, this is the “golden age for data.”  And consumers are on the winning side of this trend.

“If we combine all that info with flight data, we can see in reality for our users what their experience has been in respect to being late or diverted,” Bernstein says.  “We’re working for those who would benefit from us saying, ‘even though this flight through Chicago is cheaper, it may be worthwhile to fly through Denver,’ as the cost of not arriving on time to a meeting may be more valuable.”

Contributors: Kristina Farrah

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU