What comes after real-time for Big Data? | #WomenInTech


We know what Big Data is, we even figured out how to analyze large chunks of it in real-time. So what’s next for Big Data? And what current trends foster innovation for Big Data’s remaining gaps?

As we prepare for upcoming industry events focusing on Big Data, including the upcoming #BigDataSV 2016, this week SiliconANGLE’s Women in Tech Wednesday celebrates the data innovators we have interviewed on theCUBE, from the SiliconANGLE Media team. The following women are not only established in current data projects but are pushing the boundaries of where Big Data will go in the future. 

Evolving to real-time data

Tendu Yogurtcu, Ph.D. and GM of Big Data at Syncsort, In., spoke with theCUBE at Spark Summit East 2016, about the process of moving Big Data from legacy systems to achieve real-time analytics.

“I think they [businesses] are evolving to real-time, and it’s a process within most organizations because there are many different owners and groups and business units involved. Most of the Big Data applications start from the business … however, there is a lot of enterprise data held by more legacy groups, and it requires that collaboration at the business unit and sometimes IT drives that as well.

“So we see that transformation across organizational to more real time, and one of the most common use-cases we see is operational intelligence and operational data from those legacy platforms being processed in the Big Data analytics. For example, Spark has that advantage because it became very popular. It has the promise of being that single computer platform from both streaming analytics and advanced analytics with machine learning, as well as batch analytics.”

Watch theCUBE interview with Yogurtcu below:

Angling data into intelligent insights

Anjul Bhambhri, VP of Big Data for IBM Analytics, joined theCUBE at Hadoop Summit 2015 to talk about how IBM is working with open-source projects to help customers gather insights from Big Data.

“We’ve all heard this from every customer that they are spending 70 to 80 percent of their time shaping the data, getting it ready, so they can get value out of that data. So there is work that is happening in IBM, as well as outside, around data angling, data shaping and being able to do it programmatically, as well as using things like SQL for transformation. Using things like text analytics (and) machine learning. And then you need absolutely, very powerful visualization. We are embracing D3 (D3.js – a Javascript library for creating data visualizations in the browser.) as an extensible framework from a visualization standpoint, just to make it easier and easier to shape that data.

“From an IBM standpoint, we have our predictive and prescriptive analytics portfolio around SPSS (Statistical Product and Service Solutions) so we are going to be leveraging things like projects from Hadoop and, of course, Spark to be able to scale out the predictive and prescriptive models and algorithms.”

Watch theCUBE interview with Bhambhri below:

What’s next for Big Data?

Hilary Mason, founder and CEO of Fast Forward Labs, spoke to theCUBE at the Grace Hopper Celebration of Women in Technology about the advancements in Big Data and where it will be going next.

“If you think back even just five or six years ago, getting Hadoop to work on a large dataset in a reliable way with a good response time was in itself a technical challenge. Now I can spin up an elastic MapReduce cluster on Amazon Web Services in one line of code, and it works most of the time. So the reduction in cost, in time and in friction that we’ve seen over the last five years has been remarkable.

“When you look at what’s going next you can sort of think about it coming up the stack. First, we had to get the data in one place. Then we had to build a way to analyze that, but we didn’t really care how long it took, then we cared how long it took; now it’s real-time and in memory. The next thing, I think we will start to see some algorithms commoditized in that process, so you no longer have to write your own code for a recommendation algorithm or for variance model. A lot of that is happening in the open-source community.

“We’re starting to see the same thing around analyzing data that was previously too difficult to analyze whether because it was too dirty (so we have products emerging to clean data) or because it was too complex, as in the case of image data.”

Watch theCUBE interview with Mason below:

Photo by Pixabay