NEWS
NEWS
NEWS
The expanding reach of Spark for data-management and associated utilities is still on its way up from an integration perspective, but from the view of the companies trying to increase its profile to business users, there’s still hard roads ahead.
Stephen Sit, director of product management for Hadoop and Spark offerings at IBM, joined John Furrier (@furrier) and Jeff Frick (@jefffrick), cohosts of theCUBE, from the SiliconANGLE Media team, at BigDataSV 2016, where theCUBE is celebrating #BigDataWeek, including news and events from the #StrataHadoop conference. Their conversation centered on the Spark-Hadoop linkages, how IBM is piecing out Spark development into its own focus, and effective leveraging of open source.
The discussion began with Sit looking back to the first serious considerations of Big Data being handled by Hadoop. “As we were first looking at the adoption of Hadoop … there was a mass amount of semi-structured data,” he explained. “[And] over time we started to see the emerging pattern around datalakes.”
While the approach soon found solid results, difficulties beyond the infrastructural aspect of Hadoop soon presented themselves. “It’s a great environment … but getting wider adoption from the line of businesses … has been somewhat challenging,” Sit noted.
Finding ways to move to a more accessible model has found a boost in Spark, however. “We certainly believe that to get Big Data to the next stage, with a wider adoption from the conventional business organizations, it needs to have something more interactive and consumable, and Spark is a big part of that,” Sit said.
Sit went into further detail on how he saw Spark and Hadoop relating to each other, currently and in the near future: “I think, certainly, I do want to give credit to Hadoop; it’s a great system. … I think what Spark brings to the table is not just about the performance … but from my point of view, the other key value is that unified set of APIs and allowing the data engineers and data scientists and applications developers to work more seamlessly together.”
That streamlining and reduction of “moving parts” in the structuring seems to be one of, if not the biggest, draws that Spark has in its favor. “Hadoop is very scalable, is great, but on the drawback side, has all this pieces in the ecosystem,” Sit said, contrasting it with how “Spark gives you the opportunity to tighten and unify.” He also felt that Spark was much more conducive to “the way that data scientists actually work.”
Other points touched on in the interview were on the role of CEOs trying to come to terms with these new technologies and what IBM sees as the future of apps. For the first, Sit said, “I think you’ve got to think about the end results of what you want to accomplish [as a CEO] … they shouldn’t focus on the infrastructure side … unless they have a really clear view of where they want to go.”
As to apps, he said, “We believe that the future application is all about intelligence built into the application.” Citing the development of a Spark Technology Center at IBM and its two-fold efforts to establish a strong open-source foundation, then leverage it through their own services, Sit provided a confident outlook on IBM’s plans for the different sides and approaches to Big Data.
Watch the full video interview below, and be sure to check out more of SiliconANGLE and theCUBE’s coverage of BigDataSV 2016. And make sure to weigh in during theCUBE’s live coverage at the event by joining in on CrowdChat.
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.