BIG DATA
BIG DATA
BIG DATA
As pretty as it sounds, “streaming data” in an actual database is not the elegant, unbroken chain that data scientists and developers wish it were, according to Itamar Ankorion (pictured, right), chief marketing officer at Attunity Inc.
“You’re inherently building an architecture that takes what was originally a database, but you’re kind of, in a sense, breaking it apart into partitions as you’re loading it over time,” he explained in an interview at DataWorks Summit in San Jose, California.
Ankoria and Arvind Rajagopalan (pictured, left), director of global technology services at Verizon Communications Inc., spoke with Lisa Martin (@Luccazara) and George Gilbert (@ggilbert41), co-hosts of theCUBE, SiliconANGLE Media’s mobile livestreaming studio. (* Disclosure below.)
The partitions that result from ingesting continuous or incremental data streams can be awkward to pull together for cohesive analytics, Ankoria explained. Attunity is attempting to solve this problem with the announcement of Compose for Hive (Hive is Apache data warehouse software), which automates data lake processes.
“It reassembles these partitions, and it then creates analytic-ready datasets back in Hive. It can create operational data stores; it can create historical data stores, so then the data becomes formatted in a manner that’s more easily accessible for users who want to use analytic tools, BI [business intelligence] tools,” Ankorian said.
Verizon struggled for some time to bring its three separate Enterprise Resource Planning systems’ data together, Rajagopalan said.
Standard reporting in each of the ERP systems was mature and adequate, however, “When you want to look at combining all of the data, it’s very hard,” he said.
The company gave both Oracle Corp. and SAP SE Hana in-memory data platform a shot; results in both cases were disappointing due to high cost, manual tasks and minimal self-service options, according to Rajagopalan.
Verizon ultimately decided on Attunity’s data integration and management software. “It has the intelligence to look at and understand the proprietary data structures of the ERPs, and it’s able to bring all the data from the ERP source systems directly into [Apache] Hadoop without any stops or staging databases along the way,” he said. (Hadoop is an open-source-based software used for storing, processing and analyzing big data.)
Rajagopalan added that its Replicate CDC (Change-Data-Capture) tool helps developers perform ad-hoc analytics.
Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s independent editorial coverage of DataWorks Summit. (* Disclosure: Attunity Inc. sponsored this DataWorks Summit segment on SiliconANGLE Media’s theCUBE. Neither Attunity nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.