UPDATED 10:00 EDT / APRIL 22 2015

NEWS

WANdisco’s new Fusion tool syncs Hadoop clusters

WANdisco Plc. has just announced the release of its new WANdisco Fusion tool, designed to distribute large datasets across multiple Hadoop clusters while keeping them in sync and up to date.

WANdisco Fusion uses active replication technologies to deliver up to date data from one Hadoop cluster to another, regardless of where those clusters are physically located. According to WANdisco’s Randy DeFauw, director of product marketing, the new technology should enable enterprises to roll out Hadoop production servers globally.

“The fundamental ability to use the same data from everywhere, as if everyone was running in the same cluster in the same place, this solves a lot of the key challenges the enterprise Hadoop architects were worrying about,” De Fauw told SDTimes.

If this sounds a little similar to WANdisco’s earlier “NonStop Hadoop” product, well, that’s the intention. WANdisco’s NonStop Hadoop was built to provide extremely fast and reliable data replication for enterprise customers like banks, which require high availability and also the best disaster recovery capabilities. The software is extremely powerful, but also fairly invasive and it had some limitations, De Fauw admitted in an interview with Datanami. For one thing, NonStop Hadoop was installed on the NameNode, which meant it was quite tricky to get up and running.

“Any tweaks made to the underlying Hadoop cluster or NameNode configuration could throw replication, which necessitated a deep level of certification work between WANdisco and the Hadoop distributors,” notes Datanami. “Because of this work, WANdisco focused its certification work with the major open source players who used HDFS: Cloudera and Hortonworks.”

Rather than install its software on the NameNode, WANdisco fusion is installed on a server adjacent to the Hadoop cluster its working on, thus making it far less invasive. This effectively makes WANdisco Fusion an evolution of NonStop Hadoop. “It’s still active-active replication, but we’re sitting at a much higher level in the Hadoop stack,” DeFauw told Datanami. “Instead of working deeply at the NameNode level, it actually works as a proxy application to the Hadoop file system.”

WANdisco Fusion provides other benefits too, as it can be used to boost processing power in the cloud by transferring data to AWS in order to gain additional processing power when it’s required. In addition, WANdisco Fusion can also sync different Hadoop distributions.

“The new architecture also means it has the ability to replicate between different types of Hadoop distributions,” DeFauw told SDTimes. “You can not only replicate between two Hortonworks clusters, you can replicate between Hortonworks and Cloudera and EMC’s Isilon storage systems.”

Finally, WANdisco Fusion can also sync HBase servers, though this requires more technical knowledge than simple HDFS syncing, DeFauw noted.

“[With HBase] The coordination happens for the writes, and each region server maintains its own write log,” De Fauw told SDTimes. “When it comes time to flush the memstore onto disk and write an HFile, every region server can have its own HFile. It writes to its local sever, but which region server should write to HDFS? We have a coordinated flush, where we choose a specific server that will write the file on the underlying file system.”

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

WANdisco’s new Fusion tool syncs Hadoop clusters

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

KubeCon + CloudNativeCon EU 2026

RSAC 2026 Conference

Nvidia GTC 2026

Google Cloud AI Agents in Action Series 2025/2026

MWC Barcelona 2026

WANdisco’s new Fusion tool syncs Hadoop clusters

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

KubeCon + CloudNativeCon EU 2026

RSAC 2026 Conference

Nvidia GTC 2026

Google Cloud AI Agents in Action Series 2025/2026

MWC Barcelona 2026

Cookies