UPDATED 09:00 EDT / JUNE 02 2016

NEWS

IBM launches system for building and managing data lakes

The growing interest in so-called data center operating systems such as OpenStack and Mesosphere Inc.’s DCOS has spurred IBM Corp. to join the fray today with its own automation platform. Dubbed Spectrum Conductor, the software promises to do away with much of the duplicate componentry and expenses that burden IT departments.

According to the vendor, this feat is made possible by a homegrown access mechanism that provides the ability to share information among the applications that need it instead of having to make a separate copy for each. Spectrum Conductor thus reduces storage requirements and eliminates the infrastructure necessary to support duplicate records, which can save a lot of resources in a large company. Yet as appealing it is, IBM’s value proposition will likely meet  some skepticism due to the challenges that plagued past attempts to pull off such an arrangement.

In fact, creating a data lake, as the model is often called, has proven so difficult that Gartner Inc. all but deemed it impractical two years ago. However, two years is a long time in the technology world. Spectrum Conductor comes with automated configuration tools that IBM says can ease the task of configuring applications to exploit its data access mechanism. And the software also simplifies day-to-day management from there onwards with a policy-based provisioning feature borrowed from the company’s storage systems.

The functionality makes it possible to ensure that every workload runs on the infrastructure best suited to meet its requirements. For instance, an administrator can have Spectrum Conductor store an application’s most frequently-used records on flash drives while sending everything else to a cheaper disk system. IBM sees the capability coming particularly handy for analytic workloads, which is why it’s pairing the platform with an optional extension designed to ease the deployment of Spark clusters. The combination provides an up to 58 throughput improvement over vanilla implementations of the engine, according to the company.

Much of the credit goes to File Placement Optimizer, a set of low-level data management features included in Spectrum Conductor that accelerate read and write operations. IBM says that the benefits become especially pronounced in environments with multiple Spark instances, where its software can move infrastructure resources around as usage patterns change. When one cluster is inactive, the hardware allocated to it is made available for the others to help speed their work. And important data can be shared as well to save analysts the delay of recalculating results that have already been readied by a colleague.

IBM plans on contributing key parts of the technology to the upstream Spark community as part of its $300 million effort to foster adoption of the engine. Spectrum Conductor, meanwhile, will be made available commercially as an on-premise offering and in the public cloud.

Image via Geralt

A message from John Furrier, co-founder of SiliconANGLE:

Support our open free content by sharing and engaging with our content and community.

Join theCUBE Alumni Trust Network

Where Technology Leaders Connect, Share Intelligence & Create Opportunities

11.4k+  
CUBE Alumni Network
C-level and Technical
Domain Experts
15M+ 
theCUBE
Viewers
Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.

SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.