UPDATED 09:00 EDT / JUNE 02 2016

data NEWS

IBM launches system for building and managing data lakes

The growing interest in so-called data center operating systems such as OpenStack and Mesosphere Inc.’s DCOS has spurred IBM Corp. to join the fray today with its own automation platform. Dubbed Spectrum Conductor, the software promises to do away with much of the duplicate componentry and expenses that burden IT departments.

According to the vendor, this feat is made possible by a homegrown access mechanism that provides the ability to share information among the applications that need it instead of having to make a separate copy for each. Spectrum Conductor thus reduces storage requirements and eliminates the infrastructure necessary to support duplicate records, which can save a lot of resources in a large company. Yet as appealing it is, IBM’s value proposition will likely meet  some skepticism due to the challenges that plagued past attempts to pull off such an arrangement.

In fact, creating a data lake, as the model is often called, has proven so difficult that Gartner Inc. all but deemed it impractical two years ago. However, two years is a long time in the technology world. Spectrum Conductor comes with automated configuration tools that IBM says can ease the task of configuring applications to exploit its data access mechanism. And the software also simplifies day-to-day management from there onwards with a policy-based provisioning feature borrowed from the company’s storage systems.

The functionality makes it possible to ensure that every workload runs on the infrastructure best suited to meet its requirements. For instance, an administrator can have Spectrum Conductor store an application’s most frequently-used records on flash drives while sending everything else to a cheaper disk system. IBM sees the capability coming particularly handy for analytic workloads, which is why it’s pairing the platform with an optional extension designed to ease the deployment of Spark clusters. The combination provides an up to 58 throughput improvement over vanilla implementations of the engine, according to the company.

Much of the credit goes to File Placement Optimizer, a set of low-level data management features included in Spectrum Conductor that accelerate read and write operations. IBM says that the benefits become especially pronounced in environments with multiple Spark instances, where its software can move infrastructure resources around as usage patterns change. When one cluster is inactive, the hardware allocated to it is made available for the others to help speed their work. And important data can be shared as well to save analysts the delay of recalculating results that have already been readied by a colleague.

IBM plans on contributing key parts of the technology to the upstream Spark community as part of its $300 million effort to foster adoption of the engine. Spectrum Conductor, meanwhile, will be made available commercially as an on-premise offering and in the public cloud.

Image via Geralt

Since you’re here …

Show your support for our mission with our one-click subscription to our YouTube channel (below). The more subscribers we have, the more YouTube will suggest relevant enterprise and emerging technology content to you. Thanks!

Support our mission:    >>>>>>  SUBSCRIBE NOW >>>>>>  to our YouTube channel.

… We’d also like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.

If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.