UPDATED 23:04 EDT / FEBRUARY 18 2016

NEWS

Big Blue touts an easier way to deploy Apache Spark

Big Blue is hoping to smooth the passage for organizations looking to deploy Apache Spark with a new offering aimed at helping them to get the data processing engine up and running as swiftly and as easily as possible.

The company’s new Platform Conductor for Spark was announced yesterday at the IBM PartnerWorld Leadership Conference in Orlando, Florida, alongside a new hybrid offering that fits into its Storwize storage family, the V5000.

A new way to deploy Apache Spark

The new Spark offering is by far the bigger deal. According to IBM, organizations are struggling to mine data that’s being generated at unprecedented rates today. Previously, most organizations relied on Hadoop and MapReduce to collect, crunch and analyze their Big Data, but today data is being generated so fast (2.5 quintillion bytes of data per day, according to IBM) that these older systems can no longer keep up.

That’s why Apache Spark was developed. According to the Apache Software Foundation, Spark can process data 100 times faster than MapReduce, which makes it far more useful for organizations gathering terabytes and petabytes of data every single day. The only problem with Spark is that, just like Hadoop and MapReduce, it isn’t easy for customers to deploy it atop of their existing infrastructure. Many companies lack personnel with suitable skills and experience using Spark, and that’s what IBM’s trying to address with today’s release. The IBM Platform Conductor for Spark is targeted at customers in the financial services, life sciences, oil and gas, design automation, and retail industries, and provides an enterprise-grade, multi-tenant solution for quickly deploying Spark.

Big Blue touts a number of benefits when using Platform Conductor for Spark, including faster time-to-results on Big Data analytics due to its multi-tenancy feature that allows jobs to be run on resources that would otherwise be left idle. The product features what IBM calls “high-efficiency resource scheduling technology” to ensure Spark maximizes the potential of whatever infrastructure is in place.

IBM also claims its product helps to simplify deployment and management of Spark jobs.

“This is an end-to-end integrated solution incorporating resource scheduling, data management, monitoring, alerting, reporting and diagnostics as well as Spark,” the company said in a statement.

Flexible storage

IBM is also touting a new hardware offering designed to make data mining tasks easier and more efficient. Mining data is a great way to gain a competitive edge, the company said, but one problem enterprises face is finding somewhere to store that data while it’s being mined. As such, IBM believes its new Storwise V5000 storage solution is the answer. The system offers efficiency, flexibility, performance and scale advantages together with simplified management and unified virtualization.

Storwise V5000 packs up to 2 petabytes of capacity, or up to 4 petabyes with two-way clustered systems, allowing companies to start off small and grow their data mining projects over time.

Both solutions are available from today.

Image credit: ClkerFreeVectorImages via pixabay

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU