UPDATED 10:00 EDT / JUNE 15 2015

NEWS

IBM commits 3,500 engineers to Apache Spark

Between the new features and integrations introduced at its third annual community meetup this morning, Apache Spark is marking a landmark new endorsement from IBM, which has decided to back the project to the tune of over 3,500 engineers who will now actively participate in the development of new functionality. The opening contribution of the initiative is a machine learning library called SystemML.

The technology is one of the latest innovations to have emerged from the company’s ongoing work on Watson, which has seen its use expand from answering trivia questions to extracting complicated patterns out of vast quantities of unstructured data over the last few years. To keep up, SystemML provides a language that directly exposes the capabilities of the artificial intelligence for data scientists to harness.

Queries written in the syntax, which is deliberately modeled after the widely-used R statistical programming framework, are automatically executed according to the most efficient mode of operation for the specific workload and operational characteristics of a Spark cluster. Needless to say, that has the potential to provide a tremendous boost for the project’s machine learning capabilities.

But SystemML still only represents tip of the iceberg for IBM’s plans. The bulk of its efforts will focus on integrating Spark into its analytics arsenal, beginning with none other than Watson. The cloud-based incarnation of the artificial intelligence that the company released for the healthcare sector earlier this year is first in line to be standardized on the framework, with other versions presumably due to follow suit later on.

At the same time, IBM is also embedding Spark into its Bluemix platform-as-a-service stack, which will make the capabilities of the framework accessible on-demand for developers and data scientists. The company hopes to bring the total number of professionals skilled in using the project to over a million within a few years through a number of education partnerships announced in conjunction, users who it hopes will tilt toward its implementation over the competition as a result.

Added up, IBM’s commitment to Spark represents the arguably biggest milestone for the project since its inception at UC Berkeley four years ago. The framework is already a fixture of the analytics discussion thanks to its speed and extensibility, but if Big Blue’s past kingmaking role in other open-source projects as Linux is anything to go by, its addition fray could take that to a whole different level.

Photo: Ariel Zambelich/Wired

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU