How Spark and RapidMiner are helping to create ‘citizen data scientists’ | #SparkSummit


On one side of the aisle in Big Data, you have the glass-half-empty folks who warn that this is new technology and actual use-cases are still sparse. On the other side, you have glass-half-full people, like Peter Lee, CEO of Rapid-I Inc. (dba. RapidMiner), who sees the day coming when use of “citizen data scientists” will be as common as Excel users. Lee told hosts Jeff Frick and George Gilbert of

Lee told Jeff Frick and George Gilbert, cohosts of theCUBE, from the SiliconANGLE Media team, that his company’s latest release, RapidMiner 7, offers incredible opportunities for both coders and non-coders to utilize predictive analytics.

“RapidMiner really expands the universe of people that can take advantage of this transformational paradigm shift in technology,” he said. He explained that RapidMiner is code optional, so even with little knowledge of code, “you can still use RapidMiner to help exploit all of the capabilities of Spark, but through a visual interface.”

Spark and RapidMiner tag team

Lee spoke about the importance of Spark in Rapidminer 7. “It’s a release that has a lot of meaning in the Spark community,” he said. He stated that 250 of its machine learning libraries and something like 1,000 data prep methods utilized Spark.

The number one use-case for RapidMiner, according to Lee, is to move the needle to more granular human customer insights and embed predictions therein. He said that with predictive analytics of this kind, “safety rails” are needed to validate the right predictive insights.

Watch the full interview below, and be sure to check out more of SiliconANGLE and theCUBE’s coverage of Spark Summit East 2016.

Photo by SiliconANGLE