With new solutions, StreamSets aims to simplify data integration and Apache Spark

StreamSets Inc. today unveiled a cloud version of its namesake data integration platform and a new tool for Apache Spark that’s aimed at helping enterprises put their information to work faster.

Five-year-old StreamSets is a relative newcomer to the data integration market, going up against incumbents such as IBM Corp., but it has already established itself as a major player. The startup, which claims dozens of Fortune 500 customers, is backed by blue-chip investors such as Accel. So far, StreamSets has raised more than $67 million in funding.

The startup’s new cloud offering, StreamSets Cloud, provides its data integration platform in the form of a fully managed service. The offering enables companies to take information from sources such as industrial sensors and stream it to their back-end analytics infrastructure for processing. During transit, an organization can manipulate the data to make it easier to process.

A manufacturer looking to keep better track of plant productivity, for instance, could use StreamSets to aggregate readings from its factory equipment, filter errors and convert the disparate records into a unified format. A cybersecurity team can use the same features to integrate threat information from across the corporate network.

StreamSets Cloud removes the need for companies to maintain their deployments of the data integration platform manually. Under the hood, the service uses Kubernetes to scale capacity up and down automatically based on how much information is being processed.

The other solution StreamSets unveiled today is Data Transformer, an interface tool that allows users to work with Apache Spark. It’s one of the most widely adopted and versatile analytics engines on the market, but it isn’t known for being particularly easy to use. StreamSets said Data Transformer lowers the learning curve by providing the ability to create data processing workflows for the engine using drag-and-drop commands.

The tool also has a few other features that the startup said will save time for analytics teams. Among them is what StreamSets describes as “progressive error handling.” Data Transformer lets users troubleshoot their Spark workflows and iron out errors even if they haven’t mastered the ability to interpret Spark’s notoriously complex log files, which reduces the need for specialized know-how in data projects. 

“With StreamSets Transformer, Apache Spark is finally available to a wide range of users, enabling visibility, monitoring and reporting for mission-critical workloads,” said StreamSets Chief Technology Officer Arvind Prabhakar.

Photo: StreamSets

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy