With new solutions, StreamSets aims to simplify data integration and Apache Spark

StreamSets Inc. today unveiled a cloud version of its namesake data integration platform and a new tool for Apache Spark that’s aimed at helping enterprises put their information to work faster.

Five-year-old StreamSets is a relative newcomer to the data integration market, going up against incumbents such as IBM Corp., but it has already established itself as a major player. The startup, which claims dozens of Fortune 500 customers, is backed by blue-chip investors such as Accel. So far, StreamSets has raised more than $67 million in funding.

The startup’s new cloud offering, StreamSets Cloud, provides its data integration platform in the form of a fully managed service. The offering enables companies to take information from sources such as industrial sensors and stream it to their back-end analytics infrastructure for processing. During transit, an organization can manipulate the data to make it easier to process.

A manufacturer looking to keep better track of plant productivity, for instance, could use StreamSets to aggregate readings from its factory equipment, filter errors and convert the disparate records into a unified format. A cybersecurity team can use the same features to integrate threat information from across the corporate network.

StreamSets Cloud removes the need for companies to maintain their deployments of the data integration platform manually. Under the hood, the service uses Kubernetes to scale capacity up and down automatically based on how much information is being processed.

The other solution StreamSets unveiled today is Data Transformer, an interface tool that allows users to work with Apache Spark. It’s one of the most widely adopted and versatile analytics engines on the market, but it isn’t known for being particularly easy to use. StreamSets said Data Transformer lowers the learning curve by providing the ability to create data processing workflows for the engine using drag-and-drop commands.

The tool also has a few other features that the startup said will save time for analytics teams. Among them is what StreamSets describes as “progressive error handling.” Data Transformer lets users troubleshoot their Spark workflows and iron out errors even if they haven’t mastered the ability to interpret Spark’s notoriously complex log files, which reduces the need for specialized know-how in data projects. 

“With StreamSets Transformer, Apache Spark is finally available to a wide range of users, enabling visibility, monitoring and reporting for mission-critical workloads,” said StreamSets Chief Technology Officer Arvind Prabhakar.

Photo: StreamSets

A message from John Furrier, co-founder of SiliconANGLE:

Show your support for our mission by joining our Cube Club and Cube Event Community of experts. Join the community that includes Amazon Web Services and Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.

Join Our Community 

Click here to join the free and open Startup Showcase event.

“TheCUBE is part of re:Invent, you know, you guys really are a part of the event and we really appreciate your coming here and I know people appreciate the content you create as well” – Andy Jassy

We really want to hear from you, and we’re looking forward to seeing you at the event and in theCUBE Club.

Click here to join the free and open Startup Showcase event.