streamsets BIG DATA

With new solutions, StreamSets aims to simplify data integration and Apache Spark

StreamSets Inc. today unveiled a cloud version of its namesake data integration platform and a new tool for Apache Spark that’s aimed at helping enterprises put their information to work faster.

Five-year-old StreamSets is a relative newcomer to the data integration market, going up against incumbents such as IBM Corp., but it has already established itself as a major player. The startup, which claims dozens of Fortune 500 customers, is backed by blue-chip investors such as Accel. So far, StreamSets has raised more than $67 million in funding.

The startup’s new cloud offering, StreamSets Cloud, provides its data integration platform in the form of a fully managed service. The offering enables companies to take information from sources such as industrial sensors and stream it to their back-end analytics infrastructure for processing. During transit, an organization can manipulate the data to make it easier to process.

A manufacturer looking to keep better track of plant productivity, for instance, could use StreamSets to aggregate readings from its factory equipment, filter errors and convert the disparate records into a unified format. A cybersecurity team can use the same features to integrate threat information from across the corporate network.

StreamSets Cloud removes the need for companies to maintain their deployments of the data integration platform manually. Under the hood, the service uses Kubernetes to scale capacity up and down automatically based on how much information is being processed.

The other solution StreamSets unveiled today is Data Transformer, an interface tool that allows users to work with Apache Spark. It’s one of the most widely adopted and versatile analytics engines on the market, but it isn’t known for being particularly easy to use. StreamSets said Data Transformer lowers the learning curve by providing the ability to create data processing workflows for the engine using drag-and-drop commands.

The tool also has a few other features that the startup said will save time for analytics teams. Among them is what StreamSets describes as “progressive error handling.” Data Transformer lets users troubleshoot their Spark workflows and iron out errors even if they haven’t mastered the ability to interpret Spark’s notoriously complex log files, which reduces the need for specialized know-how in data projects. 

“With StreamSets Transformer, Apache Spark is finally available to a wide range of users, enabling visibility, monitoring and reporting for mission-critical workloads,” said StreamSets Chief Technology Officer Arvind Prabhakar.

Photo: StreamSets

Since you’re here …

Show your support for our mission with our one-click subscription to our YouTube channel (below). The more subscribers we have, the more YouTube will suggest relevant enterprise and emerging technology content to you. Thanks!

Support our mission:    >>>>>>  SUBSCRIBE NOW >>>>>>  to our YouTube channel.

… We’d also like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.

If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.