UPDATED 12:00 EDT / MARCH 02 2021

BIG DATA

Kaskada data science automation platform aims to speed machine learning models into production

More than a year after announcing plans to automate the feature engineering phase of artificial intelligence projects, Seattle-based startup Kaskada Inc. is bringing its first product to market.

Kaskada says it aims to democratize feature engineering, an often laborious process that requires data scientists to select, clean and validate the data to be fed into machine learning training models prior to moving them into production.

A model intended to predict housing prices, for example, would be feature engineered with predictor data such as the square footage of properties, number of bedrooms and location. The larger and more complete the training data set, the better the results.

The resources required to collect data and move machine learning models into production can be so significant that the capabilities are out of reach of all but the largest companies. Kaskada says its platform features a collaborative interface for team engineering and a proprietary data infrastructure for computing across event-based data and serving features in production.

“We are focused on building the bridge between training and production,” said Davor Bonaci, Kaskada’s chief executive and a former software engineer at Google LLC and Microsoft Corp. “We are launching a self-service platform to help data scientists get work into production by automating infrastructure. You can onboard and don’t have a big adoption curve or need to get everybody in your organization you agree to try it.”

The company’s self-service platform is a self-contained data science studio with pre-built machine learning models and the feature vectors needed to support them provided via an application program interface. “You get up-to-the-moment feature vectors for functions like real-time fraud detection,” Bonaci said. “You don’t have to write data pipelines or process streaming data. We run the data processing needed for the model.”

Event-driven focus

Kaskada’s platform has undergone some changes since it was announced, the most significant of which is a greater focus on event-driven data collection. That’s a type of processing that makes decisions in response to real-time events such as mouse clicks and transactions.

Event-driven processing is especially useful in scenarios like predicting the probability that a customer will buy a product or that a credit card transaction will be fraudulent. Real-time data handling requires an efficient data infrastructure to calculate features at arbitrary points in time and to deliver them to both training and production environments. “We have built a lot of functionality to think in terms of time,” Bonaci said.

The company has also focused more of its attention on automating the data science process rather than data engineering. Those two functions are supposed to work in tandem but frequently fail to communicate effectively because data scientists are focused on data and engineers on getting models into production.

“There can be friction getting into production because science and engineering teams have different values,” Bonaci said. “We reduce the friction needed to get work into production.”

Kaskada is a cloud-native service that customers can deploy in their own cloud instances, run as a managed service or install on local infrastructure. The company offers a distinctive pricing model that includes a free tier with limited data capacity, curated public datasets, sample projects and individual commit and version histories. Paid plans support team development, batch data uploads, direct data connection and real-time features. Details weren’t provided.

Image: Starline/Freepik

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Kaskada data science automation platform aims to speed machine learning models into production

Event-driven focus

Image: Starline/Freepik

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

Pure Accelerate 2026

FinOps X 2026

Snowflake Summit 2026

Freshworks Refresh 2026

IBM Think 2026

Kaskada data science automation platform aims to speed machine learning models into production

Event-driven focus

Image: Starline/Freepik

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

Pure Accelerate 2026

FinOps X 2026

Snowflake Summit 2026

Freshworks Refresh 2026

IBM Think 2026

Cookies