UPDATED 08:00 EDT / APRIL 06 2021

BIG DATA

Yellowbrick ventures into the cloud with Kubernetes support, distributed data warehouse manager

Data analytics startup Yellowbrick Data Inc. today announced a major expansion of its data warehousing platform along with a consolidated management dashboard and a new version of its multiprocessor-based data warehouse appliance.

The enhancements for the first time separate the Yellowbrick software from the underlying hardware and permit multiple instances to be installed on customer-owned hardware, cloud platforms and edge devices.

Specifically, Yellowbrick has recast its platform to run in software containers, which are portable operating environments that can be managed with the popular Kubernetes orchestration platform. “This is the first stage toward a distributed cloud strategy,” said Chief Technology Officer Mark Cusack. “It will take into next year to get there.”

The changes “allow us to deploy in containerized form on our own hardware or on public or private clouds in a hybrid configuration,” Cusack said. “The software is very well set up for this containerization down to the point that you can run a single container with Yellowbrick on a ruggedized edge device scaling up to thousands of containers in a petabyte-sized cloud.”

It also gives customers the flexibility to run Yellowbrick on a combination of dedicated on-premises hardware, local Kubernetes stacks like OpenShift, in public clouds and on edge platforms such as Amazon Web Services Inc.’s Wavelength.

Cross-platform compromises

Yellowbrick positions itself as the world’s fastest data warehousing engine. Enabling deployment on other platforms requires making some performance compromises, but customers can still expect “all the optimizations at the software level even running on other hardware,” Cusack said. “As instance types get more capable, our software adapts to take advantage of them.”

Yellowbrick also doesn’t position its architecture as a distributed database but rather as a constellation of discrete instances that process data locally and can replicate and federate them across a network. Cusack said competitors such as Databricks Inc. and Starburst Data Inc., which process distributed data in place, “make a bunch of compromises in how they store data in large-file formats. They approach it not from a data warehouse perspective but as a data lake query engine that is grabbing more of the data warehouse workloads.”

As part of its foray into the cloud, Yellowbrick is also announcing support for cloud object stores such as AWS S3, Microsoft Corp.’s Azure Data Lake Storage and Google LLC Cloud Services as well as on-premises storage. Support will initially be limited to the CSV and Parquet file formats but will be expanded to include others such as Apache Avro and JavaScript Object Notation in the future.

Single manager

To support the more distributed architecture, the company today also debuted Yellowbrick Manager, a web-based control panel that provides for consistent management across all data warehouse deployments in distributed clouds. It can be used to manage multiple warehouse instances across platforms and also to load data and write and edit SQL queries.

“This is a way to manage all instances from a single place and replicate data between them,” Cusack said. “The goal is to control and link instances from the edge to the center.”

Finally, there’s a new version of its hardware appliance called Andromeda that runs queries three times faster than on the first-generation platform introduced four years ago. Performance enhancements at the hardware level include dual proprietary scan accelerators that operate at multiple terabytes per second, a threefold increase in network performance and new 64-core central processing units, compared with 18-core processors in the earlier version.

Andromeda can scale up to 6 petabytes across 40 nodes, the company said. Pricing will be in line with the current generation, starting at a subscription fee of $10,000 per month.

Yellowbrick will attempt to lure more customers into the fold by rolling out a seven-day free trial of its platform running on an AWS platform optimized with solid-state storage and the Non-Volatile Memory Express storage backplane. Tire-kickers will be able to provision multiple instances and load up to 60 terabytes of data in CSV format from their own data stores. The free trial also includes Yellowbrick Manager.

Image: Pixabay

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU