UPDATED 08:00 EDT / APRIL 02 2019

CLOUD

MapR separates Kubernetes storage and compute to boost container flexibility

MapR Technologies Inc. is stepping up its efforts to hasten its customers’ moves to software containers with a new set of features in its MapR Data Platform that separates computing from storage.

MapR described the enhancements, announced today, as deep integrations with the core components of Kubernetes, which is the open-source software that orchestrates applications running in containers.

Containers make it simple to encapsulate applications in a form that’s easy to run on any computing environment in companies’ data centers or in public clouds. Kubernetes, which some people have called the operating system for the cloud, is expected to be in use by 90 percent of enterprises by the end of the year.

MapR said the new features make it easier for organizations to manage highly elastic workloads by enabling them to separately scale compute and storage. The platform will initially support Apache Spark and Apache Drill, which are two popular open-source analytics frameworks, “but this is just the beginning,” said Suzy Visvanathan, a senior director at MapR. “We will continue to build this out. The idea is to have a whole ecosystem.”

MapR has been on a campaign to align itself closely with Kubernetes since it announced support for persistent storage and stateful containerized applications a year ago. “There are things Kubernetes doesn’t do well, like provisioning, multitenancy and snapshots,” Visvanathan said. “We’re giving customers the ability to run Kubernetes in a production environment.”

Separating compute and storage enables workloads to be more appropriately provisioned according to the needs of each use case, she said. “Let’s say one of your users suddenly has a peak workload; how do you make sure others aren’t throttled when one user has 90 percent of the CPU?” she said. “You need to separate compute and storage subscriptions.”

The enhancements enable Spark and Drill processing engines to be deployed within compute containers orchestrated by Kubernetes. Each workload in a Kubernetes cluster is independent of where the data is stored or managed. Independent versions of Spark can be deployed in separate pods, which is the Kubernetes term for a group of containers that are deployed together on the same host. This enables multiple stages of development, testing, and quality assurance to co-exist within a cluster.

The company is introducing an approach it calls multitenancy in the compute layer to make such intricacies “agnostic to the end user,” Visvanathan said. The technology makes it possible for users to creates tenant namespaces for compute applications, enabling each container to get the resources it needs without infringing upon other containers in the cluster. Tenants can point to a storage cluster located elsewhere.

“If Sally needs four cores, 256 gigabytes of memory and a 2-terabyte volume, those resources can be provided only for user Sally and only user Sally knows it,” Visvanathan said. “You can scale in and scale out your compute jobs independent of scaling your storage.”

That’s in contrast to the approach Hadoop took the early days of big data by closely aligning storage and compute.  The intent was to minimize latency, but the ultimate effect was to create a lot of excess computing capacity to accommodate duplicate data for tasks such as disaster recovery, Visvanathan said.

The enhancements are set to ship in the second quarter. The company hasn’t yet decided whether to make the available as a separate product or as an in-line enhancement to the data platform.

Image: Pixabay

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU