

Amazon Web Services is hoping to bolster customers’ cluster-management options by integrating Apache Mesos and the Marathon framework with its EC2 Container Service (also known as ECS), according to a blog post by ECS founder Deepak Singh.
The company has just unveiled a new Apache Mesos scheduler driver as a proof-of-concept integration with Marathon. The idea is to demonstrate how workloads on ECS can be scheduled using Marathon. Customers will be able to use the ECS schedulers, integrate with third-party schedulers or even write their own using the new driver, explained Singh.
For developers who’re building distributed applications in the cloud, cluster-management is becoming a crucial issue.
“A common example of developers interacting with a cluster management system is when you run a MapReduce job via Apache Hadoop or Apache Spark,” wrote Singh. “Both these systems typically manage a coordinated cluster of machines working together to perform a large task. In the case of Hadoop or Spark, these tasks are most often data-analysis jobs or machine learning.”
Singh pointed the extreme complexity of managing the state of the cluster as one of the major difficulties developers face with cluster-management.
“Software like Hadoop and Spark typically has a Leader, or a part of the software that runs in one place and is in charge of coordination,” said Singh. “They’ll then have many, often hundreds or even thousands of Followers, or a part of the software that receives commands from the Leader, executes them, and reports state of their sub-task.”
The problem, according to Singh, is that when machines fail the Leader must detect them, replace them, and restart the Followers that receive commands. “This can be a significant portion of code written for applications which need access to a large pool of resources,” he noted.
Cluster management systems also present a second challenge, in that applications generally assume full ownership of the machine where their tasks are running. As a result, developers will often end up with numerous clusters of machines, each one dedicated to the management system being used.
“This can lead to inefficient distribution of resources, and jobs taking longer to run than if a shared pool of resources could be used,” Singh explained.
Amazon says that the Marathon driver is purely for demonstration purposes, which means it’s not recommended for production use just yet.
“We are working with the Mesos community to develop a more robust integration between Apache Mesos and Amazon ECS,” the company says, indicating that a more robust tool will be available soon.
photo credit: Port via photopin (license)
THANK YOU