There is a lot of movement, excitement and energy around Docker and containers and a lot of eagerness to adopt this new paradigm at least in development and testing environments. So with all this excitement, and developer adoption, organizations are right to start looking at how they will manage their container chaos.
Our goal is to help clarify where things stand and what we can expect in the coming months.
One of the difficulties with the current tools are that they only overlap in some areas and cannot be compared ,one-to-one, for their entire feature sets. While many of the tools can be substituted for each other, there are other cases where they can be used in conjunction. In order to clear some of the fog (ship pun intended) we assessed the tools against five key attributes: high availability, scalability and real world usage, ease of adoption, networking, and extensibility beyond containers. In the last category we looked for an ability to integrate with various development projects including those that are not container centric or even related.
Three features were selected here for their ease of comparison and also their importance to implementing HA specifically for clustering. First we look at master failover, this includes the ability of a cluster master to failover and also elect an existing machine as a new master. The next feature we examined is health checking or the ability for the system to know when the application and cluster are truly available. Lastly we looked at auto rescheduling on failure. If a part of the cluster fails we want to know that our container workload will be rescheduled and restarted.
Mesos has a high availability mode which uses Apache ZooKeeper under the hood. In HA mode multiple masters are run. One active master, called the leader is elected. In the event of the active master failing a new master is elected. Swarm and Kubernetes both also support a “Highly Available” Master setup. It’s not out of the box, but there are guides for Swarm and Kubernetes clusters.
Swarm, Mesos, and Kubernetes all have TCP/HTTP methods for health checking. Command health checking takes it step further allowing for specific commands to be executed and checked against for health. K8s has supported this type of check natively for a while now. Further Kubernetes has built-in constructs for Services and the health checks can be used to evaluate which hosts are healthy enough to receive service traffic. Mesos has support for command health checks as well and recently fixed some of the issues that previously prevented their use.
Finally, Mesos and K8s will both reschedule containers. In the event of a host failure a new healthy host is found and containers are rescheduled there. Swarm has container rescheduling planned in the next released.
Scalability and Battle Tested (Real World Usage)
As we mentioned in our container webinar series, Mesos is used at a lot of big name companies. Use cases exist at Twitter, eBay, Netflix, and the list goes on. That said, these are real-world use cases for Mesos itself which has a wide ecosystem of frameworks beyond containers. Kubernetes and Docker’s Swarm are being used by the development community and have already seen production deployment at a variety of organizations. I expect to see more public references as we go forward.
It’s also worth noting that Kubernetes is backed by Google. Google has been running its’ entire business on containers for a decade and they launch 2 billion containers a week. K8s has made significant strides in the past six months in terms of scalability. With the 1.1 release it now supports clusters with as many as a thousand nodes and one million queries per second.
Swarm has also recently reached “Production Ready” status. In testing it’s also reached the thousand node mark. It also seen adoption by some organizations already. Most notably is Rackspace which uses Swarm under the hood of their newest container offering. Hortonworks is also using it with their new Cloudbreak project, a provider independent Hadoop as a Service offering.
While Mesos itself has been widely used in the community, Mesosphere and Microsoft are also working together on the new Azure Container Service. This service was announced at MesosCon and should be available in the next few months.
Ease of Adoption (Skill set required)
When evaluating solutions for managing Docker containers, nothing is easier than working within the same Docker API your team is already used to. The Swarm, Compose, Machine trio are built into the core Docker ecosystem so a Docker team should have no trouble picking it up. Kubernetes doesn’t require a lot of extra skills to learn and you can even try it out for yourself with this how to. The biggest difference will be the mental model constructs of Pods, Service, and Controllers. That said it’s a pretty short learning curve and in return you’ll be able to manage at application and service granularity. Mesos, on the other hand, is not for the faint of heart. It can be quite raw and requires a larger skill set for deployment. Mesosphere has made big strides in easing the setup and use of Mesos adoption. The DCOS has just moved to General Availability early this fall. They have built a great management UI and dashboard that integrates with Marathon directly. You can try it out for yourself, for free, on their site.
Kubernetes has some of the most straight-forward container networking of the bunch. Network and port management can be a nightmare and is certainly a key area for container management complexity. K8s has some basics tenets in their networking that keep things simple for the developer and reduce the risk in this area considerably. First, they require that all containers communicate with each other without the use of NAT (Network Address Translation). They also require the nodes communicate with the containers without NAT. Lastly, K8s requires that a container’s internal IP address is the same IP that other containers and nodes use to communicate with it. These tenets eliminate much of the port mapping, address translation, and service discovery issues that arise in large scale operations.
Docker has made tremendous strides over the past six months in how it handles networking. The just released 1.9 version has graduated the Docker Networking plugins out of experimental support and into “Production Ready” status. The new networking model allows you to create virtual networks independent of the containers. In this way groups of containers can be placed in specific networks as needed. Additionally the networking in Docker allows for the use of plugins if you prefer another model. There are already a number of third-party plugins available from Weave, Project Calico, VMware, Cisco, Midokura and Microsoft.
Mesos adopts a networking strategy that is similar to the older Docker model and requires port mapping and an IP address per node. There is definitely room for improvement for Mesos with regards to container networking. However, there was a demo at MesosCon in August of new experimental support for using Project Calico. While Calico can also be used with K8s and Docker, it’s particularly helpful for Mesos where there are less networking options widely available. Find out more about the calico-mesos project here.
Extending Beyond Containers
All projects are open-source, so they are easy to fork and contribute to (Here, Here or Here). Mesos is built as a platform and has many frameworks for non-container solutions. These frameworks include Hadoop, Cassandra, Jenkins, etc. While you need the commercial Mesosphere product to get the improved install and management experience, the organization has done a great job contributing back to the Mesos OS projects. Kubernetes and Swarm are focused on containers for the time being. K8s can be run with non-Docker containers, but it is still container focused.
As we discuss in the container webinar series, these tools are not always a one-to-one comparison, so metrics are chosen in order to allow for comparison in their current state. In some cases these products should be combined for the best result. For example, the Mesos-Swarm and Mesos-Kubernetes project gives you the potential of having a highly available system with master-failover capabilities, while simultaneously providing the features of Swarm or Kubernetes!
While it’s still early we can see from theses comparisons a few areas where each tool shines.
Mesos is used in big production deployments and if you want to run a cluster of a variety of services, not just containers, Mesos is the way to go. This is especially true if you have are building distributed systems or have “Big Data” components as there are already Mesos frameworks for Hadoop, Spark, Cassandra, and many more.
Kubernetes is well rounded in most areas of functionality. Recent updates have made large scale deployments a reality and improved the existing smooth update process. Combine that with the backing of an organization that starts 2 billion containers a week and has been working in the container world for ten years and you have a formidable product that is already maturing rapidly.
Swarm is the youngest of the bunch, but Docker has made huge strides in the past year. Docker has made big improvements to networking and the talent there is sure to continue to evolve Swarm over the next year. If you’re a small Docker-only shop you may not need the extra benefits of the other tools. Thanks to the swarm-kubernetes integration work you can always adopted and combine another orchestration layer as you grow.
|Application Health checking||3||5||4|
|Auto Reschedule on Failure||1||5||5|
|Scalable and Battle Tested||3.5||3.5||4.3|
|Large Scale||4||4.5||5 / 5|
|Battle Tested (Real world use)||3||2.5||5 / 2|
|Extensibility Beyond Containers||3.0||3.0||4.3|
|Open-Source||5||5||5 / 2|
|Uses Beyond Containers||1||1||5 / 5|
|Ease of Adoption||4.0||3.8||3.0|
|Variety of Skills Required||4||3.8||2 / 4|
Figure 2 – Current state of orchestration tools
In the original version of this article we had some future state prediction. However, the speed of developments is incredible, so we will come back an update this article as things evolve.
The container landscape is still very young and rapidly evolving. Swarm and Machine are not even a year past initial beta and 1.0 for Swarm only happened at the beginning of November. Only about 18 months has passed since the Kubernetes project was released. Even Mesos is relatively new to the scene with its Docker container frameworks. Despite this, we’ve seen tremendous growth and evolution
For an in-depth discussion on the best current orchestration tools as well as deep dives with individual platforms, watch our container webinar series.
Still overwhelmed and just want something you don’t need to worry about?
Both Google and AWS provide managed container services. Google Container Engine (GKE) and Amazon’s Elastic Container Service (ECS) both offer a managed orchestration approach. Watch my blog for more on these services over the next few months.