Basically, every Docker host runs a client and a daemon process. The daemon process implements the Docker Remote API endpoints, and the client talks to the daemon via HTTP-based API calls — via commands like docker run, docker ps, docker info, and the rest. Because Swarm implements most of those API endpoints, most of the regular old Docker commands you already know still work.
Obviously there will be slight differences. For example, rather than returning information specific to a single Docker host, the docker info and docker ps commands return info related to the entire cluster when run within a Swarm cluster.
This is great. The learning curve is so small it's barely worth calling a curve.
Scheduling is simple
Launching containers in a Swarm cluster tends to be known as scheduling. Swarm currently has three algorithms to help it decide which nodes in the cluster on which to schedule new containers: Spread, BinPack, and Random.
Spread is the default. It tries to balance containers evenly across all nodes in the cluster. To do so, it takes into account each node's available CPU and RAM, as well as the number of containers it's already running.
BinPack is the opposite of spread. It works by scheduling all containers on a single node until that node is fully utilized. Then it starts scheduling containers on the next node in the cluster. A major goal of BinPack is to use as little infrastructure as possible — great if you're paying for instances in the cloud. It gets its name from the fact that its modus operandi is similar to how we fill bins (trash cans): fill one to the top before starting to fill the next.
Random is, well, random.
Scheduling is powerful
One of my favorite Swam features is Constraints. Swarm lets you to tag nodes in the cluster with our own custom labels. We can then use these labels to constrain which nodes a container can start on.
For example, you can label nodes according to geographic location such as "London," "NYC," and "Singapore." But it doesn't stop there. Each node can be tagged with multiple labels. You could keep going and tag nodes according to the zone they're deployed in, such as "production" or "development." We might even add another label for platform: "Azure," "AWS," "on-prem," and so on.
Leveraging these labels, we can easily schedule containers to only start on nodes tagged as "London," "production," and "on-prem," all via the usual docker run command.
Labels are insanely simple, but massively powerful.
What's missing in Swarm
OK, Swarm is awesome, but it's by no means the finished article. Here I'll point out what I think are the most important features yet to be added.
Sign up for CIO Asia eNewsletters.