Docker Ecosystem

Feb 02, 2019

Containers, Compose, Swarm, Desktop?

Docker isn't simply a container or a virtual machine, its an entire ecosystem. Luckily, each component is a layer that builds upon its previous layer. There are definitely a lot of moving pieces, but from a birds eyes perspective there are three main components.

This post is only meant to be a very brief perspective. For more in-depth information check out the official getting starting docs.

Docker - A single image

A docker image is very much like a virtual machine. If you haven't worked with virtual machines before, it's basically your computer, without a desktop. You can't run around clicking on start or finder icons, but you can install packages, start web servers, create users, run applications, and spin up databases all without installing any software to your local machine. Besides docker, of course!

Examples

  •  Web Applications - python flask, node.js express, static html sites, etc.
  •  Machine Learning Applications - Models trained with tensorflow, data preprocessing pipelines, predictive applications, etc. 
  •  Databases - MySQL, postgres, or mongodb databases to store your data. 

Docker Compose - A set of images composed into services

Now, this is where things start to get interesting. This is where you can string together multiple applications, databases, or job queues, and they can all talk to each other. Depending on how they are set up, they can also talk to resources in the outside world, or other stacks created with docker-compose. Each one of those services is a single docker container. You see the layering?

From a practical standpoint a lot of this functionality has to do with networking, but I kind of hate having to deal with networking. So I prefer to leave it as docker magic. If you also hate networking a large bonus point for using docker is almost never having to deal with it, besides knowing that it exists.

Examples

  •  A WordPress website that needs to communicate with a mysql database.
  •  A node.js application that needs to communicate with a MongoDB database.
  •  A python job queue with celery that must talk to a message queueing system (rabbitmq).
  •  A spark cluster, where a manager node must communicate with a worker node.

Docker Swarm - Scale your stacks to infinity and beyond!

This is where we get into one of my favorite subjects, world domination!

Imagine you have a large web application with a bazillion people who are viewing and clicking on things. This would completely overload a single server, right? Instead of having some cron job that restarts your server every 5 minutes, because that would be a total hack and I have absolutely never done such a thing, instead you use docker swarm to string together a set of servers, and replicate your application across all of them, or even multiple instances on a single server. Cool! You can even tell docker to restart the individuals if (when) they crash. You accomplish this magic by throwing a load balancer in front of all your services, which is basically a cool piece of tech that distributes your requests evenly across your swarm cluster.

Examples

  •  A job queue that will process a bazillion tasks that your boss wanted complete yesterday. Throw a REST api on that nonsense and distribute it to whole cluster.
  •  A web application the whole universe will use. Ensure not everyone is clicking on the same page rendered from the same physical server, but that they all THINK they are clicking on the same page.
  •  A spark cluster that will see some serious computing. Spread out the computations across multiple nodes.

DEPLOY

Now we are back to my favorite talking point, deployment. When you develop with this docker (or virtual machines in general, but docker is my fave so go with it!), you can deploy very easily. It doesn't matter if you are deploying to a remote server somewhere in your own datacenter, your own computer (although where is the fun in that??), or a cloud provider such as Amazon Web Services or Google Cloud. Underneath the hood they are all computers, and we can abuse them all in the same fashion with DOCKER.

Bioinformatics Solutions on AWS Newsletter 

Get the first 3 chapters of my book, Bioinformatics Solutions on AWS, as well as weekly updates on the world of Bioinformatics and Cloud Computing, completely free, by filling out the form next to this text.

Bioinformatics Solutions on AWS

If you'd like to learn more about AWS and how it relates to the future of Bioinformatics, sign up here.

We won't send spam. Unsubscribe at any time.