AppScale has strived to be the platform that runs on any infrastructure, and historically, we’ve done a pretty good job there. We started off on Xen virtual machines (avoiding bare metal so that all the stuff we install doesn’t conflict with the stuff you rely on) way back in 2008, and added support for Amazon EC2 shortly after. We expanded onto Eucalyptus private clouds, KVM virtual machines, and (quite recently) Google Compute Engine! Since all you really need to run AppScale is a Ubuntu Precise VM, in theory it can run anywhere those can be run (like OpenStack and CloudStack) and even places we didn’t anticipate it to run. For example, I develop on my MacBook Air, so I run VirtualBox/vagrant and develop on that, which works great! But wait! There’s something new and cool that everyone on the internet is talking about called Docker! So the questions are: what is Docker, and can AppScale run on it? spoiler alert: yes!
What is Docker?
The tagline on the homepage says it perfectly:
Docker is an open-source project to easily create lightweight, portable, self-sufficient containers from any application. The same container that a developer builds and tests on a laptop can run at scale, in production, on VMs, bare metal, OpenStack clusters, public clouds and more.
For the more technical-minded among us, Docker is to LXC what Vagrant is to VirtualBox. This, of course, is awesome, especially since I never used VirtualBox until Vagrant came along, and never used LXC until Docker showed up (despite VirtualBox and LXC having been around for quite some time). So Docker gives us a nice pretty interface to LXC, but what does that mean for developers?
Basically, it means that you can easily create isolated containers, develop in them, save them, and share them with others. For example, you can do development in a container on your laptop, package it up, and run it in production on an OpenStack cluster. The promise of Docker is that you don’t need to change anything to make that migration happen – Docker abstracts away everything below the container level.
For someone like me, who normally develops on virtual machines, this is not too much of a transition. A container is conceptually similar to a virtual machine, but there are some cases where a container is better. The main case that the website evokes is the scenario of what to do when you need a second container or VM. For virtual machines, you have to copy the entire operating system, filesystem, etc. But for a container, you don’t need to make a copy. They share the same operating system, and they can share the same filesystem. Whenever one of the containers needs to write to the FS, it looks like they do a copy-on-write so that only differences need to be saved. This gets more awesome as you add more containers, naturally.
Why Run AppScale on Docker?
The obvious answer is: why not? The longer version of this is that AppScale and Docker add a tremendous amount of value to each other. On the Docker side of things, I saw a bunch of cool examples on the Docker site (and on the slides cited earlier). They set up web servers or databases, but I didn’t really see anything that set up a full software stack (load balancer, application servers, database) with autoscaling. So clearly I thought of AppScale here, since it provides all those things out of the box!
But it’s not a one-way street – there’s a lot that AppScale gets out of running on Docker. Docker lets AppScale easily deploy to bare metal and OpenStack (two targets that haven’t seen as much love as others), and since Docker doesn’t use hardware virtualization, it promises improved performance versus its competitors.
For me, Docker solves a much more important problem. Whenever I send out a pull request on appscale or appscale-tools, I have to demo to someone that my change (1) actually works, and (2) doesn’t break anything that was previously working. Normally a one machine demo is fine, but about once a week, I have to do a demo on multiple virtual machines. This is fine, except that the hard drive on my MacBook Air is perpetually 99% full. This means that making another VM is a huge pain in the ass, because now I have to go hunt down free space across my computer, which makes me do regrettable things like delete Team Fortress 2, one of our company’s pasttimes.
But with Docker, this problem is trivially solved. I don’t have to copy the whole VM! I just make a new container, and we’re good to go! In fact, if I need another container on top of that, I just run the same docker command again, and I’ve got another container! No more hunting down gigs of disk to free up like a madman.
Running AppScale on Docker
Making AppScale work on Docker was actually pretty straightforward. All I really had to do on the AppScale side of things was not muck around with the
/etc/hosts/ file (which we used to do for Hadoop, but we don’t need anymore), and make sure that we install packages that we previously assumed were always there (but weren’t on the extremely minimal ubuntu image that Docker provides). It’s a pretty small pull request, if that’s the kind of thing you’re into.
Of course, you’re probably much more interested in how to actually run AppScale on Docker. Well, it’s easy! First, make sure you know how to use Docker. This getting started guide is suprisingly awesome, so do that first. Now that you know how to use docker, just download this Dockerfile wherever you have docker installed:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Next, build an AppScale container with this Dockerfile:
The last line of the output there will tell you the ID of your new container, so be sure to commit that to a new image:
You can then fire up an AppScale container just like any other container:
Now you’re logged into your container. AppScale uses ssh to talk to your containers, so begin by setting a root password, start up sshd, and get the IP address of your container:
1 2 3
Next, go to root’s home directory and put this AppScalefile there, telling it how to start up AppScale. Assuming your IP address is 192.168.10.2, your AppScalefile would look like this:
1 2 3 4 5 6 7
Then, you can just start up AppScale with the usual:
And you’re up and going with AppScale on Docker! You can create additional containers by running:
Each of those containers will have a different IP address, so just repeat the process above to set the root password, start sshd, and get the IP address for the AppScalefile, which will have all the IPs as follows (assuming you now have four containers):
1 2 3 4 5 6 7 8 9 10 11
So that’s AppScale running on Docker! It’s pretty new at the moment (meaning I got it going today), and there are some more things to do here:
- Getting my docker modifications for appscale merged into the appscale/master branch
- Starting sshd on container start would be nice
- Maybe a script to start containers, grab their IPs, set the root password, and start AppScale?
- Prepackaging up the finished AppScale container into a docker image and talking about how to use it instead of building one from scratch
But for now, it should be good to go – enjoy!