Experiences from Dockerizing Alaveteli for High Availability

67 views
Skip to first unread message

Caleb Tutty

unread,
Apr 19, 2015, 8:43:26 PM4/19/15
to alavet...@googlegroups.com
Hi all,

First of all I want to say thank you to everyone who has contributed to such a useful piece of civic tech. In New Zealand it has been (and will continue to be) an amazing public resource, and I know that is because of the huge amount of hard work that has gone in to it.
At my previous job we started using Docker as a tool for deployment as the benefits of containerising all dependencies meant for faster and less brittle deployments, rollbacks meant just picking a previous image version.

I'd like to give a brief overview of what our thinking has been, and I'll try flesh this out a bit more in a blog post (and post a link).


FYI.org.nz went down when the Open Knowledge Foundation found that one of their servers (shared by a number of different groups) had potentially been exploited. In this situation we thought that the first priority was to make sure we didn't miss any email while we figured out how we wanted to rebuild the server(s).


After seeing that Postfix piped to the mailin script, and that mailin script used rails runner to boot the Rails environment and read STDIN to the receive method of RequestMailer, I thought that a minimal-as-possible Postfix installation to save email would be the best way to go. I'm not an expert with Postfix so I thought that the safest thing to do at the time was to save mail to S3.

I wrote a very simple Golang app https://github.com/nzherald/s3putter, because it could work with a Docker Postfix image already available (and then tweaked slightly https://github.com/nzherald/docker-postfix), and being a compiled binary meant that I wouldn't need to manage a Ruby version dependency. Using Go is probably unnecessary (especially with our traffic), but I also learnt a thing or two about how tricky it is to get Postfix pipes to respect environment variables.

The ultimate goal was to be in a position where we would have redundancy, and services split on logical seams. However, there were a few more challenges.

We Dockerized the dependencies with a Ruby 2.1 base image, with packages from Debian Jessie, and our Docker image is designed to run a script which should make it agnostic to most configurations (and potentially reusable by others in the future) but it does make some trade-offs:


This docker image also copies over a generic general.yml which uses environment variables injected when starting docker:

(e.g. docker run --rm --name alaveteli  --env-file=/data/alaveteli/.env -v /data:/data nzherald/alaveteli)

There are some minor changes on our fork (such as to allow processing of ERB in the general.yml), and we've tried to avoid some of the bash scripts (like the commonlib script which did a similar thing but required pyyaml and php to parse the general.yml file which broke with including ERB).

This means that others can use this by setting up a mounted volume and putting their environment variables (including theme) in a file.


The app server is completely isolated from the container receiving email, and currently running on a separate service. Cron tasks (and an additional mail-from-s3 processing task) are run in another docker image using a sidekiq branch of our fork (https://github.com/nzherald/alaveteli/tree/sidekiq/app/workers). The mounted volume includes a Unicorn socket (which an nginx container reads), the public folder (precompiled by the Alaveteli image each launch), the raw emails and the Xapian database.

To replicate between app servers we've used a Dropbox docker image which also serves as a backup.


Problems with this current setup include:
  • Dropbox replication may be problematic. Currently the strategy is that if app-01 is knocked out, then Xapian indexing will stop and all other worker jobs will stop (as the sidekiq). Fleet (as part of CoreOS) can also manage relaunching services with a bit of configuration.
  • Dropbox replication could be made redundant if we move to more evented background jobs.
  • Sidekiq can be replaced with delayed_job and used in non-dockerised deployments without the need for Redis. This was chosen more for speed and familiarity with the Sidekiq, and because of Sidekiq's monitoring UI
  • The Alaveteli docker image migrates the database each time. This should be a no-op if no changes are needed, but a more solid zero-downtime strategy when migrations are required probably takes a bit more thought. Perhaps one-shot containers.
  • The Alaveteli docker image also precompiles assets each launch, which is slow. It would be nice to be able to have a prebuilt, quickly ready container for speedy rollbacks. This is a trade-off with being able to be used with any theme.


This isn't perfect, and probably not advisable for others to use just yet, but I wanted to just shared this high level overview of how we approached this problem. We will also try to work on improving this set up.

I'll make sure to include a link to a more detailed blog post when I've written it.


Thanks,


Caleb

Gareth Rees

unread,
Apr 21, 2015, 6:45:36 AM4/21/15
to alavet...@googlegroups.com
Hey Caleb,

Thanks for all the hard work you've been putting in to this. We've noticed 
fyi.org.nz is up and running again and looking great!

I really like the direction you're going in; I think running an app in a 
multi-server environment really forces you to have a good systems architecture. 
Alaveteli could certainly do with some improvemnet here.

Its worth keeping in mind that most re-users of Alaveteli currently run the 
whole stack on a single server – often shared with other services. At 
mySociety, we only run a single app server with a remote database server. “At 
the moment, you need more systems expertise to use Docker, not less.” [1]

I'm pretty confident we're going to see an explosion of container tools in the 
next few years. CoreOS have released their own [2], and both RedHat [3] and 
Microsoft are getting in on the action [4]. I also have reservations about 
getting locked in to Docker. They're clearly trying to build a platform around 
Docker-the-tool, which might become problematic in the future.

I think the approach we should take is to make Docker an extension of 
Alaveteli, rather than a key component in its install and run procedure. I 
think the difficulty at the moment is that its pretty hard to run specific 
parts of Alaveteli independently. Extracting some of those is definitely a 
great first step, as you've mentioned [5].

I'm sure there will be some work to get to that point, but I think it would 
benefit both the single-server re-user and those who want a service-oriented 
environment.

I just want to re-emphasise that we're really excited about the work you're 
doing. Its totally the direction we want to go in so don't take our 
reservations as negatives. I'm certain that moving forward with Docker will 
improve Alaveteli for everyone – even if they don't end up using Docker to run 
it.

Cheers,

Gareth

Andrei Cristian Petcu

unread,
Apr 23, 2015, 7:52:09 AM4/23/15
to alavet...@googlegroups.com
Hi Caleb,

Thank you for working on this container! I am trying to build it right
now :)

How do you manage your docker containers? Do you use Docker Fleet (I see
you mentioning it)? I want to try to use the same tools in production as
you are.

I will probably try to use it wit no S3 storage and see how that goes.

Thank you,
Andrei

On 04/20/2015 03:43 AM, Caleb Tutty wrote:
> Hi all,
>
> First of all I want to say thank you to everyone who has contributed to
> such a useful piece of civic tech. In New Zealand it has been (and will
> continue to be) an amazing public resource, and I know that is because of
> the huge amount of hard work that has gone in to it.
> At my previous job <http://carnivalmobile.com> we started using Docker as a
> tool for deployment as the benefits of containerising all dependencies
> meant for faster and less brittle deployments, rollbacks meant just picking
> a previous image version.
>
> I'd like to give a brief overview of what our thinking has been, and I'll
> try flesh this out a bit more in a blog post (and post a link).
>
>
> FYI.org.nz went down when the Open Knowledge Foundation found that one of
> their servers (shared by a number of different groups) had potentially been
> exploited. In this situation we thought that the first priority was to make
> sure we didn't miss any email while we figured out how we wanted to rebuild
> the server(s).
>
>
> After seeing that Postfix piped to the *mailin* script, and that mailin
> script used *rails runner* to boot the Rails environment and read *STDIN*
> to the *receive* method of *RequestMailer*, I thought that a
> - Dropbox replication may be problematic. Currently the strategy is that
> if app-01 is knocked out, then Xapian indexing will stop and all other
> worker jobs will stop (as the sidekiq). Fleet (as part of CoreOS) can also
> manage relaunching services with a bit of configuration.
> - Dropbox replication could be made redundant if we move to more evented
> background jobs.
> - Sidekiq can be replaced with delayed_job and used in non-dockerised
> deployments without the need for Redis. This was chosen more for speed and
> familiarity with the Sidekiq, and because of Sidekiq's monitoring UI
> - The Alaveteli docker image migrates the database each time. This
> should be a no-op if no changes are needed, but a more solid zero-downtime
> strategy when migrations are required probably takes a bit more thought.
> Perhaps one-shot containers.
> - The Alaveteli docker image also precompiles assets each launch, which
signature.asc

Caleb Tutty

unread,
Apr 26, 2015, 7:16:05 AM4/26/15
to alavet...@googlegroups.com
Sorry for the delay replying:

@Andrei:

I've been using CoreOS as the Linux distribution, and using SystemD (http://www.freedesktop.org/wiki/Software/systemd/) unit files with CoreOS' tool 'fleet' (https://coreos.com/using-coreos/clustering/).

SystemD looks to be the new standard in Debian and Ubuntu, replacing the old style of upstart and init scripts. Not everyone likes it (http://www.pcworld.com/article/2841873/meet-systemd-the-controversial-project-taking-over-a-linux-distro-near-you.html)

Here are my unit files:


They use a trick where I create my servers with metadata called 'name' and fleet chooses to run them on servers which that matches on. So I create 'app-01' and 'app-02' as two servers in two different availability zones.

I then use:

fleetctl submit alaveteli@app-01
fleetctl submit alaveteli@app-02

then 

fleetctl start alaveteli@app-01
fleetctl start alaveteli@app-02

Whatever goes after the "@" symbol is accessible as "%i" in those unit files. (https://coreos.com/docs/launching-containers/launching/getting-started-with-systemd/#instantiated-units)

If app-01 dies for some reason, I can create a new server with that metadata and fleet will automatically restart that service on a new machine with that name property.


@Gareth

Completely agree, and I understand the need to support reusers of this software. 

If the work I've done is useful for others, then that's great, but I understand that there'll be a range of different environments Alaveteli is deployed into and how important it is to make the process of updating as easy as possible.

There may be some advantages with a more mature implementation which could see automated containerised deployments able to be up and running in minutes, but that could be something to look more deeply at further down the line.

--
You received this message because you are subscribed to the Google Groups "Alaveteli Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alaveteli-de...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages