Docker

Steve Fisher

unread,

Apr 5, 2016, 12:15:08 PM4/5/16

to icatgroup

In order to debug the icat.server issue 150 "Load balancing does not work properly with long lasting queries (503 error)" instead of getting a lot of machines (virtual or otherwise) I decided to have a look at docker. I am new to docker so have not made use of the "swarm" which I think is probably the right way to manage a set of containers. Instead I have three docker files - glassfish, mysql-client and icat (built on top of glassfish). The script builds the images and runs things so that at the end I have three containers, two with an icat and one running mysql. I am now in a position to start looking at bug 150!

The code is not beautiful and is currently in a personal repository. The docker assumption is that each container runs one process which is quite different from the "normal" VM approach.

Steve

Peter Parker

unread,

Apr 5, 2016, 5:10:45 PM4/5/16

to icatgroup

Hi Steve

Thanks for that.

I have also been looking into Docker recently and like you have reached a stage where I can run an ICAT container alongside a MySQL container. For now, my code is also in a private repository.

My overall aim with this is to ease (read: completely automate) the deployment of ICAT instances at SNS and HFIR, not just for testing and development purposes, but for production purposes, too. I see the following benefits to this kind of automation:

Spawning one-off ICAT stacks on a development machine becomes trivial. These help solve problems like the one you’ve described, but they could also be used to test new versions of components before they are deployed to production, or as a test bed against which to develop external software that makes calls to ICAT.
The developer desktop environment resembles the production environment as closely as is possible, so what works for the developer should more often than not work in production.
Maintenance tasks on production ICATs could actually be done separately and away from the deployed instances, and then one could quickly replace what is deployed whenever it is appropriate to do so.
Configuration is under source control, so changes can be tracked and well understood by everybody.

While I do have ORNL-specific use cases in mind, I am trying to code in a facility-agnostic way so that it would be easy enough for other facilities to use should that ever be deemed as a sensible thing to do.

Puppet

As you know I originally started trying to solve this problem with Puppet, but since then my experiences have not all been positive. My personal opinion is that:

There is a relatively steep learning curve. While Puppet module users have a simple declarative syntax to use when interacting with modules, Puppet module authors are in for a tougher time since they are expected to understand the wider Puppet ecosystem and toolsets. A reasonably-sized module will likely have:

“under-the-hood” code written in Ruby;
BDD-style tests written with RSpec;
acceptance tests written for Beaker;
templates written in EPP or ERB syntax;
Gemfiles, Rakefiles and Puppetfiles; and
a large number of dependencies on external Puppet modules.

The problem is actually made more complex by Puppet, rather than less. The principle behind Puppet is that you should be able to idempotently converge to the desired state of the system, given whatever state the system is currently in. Unfortunately this can get very complex very quickly given that there are so many possible system states, and even something simple like updating a component to a new version can be tricky to do in a robust way. The difference between developer machines and production machines is another source of extra states to be considered and accounted for.
The development process of iteratively making a small change to the Puppet code and then testing the result of that change can often be painfully slow. Converging to a desired state is hard, but it is made even harder when the current state of the system is either invalid or unknown to the developer. This means that deleting the GlassFish folder and running the entire script again is often the easiest course of action.
It feels like there are too many extra layers of complexity between the developer and what is actually going on, and this can make it hard to debug problems. Puppet manifests are compiled into catalogs which are then applied by Puppet, and in a development environment this will be done on a VirtualBox (or similar) VM managed by Vagrant which in turn is managed by Beaker. Now let’s say the Puppet script falls over; is it a problem with the contents of the script, or is it a problem with the configuration of one of the half-dozen tools in-between the script and the ICAT being configured?
Last but not least is the compiler itself, which can also be difficult to work with. It can be quite inconsistent at times (strict in some places but not in others), and also quite opaque. A careful developer is therefore left with no option but to adopt the recommended practice of layering everything in comprehensive BDD-style tests, and while this gives one more confidence in understanding what the compiler is doing, the tests end up being very brittle and do not lend themselves well to rapid development.

In short: I’ve found Puppet to be quite problematic for this kind of development. I have seen it work very well for relatively straightforward configuration management across many servers, but I think that getting a similar payoff for more complex configuration management across a few servers is much harder.

Docker

My (admittedly very limited) experience with Docker has led me to think that it could be a strong alternative to a Puppet-provisioned system:

Like Puppet, Docker has it’s own inner-workings, toolsets and best practices that newcomers have to wrap their heads around, but at the end of it all what you end up with is relatively simple Dockerfiles that would be instantly recognizable to any ICAT developer.
Rather than converging from absolutely any possible state to a very particular state that you desire, Docker allows for building upon a single, known and controlled state, which is a far simpler problem to solve.
Docker’s caching mechanism means that well-organized and thought-out Dockerfiles can make for very quick iterative development.
It is far more clear what went wrong when a line of a Dockerfile causes a build to fall over, since that exact line is what was used to run on the system.

Peter

Steve Fisher

unread,

Apr 5, 2016, 6:17:15 PM4/5/16

to Peter Parker, icatgroup

Peter,

The thing I like about puppet is that you specify the state you want and then the system tries to get itself into that state. I don't think it is a very good add-on but should really be used for completely managing a large set of machines many of which are almost identical. We have used it for the IJP where we have a group of worker nodes to configure but still it is complex and confuses people and needs using with a bunch of other tools. I also found the lack of puppet modules annoying. For example I found that I had to use different modules for managing the firewall on Ubuntu and CentOS. The declarative style of puppet should make it really easy to say what firewall settings you want independent of the OS.

I found I had to do a couple of things with Docker which I did not like very much and which certainly need a comment in the code. I find Docker good for setting things up quickly for testing but I don't see much benefit in using it in production nor for developing individual components for which I much prefer to install manually. Once you have done it a couple of times this is very quick. There is also the installer I have written which will set up an initial system with whatever set of components you want then you tweak it by hand.

Steve

--
You received this message because you are subscribed to the Google Groups "icatgroup" group.
To unsubscribe from this group and stop receiving emails from it, send an email to icatgroup+...@googlegroups.com.
To post to this group, send email to icat...@googlegroups.com.
Visit this group at https://groups.google.com/group/icatgroup.
For more options, visit https://groups.google.com/d/optout.

Rolf Krahl

unread,

Apr 8, 2016, 3:52:21 AM4/8/16

to icat...@googlegroups.com

Hi Steve,

Am Dienstag, 5. April 2016, 09:15:08 schrieb Steve Fisher:
> In order to debug the icat.server issue 150 "Load balancing does not work
> properly with long lasting queries (503 error)" instead of getting a lot of
> machines (virtual or otherwise) I decided to have a look at docker. I am
> new to docker so have not made use of the "swarm" which I think is probably
> the right way to manage a set of containers. Instead I have three docker
> files - glassfish, mysql-client and icat (built on top of glassfish). The
> script builds the images and runs things so that at the end I have three
> containers, two with an icat and one running mysql. I am now in a position
> to start looking at bug 150!

This is an interesting coincidence, because I also decided a few days
ago that I should have a look at docker. I'm also new to it and have
no practical experience so far. But from the theory I learned by now,
it looks very promising. I will play around and evaluate it. If it
lives up to its promise I consider to use it for our ICAT deployment.

I'm not at the point now, but I'd like us to share experiences in the
near future.

Best,
Rolf

--
Rolf Krahl <rolf....@helmholtz-berlin.de>
Helmholtz-Zentrum Berlin für Materialien und Energie (HZB)
Albert-Einstein-Str. 15, 12489 Berlin
Tel.: +49 30 8062 12122

signature.asc

Reply all

Reply to author

Forward