Warden vs Docker, LXC etc.

Glyn Normington

unread,

Feb 25, 2014, 11:20:36 AM2/25/14

to vcap...@cloudfoundry.org

I'd like to add three topics to this discussion: serviceability, diagnostics, and functional control. Apologies if these have been touched on in the past, but I believe they are very important.

In the following the term "Warden server" refers to the core function of Warden or Garden (which is currently equivalent in both projects).

Serviceability

If we replaced the Warden server with a third party project such as Docker or LXC, we would still need to be able to support users of Cloud Foundry. This would require considerable investment to learn the internals of the chosen third party project and to put committers in place with the skills to produce fixes on a schedule required by our users.

Diagnostics

It seems that Docker and LXC have not been designed with diagnostics as a major design goal.

A recent PR against the Warden server aims to improve its diagnostics in "out of memory" situations. The details of this particular change (for which see the commit message) are not important here, but it's crucial to ask ourselves how easy it would be to make such changes to Docker, LXC, etc. Not only would we need at least one or two core committers on such a project, but we would need to persuade the relevant community of the importance of first class diagnostics so that such changes could be made where necessary in the codebase.

Diagnostics are essential to enable Cloud Foundry to be serviced and supported cost-effectively. This is important to those vendors building commercial products, especially on-premise products, where "first failure data capture" of diagnostic data can be a life-saver for a support organisation, especially for rare, severe failures.

Functional Control

The other main reason for preferring Warden server over third party projects is to be masters of our own destiny in terms of the functional direction. Whether we are talking about exploiting additional cgroup subsystems, absorbing/exploiting the cgroup re-architecture if/when it arrives, or using cgroup statistics for monitoring and managing Cloud Foundry, these functions are critical to Cloud Foundry's ability to meet its customers' requirements.

Also Warden server has several functional advantages over Docker and LXC (detailed in the comparison document [2]) and so it would be necessary to port these features to a replacement third party project, which would again require a substantial investment in order to gain the necessary influence.

[1] https://github.com/cloudfoundry/warden/pull/54

[2] https://docs.google.com/a/gopivotal.com/document/d/1DDBJlLJ7rrsM1J54MBldgQhrJdPS_xpc9zPdtuqHCTI/edit

Dmitry

unread,

Feb 25, 2014, 11:38:43 AM2/25/14

to vcap...@cloudfoundry.org

I know what my question may cause the negative reaction of many folks, but I think we should reevaluate the general approach here: are you sure we really need the second level of virtualization and hypervisors are still not efficient enough ?

To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+u...@cloudfoundry.org.

James Bayer

unread,

Feb 25, 2014, 11:46:28 AM2/25/14

to vcap...@cloudfoundry.org

hypervisor only support may be suitable for some use cases where provisioning time, efficiency and lower scale-out make sense to be sacrificed for hypervisor isolation and performance. we also hear that people want to run CF on bare metal and that is just further out along the tradeoff curve. we're focused on provisioning/recovery speed, isolation, large scale-out and efficiency in the current tradeoffs we've intentionally made. running a DEA with max app instance of 1 is essentially like getting a dedicated VM for your application.

--

Thank you,

James Bayer

Dmitry

unread,

Feb 25, 2014, 12:01:44 PM2/25/14

to vcap...@cloudfoundry.org

Some of the problems above could be solved by efficient resource management (when resources will be managed in pools). In that case the provisioning/recovery speed will be the same as it's today.

From the large scale-out perspective, benefits of the shared kernel are negligible relatively on network penalties of the second level virtualization.

In regards to bare-metal installation, the container-based approach is fully makes sense. The best will be to support both approaches which could be managed by the "density controller".

Alex Suraci

unread,

Mar 3, 2014, 2:12:02 AM3/3/14

to vcap...@cloudfoundry.org

My thoughts on the whole Warden vs. Docker discussion:

Trying to convince an engineer who knows Docker and Warden to switch to Docker

is like trying to convince a manual transmission driver to buy an automatic

because it's what your neighbor knows.

The real magic happens in the kernel. Both Docker and Warden are sugar on top of

it, with APIs tailored for different things.

Docker provides reproducible environments. Warden provides agility on top of

predefined environments.

Further: Docker is a tool for bootstrapping images and running single commands

with a static environment. Warden is a tool for taking predefined images and

spinning up dynamically configurable containers that can run any number of

commands.

On Diego team we actually use both Docker and Warden. We use Docker to build our

base images[1], and we use Warden to run apps with them. We also use Docker for

having reproducible CI environments with Drone[2]. We opt for the best tool for

the job.

Diego's architecture demands a flexible, cross-platform container abstraction.

Our components cannot assume Linux, let alone LXC. Given this, the Warden

architecture serves an important function: a generic API over a pluggable

backend. Having this abstraction layer allows us to design components that will

"Just Work" when you swap out Linux for Windows, or, hell, our own Linux backend

with a Docker backend.

So, here's what would really bug me about swapping out all of Warden: you can

cleanly define Docker's container API in terms of Warden's, but not the other

way around. By switching we lose flexibility, not just control.

As a challenge, I've started working on a Docker backend for Garden[3]. So far

you can create a container, copy files in and out, run and stream commands, and

set memory/CPU limits. That's about as far as I think I can get without

rewriting half of our Linux backend; abstractions are springing leaks left and

right.

The transition is lossy, and what would it technically buy us? We wouldn't have

to maintain our Linux backend anymore? It requires next to no maintenance as it

is. Again, the magic happens in the kernel; we just have to turn some knobs. I'm

afraid of having those knobs boxed away and having GitHub issues and Google

searches and more external dependencies in their place. There does not need to

be one single application for the Linux kernel's containerization APIs.

[1]: https://github.com/cloudfoundry/stacks/tree/docker

[2]: https://github.com/pivotal-cf-experimental/garden/blob/master/Dockerfile

[3]: https://github.com/vito/alcatraz

Dmitry

unread,

Mar 3, 2014, 6:41:57 AM3/3/14

to vcap...@cloudfoundry.org

Hi Alex,

I'm trying to figure out why docker and warden aren't trying to solve the same problem as you described above.

If droplet + execution context will fuse into the docker image, can't we then operate this image by docker API exactly like we're creating warden containers?

Thanks,

Dmitry

To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+u...@cloudfoundry.org.

Alex Suraci

unread,

Mar 3, 2014, 11:07:12 AM3/3/14

to vcap...@cloudfoundry.org

Not exactly. Having a base image is only a small part of a container's lifecycle. Docker's API let you take an image and `run` some command in it with static configuration. Warden's API lets you take an image, create a container, inject commands, dynamically change its resource configuration, etc. This is what I mean by describing Docker's API as a composition of Warden's.

If you know your requirements upfront and they're never going to change, Docker works. But the API doesn't allow for more interesting things, like allowing users to attach to a running container, or dynamically configuring resource limits (i.e. scaling memory without restarting). Some things are flat out not defined, like disk quotas and bandwidth limits.

They're trying to solve the same problem in that they're both containerization products. Much like a cargo ship and an airplane are trying to solve transportation.

Dr Nic Williams

unread,

Mar 3, 2014, 11:24:56 AM3/3/14

to vcap...@cloudfoundry.org

Alex, afaik Docker is as flexible with opening volumes to external configuration folders/files as Warden is. E.g. the -v flag on docker run. This failed for you?

--

Dr Nic Williams

Stark & Wayne LLC - consultancy for Cloud Foundry users

http://drnicwilliams.com

http://starkandwayne.com

cell +1 (415) 860-2185

twitter @drnic

Alex Suraci

unread,

Mar 3, 2014, 11:26:46 AM3/3/14

to vcap...@cloudfoundry.org

By configuration I mean container level configuration; CPU limits, disk limits, memory limits, exposed ports, etc.

Dr Nic Williams

unread,

Mar 3, 2014, 11:46:15 AM3/3/14

to vcap...@cloudfoundry.org

The "docker run" command includes the following flags. Do they solve the container-level config problem you're describing?

-p is for port mapping http://docs.docker.io/en/latest/reference/run/#expose-incoming-ports

-m & -c are for RAM & CPU limits (I've not played with them) http://docs.docker.io/en/latest/reference/run/#runtime-constraints-on-cpu-and-memory

-lxc-conf for generic lxc configuration (I've not played with it) http://docs.docker.io/en/latest/reference/run/#runtime-privilege-and-lxc-configuration

For disk, @chenyf comments on how they set disk quotas based on the ideas from warden https://github.com/dotcloud/docker/issues/471#issuecomment-22373948

Alex Suraci

unread,

Mar 3, 2014, 11:48:17 AM3/3/14

to vcap...@cloudfoundry.org

Statically. You cannot change them after the container is created.

Dr Nic Williams

unread,

Mar 3, 2014, 11:54:49 AM3/3/14

to vcap...@cloudfoundry.org

Ok.

Is that something we do in CF? I didn't know dynamically changed disk or RAM or ports, etc.

Alex Suraci

unread,

Mar 3, 2014, 1:42:15 PM3/3/14

to vcap...@cloudfoundry.org

It's not something we currently use. There's a lot of Warden we don't use at some point, and then a story comes up and we use or implement it. For example, setting inode limits or CPU shares.

It'd be a nice user feature to not have to restart apps when scaling down* memory/disk. There's no technical reason to require this; it's just echoing to a cgroups pseudofile. And being able to dynamically change exposed ports would be great for being able to hook into a live container for e.g. a SSH session.

* Scaling up is harder due to available capacity potentially running out.

Glyn Normington

unread,

Mar 4, 2014, 4:53:44 AM3/4/14

to vcap...@cloudfoundry.org

Another use case for dynamically adjusting resource controllers is to cope automatically with temporary peaks in demand by a particular application.

Aristoteles Neto

unread,

Mar 4, 2014, 3:39:18 PM3/4/14

to vcap...@cloudfoundry.org

One of the reasons why I originally liked the idea of using Docker (or rather LXC) is because OpenVZ, from what I know, is the foundation of the codebase.

What this means is that all the features we say are currently unavailable will probably be there eventually, since they were already available in OpenVZ. Making the switch to LXC and helping with its code base, would essentially allow that to happen sooner, but I completely understand the case for using Warden + Docker Images as that gets the best of both worlds, currently.

I’m not sure I’d agree with using hypervisors instead of containers - containers allow for higher density, and better provisioning agility.

Aristoteles Neto

ne...@orcon.net.nz

Rao Pathangi

unread,

May 2, 2014, 5:57:08 AM5/2/14

to vcap...@cloudfoundry.org

Thanks for explaining. In your view, do you see Docker evolving to acquire Warden's capabilities it currently lacks? Also, in terms of elasticity, doesn't dynamically provisioning resources (CPU, RAM etc) to an already running VM deliver the same results as spinning up a new VM on demand albeit with a static configuration?

Do you feel the feature of dynamically changing RAM or CPU is necessary because the elasticity of CF is really bounded--as in the number of VMs can grow from a minimum of 'x' to a maximum of 'y'?

Mike Heath

unread,

May 2, 2014, 12:10:56 PM5/2/14

to vcap...@cloudfoundry.org

We dynamically change networking settings using Warden. This is a customization that we've made to implement application level firewalls. This is HUGE for us as it allows our CF network zone to route to sensitive network zones but only specific applications can actually route outside of Warden to those zones. This is actually one of the big reasons why CF has gained so much traction in our org.

Reply all

Reply to author

Forward