Hi,
We're looking into Nomad to use as a way to schedule batch jobs, but we've run into a problem. We're using the exec driver since we want to change as little as possible from the setup we have now, just introducing Nomad is a big step and we'd like to avoid to change too much at the same time. The problem is that it looks like Nomad is copying everything from the chroot_env list (/usr, /etc, /lib, and so on) into each allocation's directory. Since /usr on our systems (AWS Linux 2016.03.3) is 1 GB this means that each allocation takes 1 GB disk. It also looks like Nomad isn't very quick in cleaning things up after jobs have run, so we can't run very many jobs before the machines become unusable after filling up their disks. I guess we could limit the list of things in the chroot_env list, but that would also limit what our jobs have access to (I'm not sure at this point how limiting it would be, but it wouldn't be without complications).
Is this really how the exec driver is supposed to work?
How do people avoid running out of disk?
How quickly can we expect Nomad to clean up old allocations?
Since the java driver seems to work similarly to the exec driver, does it have the same issue?
yours,
Theo