Restart nomad client without disturbing running jobs?

1,624 views
Skip to first unread message

dma...@istreamplanet.com

unread,
Jul 23, 2018, 1:38:40 PM7/23/18
to Nomad
Hi everyone!

Is it possible to restart the nomad agent without disturbing the running jobs?  It seems that whenever I do this, the jobs get re-distributed to other nodes.  IE nomad treats the node as dead.  This makes sense, because the nomad agent IS down, but is there a way to tell it not to re-distribute the jobs for a while?  

Said another way: I'm sure my jobs can live for 5 minutes without the nomad agent, so is it possible to take advantage of that and leave the jobs running while I restart nomad?

Thanks!
-Dylan

msch...@hashicorp.com

unread,
Jul 23, 2018, 5:44:14 PM7/23/18
to Nomad
Hi Dylan,

Nomad client nodes periodically heartbeat to servers to inform them they're alive and healthy and available for scheduling work. As you stated, if a Nomad node is down long enough the servers consider it "lost" and reschedule its work on other nodes (if possible). Currently the only way to configure this behavior is by configuring the heartbeat_grace on the servers. We've discussed implementing a "maintenance mode" to temporarily raise the heartbeat grace period, but we don't have any firm plans on how or when to implement it. Please free to open an issue if you have a use case that a maintenance mode would help!

Hope that helps!

Cameron Davison

unread,
Jul 24, 2018, 9:10:01 AM7/24/18
to Nomad
If you are using systemd to run nomad then make sure to look at the example definition. Specifically the kill mode https://github.com/hashicorp/nomad/blob/master/dist/systemd/nomad.service because otherwise systemd will kill everything under it.

dma...@istreamplanet.com

unread,
Jul 24, 2018, 1:00:17 PM7/24/18
to Nomad
Thanks folks!  That's both helpful info! 


On Monday, July 23, 2018 at 10:38:40 AM UTC-7, dma...@istreamplanet.com wrote:
Reply all
Reply to author
Forward
0 new messages