Upstart process used with consul lock exits every few hours

70 views
Skip to first unread message

an...@systeminsights.com

unread,
Oct 28, 2016, 2:54:53 AM10/28/16
to Consul
I have a process that uses the command consul lock to check to elect a leader and run only a particular process only if it acquires the lock. All nodes on the server have the same upstart 

description "MyApp"

emits myapp
-up

start on
(local-filesystems and net-device-up IFACE!=lo)
stop on runlevel
[016]

respawn
respawn limit
10 10

kill timeout
10
kill signal INT

setuid appuser
setgid appgroup

env SHELL
=/bin/bash

script
   
exec consul lock -name myapp apps/myapp /opt/myapp/venv/bin/python /opt/myapp/app.py --config /etc/app/config.yaml

end script



I find that this service exits every few hours, and does not spawn on the node that has the lock. The upstart logs for this service say:

Error running handler: signal: terminated
signal: terminated
Shutdown triggered or timeout during lock acquisition
Error running handler: signal: terminated
signal: terminated

For the node which had the lock:

Error running handler: signal: terminated
signal: terminated

The logs on the python application does say that it received a kill in the form of a KeyboardInterrupt.

I referred https://github.com/hashicorp/consul/issues/985, but did not get much idea on why this is failing. The output of consul --version on all three nodes is:

Consul v0.7.0
Protocol 2 spoken by default, understands 2 to 3 (agent will automatically use protocol >2 when speaking to compatible agents)

Where could I be going wrong?

Barry Kaplan

unread,
Nov 3, 2016, 4:09:21 AM11/3/16
to Consul
Any idea on this? We are losing all our elastalert instances. Pretty sure that this problem started after the upgrade from 6.x to 7.x.

James Phillips

unread,
Dec 19, 2016, 5:21:17 PM12/19/16
to consu...@googlegroups.com
Hi,

Sorry for the late reply on this one. Do you see anything in your
Consul agent logs when this happens? Haven't seen similar reports or
experience like this in local testing, so this is a weird one.

-- James

On Thu, Nov 3, 2016 at 1:09 AM, Barry Kaplan <mem...@gmail.com> wrote:
> Any idea on this? We are losing all our elastalert instances. Pretty sure
> that this problem started after the upgrade from 6.x to 7.x.
>
> --
> This mailing list is governed under the HashiCorp Community Guidelines -
> https://www.hashicorp.com/community-guidelines.html. Behavior in violation
> of those guidelines may result in your removal from this mailing list.
>
> GitHub Issues: https://github.com/hashicorp/consul/issues
> IRC: #consul on Freenode
> ---
> You received this message because you are subscribed to the Google Groups
> "Consul" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to consul-tool...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/consul-tool/8e1a6f6a-99bb-4b35-9643-e1882845949d%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages