Re: [consul] Help with consul KV store sessions and requests from load balancers

106 views
Skip to first unread message
Message has been deleted

Armon Dadgar

unread,
Sep 24, 2015, 9:41:44 PM9/24/15
to consu...@googlegroups.com, anton...@gmail.com
Hey,

I’m not sure if the load balancer is the culprit here. The error “Lock already held”
indicates the client tried to “acquire” a lock on a key that is already locked.
Meaning client A holds the lock, and client B tried to acquire the lock and got
an error (which is expected). Consul does not use the client IP, instead it

Without more information its hard to tell what is happening, but based on what
you have described so far, it doesn’t seem like a Consul bug. More likely the
client interaction with KV locking is flawed.

Best Regards,
Armon Dadgar

From: anton...@gmail.com <anton...@gmail.com>
Reply: consu...@googlegroups.com <consu...@googlegroups.com>>
Date: September 24, 2015 at 10:18:36 AM
To: Consul <consu...@googlegroups.com>>
Subject:  [consul] Help with consul KV store sessions and requests from load balancers

Hi

I've been trying to set up a docker swarm cluster with a highly available swarm manager and a consul discovery backend as (poorly) documented here:

For the most part I've got things working but I'm having some issues with the consul KV store and sessions - basically I don't understand how the sessions work.

For reasons I won't go in to here (unless you really want to ask), my consul cluster is behind a load balancer and I have an internal DNS record which points to it.
For my services which need to register with the consul cluster, they use this DNS address e.g. consul.my-domain.private.
That works fine in most cases, but docker swarm throws up an interesting problem.
The docker swarm manager uses the consul KV store to help manage its own leadership elections.
When setting up my swarm manager nodes, I use my consul DNS record (lets just assume the IPs are unknown, but the DNS record always resolves to a healthy node).
When the swarm manager elects a leader, it creates a KV record with the IP of the leader and has a lock session.
This is where my understanding is hazy.
The first time a leader is elected, the record doesn't exist. The docker swarm manager makes a request via the load balancer using the DNS e.g. consul.my-domain.private and the KV record is created and visible in the consul UI. About a minute later, it makes another request and I assume it tries to update this value but fails - it has an error about Lock already held
I think the problem is because the request comes through the load balancer and therefore consul sees it as originating from a different IP each time.
I assume the lock session has something going on which prevents the original value being updated.

Can anyone help shed light on why the record can't be updated if the request comes from a different IP and if there's anything I can do to get this working?

I have tried changing my swarm manager to use a specific IP instead of going through the load balancer and this works successfully.
--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.
 
GitHub Issues: https://github.com/hashicorp/consul/issues
IRC: #consul on Freenode
---
You received this message because you are subscribed to the Google Groups "Consul" group.
To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/consul-tool/0e9a633b-7a32-40d6-980e-80de9e614462%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages