data replication

362 views
Skip to first unread message

Goverdhan R

unread,
May 8, 2016, 6:46:04 PM5/8/16
to Consul
How does data replication happen in a consul cluster?
Can someone point me to any documentation?

I have set up a cluster, but replication is not happening.
I need to understand how replication works.

David Adams

unread,
May 9, 2016, 8:16:14 AM5/9/16
to consu...@googlegroups.com
You'll need to tell us more about what you have done and what you expect to see that isn't happening.

If you have an active server or servers and all your nodes are successfully joined, then they should all see the same KV database and service catalog. If you have multiple clusters set up connected in a WAN gossip pool, nothing is replicated across that (except ACLs which aren't really replicated but are just homed in one datacenter).


--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.

GitHub Issues: https://github.com/hashicorp/consul/issues
IRC: #consul on Freenode
---
You received this message because you are subscribed to the Google Groups "Consul" group.
To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/consul-tool/62bbd879-b368-4b74-9662-5e0d212c9a90%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Message has been deleted

Goverdhan R

unread,
May 9, 2016, 11:57:51 AM5/9/16
to Consul
I have a three server cluster, I started with bootstrap expect, and they joined sucessfully and elected a leader.
All the three servers are running in the same data center.
I am using consul for service discovery, and registered some services using registrator.

I pointed registrator to one of the servers in my cluster (I tried pointing to both the leader and a follower), the service catalog just stays with the server I pointed registrator to. It doesnt get replicated to the other servers. If I bring down the server which has the service catalog, all the data is lost.

I am trying to figure out why data replication isnt happening. Is it any configuration I am missing or any issue with the network.

Below is how I started up the servers

./consul agent -server \
-data-dir=/data/consul -node=hostname_consul -bind=$IP -ui -client=$IP \
-bootstrap-expect 3 \
-encrypt
-retry-join $Peer_IP
-retry-join $Peer_IP

I am using Consul v0.6.4 on RHEL 6.7 kernel 2.6.32

I need to understand how data gets replicated onto the other servers in the cluster, so that I can figure out whats wrong in my set up.

David Adams

unread,
May 9, 2016, 12:06:14 PM5/9/16
to consu...@googlegroups.com
Services are registered per-node. If you register service "X" on node 5, then the information that "service X is on node 5" gets replicated through the Consul cluster. But Consul will not assume you are running service X on all your nodes. That's not how Consul is meant to be used. Each node running a service should inform Consul that it is running that service.

--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.

GitHub Issues: https://github.com/hashicorp/consul/issues
IRC: #consul on Freenode
---
You received this message because you are subscribed to the Google Groups "Consul" group.
To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
David Adams | Systems Administrator

Goverdhan R

unread,
May 9, 2016, 12:40:57 PM5/9/16
to Consul
Thanks for the response.

Yes, I understand and agree with that.

"service X is on node 5" gets replicated through the Consul cluster" -- This is not happening in my case.

My services are running on two different nodes. This information is being held with only one of the consul servers to which I am pointing the registrator, its not passing the information to the other consul servers in the cluster.

I am trying to find out how replication actually happens, so that I can see what the issue is with my setup.

David Adams

unread,
May 9, 2016, 12:49:34 PM5/9/16
to consu...@googlegroups.com
Can you post what commands you are running, and their output?

--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.

GitHub Issues: https://github.com/hashicorp/consul/issues
IRC: #consul on Freenode
---
You received this message because you are subscribed to the Google Groups "Consul" group.
To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Goverdhan R

unread,
May 9, 2016, 1:13:56 PM5/9/16
to Consul
Commands to start up the consul servers?

David Adams

unread,
May 9, 2016, 1:59:05 PM5/9/16
to consu...@googlegroups.com
The commands you are running to determine that "This information is being held with only one of the consul servers to which I am pointing the registrator, its not passing the information to the other consul servers in the cluster."

How are you deciding that is the case?

On Mon, May 9, 2016 at 12:13 PM, Goverdhan R <goverd...@gmail.com> wrote:
Commands to start up the consul servers?
--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.

GitHub Issues: https://github.com/hashicorp/consul/issues
IRC: #consul on Freenode
---
You received this message because you are subscribed to the Google Groups "Consul" group.
To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Goverdhan R

unread,
May 9, 2016, 2:04:43 PM5/9/16
to Consul
Below is the command I used to start up.


./consul agent -server \
-data-dir=/data/consul -node=hostname_consul -bind=$IP -ui -client=$IP \
-bootstrap-expect 3 \
-encrypt
-retry-join $Peer_IP
-retry-join $Peer_IP

I have my services running in docker containers, and I am using gliderlabs/registrator to register the services with consul.

I am using the below command to run registrator container.

docker run -d -h 'hostname' --name='hostname'_registrator --volume=/var/run/docker.sock:/tmp/docker.sock gliderlabs/registrator:latest consul://$host:8500 -cleanup

I have registrators running on two nodes where I have multiple services runnining.

I am pointing registrator only to the leader, so that I can make sure the service catalog gets replicated across the cluster.

The services get registered with the leader, and the leader knows what nodes they are running in. But it holds that information with in itself, its not replicating to the follower servers.

I did check the data directory on the file system, the followers are not in sync with the leader.

I tried pointing the registrators to on of the followers, in this case too, the service catalog stays with the follower server to which I pointed tge registrator, its not replicating to the leader or tve other follower.

When I kill the consul server which has the service information, the cluster looses all the data.

When I look up consul info for each of the servers

raft:
applied_index = 80
commit_index = 80
fsm_pending = 0
last_contact = never
last_log_index = 80
last_log_term = 1
last_snapshot_index = 0
last_snapshot_term = 0
num_peers = 2
state = Leader
term = 1

This for the follower where I loaded the data.

raft:
applied_index = 85
commit_index = 85
fsm_pending = 0
last_contact = 44.081724ms
last_log_index = 85
last_log_term = 1
last_snapshot_index = 0
last_snapshot_term = 0
num_peers = 2
state = Follower
term = 1


The applied index is 85 for the follower and 80 for the leader. So they are not in sync.

Goverdhan R

unread,
May 9, 2016, 2:18:29 PM5/9/16
to Consul
I am checking that through the ui, when I kill the server which has the service catalog, the cluster looses all the information.

Also the data directories are not in sync, the services directory inside the consul data-dir is being created only for the server to which I point the registrator. The other two servers in the cluster dont have the services directory at all.

Darron Froese

unread,
May 9, 2016, 2:32:26 PM5/9/16
to Consul
Goverdan,

This actually sounds like it's operating like it should - I think there's some confusion as to how the replication works.

First off - if you have a cluster and your cluster has quorum - then everything is "replicated" and all services should be visible from all nodes.

BUT if you register services on a particular node and then kill that node - those services will disappear from all nodes.

Services have to be active on a node to be visible in your cluster - you can't kill the node where you registered it.

The best way to get this working and reliable:

1. Setup your 3 Consul server nodes - once they're setup and working, leave them alone.
2. Add Consul agent nodes and run Registrator on each of them. When your services come up - register them with the local agent.
3. Then you should be able to see the service listed on all nodes.

You can verify that services are listed on all nodes by using a tool like consul-cli https://github.com/CiscoCloud/consul-cli to veiw the catalog. I think the command you'll want is:

consul-cli catalog services

Also - don't bother looking at the data_dir to "verify" whether or not things are replicated - that's not a good way to do it.

You can also test if your cluster works by adding an item to the KV store - then checking for the same value on other nodes:

Set it on one node:

consul-cli kv write testing "This is the value"

Then read on the others:

consul-cli kv read testing
This is the value

If your cluster doesn't work correctly - you won't be able to do that. If it does - they should work fine.

Does that help at all?

On Mon, May 9, 2016 at 12:18 PM Goverdhan R <goverd...@gmail.com> wrote:
I am checking that through the ui, when I kill the server which has the service catalog, the cluster looses all the information.

Also the data directories are not in sync, the services directory inside the consul data-dir is being created only for the server to which I point the registrator. The other two servers in the cluster dont have the services directory at all.

--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.

GitHub Issues: https://github.com/hashicorp/consul/issues
IRC: #consul on Freenode
---
You received this message because you are subscribed to the Google Groups "Consul" group.
To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool...@googlegroups.com.

Goverdhan R

unread,
May 9, 2016, 3:44:48 PM5/9/16
to Consul
Thanks Darron, let me set this up and try.

I didnt think I needed agents as my services were supposed to be running only on three nodes, so I decided to just have a three server cluster each running on one of the nodes.

I will add agents and register the services with the local agents and see how it works.

Thanks

Goverdhan R

unread,
May 9, 2016, 4:00:41 PM5/9/16
to Consul
And to clarify, you mean consul client when you say local agent right?

Thanks

Darron Froese

unread,
May 9, 2016, 4:05:02 PM5/9/16
to Consul
Goverdhan,

And yes - consul agent / consul client - they're the same thing to me - they just need to NOT be server nodes.

This way - they're properly disposable and connected to the service that you register.

--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.

GitHub Issues: https://github.com/hashicorp/consul/issues
IRC: #consul on Freenode
---
You received this message because you are subscribed to the Google Groups "Consul" group.
To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool...@googlegroups.com.

Goverdhan R

unread,
May 10, 2016, 2:36:05 PM5/10/16
to Consul
Hi Darron

I did set up a 3 server cluster and a consul client on a 4th node.

I added a test kv pair, and it gets replicated across the cluster.

But the behaviour with services remains the same.
I think i was not clear about how replication/service discovery works in a cluster.

Let me explain how I understand it now, and correct me if I am wrong.

1. Each node hosting a service should atleast have a consul client or consul server running.
2. All the services running on any particular host must be registered only with the local consul client/server.
3. In case the consul client/server on any node/host goes down, its going to deregister all the services running on that node.

Thanks

Darron Froese

unread,
May 10, 2016, 6:39:18 PM5/10/16
to Consul
Ghoverdhan,

You're understanding it and 1-3 sound correct.

Does that help make things clearer?
--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.

GitHub Issues: https://github.com/hashicorp/consul/issues
IRC: #consul on Freenode
---
You received this message because you are subscribed to the Google Groups "Consul" group.
To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool...@googlegroups.com.

Goverdhan R

unread,
May 10, 2016, 9:57:02 PM5/10/16
to Consul
Yes, Thanks a lot for the help!
Reply all
Reply to author
Forward
0 new messages