CONSUL Road map

XavM

unread,

May 15, 2014, 6:05:28 AM5/15/14

to consu...@googlegroups.com

Hello,

Could you give some hints about the road map for consul ?

We have read that :

- v0.3 could introduce some "leader election support" for services

- Handlers, events and query (similar to Serf) could be introduced later

- envconsul has just been announced

What are the others things you have in mind ?

What could be the timing for all this ?

Regards,

Xavier

Dr Nic Williams

unread,

May 15, 2014, 11:11:36 AM5/15/14

to XavM, consu...@googlegroups.com

Just last night I started pondering what my options were around consul and/or serf to manage leader/slave elections and triggering scripts. The event handler style from serf looked interesting - except for a service tracking it's cluster and knowing if it's been made the leader.

--
You received this message because you are subscribed to the Google Groups "Consul" group.
To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mitchell Hashimoto

unread,

May 15, 2014, 12:07:42 PM5/15/14

to Dr Nic Williams, XavM, consu...@googlegroups.com

Xavier,

Besides what we say publicly, we don't like to make public promises of
features unless they're already in the pipeline.

As you bullet pointed out, we have some locking primitives coming in
0.3 to help facilitate leader election and some other use cases. The
primitives we're working on are very powerful and we're excited to see
how they're used. It has taken us some time to work on these though
since it has required quite a bit of research for what the tradeoffs
are in a distributed system with regards to locks.

If you have a specific feature request, I'd be happy to comment on it.

Best,
Mitchell

XavM

unread,

May 15, 2014, 7:50:48 PM5/15/14

to consu...@googlegroups.com, Dr Nic Williams, XavM

Thank you for your answer Mitchell,

My question about the road map is not really about one specific feature request, but more on where do you plan to go, when and how

(and where do you know you will not go)

That being said, the specific feature request I have, would be to expose in some way the event handler and query interface available in serf

I know about the blocking queries already available, but as far as I have seen, they work pretty well for kv, but are not so friendly for catalog and health :

"X-Consul-Index" just keep incrementing every second or so, even when no node or service has been "changed"

Those idempotent writes make the use of blocking queries not so useful to detect changes

An other concern is about the distinction that surfaces between KV on the one hand and "nodes + services" on the other hand

Both are really useful, but I still don't see how they will be "glued" together

If they are not, i feel that we could land with one great product (consul), but two distinct functionalities and workflows (KV vs n+s)

Do you plan to allow services and nodes to be bound to a subset ok keys ?

This could allow to discover services, including the nodes:port they are deployed on and associated states, but that would allow to discover their conf as well

Ex:

cat ${data_dir}/myService.json

"service": {

"name": "myService",

"kv": "/services/common/conf", // KV could be generic for the service

"tags": [ // or they could be specific to a tag, overriding the generic

{

"name": "A",

"kv": [ "/services/common/A/conf/", "/services/myService/A/conf/" ]

}

],

...

GET /v1/catalog/service/myService?tag=A

[

{

"Node": "node1",

"Address": "10.0.0.1",

"ServiceID": "myService",

"ServiceName": "myService",

"ServiceTags": [

"A"

],

"ServicePort": 80,

"keys": [

"user=appUser", // This kv comes from "/services/common/conf"

"apiVersion=xxxx", // This one comes from "/services/common/A/conf"

"timeOut=3ms", // This one comes from "/services/myService/A/conf"

]

},

...

I do not pretend this example is the good way to do it, I am just wondering out loud how KV and n+s could be tightly integrated

Lets pretend that we are in a perfect world, what I would love to have :

Any change triggers an event, with change being any of the following :

- Service is registered or de registered

- Service tags have changed

- The pool of underlying nodes that expose this service and/or tag.service has changed (new nodes, less nodes, failing checks, etc ...)

- Some of the KV associated with this service or tag.service have changed (CUD)

1 more question: Do you plan to implement KV store replication between datacenters ?

Anyway, congrats for the great job you have already done with all the awesome HashiProducts

Regards,

Xavier

Brian Lalor

unread,

May 15, 2014, 9:34:58 PM5/15/14

to consu...@googlegroups.com

On May 15, 2014, at 7:50 PM, XavM <mail...@gmail.com> wrote:

I know about the blocking queries already available, but as far as I have seen, they work pretty well for kv, but are not so friendly for catalog and health :

"X-Consul-Index" just keep incrementing every second or so, even when no node or service has been “changed"

I just wanted to address this separately. I also had this problem, but it was because the check output was different on every run. I’ve modified my checks (I’m using the TTL style) so that the notes/output contain a fairly discrete set of output. For example, I initially included the time taken to load a HTTP health check for one of my servers in the output, something like “request took 12ms”. Well, that’s pretty variable (anywhere between 5 and 100ms, say) and each time consul saw that the output changed it’d trigger an update. Instead I just set the notes to “ok” (or don’t set them at all). It’s less useful, but cuts down on the number of events my monitoring system needs to handle.

--

Brian Lalor

bla...@bravo5.org

Armon Dadgar

unread,

May 15, 2014, 9:35:07 PM5/15/14

to XavM, consu...@googlegroups.com, XavM, Dr Nic Williams

Hey,

Sorry, I’ve been a bit delayed to this thread. At this point, the public roadmap

for Consul is the following:

0.2.X:

UI improvements

Bug fix release

Expected in the next few weeks.

0.3:

Experimental support for locking / leader election

DNS performance knobs (TTL + Stale reads)

Expected in the next few weeks

0.4:

Support for handler system (ala Serf style)

Potentially exposing some of the Serf features (Event/Query)

Refine locking / leader election

Expected probably several weeks after 0.3

In terms of some other questions raised here is a short list:

* KV Replication: I plan on releasing a daemon in the next few weeks that operates

independently to do this. It will take a source + destination DC with a key prefix, and

replicate from the source to the destination. This will allow you to specify a particular

DC as authoritative for a key space and replicate master/slave style to other DCs. There

are no plans to support master-master replication, as that is a huge can of worms.

* Integration of Catalog+KV data. I admit a design flaw on my part with this. If I could do

it again, all Consul data would be exposed over a “/proc” like file system, where some keys

are just magically populated while others are standard file-like entries. I don’t think its too

late, and a v2 API could introduce a lot more unification in how data is exposed.

* Service Configs: It seems we need a stronger convention around this. We use envconsul

with some conventions internally, but I can see the use for tighter integration with some

conventions on use. I sent an email to the list about this, and would love to get feedback before

committing to anything. I do however, thing a 0.3 or 0.4 could introduce better support for this.

Lastly, Xavier, the X-Consul-Index does not auto-increment, so there must be lots of idempotent

writes happening in your use case. Probably worth starting a thread about that, since it shouldn’t

be the case.

Best Regards,

Armon Dadgar

unread,

May 15, 2014, 9:37:29 PM5/15/14

to consu...@googlegroups.com, Brian Lalor

Glad you brought this up! We were just talking about this. So I agree that you want to be able

to provide verbose output and you especially care when a check transitions from passing -> critical or

any other transition.

However, what we are thinking is a flag that controls how often “Output” is updated on the servers

if the state is quiescent. As an example, if the check remains in the passing state, only update

the output every 5 minutes. This way, you can have the verbose output you want, and when a check

transitions you get that output immediately, but for a stable check you get relatively up-to-date output.

Thoughts?

Best Regards,

Armon Dadgar

Brian Lalor

unread,

May 15, 2014, 9:57:27 PM5/15/14

to Armon Dadgar, consu...@googlegroups.com

On May 15, 2014, at 9:37 PM, Armon Dadgar <armon....@gmail.com> wrote:

Glad you brought this up! We were just talking about this.

I know. I quoted part of Xav’s message. ;-)

However, what we are thinking is a flag that controls how often “Output” is updated on the servers
if the state is quiescent. As an example, if the check remains in the passing state, only update
the output every 5 minutes. This way, you can have the verbose output you want, and when a check
transitions you get that output immediately, but for a stable check you get relatively up-to-date output.

I think that seems reasonable. As much as I’d like to be able to show up-to-moment information on my monitoring dashboard, the fact of the matter is that processing results for 10,000 checks takes some seconds to execute. It’s not reasonable to update the data in real-time like that. But I think this behavior should be laid out in the docs, as even ping checks and disk free reports can be quite variable.

Even better would be if there were a way to have Consul return what changed when a blocking query returns, rather than the current state of, say, a health check. As it is, whenever a service changes state, I have to poll all the health states to determine the new state (critical → passing? passing → warning?). I’m up to about 10,000 checks right now, so that’s a fair bit for my monitoring system to ingest. What would *really* be useful would be something like /v1/health/state/changes that blocks and returns the state changes.

--

Brian Lalor

bla...@bravo5.org

Reply all

Reply to author

Forward