RFC: etcd service discovery implementation

225 views
Skip to first unread message

David Wragg

unread,
Feb 10, 2016, 9:18:36 AM2/10/16
to prometheus...@googlegroups.com
Hi,

I have written an etcd service discovery implementation. It needs some
further polishing to be submitted as a PR, but it works:

https://github.com/dpw/prometheus/tree/etcd-sd

I noticed that there is a blog post from last year describing how to do
etcd service discovery externally via file_sd_configs
(<https://prometheus.io/blog/2015/08/17/service-discovery-with-etcd/>).
That would approach would be a bit awkward in our context, which is why
I implemented direct support instead.

Before proceeding towards a PR, I wanted to check whether it would be
mergeable in principle. Are the core prometheus developers still happy
to extend the set of built-in service discovery implementations?

Thanks,
David

Fabian Reinartz

unread,
Feb 10, 2016, 9:36:11 AM2/10/16
to David Wragg, prometheus...@googlegroups.com
Hi David,

Quite a bit of thought went into whether it makes sense to natively support generic key-value stores in some way.
While they fit very well to do service discovery, they also allow endless different approaches to structure the data.

Your implementation probably fits your use case perfectly. The next user might store data slightly differently and thus cannot use your implementation at all.
So it makes only sense to support widely used/standardized formats (e.g. serversets for Zookeeper) natively.
We are happy to extend native service discovery support, but with lack of standardization it's unfortunately not feasible for generic key-value stores.

Does that make sense?

Out of curiosity: What would've been awkward about the file-based approach for your use case?


Cheers,
Fabian

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Brian Brazil

unread,
Feb 10, 2016, 9:37:48 AM2/10/16
to David Wragg, Prometheus Developers
Our understanding is that there is no canonical way to do service discovery with etcd, and thus we can't usefully offer anything as part of Prometheus as making it sufficiently generic wouldn't be sane.

For example your implementation seems to use the value in etcd as a straight host:post. I presume it's possible that some organisations may have JSON or other data in there instead (such as with the two Zookeeper SDs we have), or have a multi-file approach.

Put another way, to be called the etcd service discovery inside Prometheus we'd need it to be the way that everyone does service discovery with etcd. I lack experience with etcd to know whether that's the case. If this is a widely used and well defined format, we may be able to accept it under a more specific name.

--

David Wragg

unread,
Feb 10, 2016, 12:08:54 PM2/10/16
to Fabian Reinartz, prometheus...@googlegroups.com
Fabian Reinartz <fab.re...@gmail.com> writes:
> Quite a bit of thought went into whether it makes sense to natively support
> generic key-value stores in some way.
> While they fit very well to do service discovery, they also allow endless
> different approaches to structure the data.
>
> Your implementation probably fits your use case perfectly. The next user
> might store data slightly differently and thus cannot use your
> implementation at all.
> So it makes only sense to support widely used/standardized formats (e.g.
> serversets for Zookeeper) natively.
> We are happy to extend native service discovery support, but with lack of
> standardization it's unfortunately not feasible for generic key-value
> stores.
>
> Does that make sense?

I understand the argument, but it's not one I would agree with.

You are right that an organization might have the relevant service
information in etcd in some in-house format. But by the same token they
might also have the information in a file, but in a different format to
the one supported by file_sd_config. I don't think it is unreasonable
for prometheus to dictate the format of the data in etcd, just as it
does for file_sd_config (or perhaps support a small number of different
conventions, as it does for dns_sd_config).

I don't have a strong view on what the format of the target information
in etcd should be. In our use, prometheus is consuming target
information from an etcd directory that exists just for prometheus; no
other part of the system looks at it, so any format is fine. I'd be
fine with changing the convention to match the adapter code in the blog
post, or a format that supports arbitrary labels as the file_sd_config
format does.

> Out of curiosity: What would've been awkward about the file-based approach
> for your use case?

We're running prometheus in a docker container. Running a separate
container to host the etcd file exporter program would be unwelcome. So
we'd have to derive a container image that runs both prometheus and the
exporter program. Which means adding something like supervisord to the
image.... Maintaining such a thing is not obviously preferable to
maintaining a non-mainline prometheus branch with built-in etcd service
discovery.

In other words, it doesn't matter *how* prometheus implements etcd
service discovery. It matters whether it implements it out of the box.

David

Fabian Reinartz

unread,
Feb 10, 2016, 12:33:59 PM2/10/16
to David Wragg, prometheus...@googlegroups.com
I see what you mean. Your setup has a different direction than it will be for most users. Most users have an existing service discovery that Prometheus has to deal with as it is.
The file format is not meant to be a way to do service discovery. It's meant as a plugin mechanism to translate your custom SD to something Prometheus can understand.

You might not have a strong opinion on how the format in etcd should be. But if we say "this is what Prometheus can read from etcd, and this is how you have to do SD with etcd in your whole company" many others will have very strong opinions. We would implicitly attempt to define a standard. This is simply not in scope of Prometheus as a project.

Running small helper components next to the main application is ubiquitous in the Prometheus universe (every exporter has this requirement). Efforts like the Application Container Specification and Kubernetes with their idea of pods indicate that this is a reasonable approach.

Implementing something out of the box cannot require users to bend the rest of their infrastructure to suit our opinion.


Ben Kochie

unread,
Feb 11, 2016, 4:02:37 AM2/11/16
to David Wragg, Fabian Reinartz, Prometheus Developers

You are right that an organization might have the relevant service
information in etcd in some in-house format.  But by the same token they
might also have the information in a file, but in a different format to
the one supported by file_sd_config.  I don't think it is unreasonable
for prometheus to dictate the format of the data in etcd, just as it
does for file_sd_config (or perhaps support a small number of different
conventions, as it does for dns_sd_config).

I agree here, supporting a basic discovery method that is similar to file_sd_config would be a perfectly valid way to support etcd without the need for sidecars.  If someone has a more complicated etcd method than this, they're welcome to build a sidecar for their use case.

Another way to put this, instead of implementing etcd support, what if file_sd_config simply supported etcd paths?

Björn Rabenstein

unread,
Feb 16, 2016, 6:19:12 AM2/16/16
to Ben Kochie, David Wragg, Fabian Reinartz, Prometheus Developers
On 11 February 2016 at 10:02, Ben Kochie <sup...@gmail.com> wrote:
> Another way to put this, instead of implementing etcd support, what if
> file_sd_config simply supported etcd paths?

Without having deeper insights into the details, this sounds pretty good to me.

--
Björn Rabenstein, Engineer
http://soundcloud.com/brabenstein

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany
Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B

David Wragg

unread,
Feb 16, 2016, 6:51:32 AM2/16/16
to Björn Rabenstein, Ben Kochie, Fabian Reinartz, Prometheus Developers
Hi,

Björn Rabenstein <bjo...@soundcloud.com> writes:
> On 11 February 2016 at 10:02, Ben Kochie <sup...@gmail.com> wrote:
>> Another way to put this, instead of implementing etcd support, what if
>> file_sd_config simply supported etcd paths?
>
> Without having deeper insights into the details, this sounds pretty
> good to me.

It's possible. I think it would still be desirable to have a distinct
etcd_sd_config, rather than overloading file_sd_config. But it could be
made to mirror file_sd_config quite closely.

I think the most likely mode of use would still be to have a single etcd
value per target (in order to avoid compare-and-swap operations, and for
expiry of target information using TTLs). But the file_sd_config
approach would not prevent that.

I'm not sure this approach will swing the argument, which did not seem
to hinge on such format details. But I'd be happy to resume the
discussion if I'm wrong about that.

David

Fabian Reinartz

unread,
Feb 19, 2016, 4:45:15 PM2/19/16
to David Wragg, Björn Rabenstein, Prometheus Developers
A general purpose etcd config is unfeasible due to reasons described earlier.
Should there ever be an established format, this will change (c.f. SD for serversets and nerve).

Reading our file SD format from different backends is already possible.
There's a tool for that called confd.

To me it's pretty clear that the existing native integrations we have come with a significant maintenance burden already. Reimplementing and maintaining our own confd plus configuration entry points for all these backends is simply not feasible if there's a well-working tool already.

If someone writes a respective example confd template, we can make it visible in one of our repositories.
Reply all
Reply to author
Forward
0 new messages