Many names, few labels vs. few names, many labels

Jan Algermissen

unread,

Oct 6, 2016, 5:23:47 PM10/6/16

to Prometheus Developers

Hi,

I am wondering what the difference is between having a small set of metric names and differentiate using labels and differentiating based on name with fewer labels.

Suppose I have

- services A,B,C,D,E,....

- components in these services, eg. "DB" , "QUEUE","FTP","ODBC", representing connections to upstream systems

- a list of errors possibly occurring when interacting with these upstreams (TIMEOUT, LOGIN_FAILED, UNKNOWN_HOST"...)

Now, I could create metrics names like these (many names, no labels):

A_QUEUE_TIMEOUT_COUNTER

B_QUEUE_TIMEOUT_COUNTER

C_QUEUE_TIMEOUT_COUNTER

...

F_ODBC_UNKNOWN_HOST_COUNTER

But I could also create names like these (few names, many labels)

TIMEOUT_COUNTER{service=A,component=DB}

In my understanding, in the many-names-few-labels case, I am loosing the ability to efficiently aggregate, eg all timeouts. Is that correct?

On the other hand, I would prefer the later variant, with a few names and using labels for variations to be prepared for a diversity of aggregations / views.

My question: is the a performance or other impact of having few metrics with lots of labels?

How can I determine the tipping point? IOW, how do I know I am using too many labels and should rather vary on names?

Or does it make no difference whatsoever, because inside prometheus any unqiue combination becomes an equally significant time series anyhow?

(I understand that having unbounded number of labels is a problem - that is however not the issue - the eventual amount of combinations would be exactly the same in my case, either way)

Jan

Brian Brazil

unread,

Oct 6, 2016, 6:38:49 PM10/6/16

to Jan Algermissen, Prometheus Developers

On 6 October 2016 at 22:23, Jan Algermissen <algermi...@gmail.com> wrote:

Hi,

I am wondering what the difference is between having a small set of metric names and differentiate using labels and differentiating based on name with fewer labels.

Suppose I have

- services A,B,C,D,E,....
- components in these services, eg. "DB" , "QUEUE","FTP","ODBC", representing connections to upstream systems
- a list of errors possibly occurring when interacting with these upstreams (TIMEOUT, LOGIN_FAILED, UNKNOWN_HOST"...)

Now, I could create metrics names like these (many names, no labels):

A_QUEUE_TIMEOUT_COUNTER
B_QUEUE_TIMEOUT_COUNTER
C_QUEUE_TIMEOUT_COUNTER
...
F_ODBC_UNKNOWN_HOST_COUNTER

But I could also create names like these (few names, many labels)

TIMEOUT_COUNTER{service=A,component=DB}

In my understanding, in the many-names-few-labels case, I am loosing the ability to efficiently aggregate, eg all timeouts. Is that correct?

On the other hand, I would prefer the later variant, with a few names and using labels for variations to be prepared for a diversity of aggregations / views.

My question: is the a performance or other impact of having few metrics with lots of labels?

They're broadly the same performance wise.

The main thing to consider is if they all have the same semantics, which usually means you're instrumenting them all via one single common library that acts as a natural chokepoint.

Brian

How can I determine the tipping point? IOW, how do I know I am using too many labels and should rather vary on names?

Or does it make no difference whatsoever, because inside prometheus any unqiue combination becomes an equally significant time series anyhow?

(I understand that having unbounded number of labels is a problem - that is however not the issue - the eventual amount of combinations would be exactly the same in my case, either way)

Jan

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-developers+unsub...@googlegroups.com.
To post to this group, send email to prometheus-developers@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/3e0e4110-df46-4779-8c72-4936c455b932%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Brian Brazil

www.robustperception.io

Matthias Rampke

unread,

Oct 7, 2016, 4:44:45 AM10/7/16

to Brian Brazil, Jan Algermissen, Prometheus Developers

Just as a data point, SoundCloud started out with "many names, few labels".

A few months ago we (painfully) converted everything using our standard internal framework to be "few names, many labels". We did it to be able to re-use rules files (before, we had many copy-pasted, slightly divergent ones) and have a standard set of Grafana dashboards.

Especially the latter, using templating variables from queries, are invaluable to us now. A new service has a dashboard by virtue of there being time series with that label, no configuration required.

/MR

To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
To post to this group, send email to prometheus...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/3e0e4110-df46-4779-8c72-4936c455b932%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Brian Brazil
www.robustperception.io

--

You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
To post to this group, send email to prometheus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/CAHJKeLq1zzaNr8PdAkM4_ms96nXXP-aZUKC_Qwkddw1NrFZT8Q%40mail.gmail.com.

Reply all

Reply to author

Forward