How to design a well target labels scheme?

21 views
Skip to first unread message

Allenzh li

unread,
Dec 17, 2020, 9:54:48 PM12/17/20
to Prometheus Users
Hi

I use prometheus to monitor service. 

At first, it was simple. I use four labels(team, product, service, instance) to identify a service, the metric like 
```
metric_name{team="t1", product="p1", instance="i1", service="s1", other business label} 
```
I also use the four labels with other labels(such as cpu="xx") to create alerting rules.

With business development and team adjustment, some team have new child team and some service deployed to different city, the four labels needs to be expanded appropriately, may be different scenarios require different numbers of target labels. 

Now I think about two options: 

option 1

I will extend target labels for every specific scene.
For every scene, I will store a target labels list in mysql db, different metric will have diffenert target labels which like 
```
metric_name{scene="s1", team="t1", product="p1", instance="i1", service="s1", ...} 
metric_name{scene="s2", team="t2", product="p2", instance="i2", service="s2", city="c2", ...} 
```
disadvantage: I should maintain every scene target labels.

option 2

I will fllow the CMDB which use a tree to organize team, product, service, city and the relation of them. The tree layers is not fixed, so the target labels will only have the node id and parent id. Every node have their own metric. The metric will like
```
metric_name{leatNodeId="n1", pid="p1",  ...}
```
disadvantage: aggregating high-layer data is not easy since metric not have all parent node info.

Is there any other best practices?




Stuart Clark

unread,
Dec 18, 2020, 3:16:44 AM12/18/20
to Allenzh li, Prometheus Users
Rather than attaching the team, etc labels to every metrics, it can work
well to instead have separate metrics with that information, similar to
machine role metrics as described here:


https://www.robustperception.io/how-to-have-labels-for-machine-roles

Allenzh li

unread,
Dec 18, 2020, 5:16:55 AM12/18/20
to Prometheus Users
Understaood.

Thank you very much.

Ben Kochie

unread,
Dec 18, 2020, 5:21:40 AM12/18/20
to Allenzh li, Prometheus Users
Locality information is good to have, but you don't want to go too wild with it.

For example, if you have a large network, where services are running in multiple networks/regions/zones/etc, you will likely have or want multiple Prometheus servers.

For example, we have Prometheus servers in each cloud provider zone/region. This avoids cross-zone issues.

For this kind of identification, we use external labels, rather than labels on each metric.

We then use Thanos as a top-level query service, which uses those external labels to route queries to the right location.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/a517be9f-c02b-4f77-9ddd-fd498d527d1fn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages