Hi, as you might infer from recent PRs I posted, I have been looking at the implementation of labels.Labels. I wanted to post a comprehensive design document, but it's been a few months and I didn't finish it, so I thought I would briefly describe the work here.
Motivation: when I look at profiles of Prometheus and Cortex(ish) systems, labels account for 30-40% of Go heap memory. This translates directly into cloud costs, or if you have a machine of a certain size how many series it can handle.
Brief accounting: labels.Labels is a slice, so takes up a 24-byte slice header plus 16-byte string header per name and value, plus the contents of the strings.
Let's take as an example a series with 10 name/value pairs, where each name and value string averages 10 bytes long. Fundamentally the strings need 200 bytes, but labels.Labels takes up 24 + 10 * 2 * (16 + 10) = 544 bytes.
That's 170% overhead.
I have some ideas what to replace this with, but don't want to get into that detail right now.
I will observe that the slice nature of labels.Labels is hard-coded in about 1,000 places in the code of Prometheus. I have done all the work to replace those with an abstraction: all tests pass, and benchmarks go broadly at the same speed.
If there is interest I will make a PR containing just the above change, no change to the structure labels.Labels itself as yet.
Regards,
Bryan Boreham