Potential big performance gain in common/expfmt/TextParser

143 views
Skip to first unread message

Antoine Pultier

unread,
Jun 23, 2023, 3:35:56 AM6/23/23
to Prometheus Developers
Hi,

I am parsing a large number of metrics, and I noticed that the Prometheus expfmt.TextParser takes a significant amount of CPU time on my machine.

I also noticed that VictoriaMetrics has an entirely different parsing implementation that is faster on my machine. I have not conducted extensive benchmarking; I'm unsure if I want to. But you can find a small comparison at the end of the email with a small string to parse and a 5MB string full of metrics and labels to parse.

I read both implementations, both open-source with the Apache 2.0 license, and I guess the main difference is the extensive use of strings.IndexByte in the VictoriaMetrics parser. Golang provides a fast implementation to look for a byte in a string, which is much faster than scanning and comparing byte per byte (on common CPU architectures).
Example for arm64: https://github.com/golang/go/blob/e45202f2154839f713b603fd6e5f8a8ad8d527e0/src/internal/bytealg/indexbyte_arm64.s I discovered the existence of such optimisations while reading this article about ripgrep: https://blog.burntsushi.net/ripgrep/#literal-optimizations

I'm not a Prometheus developer, but I would guess that completely replacing the parser with another one is not on the table, but doing some changes to the existing one could be possible.

However, it seems to require significant changes to gain performance. I'm wondering whether the Prometheus project would welcome substantial changes inside the parser at this point. One change would be to load more data at once. Perhaps the whole data into a string in memory like VictoriaMetrics does, which has some implications. And also the use of strings.IndexBytes and slices instead of constructing many strings byte by byte. These changes will probably make the parser less elegant, but that may or may not be worth it.

---
The tiny benchmark:
---
goos: darwin
goarch: arm64
pkg: simple-bench
BenchmarkPrometheusTextParserMinimal-8      416382       2798 ns/op
BenchmarkVictoriaMetricsTextParserMinimal-8   3622894       296.1 ns/op
BenchmarkPrometheusTextParserBig-8          4    287416010 ns/op
BenchmarkVictoriaMetricsTextParserBig-8       142     8374695 ns/op

Ben Kochie

unread,
Jun 23, 2023, 4:07:16 AM6/23/23
to Antoine Pultier, Prometheus Developers
We've done at least one rewrite of the parser in the past. We do substantial changes to our subsystems all the time. For example, the "stringlabels" changes were a recent substantial change to the internals of in-memory label storage.

The only things we want to avoid is breaking existing users and reducing the correctness of the parser.

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/31a41b4f-cbcb-40c7-9df8-f1deddd15a32n%40googlegroups.com.

Julien Pivotto

unread,
Jun 23, 2023, 4:42:38 AM6/23/23
to Ben Kochie, Antoine Pultier, Prometheus Developers
On 23 Jun 10:07, Ben Kochie wrote:
> We've done at least one rewrite of the parser in the past. We do
> substantial changes to our subsystems all the time. For example, the
> "stringlabels" changes were a recent substantial change to the internals of
> in-memory label storage.
>
> The only things we want to avoid is breaking existing users and reducing
> the correctness of the parser.

Yes, that parser is meant for correctness and is not used in the
Prometheus server itself.
> > <https://groups.google.com/d/msgid/prometheus-developers/31a41b4f-cbcb-40c7-9df8-f1deddd15a32n%40googlegroups.com?utm_medium=email&utm_source=footer>
> > .
> >
>
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/CABbyFmoT5-Q%3DvDqrjT9sP4w9h-c7ogMGk8vNp_16FG8nkZAJKg%40mail.gmail.com.

--
Julien Pivotto
@roidelapluie

Bryan Boreham

unread,
Jun 23, 2023, 2:07:58 PM6/23/23
to Prometheus Developers
For completeness, the parsers that Prometheus uses for scraping live here:

Labels are cached by the scraper, so parsing is only done once for the lifetime of each target.
This means that Prometheus is not too sensitive to the performance of parsing.
But there is a request to make this cache optional: https://github.com/prometheus/prometheus/issues/12443

Bryan

Antoine Pultier

unread,
Jun 26, 2023, 3:07:28 AM6/26/23
to Bryan Boreham, Prometheus Developers

All right, sorry about the noise I wasn’t aware of the existence of two parsers, and I found the slow but correct one first. I quickly edited my stupid benchmark, and it looks like the lex parser is indeed much faster. It’s a bit slower than the Victoria one, but probably not enough to justify a change. And lex parsers are neat.

 

BenchmarkPrometheusTextParserMinimal-8            411118              2733 ns/op

BenchmarkVictoriaMetricsTextParserMinimal-8      4048662               291.1 ns/op

BenchmarkPrometheusLexMinimal-8                  2301423               526.0 ns/op

 

BenchmarkPrometheusTextParserBig-8                  2158            536188 ns/op

BenchmarkVictoriaMetricsTextParserBig-8           122446              9699 ns/op

BenchmarkPrometheusLexBig-8                        33807             29730 ns/op

 

Thanks for your answers.

 

--
You received this message because you are subscribed to a topic in the Google Groups "Prometheus Developers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/prometheus-developers/mWgJyg1VYdQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to prometheus-devel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/3384e7dc-d495-4930-be9d-d1e36ad87d00n%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages