Proposal: reloadable collector config

Max Englander

unread,

Jul 31, 2023, 12:46:40 AM7/31/23

to Prometheus Developers

Hello,

I would like the propose adding to mysqld_exporter the ability to /-/reload the scraper configuration. Currently scrapers are only configurable via --collect.<name>[.arg] and cannot be reconfigured at runtime, so this proposal also entails allowing collectors to be configured via a file, and then to have the ability to reconfigure scrapers when the contents of that file change and /-/reload is called.

I opened a GH issue here. It was closed by Ben saying that the maintainers had no plans to implement, but didn't say whether they would be open to a contribution, which I'd be happy to offer if so.

The reason this would be useful to me is that my organization runs mysqld_exporter in a container co-located with a mysqld container in the same Kubernetes pod. mysqld has a socket but no port so there's no way to reconfigure the mysqld_exporter container without recreating the pod. I realize this situation may not be common, but hopefully this idea would be generally useful and merit consideration here.

I have prototyped a solution locally that works pretty well and is fully backwards compatible.

Thanks,

Max

Matthias Rampke

unread,

Sep 25, 2023, 9:39:59 AM9/25/23

to Max Englander, Prometheus Developers

Hmm, I see your use case, and that restarting the exporter when it's the sidecar of a stateful pod is painful. However, wherever we have introduced live reload (such as the statsd exporter) it has brought a lot of complexity and potential bugs that need to be taken into account for all subsequent development.

Can you say more about the use case for turning individual collectors on or off while a pod is running? Under what circumstances do you need to do this? Would it be an on/off for everything at once, or do you need to enable specific collectors for specific pods at specific times?

There's a mechanism to filter collectors at scrape time, i.e. in Prometheus configuration. I believe you may be able to use that instead of incurring the cost of live reload: configure the exporter with the superset of collectors you may want for a given pod, then use the filter mechanism to only request the metrics that you need. With this, you can make use of Prometheus' service discovery (and the prometheus-exporter) to change what you actually collect at run time: either change the *Monitor to toggle a collector on for all monitored pods, or write relabeling rules to make use of a pod annotation to decide whether to include or exclude certain collectors. This will probably get very annoying if you are looking for a generic "let me configure every collector differently for every pod" mechanism, but it should be manageable for either "configure the list of collectors for all pods" (change the monitor) or "toggle extra debug collectors for a specific pod" (have a relabeling rule that matches a "debug yes/no" annotation).

An alternative fallback solution would be to wrap the exporter in a binary (or shell script) that reads the list of exporters from a (configmap) file and restarts it as necessary.

Would either of these solve the use case without incurring the maintenance overhead of live reload capabilities in the exporter?

/MR

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/9c110603-3652-44c6-a196-1334dd1f1d65n%40googlegroups.com.

Max Englander

unread,

Sep 27, 2023, 9:42:18 AM9/27/23

to Prometheus Developers

Hi Matthias,

Thanks for the thoughtful reply. We have many thousands of pods running mysqld and mysqld_exporter together with a baseline mysqld_exporter configuration. The use cases that would be served by being able to modify the collector config at runtime:

Occasionally, we will want to modify the baseline configuration for all pods.
Other times, we want to turn on additional collectors for a given pod for debugging an issue, and then return to baseline after.
Some pods deviate from the baseline for indefinite periods of time, e.g. for applications with unusual workloads.

In the first case, recreating every pod is pretty expensive. For the second case, even when only one pod needs to be changed for debugging purposes, recreating the pod is a non-option because restarting mysql could change the conditions of the bug we're trying to pinpoint. In all cases, recreating the pod has the potential to interrupt service to mysql.

Filtering scrapers at collect time might work in some cases, but not in others. For example, parameters such `collect.perf_schema.eventsstatements.limit` cannot be modified through filtering. If the filtering collectors approach could be extended to support all parameters, and not just on/off, then this would be great. However, I think to support that it would be necessary to rework the code in a similar way to this draft PR.

Your other idea (wrap mysqld_exporter and restart as changes detected in volume) would probably work fine for us, just a preference I have for mysqld_exporter to handle things natively :).

I totally understand if there's not enough benefit to the community to outweigh the maintenance costs for maintainers. Thanks again for the consideration.