Prometheus and rolling update services

27 views
Skip to first unread message

Ivan Pohodnya

unread,
Apr 4, 2020, 8:53:49 AM4/4/20
to Prometheus Users


Hello everyone, i have question about rolling update services (mesos/marathon orchestration). 

When we upgrade service (or just restart), it can be possible two services working on the same server simultaneously for short period, but each service reports metrics, new with reseted counter and old with old counter values. Prometheus starts scrape two services and it results in spikes in metrics (see attachments).

Is where any decision other than adding unique id to each scrape service? 
resets_counter.png
request_counter.png

Ivan Pohodnya

unread,
Apr 4, 2020, 8:56:48 AM4/4/20
to Prometheus Users
spike in rates 

суббота, 4 апреля 2020 г., 15:53:49 UTC+3 пользователь Ivan Pohodnya написал:
spikes.png

Ben Kochie

unread,
Apr 4, 2020, 9:30:41 AM4/4/20
to Ivan Pohodnya, Prometheus Users
I assume you're hitting the metrics through some kind of load balancer. Prometheus assumes direct access to each instance of an application, rather than through a load balancer.


--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/f18543da-ffb4-447f-b28b-8f899b3970aa%40googlegroups.com.

Ivan Pohodnya

unread,
Apr 4, 2020, 10:03:33 AM4/4/20
to Prometheus Users
It is marathon_sd_config (https://prometheus.io/docs/prometheus/latest/configuration/configuration/#marathon_sd_config), prometheus get instances from service discovery (marathon), and call them directly, i think it something wrong with prometheus merathon_sd module. 

суббота, 4 апреля 2020 г., 16:30:41 UTC+3 пользователь Ben Kochie написал:
I assume you're hitting the metrics through some kind of load balancer. Prometheus assumes direct access to each instance of an application, rather than through a load balancer.


On Sat, Apr 4, 2020 at 2:53 PM Ivan Pohodnya <pohodn...@gmail.com> wrote:


Hello everyone, i have question about rolling update services (mesos/marathon orchestration). 

When we upgrade service (or just restart), it can be possible two services working on the same server simultaneously for short period, but each service reports metrics, new with reseted counter and old with old counter values. Prometheus starts scrape two services and it results in spikes in metrics (see attachments).

Is where any decision other than adding unique id to each scrape service? 

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to promethe...@googlegroups.com.

Matthias Rampke

unread,
Apr 5, 2020, 12:25:49 PM4/5/20
to Ivan Pohodnya, Prometheus Users
In your scrape configuration, make sure that the instance label distinguishes between the instances. In your case, both seem to have the value "server1". Prometheus needs to see both counters as separate time series to be able to handle the reset.

Keep "server1" or a something else to identify the *group* of instances separate, in a different label (or use the job label if suitable) to aggregate over instances at query time.

/MR

To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/dfc0f0bb-77cc-411a-b6d2-1d3bf9dec5a5%40googlegroups.com.

Ivan Pohodnya

unread,
Apr 6, 2020, 6:17:51 PM4/6/20
to Prometheus Users
thanks, yes, i think we will use unique labels per app although it is mostly useless because we have one app instance per host

воскресенье, 5 апреля 2020 г., 19:25:49 UTC+3 пользователь Matthias Rampke написал:
Reply all
Reply to author
Forward
0 new messages