Is remote read right approach for for cross cluster or distributed environment alert rules ?

108 views
Skip to first unread message

Rajesh Reddy Nachireddi

unread,
May 17, 2020, 5:53:55 AM5/17/20
to Prometheus Users, Julius Volz
Hi,

Basically, we have large networking setup with 10k devices. we are hitting 1M metrics every second from 20 % of devices itself, so we have 5 prom instances and one global proemtheus which uses remote read to handle alert rule evaluations and thanos querier for visualisation on grafana.

We have segregated devices with specific device ip ranges to each Prometheus instances. 

So, we have one aggregator which is using remote read from all the individual prom instances through remote read

1. will the remote read cause an issue w.r.t loading the large time series over wire every 1 min ?
2. Is it CPU or memory intensive ?

What is best design strategy to handle these scale and alerting across the devices or metrics ?

Regards,

Rajesh

Brian Brazil

unread,
May 17, 2020, 6:05:28 AM5/17/20
to Rajesh Reddy Nachireddi, Prometheus Users, Julius Volz
Remote read is unlikely to be the best approach here, it's pulling tons of raw data over the network on every evaluation which have to be buffered up in RAM.

What you want to do here is do as much of the alerting&rules on the scraping Prometheus servers as is possible. For things that you can't do that way (e.g. 10% of devices are down globally), use federation to pass up e.g. total number of devices down in each Prometheus to the global and alert on that.

--

Rajesh Reddy Nachireddi

unread,
May 17, 2020, 6:19:51 AM5/17/20
to Brian Brazil, Prometheus Users, Julius Volz
Hi Brian,

Yes, agree with above implications. But, even with the case of federations it has to take care of all alert rules and loading the time series into RAM.

Main issue with networking devices rules are, we don't know when we want to raise an alert across devices (proemtheus shards). In this case, we thought remote read will be just an overhead with the scraping and RAM (not too much compared to actual ingestion). Main issue was with Proemthes is getting crashed without prior notiifcations such OOM with large label values or some other indications.

if you have any case studies on proemtheus crash handling cross cluster alerting , please let us know.


Regards
Rajesh

Ben Kochie

unread,
May 26, 2020, 2:07:05 AM5/26/20
to Rajesh Reddy Nachireddi, Prometheus Users, Julius Volz
This is probably a case where you would want to look into Thanos or Cortex to provide a larger aggregation layer on top of multiple Prometheus servers.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAEyhnp%2BfG8YvciR4-30D%2BzsDzg_kF%2BKkJUavdbyGCxoz-97q_A%40mail.gmail.com.

Rajesh Reddy Nachireddi

unread,
May 27, 2020, 12:46:51 PM5/27/20
to Ben Kochie, Prometheus Users, Julius Volz
Hi Ben,

Does latest version of Cortex /Thanos supports the alerting with multiple shards of prometheus ?
Thanos Ruler wasn't ready for production to evalute the expression across the prometheus instances .. Do we have any docuemnet or blog about this ?

Thanks,
Rajesh

Ben Kochie

unread,
May 27, 2020, 2:08:28 PM5/27/20
to Rajesh Reddy Nachireddi, Prometheus Users, Julius Volz
The Thanos rule component has always supported multiple instances. The document warns about the downsides, but this doesn't make it not work.

Aliaksandr Valialkin

unread,
May 27, 2020, 3:09:13 PM5/27/20
to Prometheus Users
Take a look also at the following projects:

* Promxy - it allows executing alerts over multiple Prometheus instances. See these docs for details.
* VictoriaMetrics+vmalert. Multiple Prometheus instances may write data into a centralized VictoriaMetrics via remote_write API, then vmalert may be used for alerting on top all the collected metrics in VictoriaMetrics.



--
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics


--
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

Rajesh Reddy Nachireddi

unread,
May 28, 2020, 12:29:07 AM5/28/20
to Aliaksandr Valialkin, Ben Kochie, Prometheus Users
Thanks Aliaksandr and Ben.

Do you have any suggestion for autoscaling of prometheus or victoriametrics ?

Currently, we have allocated 200GB of RAM for each prometheus instance, but we are not sure when it gets filled up due to high cardinality OOM issues.

@Ben Kochie -  Regarding ruler in Thanos, Is it production ready component ?

@Aliaksandr Valialkin - Do we have any document which talks about pros and cons about remote_read vs remote_write in prometheus vs Victoriametrics ?

Regards,
Rajesh

Ben Kochie

unread,
May 28, 2020, 5:12:05 AM5/28/20
to Rajesh Reddy Nachireddi, Prometheus Users
Typically the recommend way to scale is to shard by failure domain, then by function. For example, we have different Prometheus servers for different networks within our production environment. Within each network we also shard by function. One Prometheus for monitoring databases, one for application metrics, one catch all. This allows for better isolation between teams managing different services. We use Thanos as an aggregation layer on top of these distributed Prometheus servers.

Without more information on your network/monitoring architecture, I can't say specifically how to best shard your setup.

Yes, the Rule service works just fine, we are using it in production.

Rajesh Reddy Nachireddi

unread,
May 28, 2020, 5:44:07 AM5/28/20
to Ben Kochie, Prometheus Users
Thanks Ben.

we have 2 prometheus environments - one for each function and each function has one querier

How do we scale prometheus shards with in the function.

Example: when we have 1000 devices send the data to 1 shard, it is getting killed with OOM. we have increased 2 and then to 3, 4, 5 based on no.of metrics/frequency of collection. But the only issue is we don't know when the scale up has to happen as we are waiting for pod killed with OOM. Though we lookup at robust perception blog on RAM sizing, not much helpful to predict this .

If you have any insights about the which metrics to consider along with memory/cpu/label cardinality/size of metric or hybrid approach.. please do let us know

Regards,

Rajesh
Reply all
Reply to author
Forward
0 new messages