Reduce the number of time series

703 views
Skip to first unread message

Robson Peixoto

unread,
Mar 26, 2017, 6:25:08 PM3/26/17
to Prometheus Users
Hi!

I opened the issue about a OOM kill https://github.com/prometheus/prometheus/issues/2525 and Björn Rabenstein show me that my problem is the number of time series.

Do you have any tip how to reduce the number of time series?

I'd like to know how to count the number of time series for each metrics to figure out if exists any instrumentation problem.

I'm using some exports, like mesos-exporter, just to know if the process is up of down. Any tip how to monitor a app and exclude all not used metrics?

Thanks a lot!

Julius Volz

unread,
Mar 26, 2017, 6:34:59 PM3/26/17
to Robson Peixoto, Prometheus Users
Hi,

these two articles will hopefully be useful for you:


Short answer:

  topk(10, count by (__name__) ({__name__=~".+"}))

Cheers,
Julius

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/8e817a0d-0ac5-45ad-b3a2-f335e3936b4b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Robson Roberto Souza Peixoto

unread,
May 19, 2017, 7:37:25 AM5/19/17
to Prometheus Users
Thanks Julius,

After migrate our apps to another datacenter, I'm backing to work on this.
Sorry the delay to reply.

One more question. How can I configure the crashrecovery.go to ignore all not indexed metrics?
I'm getting another problem of OOM kill :(

Thanks.

On Sun, 26 Mar 2017 at 19:34 Julius Volz <juliu...@gmail.com> wrote:
Hi,

these two articles will hopefully be useful for you:


Short answer:

  topk(10, count by (__name__) ({__name__=~".+"}))

Cheers,
Julius

On Mon, Mar 27, 2017 at 12:25 AM, Robson Peixoto <robson...@gmail.com> wrote:
Hi!

I opened the issue about a OOM kill https://github.com/prometheus/prometheus/issues/2525 and Björn Rabenstein show me that my problem is the number of time series.

Do you have any tip how to reduce the number of time series?

I'd like to know how to count the number of time series for each metrics to figure out if exists any instrumentation problem.

I'm using some exports, like mesos-exporter, just to know if the process is up of down. Any tip how to monitor a app and exclude all not used metrics?

Thanks a lot!

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.
--
Robson Roberto Souza Peixoto
Robinho
Master in Computer Science, University of Campinas
IRC: robsonpeixoto
Twitter: http://twitter.com/robinhopeixoto
github: https://github.com/robsonpeixoto

Brian Brazil

unread,
May 19, 2017, 7:43:44 AM5/19/17
to Robson Roberto Souza Peixoto, Prometheus Users
On 19 May 2017 at 12:37, Robson Roberto Souza Peixoto <robson...@gmail.com> wrote:
Thanks Julius,

After migrate our apps to another datacenter, I'm backing to work on this.
Sorry the delay to reply.

One more question. How can I configure the crashrecovery.go to ignore all not indexed metrics?
I'm getting another problem of OOM kill :(


Try a newer version of Prometheus, that OOM should have been fixed.

Brian

 
Thanks.

On Sun, 26 Mar 2017 at 19:34 Julius Volz <juliu...@gmail.com> wrote:
Hi,

these two articles will hopefully be useful for you:


Short answer:

  topk(10, count by (__name__) ({__name__=~".+"}))

Cheers,
Julius

On Mon, Mar 27, 2017 at 12:25 AM, Robson Peixoto <robson...@gmail.com> wrote:
Hi!

I opened the issue about a OOM kill https://github.com/prometheus/prometheus/issues/2525 and Björn Rabenstein show me that my problem is the number of time series.

Do you have any tip how to reduce the number of time series?

I'd like to know how to count the number of time series for each metrics to figure out if exists any instrumentation problem.

I'm using some exports, like mesos-exporter, just to know if the process is up of down. Any tip how to monitor a app and exclude all not used metrics?

Thanks a lot!

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
--
Robson Roberto Souza Peixoto
Robinho
Master in Computer Science, University of Campinas
IRC: robsonpeixoto
Twitter: http://twitter.com/robinhopeixoto
github: https://github.com/robsonpeixoto

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAEA_%3DxJi1nC9Kuot7hs-qq1gV9pZYmSd934Vnt3SXt2dxNHxDQ%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.



--

Robson Roberto Souza Peixoto

unread,
May 19, 2017, 10:57:38 AM5/19/17
to Prometheus Users
On Fri, May 19, 2017 at 8:43 AM Brian Brazil <brian....@robustperception.io> wrote:
On 19 May 2017 at 12:37, Robson Roberto Souza Peixoto <robson...@gmail.com> wrote:
Thanks Julius,

After migrate our apps to another datacenter, I'm backing to work on this.
Sorry the delay to reply.

One more question. How can I configure the crashrecovery.go to ignore all not indexed metrics?
I'm getting another problem of OOM kill :(


Try a newer version of Prometheus, that OOM should have been fixed.


I got this problem upgrading from 1.6.1 to 1.6.3.
The prometheus are working well with:
- sample ingestion of ~15k
- 4 cpus
- 23Gb memory
- 358 target => count(up)
- 424,541 series => sum(count by (__name__)({__name__=~".+"}))
- storage.local.target-heap-size=15032385536 # 15Gb

But after restart it show that had more than 10 millions metrics queued for indexing.

I deleted all data to make it works again.
 
Thanks,
Robinho

Brian

 
Thanks.

On Sun, 26 Mar 2017 at 19:34 Julius Volz <juliu...@gmail.com> wrote:
Hi,

these two articles will hopefully be useful for you:


Short answer:

  topk(10, count by (__name__) ({__name__=~".+"}))

Cheers,
Julius

On Mon, Mar 27, 2017 at 12:25 AM, Robson Peixoto <robson...@gmail.com> wrote:
Hi!

I opened the issue about a OOM kill https://github.com/prometheus/prometheus/issues/2525 and Björn Rabenstein show me that my problem is the number of time series.

Do you have any tip how to reduce the number of time series?

I'd like to know how to count the number of time series for each metrics to figure out if exists any instrumentation problem.

I'm using some exports, like mesos-exporter, just to know if the process is up of down. Any tip how to monitor a app and exclude all not used metrics?

Thanks a lot!

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.
--
Robson Roberto Souza Peixoto
Robinho
Master in Computer Science, University of Campinas
IRC: robsonpeixoto
Twitter: http://twitter.com/robinhopeixoto
github: https://github.com/robsonpeixoto

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.
--
Reply all
Reply to author
Forward
0 new messages