install Prometheus with Prometheus Operator

nina guo

unread,

Jun 3, 2021, 3:08:49 AM6/3/21

to Prometheus Users

Hi,

If using Prometheus Operator to install in k8s cluster, the data pv will be created automatically or not?

Julius Volz

unread,

Jun 3, 2021, 1:40:05 PM6/3/21

to nina guo, Prometheus Users

Hi Nina,

No, by default, the Prometheus Operator uses an emptyDir for the Prometheus storage, which gets lost when the pod is rescheduled.

This explains how to add persistent volumes: https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/user-guides/storage.md

Regards,

Julius

On Thu, Jun 3, 2021 at 9:08 AM nina guo <ninag...@gmail.com> wrote:

Hi,

If using Prometheus Operator to install in k8s cluster, the data pv will be created automatically or not?

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/e7f3ea4f-b7ad-473d-9095-170529fd32f5n%40googlegroups.com.

--

Julius Volz

PromLabs - promlabs.com

nina guo

unread,

Jun 4, 2021, 3:24:58 AM6/4/21

to Prometheus Users

Thank you very much.

If I deploy multiple Prometheus Pods, and mount separate volumes to each Pod:

1. If one of the k8s nodes goes down, is there any chance the access is currently on the crashed nodes, then the query will be failed?

2. If multiple Pods are running in k8s cluster, is there any data inconsistence issue?(they scrape the same targets.)

nina guo

unread,

Jun 4, 2021, 5:03:37 AM6/4/21

to Prometheus Users

And also have a question that if Prometheus has any autoscaling solution?

Julius Volz

unread,

Jun 4, 2021, 6:12:02 AM6/4/21

to nina guo, Prometheus Users

Hi Nina,

if you run multiple HA replicas of Prometheus and one of them becomes unavailable for some reason and you query that broken replica, the queries will indeed fail. You could either load-balance (with dead backend detection) between the replicas to avoid this, or use something like Thanos (https://thanos.io/) to aggregate over multiple HA replicas and merge / deduplicate their data intelligently, even if one of the replicas is dead.

Regarding data consistency: two HA replicas do not talk to each other (in terms of clustering) and just independently scrape the same data, but at slightly different phases, so they will never contain 100% the same data, just conceptually the same. Thus if you naively load-balance between two HA replicas without any further logic, you will see your e.g. Grafana graphs jump around a tiny bit, depending on which replica you are currently scraping through the load balancer, and when exactly it scraped some target. But other than that, you shouldn't really care, both replicas are "correct", so to say.

For autoscaling on Kubernetes, take a look at the Prometheus Adapter (https://github.com/kubernetes-sigs/prometheus-adapter), which you can use together with the Horizonal Pod Autoscaler to do autoscaling based on Prometheus metrics.

Regards,

Julius

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/06279fe0-978b-4806-afc6-09b79ba6f6f7n%40googlegroups.com.

nina guo

unread,

Jun 7, 2021, 4:51:56 AM6/7/21

to Prometheus Users

Many thanks for your detailed answers Julius.

nina guo

unread,

Jun 7, 2021, 5:20:46 AM6/7/21

to Prometheus Users

I still have a question that - any conflict between autoscaling and High Availability?

Because for my understanding, if currently the solution is with autoscaling, the multiple pods may scrape different metrics, then this siutation will not have HA.

but if now the solution is with HA, the mulitpes pods are exactly scraping the same metrics, if then start autoscaling, it will break HA.

nina guo

unread,

Jun 7, 2021, 5:33:57 AM6/7/21

to Prometheus Users

I came out an idea, the following 2 pods are scraping the same metrics.

Pod 1 - current load 50m - during the recent 1m(not sure if the duration makes sense), the load is 55m.

Pod 2 - current load 50m - during the recent 1m(not sure if the duration makes sense), the load is 55m.

The rule is if the metrics load reaches 50m, the autoscaling will be triggered.

Pod 1 - current load 50m - during the recent 1m(not sure if the duration makes sense), the load is 55m. -> a new Pod 1-1 will be created to afford the extra 5m load

Pod 2 - current load 50m - during the recent 1m(not sure if the duration makes sense), the load is 55m. -> a new Pod 2-1 will be created to afford the extra 5m load

So if autoscaling happened on the 2 Pods at the same time feasible?

Stuart Clark

unread,

Jun 7, 2021, 5:34:14 AM6/7/21

to nina guo, Prometheus Users

On 2021-06-07 10:20, nina guo wrote:
> I still have a question that - any conflict between autoscaling and
> High Availability?
> Because for my understanding, if currently the solution is with
> autoscaling, the multiple pods may scrape different metrics, then
> this siutation will not have HA.
> but if now the solution is with HA, the mulitpes pods are exactly
> scraping the same metrics, if then start autoscaling, it will break
> HA.
>

Autoscaling of what?

Both instances of Prometheus in a pair should be scraping the same
metrics, so there shouldn't be any issues.

--
Stuart Clark

nina guo

unread,

Jun 7, 2021, 5:36:40 AM6/7/21

to Prometheus Users

Autoscaling based on the prometheus metrics load.

For example, if during the recent minutes, the metrics is more than 50m, a new prometheus pod will be started.

Stuart Clark

unread,

Jun 7, 2021, 6:05:20 AM6/7/21

to nina guo, Prometheus Users

On 2021-06-07 10:36, nina guo wrote:
> Autoscaling based on the prometheus metrics load.
> For example, if during the recent minutes, the metrics is more than
> 50m, a new prometheus pod will be started.
>

So you are wanting to autoscale Prometheus itself? That's not normally
needed, as the load should be pretty even across instances (assuming
some sort of load balancing or dedup process in front that handles
queries) as each instance will be scraping the same targets. If the
number of targets/time series increases you might need to increase the
memory usage (so vertical pod autoscaling of the two instances rather
than horizontal pod autoscaling). Similarly for queries it would be more
memory you might need mostly rather than additional instances - if you
are expecting massive query load maybe something like Thanos might be a
good option?

--
Stuart Clark

Stuart Clark

unread,

Jun 7, 2021, 6:08:47 AM6/7/21

to Prometheus Users

When doing autoscaling (not specifically with Prometheus but everything) you need to ensure that you don't have too many changes happening at once, otherwise you might start rejecting requests (if all instances are restarting at the same time).

This would generally be done via things like pod distuption budgets. For a pair of Prometheus servers I'd not want more than one change at once. For other systems I might go as far as N-1 changes at once.

nina guo

unread,

Jun 9, 2021, 2:16:53 AM6/9/21

to Prometheus Users

Thank you very much.

May I ask if there is a way to make multiple Prometheus instances to scrape different targets?

Compared the 2 solution, scraping same targerts vs scraping different targets, which is more better?

Stuart Clark

unread,

Jun 9, 2021, 2:30:30 AM6/9/21

to nina guo, Prometheus Users

On 09/06/2021 07:16, nina guo wrote:

Thank you very much.

May I ask if there is a way to make multiple Prometheus instances to scrape different targets?

Compared the 2 solution, scraping same targerts vs scraping different targets, which is more better?

On Monday, June 7, 2021 at 6:08:47 PM UTC+8 Stuart Clark wrote:

When doing autoscaling (not specifically with Prometheus but everything) you need to ensure that you don't have too many changes happening at once, otherwise you might start rejecting requests (if all instances are restarting at the same time).

This would generally be done via things like pod distuption budgets. For a pair of Prometheus servers I'd not want more than one change at once. For other systems I might go as far as N-1 changes at once.

Yes sharding is a standard solution when wanting to scale Prometheus performance.

The two options are for different use cases and work together. A single Prometheus server can handle a certain number of targets/queries based on both the number of metrics being scraped and the CPU/memory assigned to that server. Above that level you would look to split your list of targets across multiple servers. Also it might make sense to do that splitting also for organisational reasons - different servers split by product, service, location, etc. which are managed by different teams for example. So you might have a server in location X and two servers in location Y (product A and product B).

You might have additional more central servers for global alerts using federation or a system such as Thanos not to combine all metrics together (which would be a single point of failure and require massive resources) but to allow for a consolidated view.

Alongside this you would use pairs of Prometheus for HA, so that if a single server isn't operating (failure, maintenance, etc.) you don't lose metrics. You might run a system such as promxy or Thanos in front of each pair to handle deduplication. So in the example of 3 groups of Prometheus servers (X, AY & BY) they would actually be HA pairs, so 6 servers in total. If using a system such as Kubernetes you'd need to ensure that any changes are limited (e.g. via pod disruption budgets) to ensure the second pod isn't stopped/replaced while the first is out of action.

-- 
Stuart Clark

nina guo

unread,

Jun 9, 2021, 4:46:49 AM6/9/21

to Prometheus Users

Thank you very much Stuart : )

For implementing "split your list of targets across multiple servers", currently in our env, the mulitple jobs are sharing the same configmap. So in order to split the target list, should I create separate configmap for each instance? I'm not sure if it is the correct way.

Stuart Clark

unread,

Jun 9, 2021, 5:20:31 AM6/9/21

to nina guo, Prometheus Users

On 09/06/2021 09:46, nina guo wrote:
> Thank you very much Stuart : )
> For implementing "split your list of targets across multiple servers",
> currently in our env, the mulitple jobs are sharing the same
> configmap. So in order to split the target list, should I create
> separate configmap for each instance? I'm not sure if it is the
> correct way.

Yes you would need to have separate configmaps for each set of targets.

--
Stuart Clark

nina guo

unread,

Jun 10, 2021, 1:36:30 AM6/10/21

to Prometheus Users

Thank you Stuart.

For my understanding, the concept of HA is there are 2 instances - master and slave. Master is running, and slave is standby. If master breakes down, slave will take over.

But now for Prometheus, the 2 promethues instanaces are running at the same time with scraping the same metrics.

I research the tool Promxy. It is mainly for aggregating the metrics. It will deduplicate and merge the metrics.

I'm a little confuse about the HA on Promethues. It is a bit different with the normal concept.

Regarding alert manager, I noticed that it has native HA solution. So that is to say, deploying 2 alert managers is OK? If there is any alert, it will be sent 2 times with the 2 alert managers separately.

I'm still not very clear about the concept of HA for alert manager.

Could you please help to answer my questions above?

Stuart Clark

unread,

Jun 10, 2021, 2:09:59 AM6/10/21

to nina guo, Prometheus Users

On 10/06/2021 06:36, nina guo wrote:
> Thank you Stuart.
> For my understanding, the concept of HA is there are 2 instances -
> master and slave. Master is running, and slave is standby. If master
> breakes down, slave will take over.
> But now for Prometheus, the 2 promethues instanaces are running at the
> same time with scraping the same metrics.
> I research the tool Promxy. It is mainly for aggregating the metrics.
> It will deduplicate and merge the metrics.
> I'm a little confuse about the HA on Promethues. It is a bit different
> with the normal concept.

Yes. Prometheus does not run as a cluster or active/passive pair where
they communicate and transfer information between themselves. Instead
you run two totally separate Prometheus servers which maintain their own
(slightly different) view of the world and then query in a way that
prevents duplication.

>
> Regarding alert manager, I noticed that it has native HA solution. So
> that is to say, deploying 2 alert managers is OK? If there is any
> alert, it will be sent 2 times with the 2 alert managers separately.
> I'm still not very clear about the concept of HA for alert manager.
>

For Alertmanager the different instances communicate with each other
forming a cluster. Information about alerts is shared with a full copy
of that data held by each instance. Any Prometheus servers attached must
inform every member of an Alertmanager cluster. When alerts fire
Alertmanager will ensure a single notification is sent and will continue
to work even if cluster instances stop operating.

--
Stuart Clark

nina guo

unread,

Jun 10, 2021, 2:53:54 AM6/10/21

to Prometheus Users

So for alert manager, the alerts will be sent to every member of an Alertmanager cluster before firing. Is it correct?

But when the alerts are fired, it will only have 1 notification eg. in destination Email or Slack?

Stuart Clark

unread,

Jun 10, 2021, 2:57:04 AM6/10/21

to nina guo, Prometheus Users

On 10/06/2021 07:53, nina guo wrote:
>
> So for alert manager, the alerts will be sent to every member of an
> Alertmanager cluster before firing. Is it correct?
> But when the alerts are fired, it will only have 1 notification eg.
> in destination Email or Slack?

That is correct.

--
Stuart Clark

Message has been deleted

nina guo

unread,

Jun 10, 2021, 3:22:37 AM6/10/21

to Prometheus Users

Thank you very much Stuart. : )

We will improve our solution.

nina guo

unread,

Jun 10, 2021, 4:51:08 AM6/10/21

to Prometheus Users

So whether we need to do some configuration for alert manager or it is enabled by default?

Stuart Clark

unread,

Jun 10, 2021, 12:38:26 PM6/10/21

to nina guo, Prometheus Users

On 10/06/2021 09:51, nina guo wrote:

So whether we need to do some configuration for alert manager or it is enabled by default?

Take a look at the documentation: https://github.com/prometheus/alertmanager#high-availability

-- 
Stuart Clark

Reply all

Reply to author

Forward