Configuring prometheus federation

3,931 views
Skip to first unread message

g...@recongate.com

unread,
Mar 25, 2018, 10:43:13 AM3/25/18
to Prometheus Users
Hi,

I am trying to configure prometheus federation were i have one Prometheus that scrapes from a few Prometheus's running on kubernetes clusters.

I have configures a Prometheus pod that gets metrics from the kubernetes cluster its running on. When i try to scrape this prometheus on k8s with my "main" prometheus it don't get any metrics even though on the targets page i see the target prometheus as up.

This is the prometheus.yml file in my main prometheus:

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'federate'
    scrape_interval: 15s

    honor_labels: true
    metrics_path: '/federate'

    params:
      'match[]':
        - '{job="prometheus"}'
        - '{__name__=~"job:.*"}'
        - '{name=~".+"}'

    static_configs:
      - targets:
        - '<< server_name >>:<< server_port >>'

Any idea why i don't get any metrics back from the target prometheus?

Thanks


g...@recongate.com

unread,
Mar 26, 2018, 3:07:55 AM3/26/18
to Prometheus Users
I added the match:- '{job=~".+"}'

However i still don't see any metrics from the source server.

What am i doing wrong? Do I need to configure anything on the source server?

Akadi Icho

unread,
Mar 27, 2018, 9:50:59 AM3/27/18
to Prometheus Users
hi, here is complete version source code of two Prometheus Federation:
Scrapping Prometheus (or in my example 10.95.4.77:9090 ) configuration file prometheus.yml:

----------------------------------------------------------Prometheus 1 or Scrapping Prometheus-----------------------------------

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).

  external_labels:
    monitor: 'My monitor'


scrape_configs:
  - job_name: Myfederation
    honor_labels: true
    metrics_path: /federate
    params:
      match[]:
        - '{__name__=~"job.*"}'
    static_configs:
      - targets:
        - 10.95.4.36:9090 # Source Prometheus IP address

----------------------------------------------------------------------Prometheus 2 or Scrapped Prometheus (10.95.4.36:9090)---------
global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
    monitor: '10.95.4.36-monitor'
rule_files:
  - 'prometheus.rules.yml'
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    static_configs:
      - targets: ['10.95.4.36:9090']

--------------prometheues.rules.yml for the Scrapped Prometheus (10.95.4.36:9090)-----------------------

groups:
- name: example1
  rules:
  - record: job_service:http_response_size_bytes_count
    expr: avg(rate(http_response_size_bytes_count[2m])) by (job, service)
- name: example2
  rules:
  - record: job_federate:http_request_duration_microseconds_count
    expr: avg(rate(http_request_duration_microseconds_count{handler="federate"}[2m])) by (job, federate)
-----------------------------------------------------------------------------

if u will run the Scrapped Prometheus or (10.95.4.36:9090) or localhost:9090 
in the metric dropdown list u will see two new metrics :
1.job_service:http_response_size_bytes_count
2.job_federate:http_request_duration_microseconds_count

try to excute them in both servers 

    


Reply all
Reply to author
Forward
0 new messages