No Targets on Prometheus UI

2,801 views
Skip to first unread message

M McGuinness

unread,
Mar 6, 2018, 6:06:42 AM3/6/18
to Prometheus Users
Hi 

Could you please help with my Prometheus configuration. I am trying to monitor a Kubernetes cluster and can connect to the cluster ok but when using the following yaml file below. I also noticed that when I run the following command to get the pods I get:

./kubectl get pods -n monitoring

NAME                                     READY     STATUS    RESTARTS   AGE

prometheus-deployment-5cfdf8f756-mpctk   1/1       Running   0          4d


Could the name of my pod be causing the issue? Also I noticed the example config file below does not have the alert manager details but I thought the UI should pick up some of the metrics running on the cluster automatically?


Thanks in advance for any help!!


apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-server-conf
  labels:
    name: prometheus-server-conf
  namespace: monitoring
data:
  prometheus.yml: |-
    global:
      scrape_interval: 5s
      evaluation_interval: 5s

    scrape_configs:
      - job_name: 'kubernetes-apiservers'

        kubernetes_sd_configs:
        - role: endpoints
        scheme: https

        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

        relabel_configs:
        - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
          action: keep
          regex: default;kubernetes;https

      - job_name: 'kubernetes-nodes'

        scheme: https

        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

        kubernetes_sd_configs:
        - role: node

        relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
        - target_label: __address__
          replacement: kubernetes.default.svc:443
        - source_labels: [__meta_kubernetes_node_name]
          regex: (.+)
          target_label: __metrics_path__
          replacement: /api/v1/nodes/${1}/proxy/metrics

      
      - job_name: 'kubernetes-pods'

        kubernetes_sd_configs:
        - role: pod

        relabel_configs:
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
          action: keep
          regex: true
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
          action: replace
          regex: ([^:]+)(?::\d+)?;(\d+)
          replacement: $1:$2
          target_label: __address__
        - action: labelmap
          regex: __meta_kubernetes_pod_label_(.+)
        - source_labels: [__meta_kubernetes_namespace]
          action: replace
          target_label: kubernetes_namespace
        - source_labels: [__meta_kubernetes_pod_name]
          action: replace
          target_label: kubernetes_pod_name

      - job_name: 'kubernetes-cadvisor'

        scheme: https

        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

        kubernetes_sd_configs:
        - role: node

        relabel_configs:
        - action: labelmap
          regex: __meta_kubernetes_node_label_(.+)
        - target_label: __address__
          replacement: kubernetes.default.svc:443
        - source_labels: [__meta_kubernetes_node_name]
          regex: (.+)
          target_label: __metrics_path__
          replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
      
      - job_name: 'kubernetes-service-endpoints'

        kubernetes_sd_configs:
        - role: endpoints

        relabel_configs:
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
          action: keep
          regex: true
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
          action: replace
          target_label: __scheme__
          regex: (https?)
        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
          action: replace
          target_label: __address__
          regex: ([^:]+)(?::\d+)?;(\d+)
          replacement: $1:$2
        - action: labelmap
          regex: __meta_kubernetes_service_label_(.+)
        - source_labels: [__meta_kubernetes_namespace]
          action: replace
          target_label: kubernetes_namespace
        - source_labels: [__meta_kubernetes_service_name]
          action: replace
          target_label: kubernetes_name


Simon Pasquier

unread,
Mar 6, 2018, 8:27:10 AM3/6/18
to M McGuinness, Prometheus Users
I would start by looking at the logs of the Prometheus pod. If you're running version 2.1.0 (or above), you can also check the Service Discovery UI page.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/7311214f-b398-47c6-8319-b341e5c53f06%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

M McGuinness

unread,
Mar 6, 2018, 9:25:19 AM3/6/18
to Prometheus Users


Hi Simon

I tried to look at the logs running:

./kubectl get pods --namespace=monitoring

NAME                                     READY     STATUS    RESTARTS   AGE

prometheus-deployment-5cfdf8f756-mpctk   1/1       Running   0          4d


but when I ran the following it did not find any such pod:


./kubectl logs prometheus-deployment-5cfdf8f756-mpctk

Error from server (NotFound): pods "prometheus-deployment-5cfdf8f756-mpctk" not found


I then ran ./kubectl get pods and it doesn't show my prometheus pod there -


NAME                READY     STATUS    RESTARTS   AGE

cassandra-0         1/1       Running   0          5d

cassandra-1         1/1       Running   0          5d

cassandra-2         1/1       Running   0          5d

metricgen-0         1/1       Running   0          5d

metricrest-0        1/1       Running   0          5d


Have you any idea what might be happening here? I checked the Service Discovery UI in the Prometheus UI and there was nothing showing in it.
Could you help any further please and thanks?

 

Simon Pasquier

unread,
Mar 6, 2018, 9:28:00 AM3/6/18
to M McGuinness, Prometheus Users
You're probably missing the --namespace argument:
./kubectl logs
--namespace=monitoring

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.

M McGuinness

unread,
Mar 6, 2018, 9:53:34 AM3/6/18
to Prometheus Users

Hi Simon,

That definitely fixed the logs problem (thanks a lot for that). I seem to be getting a lot of the following whilst tailing the logs now:

level=error ts=2018-03-06T11:00:09.923995347Z caller=main.go:221 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:296: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:monitoring:default\" cannot list pods at the cluster scope"

level=error ts=2018-03-06T11:00:09.924030858Z caller=main.go:221 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:354: Failed to list *v1.Node: nodes is forbidden: User \"system:serviceaccount:monitoring:default\" cannot list nodes at the cluster scope"

level=error ts=2018-03-06T11:00:09.924066031Z caller=main.go:221 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:268: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:monitoring:default\" cannot list endpoints at the cluster scope"

level=error ts=2018-03-06T11:00:09.924076209Z caller=main.go:221 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:269: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:default\" cannot list services at the cluster scope"


Do I need to set the service account somewhere? Would that be in the config-map.yaml file, prometheus-deployment.yaml file or prometheus-service.yaml?

Thanks in advance!

Simon Pasquier

unread,
Mar 6, 2018, 10:01:52 AM3/6/18
to M McGuinness, Prometheus Users
The prometheus service account doesn't have enough permissions.
You can use [1] as a starting point (at least adapt the namespace and service account). Or if you're using the Prometheus operator, check the operator repository which contain similar examples.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.

M McGuinness

unread,
Mar 6, 2018, 10:19:23 AM3/6/18
to Prometheus Users

Thanks Simon, so if I use the rbac-setup.yaml file and change the namespace to the one I am using 'monitoring' what command should I use to capture the changes made in it for the prometheus service account?

Thanks in advance!

M McGuinness

unread,
Mar 6, 2018, 11:47:14 AM3/6/18
to Prometheus Users
Hi Simon 

I changed the namespace to 'Monitoring' in the file and ran the following: kubectl apply -f brace-setup.yaml file
It seemed to run ok and created the service account 'Prometheus' but when I checked the logs again they are still giving me the same errors as above. So the permissions must still not be set properly. Could you advise any further please?

Many thanks in advance!

M McGuinness

unread,
Mar 6, 2018, 12:37:39 PM3/6/18
to Prometheus Users

Hi Simon 

I changed the namespace to 'Monitoring' in the file and ran the following: kubectl apply -f rbac-setup.yaml file
It seemed to run ok and created the service account 'prometheus' but when I checked the logs again they are still giving me the same errors as above. So the permissions must still not be set properly. Could you advise any further please?

 ./kubectl apply -f rbac-setup.yml

clusterrole "prometheus" created

serviceaccount "prometheus" created

clusterrolebinding "prometheus" created

Mary-Jos-MBP:darwin-amd64 maryjomcguinness$ ./kubectl get serviceaccounts --namespace=monitoring

NAME         SECRETS   AGE

default      1         5d

prometheus   1         24s

Simon Pasquier

unread,
Mar 7, 2018, 3:46:32 AM3/7/18
to M McGuinness, Prometheus Users
On Tue, Mar 6, 2018 at 6:37 PM, M McGuinness <maryjomc...@gmail.com> wrote:

Hi Simon 

I changed the namespace to 'Monitoring' in the file and ran the following: kubectl apply -f rbac-setup.yaml file
It seemed to run ok and created the service account 'prometheus' but when I checked the logs again they are still giving me the same errors as above. So the permissions must still not be set properly. Could you advise any further please?

 ./kubectl apply -f rbac-setup.yml

clusterrole "prometheus" created

serviceaccount "prometheus" created

clusterrolebinding "prometheus" created

Mary-Jos-MBP:darwin-amd64 maryjomcguinness$ ./kubectl get serviceaccounts --namespace=monitoring

NAME         SECRETS   AGE

default      1         5d

prometheus   1         24s


The Prometheus pod is probably using the "default" service account which is why it didn't work.
Edit rbac-setup.yml and change the line 38 to "name: default". Then "kubectl apply -f rbac-setup.yml" again.
 
 

Many thanks in advance!



On Tuesday, 6 March 2018 15:19:23 UTC, M McGuinness wrote:

Thanks Simon, so if I use the rbac-setup.yaml file and change the namespace to the one I am using 'monitoring' what command should I use to capture the changes made in it for the prometheus service account?

Thanks in advance!

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.

M McGuinness

unread,
Mar 7, 2018, 4:54:42 AM3/7/18
to Prometheus Users


Hi Simon

I changed line 38 and ran ./kubectl apply -f rbac.yaml again and it gave me this:

./kubectl apply -f rbac-setup.yml

clusterrole "prometheus" configured

serviceaccount "prometheus" unchanged

clusterrolebinding "prometheus" configured




I noticed line 25 is using 'prometheus' too, should I change that to 'default' too?

apiVersion: v1

kind: ServiceAccount

metadata:

  name: prometheus

  namespace: monitoring


Many thanks for your help!

M McGuinness

unread,
Mar 7, 2018, 6:56:49 AM3/7/18
to Prometheus Users

Hi Simon

I changed line 25 and line 38  and ran the ./kubectl apply -f rbac.yaml command again and it brought the metrics in so thanks a lot for your help with that!

I was hoping you could advise me on where in the config-map.yaml file to add the alert manager and alerts and where I can add the receiver which is a web hook I am using please?

Many thanks again!

M McGuinness

unread,
Mar 7, 2018, 7:03:14 AM3/7/18
to Prometheus Users
Could I use the following in my config-map.yaml?

## alertmanager ConfigMap entries
##
alertmanagerFiles:
  alertmanager.yml: |-
    global:
      # slack_api_url: ''
      resolve_timeout: 20s

    receivers:
      - name: default-receiver
        # slack_configs:
        #  - channel: '@you'
        #    send_resolved: true
      - name: 'webhook'
        webhook_configs:
          - send_resolved: true
            url: '<webhook>'

    route:
      group_wait: 10s
      group_interval: 5m
      receiver: webhook
      repeat_interval: 3h
 

Many thanks again!

M McGuinness

unread,
Mar 7, 2018, 8:18:58 AM3/7/18
to Prometheus Users


Hi Simon

I tried to edit my existing config-map.yaml by adding the bottom section for Alerts below but when I tried to run the create config-map.yaml command again it gave me this:


./kubectl create -f ./config-map.yaml -n monitoring

error: error validating "./config-map.yaml": error validating data: [ValidationError(ConfigMap): unknown field "alertmanagerFiles" in io.k8s.api.core.v1.ConfigMap, ValidationError(ConfigMap): unknown field "serverFiles" in io.k8s.api.core.v1.ConfigMap]; if you choose to ignore these errors, turn validation off with --validate=false


Would you mind taking a look at my file below and let me know if you notice what could be wrong please and thanks?

## alertmanager ConfigMap entries

##

alertmanagerFiles:

  alertmanager.yml: |-

    global:

      # slack_api_url: ''

      resolve_timeout: 20s


    receivers:

      - name: default-receiver

        # slack_configs:

        #  - channel: '@you'

        #    send_resolved: true

      - name: 'webhook'

        webhook_configs:

          - send_resolved: true

            url: '<normalizer webhook>'


    route:

      group_wait: 10s

      group_interval: 5m

      receiver: webhook

      repeat_interval: 3h


## Prometheus server ConfigMap entries

##

serverFiles:

  rules: ""

  alerts: |-

    # host rules

    ALERT high_node_load

      IF node_load1 > 20

      FOR 10s

      LABELS { severity = "critical" }

      ANNOTATIONS {

          # summary defines the status if the condition is met

          summary = "Node usage exceeded threshold",

          # description reports the situtation of event

          description = "Instance {{ $labels.instance }}, Job {{ $labels.job }}, Node load {{ $value }}",

      }


    ALERT high_memory_usage

      IF (( node_memory_MemTotal - node_memory_MemFree ) / node_memory_MemTotal) * 100 > 100

      FOR 10s

      LABELS { severity = "warning" }

      ANNOTATIONS {

          # summary defines the status if the condition is met

          summary = "Memory usage exceeded thershold",

          # description reports the situtation of event

          description = "Instance {{ $labels.instance }}, Job {{ $labels.job }}, Memory usage {{ humanize $value }}%",

      }


    ALERT high_storage_usage

      IF (node_filesystem_size{fstype="ext4"} - node_filesystem_free{fstype="ext4"}) / node_filesystem_size{fstype="ext4"}  * 100 > 90

      FOR 10s

      LABELS { severity = "warning" }

      ANNOTATIONS {

          # summary defines the status if the condition is met

          summary = "Storge usage exceeded threshold",

          # description reports the situtation of event

          description = "Instance {{ $labels.instance }}, Job {{ $labels.job }}, Storage usage {{ humanize $value }}%",

      }


  prometheus.yml: |-

    rule_files:

      - /etc/config/rules

      - /etc/config/alerts


    scrape_configs:

      - job_name: prometheus

        static_configs:

          - targets:

            - localhost:9090





Reply all
Reply to author
Forward
0 new messages