tectonic prometheus target "UNKNOWN"

471 views
Skip to first unread message

Alex

unread,
Oct 19, 2017, 12:23:50 PM10/19/17
to CoreOS User

Hi,


we're seeing one of the tectonic prometheus targets on "UKNOWN".
We think that this might be causing trouble we're having with the dashboards where API server stats are showing no data over periods of time.
We think it might be caused by the "namespace=default" label (and corresponding config) for the api-server.

Any ideas on how to fix or further trace this would be much appreciated.

Thanks
Alex


/prometheus/targets

Targets


alertmanager-main
EndpointStateLabelsLast ScrapeError
http://10.2.1.64:9093/metrics
UPendpoint="web" instance="10.2.1.64:9093" namespace="tectonic-system" pod="alertmanager-main-0" service="alertmanager-main"850ms ago
http://10.2.6.54:9093/metrics
UPendpoint="web" instance="10.2.6.54:9093" namespace="tectonic-system" pod="alertmanager-main-1" service="alertmanager-main"1.239s ago
apiserver
EndpointStateLabelsLast ScrapeError
https://10.0.18.114:443/metrics
UNKNOWNendpoint="https" instance="10.0.18.114:443" namespace="default" service="kubernetes"Never


/ns/tectonic-system/secrets/prometheus-k8s/details

- job_name: tectonic-system/kube-apiserver/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - default
  scrape_interval: 30s
  scheme: https

Alex

unread,
Oct 19, 2017, 3:37:02 PM10/19/17
to CoreOS User
fixed by adding:
        - '--apiserver-count=3'
to the kube-apiserver Daemon set. 

Seems to be an upstream kubernetes issue.

A.

Rob Szumski

unread,
Oct 20, 2017, 3:55:52 PM10/20/17
to Alex, CoreOS User
This is a known bug in the endpoints that the API server updates as it leader elects. The CoreOS Kubernetes upstream team has has gotten a patch into 1.8 that fixes this behavior.

As a random FYI, what happens is that all the endpoints get removed, so Prometheus can’t scrape anything. The server count flag can fix this in some circumstances, but also has its own bugs, which is why we don’t currently set it.

 - Rob

--
You received this message because you are subscribed to the Google Groups "CoreOS User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to coreos-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages