prometheus statefulset with thanos stop working after two days

62 views
Skip to first unread message

墨生

unread,
Nov 25, 2020, 10:48:26 PM11/25/20
to Prometheus Users
Hey guys, 
I'm a newbie to promethues and recently I've been working on promethues ha with thanos.But everything looks good after 2days running , here are some of the logs
logs for prometheus-3:
level=info ts=2020-11-26T03:45:48.626Z caller=manager.go:934 component="rule manager" msg="Rule manager stopped"
level=info ts=2020-11-26T03:45:48.626Z caller=notifier.go:601 component=notifier msg="Stopping notification manager..."
level=info ts=2020-11-26T03:45:48.626Z caller=main.go:789 msg="Notifier manager stopped"
level=info ts=2020-11-26T03:45:48.626Z caller=main.go:615 msg="Scrape manager stopped"
level=error ts=2020-11-26T03:45:48.626Z caller=main.go:798 err="opening storage failed: found unsequential head chunk files /data/chunks_head/000002 (index: 2) and /data/chunks_head/000005 (index: 5)"

logs for prometheus-2:
level=error ts=2020-11-26T03:15:20.159Z caller=db.go:730 component=tsdb msg="compaction failed" err="persist head block: populate block: add series: out-of-order series added with label set \"{__name__=\\\"envoy_cluster_assignment_stale\\\", app=\\\"istio-ingressgateway\\\", chart=\\\"gateways\\\", cluster_name=\\\"xds-grpc\\\", heritage=\\\"Tiller\\\", instance=\\\"10.129.45.133:15090\\\", istio=\\\"ingressgateway\\\", job=\\\"kubernetes-pods\\\", kubernetes_namespace=\\\"istio-system\\\", kubernetes_pod_name=\\\"istio-ingressgateway-669cfc876b-zdcng\\\", pod_template_hash=\\\"669cfc876b\\\", release=\\\"istio\\\", service_istio_io_canonical_name=\\\"istio-ingressgateway\\\", service_istio_io_canonical_revision=\\\"latest\\\"}\""

My complete configurations can be found here: 

Thanks in advance,
Ray

Matthias Rampke

unread,
Nov 26, 2020, 3:42:26 AM11/26/20
to 墨生, Prometheus Users
You are using a single PersistentVolumeClaim rather than letting Kubernetes create one per instance of the statefulset. Even though it is of mode ReadWriteOnce, I believe it is being bounced around between the different pods and that is causing problems.

Rather than creating the PVC yourself, use the volumeClaimTemplates field of the StatefulSet spec, like in this example.

/MR

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/775a78b8-b4f0-494d-b1db-2454d3fb010an%40googlegroups.com.

墨生

unread,
Nov 26, 2020, 4:38:21 AM11/26/20
to Prometheus Users
Much appreciated Matt, 
Definetly will try your suggest , 2f5e8fb33829e7525d118acaa4a6907af95d0400 commit and I'll let you know how it goes.( in two days :D)

Thanks,
Ray

matt...@prometheus.io 在 2020年11月26日 星期四下午4:42:26 [UTC+8] 的信中寫道:
Reply all
Reply to author
Forward
0 new messages