I figured out how to do this (i.e.
customize the prometheus.yaml config file from prometheus-operator so that a specified cluster name is included in every notification sent from alertmanager to HipChat) - here's more info (for anyone who might be interested):
NOTE:
Frederic Branczyk said "If you want to use the custom configuration because there is something you can't do with the Prometheus Operator it is likely a missing feature".
~/prometheus-operator/contrib/kube-prometheus/manifests
$ git diff prometheus/prometheus-k8s.yaml
diff --git a/contrib/kube-prometheus/manifests/prometheus/prometheus-k8s.yaml b/contrib/kube-prometheus/manifests/prometheus/prometheus-k8s.yaml
- {key: k8s-app, operator: Exists}
* but I still decided to post this, in case it helps anyone who truly needs to modify prometheus.yaml for something that prometheus-operator doesn't support.
-----------------------------
Initial Setup
Before I test this in a K8s cluster on AWS, I wanted to test it via minikube (on my Mac).
The following is basically borrowed from kube-prometheus/README.md:
$ minikube start --memory 4096 --kubernetes-version v1.8.0 --bootstrapper=kubeadm --extra-config=kubelet.authentication-token-webhook=true --extra-config=kubelet.authorization-mode=Webhook --extra-config=scheduler.address=0.0.0.0 --extra-config=controller-manager.address=0.0.0.0
And then I deployed kube-prometheus:
$ cd prometheus-operator/contrib/kube-prometheus/
$ ./hack/cluster-monitoring/minikube-deploy
Kudos to Joe Creager
I found a great blog post that gives an e.g. of specifying your own prometheus.yaml (and your own rule files - though my use case is simpler because I don't need to do that; and the following blog uses prometheus-operator but not kube-prometheus):
Custom Configurations with Prometheus Operator
Let's take a look at how prometheus-operator sets things up out-of-the-box
promcfg.go generates the prometheus.yaml - and here's how to retrieve it:
$ kubectl get secret prometheus-k8s -o json -n monitoring | jq -r '.data["prometheus.yaml"]' | base64 -D >~/raw_prometheus.yaml
Here's the other piece of data in this secret:
$ kubectl get secret prometheus-k8s -o json -n monitoring | jq -r '.data["configmaps.json"]' | base64 -D >~/raw_configmaps.json
$ cat ~/raw_configmaps.json
{"items":[{"key":"monitoring/prometheus-k8s-rules","checksum":"f8e6d9b3bc5cf6fdb4185b8d4f3ebc0d501fa54b9f17ba6895413d8f85c40665"}]}
Side note:
* for the life of me I couldn't figure out how to reproduce that checksum. From the make_secrets.sh script on the aforementioned blog post, it looks like the following command should reproduce it (but as you can see this checksum is different):
$ cat contrib/kube-prometheus/manifests/prometheus/prometheus-k8s-rules.yaml | shasum -b -a 256
3ffc6601640fa837fc1cba95f65d282c55713eba4aadab72afd929da10ec1ff7 *-
* but after more Googling I found that
Frederic Branczyk said "The checksum is only to signify whether the content of a ConfigMap has changed". So then I decided it didn't matter :)
(I didn't need to modify the configmaps.json / prometheus rules, but in a moment you'll see I needed to retrieve them in order to not blow them away.)
Disable the serviceMonitorSelector
The aforementioned blog post kinda glosses this over, but it's explicitly mentioned by
custom-configuration.md ("the
serviceMonitorSelector field has to be left empty"). Here's what you need to do: make the following change (and to be thorough I then ran
minikube-teardown and then
minikube-deploy):
~/prometheus-operator/contrib/kube-prometheus/manifests
$ git diff prometheus/prometheus-k8s.yaml
diff --git a/contrib/kube-prometheus/manifests/prometheus/prometheus-k8s.yaml b/contrib/kube-prometheus/manifests/prometheus/prometheus-k8s.yaml
- serviceMonitorSelector:
- - {key: k8s-app, operator: Exists}
+ #serviceMonitorSelector:
+ # - {key: k8s-app, operator: Exists}
Customize alertmanager.yaml (to point to your own HipChat server & room)
I created my own alertmanager config file:
$ edit ./alertmanager.yaml
Then I applied my file to the running alertmanager:
$ kubectl -n monitoring create secret generic alertmanager-main --from-literal=alertmanager.yaml="$(< ./alertmanager.yaml)" --dry-run -oyaml | kubectl -n monitoring replace secret --filename=-
Customize prometheus.yaml (to add the cluster name label)
I made my change to customize the config:
$ cp ~/raw_prometheus.yaml ~/raw_prometheus_MODIFIED.yaml
$ edit ~/raw_prometheus_MODIFIED.yaml
$ diff ~/raw_prometheus.yaml ~/raw_prometheus_MODIFIED.yaml
4c4,5
< external_labels: {}
---
> external_labels:
> cluster: josh-minikube
Then I updated the secret with my modified prometheus.yaml (and I retained the same configmaps.json):
$ kubectl -n monitoring create secret generic prometheus-k8s --from-literal=prometheus.yaml="$(< ~/raw_prometheus_MODIFIED.yaml)" --from-literal=configmaps.json="$(< ~/raw_configmaps.json)" --dry-run -oyaml | kubectl -n monitoring replace secret --filename=-
Cause an alert to fire (& see the notification in HipChat)
$ kubectl run crasher --image=rosskukulinski/crashing-app
$ kubectl run fail --image=rosskukulinski/dne:v1.0.0
$ kubectl get pods --all-namespaces | egrep -i "fail|crash"
default crasher-679745dd49-kkh4j 0/1 CrashLoopBackOff 5 5m
default fail-ddd94648b-nqc9c 0/1 ErrImagePull 0 5m
Though
DeadMansSwitch is sufficient to verify the external label I created shows up (the
Alertmanager README.md has more info about amtool):