The goal here is very simple, but I've spent the better part of the last 2 weeks trying different solutions and options - none seem to pan out for production use yet.
Goals:
- Use prometheus-rabbitmq-exporter plugin: https://github.com/deadtrickster/prometheus_rabbitmq_exporter - check this is working
- Use prometheus service monitor to gather those metrics - check this is working
- Use helm template to define the queue the HPA pod should scale off of - syntax is close but probably incorrect.
- Use the queue metrics gathered in prometheus to scale a pod based on queue count ( NOT THE RABBIT CLUSTER ) - not working!
Details:
AWS EKS - Latest 1.13 I believe ( this changes the HPA syntax in
Deployments are done via helm templates
Rabbitmq is clustered to the namespace and is on version 3.7.17.
Metrics are listed as such in prometheus:
| Element | Value |
|---|
| rabbitmq_core_queue_messages{endpoint="stats",instance="10.3.89.178:15672",job="rabbitmq",namespace="us-east-1-stg",pod="rabbitmq-2",queue="queue_1_staging",service="rabbitmq",vhost="/"} | 555 |
| rabbitmq_core_queue_messages{endpoint="stats",instance="10.3.89.178:15672",job="rabbitmq",namespace="us-east-1-stg",pod="rabbitmq-2",queue="queue_2_staging",service="rabbitmq",vhost="/"} | 123 |
| rabbitmq_core_queue_messages{endpoint="stats",instance="10.3.89.178:15672",job="rabbitmq",namespace="us-east-1-stg",pod="rabbitmq-2",queue="queue_3_staging",service="rabbitmq",vhost="/"} | 10
|
I can do a query like:
rabbitmq_core_queue_messages{queue="queue_1_staging"}
This will return the specific queue I would like to scale a pod off of.
The issues that I found so far.
- A custom rule for the prometheus operator to use to consolidate or group the queue metrics from multiple pods into a useable metric
- A kubectl get --raw request url that returns the above value in prometheus ( I must be dense because I've read way to many docs on this )
- The helm syntax to use to actually get this to work
An example of the HPA syntax I've tried to get something like this to work.
#########################################################
# HPA
#########################################################
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: "some-job"
labels:
application: "some-job"
namespace: "us-east-1-stg"
spec:
scaleTargetRef:
apiVersion: apps/v1beta2
kind: Deployment
name: "some-job"
minReplicas: 1
maxReplicas: 20
metrics:
- type: Object
object:
metricName: rabbitmq_core_queue_messages
target:
name: "queue_1_staging"
namespace: "us-east-1-stg"
targetValue: 1
I know the syntax is probably way off, but I really haven't found any documentation that can narrow it down.
I also know that other projects like: Keda
https://github.com/kedacore/keda exist and provide a much simpler solution to do this, it isn't production ready yet.
Any help with this would be greatly appreciated, I've basically hit a brick wall.
Thanks!