mysqld-exporter mysql_heartbeat_lag_seconds and server_id label - HOW?

13 views
Skip to first unread message

Stefan Szebinski

unread,
Jul 6, 2020, 12:15:13 PM7/6/20
to Prometheus Users
mysqld_exporter contains some useful code for a check on heartbeat lagging

groups
:
- name: example.rules
  rules
:
 
- record: mysql_heartbeat_lag_seconds
    expr
: mysql_heartbeat_now_timestamp_seconds - mysql_heartbeat_stored_timestamp_seconds
   
...
 
- alert: MySQLReplicationLag
    expr
: (mysql_heartbeat_lag_seconds > 30) and ON(instance) (predict_linear(mysql_heartbeat_lag_seconds[5m],
     
60 * 2) > 0)


Now, in my case the master server_id may change due to the way we operate our MySQL cluster, and hence, we may get the following metrics

{instance="batchdb001.mo-staging99-nonprod.dus1.cloud",job="prometheus-mysqld-exporter",server_id="2001500"} 0.5187849998474121
{instance="batchdb001.mo-staging99-nonprod.dus1.cloud",job="prometheus-mysqld-exporter",server_id="3212"}    1594051555.519615


As you can see, for one instance there's multiple metrics only one of which is the right one as it refers to the correct server_id. In principle, it's easy to determine the correct one as there's also a metric mysql_slave_status_master_server_id which returns the correct server_id:

mysql_slave_status_master_server_id{instance="batchdb001.mo-staging99-nonprod.dus1.cloud",job="prometheus-mysqld-exporter",master_host="dbmaster001",master_uuid="005e9c3d-baea-11ea-ab06-027e6d15fde3"}.                     2001500

so for the alert definition I would have to take into account the server_id:

- alert: MySQLReplicationLag
    expr
: (mysql_heartbeat_lag_seconds{server_id="2001500"} > 30) and ON(instance) ...

but how to do this in my case, where server_id has to be compared with a metrics value (mysql_slave_status_master_server_id)?
Reply all
Reply to author
Forward
0 new messages