JMX Exporter Timeout while running on Kafka

176 views
Skip to first unread message

Azher Khan

unread,
Jun 4, 2020, 12:48:21 AM6/4/20
to Prometheus Users
Hi There,

I have configured JMX Exporter to scrape the kafka metrics as a java agent.

I am able to see JMX Exporter fetch all the Kafka metrics only for a few hours after which prometheus is unable to scrape the metrics and we get the below message

context deadline exceeded


Also when we run a curl for the  from the Kafka Instance it takes forever to pull the Kafka metrics.

[user@kafkaX ~]$ curl http://XX.XX.XX.YYY:7070/metrics 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:--  0:40:01 --:--:--     0


If i do a restart of the Kakfa service, I can see the metrics being able to be pulled up again only for few hours and then we are back to the above issue.


I have setup JMX exporter as a java agent, which is run along with Kafka service

cat /etc/systemd/system/kafka.service

[Service]
Environment="KAFKA_OPTS=-Djava.security.auth.login.config=/opt/kafka/config/kafka_server_jaas.conf  -javaagent:/opt/jmx-exporter/jmx_prometheus_javaagent-0.12.0.jar=7070:/opt/jmx-exporter/kafka-2_0_0.yml"

julian ade

unread,
Jun 4, 2020, 4:20:28 AM6/4/20
to Azher Khan, Prometheus Users
Hi,

Try to increase scrape_timeout in your prometheus.yml file.

Julian Ade Putra
(Mobile) : +6281932700018
Aim for the stars, if you fail, at least you'll get the moon


--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/f0d944dd-ff8d-4ec4-aff3-a16491e62a6b%40googlegroups.com.

Cameron Kerr

unread,
Jun 4, 2020, 8:46:06 PM6/4/20
to Prometheus Users
Hi Azher,

You very likely have not configured your whitelistObjectNames, and are therefore first fetching all of MBean objects available within Kafka, which is likely going to be collecting far more data than you would actually use.

I would recommend that you connect to your JVM using jconsole or similar and find the ObjectName(s) that contain the Attributes you are wanting to query.

Here's a screenshot from Jconsole looking at the MBeans in a Tomcat environment as an example of where to look:

2020-06-05 12_25_36-Window.png


You can see an example of how I've used this in my whitelistObjectNames (along with other things)

# You really MUST use some whitelisting to select the bits of JMX you actually want.
# You DO NOT want to querying the entire MBean tree by default, which is what you
# get by default. This will likely take about 10 seconds depending and may have
# unintended side-effects, such as introducing lock contention potentially, or
# causing database queries to be run.
#
whitelistObjectNames
: [
 
'Blackboard.Learn:type=CourseUserActivity',
 
'Blackboard.Learn:type=ConnectionPoolAdmin',
 
'Blackboard.Learn:type=ConnectionPoolDefault',
 
'Blackboard.Learn:type=ConnectionPoolStats',
 
'Catalina:type=GlobalRequestProcessor,name="http-nio-8082"',
 
'Catalina:type=ThreadPool,name="http-nio-8082"'
 
]

Why is this significant? Because otherwise I get a scrape time in excess of 10 seconds. With an appropriate whitelist, I get something in the order of ...

# HELP jmx_scrape_duration_seconds Time this JMX scrape took, in seconds.
# TYPE jmx_scrape_duration_seconds gauge
jmx_scrape_duration_seconds
0.283491816

The example configs on https://github.com/prometheus/jmx_exporter/blob/master/example_configs/ don't currently do a good job of showing whitelistObjectNames, and so I would expect all of them to run very very slowly (and potentially have production-impacting side-effects, such as lock contention, depending on what the getter methods for the various MBeans do).
Reply all
Reply to author
Forward
0 new messages