jmx_exporter's MBean-level fetching well-intentioned but preventing required telemetry [feature request discussion]

119 views
Skip to first unread message

Cameron Kerr

unread,
Jun 5, 2020, 6:13:20 AM6/5/20
to Prometheus Users
Hi all, I've spent a few days coming up to speed on understanding jmx_exporter, and I think I'm in a pretty good place now, understanding how MBeans, JMX, and RMI work.

I've so far deployed jmx_exporter in two ways:

* as a Java Agent on Tomcat 6 (on RHEL6) for a small application that includes SOLR and PostgreSQL
* and as a standalone Java HTTP server on Tomcat 8.5.34 (that comes bundled with a mission-critical application)

I found going with the Java Agent relatively easy, although I think I'll contribute a blog post and pull-request to help on the documentation front.

You might reasonably ask why I'm bothering with the HTTP server. Here's my business logic that drives this:

* we have a mission-critical application that we urgently need to improve our visibility on to diagnose some performance limits we believe we're reaching
* we're reluctant to introduce (and cause an outage to introduce) a java agent --- as far as I'm aware jmx_exporter lacks the ability that jconsole has to dynamically inject an agent.
* as part of a previous monitoring drive, we've already introduced the appropriate Remote JMX configuration (-D....jmxremote... etc.), which means we can introduce some monitoring into our production environment and easily restart the JMX exporter as needed to iterate through configuration changes.

We recognise that running a separate JVM has its disadvantages, namely:
* it will incur a JVM memory overhead
* it will likely need to be run as the same user with the same version/type of JVM (I'm not sure if this is accurate, but it seems safer).
* it creates a potential hole (via RMI) in the security boundary of the application, so we would prefer to house this on the same server (similar to a 'side-car' type of deployment, I suppose)

So most of what I'm about to say is about Remote JMX mode of operation (but still potentially relevant in part to Agent mode).

Here's the business value I need to obtain from jmx_exporter:

1) provide telemetry we're missing to diagnose urgent and important production issues, particularly for database connection pools and thread counts (memory/garbage collection would also be useful in the general case, and application-specific MBeans that would be useful in specific cases, such as applications that use SOLR or particular frameworks that instrument various URL handlers with nice statistics)
2) impart minimal changes to application runtime or risk changing behaviour in mission-critical production application
3) impart minimal changes in performance; we don't want to induce unreasonable load by introducing monitoring.

As I understand it, the current implementation of jmx_exporter uses a MBean level of querying the Attributes available within an MBean, effecting providing a 'batch' sort of API which reduces the number of RMI round-trips in the expectation that this is faster than what JConsole does by querying each individual Attribute (more round-trips, potentially over a remote connection). This does make the assumption though that the time spent (and value received) from querying all of the attributes is worthwhile. Let's see where this assumption, well-intentioned as it is, leads us in practice:

I want to get telemetry around ThreadPool usage within Tomcat, so looking at JConsole, I see the following

2020-06-05 17_38_53-RHEL Server 7 [Running] - Oracle VM VirtualBox.png


Great, connectionCount, currentThreadCount and currentThreadBusy look to be things I would definately be interested in, I'm unlikely to use most of the rest.

Clicking on the 'http-nio-8082', I see the ObjectName being the following, which I put into my whitelistObjectNames

Catalina:type=ThreadPool,name="http-nio-8082"

So now my configuration looks something like the following:

---
hostPort: 127.0.0.1:9090
username:
password:
ssl: false

lowercaseOutputLabelNames: true
lowercaseOutputName: true

# You really MUST use some whitelisting to select the bits of JMX you actually want.
# You DO NOT want to querying the entire MBean tree by default, which is what you
# get by default. This will likely take about 10 seconds depending and may have
# unintended side-effects, such as introducing lock contention potentially, or
# causing database queries to be run.
#
whitelistObjectNames: [
  'Catalina:type=ThreadPool,name="http-nio-8082"'
  ]

# It's not enough to simply grab the data; we need to do something with it to
# generate it into metrics, otherwise that's potentially a lot of effort wasted
# getting all that raw data (you did use a whitelist, right?)
#
rules:

# Ah, due to a bug that was fixed in Tomcat 8.5.35 (our app bundles 8.5.34), this results in a
# serialization error. Because the socketProperties is not serialisable (it shows as 'Unavailable' in JConsole)
# it faults the entire request for that object and returns an exception over the wire.
#
- pattern: 'Catalina<type=ThreadPool, name="(\w+-\w+)-(\d+)"><>(currentThreadCount|currentThreadsBusy|connectionCount):'
  name: tomcat_threadpool_$3
  labels:
    port: "$2"
    protocol: "$1"
  help: Tomcat threadpool $3
  type: GAUGE


(I've spoiled the story with the comment, but that's okay...)

The problem (as other people have bumped into) is that Tomcat < 8.5.35, and other things will exhibit this behaviour also, is that ..... hang on, let me back up a bit to add some understanding to how this works:

An MBean is essentially an object (okay, a subclass) that implements an Interface. Anything in Java can create MBeans; common examples being things like Tomcat, large libraries, and even the Java base environment itself. All these MBeans get registered into JMX (Java Management Extensions) which provide some structure and discoverability for tools like JConsole (or jmx_exporter). MBeans essentially expose various Attributes (methods that essentially 'getSomething'), Operations (other methods that might be used to change runtime state), and Notifications (which we completely ignore, along with Operations, for the purposes of jmx_exporter.

JMX Exporter (in its HTTP server, external process form) connects (call it the 'client') to the (Tomcat) JVM ('server') over an RMI connection. This is effectively a form of IPC, where the client can invoke methods (RMI = Remote Method Invocation) on the server. So when you get the value of an Attribute, you are essentially calling some getSomething() method in an MBean. What you get from that is up to whatever implemented it (ie. you get a Plain-Old-Java-Object, or POJO for short). But to get from the 'server' over the RMI connection to the 'client' it needs to be serialised to be sent over the wire, deserialised at the other end, and then evaluated.

Take socketProperties for example. I don't care about it; I care about currentThreadCount etc. But the problem with Tomcat (fixed in Tomcat 8.5.35, if you have the luxury of moving to that; our vendor-supplied application bundles Tomcat 8.5.34) is that its implementation of the 'getter' method for socketProperties returns something that is not serialisable (it doesn't implement that expected method). This becomes a problem at the point where it needs to be serialised, which is RMI. This results in an exception.

Because jmx_exporter is using a method that says 'give me all the attributes for MBean B', that exception basically junks the whole result, and I lose the result of currentThreadCount etc. with it.

JConsole on the other hand uses the slower-but-steadier 'tell me what attributes exist in MBean B' followed by a lot of 'give me Attribute A for MBean B', it can handle that exception (showing it as a red 'Unavailable')


Now let's look at another similar case; one where there are no bugs present. In this example I want to get information about database connection pool utilisation because this is valuable information and a common load-related performance issue (this tends to be true of connection-pools in general, such as for LDAP, but you get plenty of third-party libraries in the JDBC space).

For this you'll need to find some suitable MBeans, assuming if they are even visible at all; one of my studies had a Tomcat 6 deployment with PostgreSQL and it didn't seem to expose any MBeans that I could see, my other study had Tomcat 7 and the MBeans lived in a domain specific to the application (in this case, an online learning product called Blackboard).

2020-06-05 20_40_34-RHEL Server 7 [Running] - Oracle VM VirtualBox.png


Note that the ClassName is org.apache.tomcat.jdbc.pool.jmx.ConnectionPool .... but its the application that decides where to put the MBean and what to use as the ObjectName, so if the application is managing its own connection pools (rather than using a connection-pool provided by the middleware), prepared to hunt around it. The ClassName does come into play though, because that tells us what data is inside the MBean (and helps us find some documentation as to what those attributes might actually mean).

So let's see what attributes this fairly common class exposes for monitoring: There are some obvious things here we would want to measure, either as GAUGES or as COUNTERS, but most of it we wouldn't need or want. In this screenshot, remember that I'm using Remote JMX, and JConsole is also using Remote JMX in this instance. If you hover over the red Unavailable for JdbcInterceptorsAsArray, you see the exception that causes it to be unavailable, and it's the same exception you see in the jmx_exporter (or more accurately, ./jmx_prometheus_httpserver.jar) when you have debug logging enabled.

2020-06-05 21_29_05-RHEL Server 7 [Running] - Oracle VM VirtualBox.png


java.rmi.UnmarshalException: error unmarshalling return; nested exception is: 
java.lang.ClassNotFoundException: org.apache.tomcat.jdbc.pool.PoolProperties$InterceptorDefinition (no security manager: RMI class loader disabled)

Let's unpack this a bit to understand what this means: the RMI client (Jconsole in Remote JMX mode, or jmx_prometheus_httpserver.jar) has received a serialised version of a class called org.apache.tomcat.jdbc.pool.PoolProperties$InterceptorDefinition, and it needs to deserialise it to extract a value from it (eg. a string value or floating point value). But to do that it needs to have that class available somewhere. You can see that this class is specific to Tomcat and JDBC, so Jconsole (or jmx_prometheus_httpserver.jar) won't be likely to have that available.

Presumably you could hunt around (a lot) and stuff a lot of things into the classpath of the RMI client, but that's painful, needless work (I tried and failed, but I'm not enough of a Tomcat wizard to know how to determine what classpath is present (classloaders, yay) for that webapp etc.

Alternatively, I could apparently make use of 'RMI class loader' which sends the classes over the wire too to be loaded on the client side --- and also have to navigate a security manager --- that's a learning path I may have to attempt next.

Either way, considering I have no interest in JdbcInterceptorsAsArray anyway, all I want is Active, Idle, Size and a few counters that bear critical importance for my monitoring. But if I can't get a complete result set, I get nothing.


Let's recap and see how this affects the value I'm expecting to achieve:

1) provide telemetry we're missing to diagnose urgent and important production issues, particularly for database connection pools and thread counts (memory/garbage collection would also be useful in the general case, and application-specific MBeans that would be useful in specific cases, such as applications that use SOLR or particular frameworks that instrument various URL handlers with nice statistics)
2) impart minimal changes to application runtime or risk changing behaviour in mission-critical production application
3) impart minimal changes in performance; we don't want to induce unreasonable load by introducing monitoring.

#1 is mostly unattainable either because something is not serialisable on the RMI server side, or is not serialisable on the RMI client side. All I can get are the 'nice-to-haves'.
#2 would be met by Remote JMX; if I have to use the Agent then my lead-time for introducing monitoring increases, decreasing my agility and ability to quickly withdraw the functionality in a production environment without an application restart.
#3 with appropriate whitelisting of ObjectNames we can get most of the way there and could reasonably scrape the metrics once a minute without fear, although some MBeans do become very large, particularly if they contain arrays, when you often only need a small handful of attributes. If we can scrape a smaller set however, we could achieve a higher fidelity if desired, which might paint a truer picture if all you have to work with are gauges.


I would like to propose that we introduce one of two things:

EITHER add a new attribute whitelistObjectNameAttributes that could be used for Jconsole-style attribute at a time (or similar; can you grab a few named attributes in one go?), which would allow for either the broad-brush or fine-brush approach to collecting the data;

OR allow for using the slower attribute-at-a-time as either an option or as a fallback.

Personally I would prefer the first option because I would much rather pick and choose, since I need to be familiar with what data is available anyway in order to use it effectively.

I'm not a Java programmer (at all, but I am a bit of a polyglot and I've been supporting Java workloads for years) but I'd be willing to give a go at implementing this and submitting a pull-request if people would be interested in receiving one.


PS. If anyone would like an Ansible playbook for deploying jmx_prometheus_httpserver.jar I'm willing to share what I have so far.

PPS. If anyone has experience setting up RMI class loader, I'd love some tips.

Thanks for reading this far, and I hope this (long) post helps people to understand and use jmx_exporter more effectively. Once I complete some of this, you can expect some documentation-related PRs

Cameron

 

Brian Brazil

unread,
Jun 5, 2020, 6:31:13 AM6/5/20
to Cameron Kerr, Prometheus Users
On Fri, 5 Jun 2020 at 11:13, Cameron Kerr <cameron...@gmail.com> wrote:
Hi all, I've spent a few days coming up to speed on understanding jmx_exporter, and I think I'm in a pretty good place now, understanding how MBeans, JMX, and RMI work.

I've so far deployed jmx_exporter in two ways:

* as a Java Agent on Tomcat 6 (on RHEL6) for a small application that includes SOLR and PostgreSQL
* and as a standalone Java HTTP server on Tomcat 8.5.34 (that comes bundled with a mission-critical application)

I found going with the Java Agent relatively easy, although I think I'll contribute a blog post and pull-request to help on the documentation front.

You might reasonably ask why I'm bothering with the HTTP server. Here's my business logic that drives this:

* we have a mission-critical application that we urgently need to improve our visibility on to diagnose some performance limits we believe we're reaching
* we're reluctant to introduce (and cause an outage to introduce) a java agent --- as far as I'm aware jmx_exporter lacks the ability that jconsole has to dynamically inject an agent.

It's a very simple Java agent that doesn't do anything fancy. I believe it's possible if you already know how to do such things, though using it as an agent is almost always better.
 
* as part of a previous monitoring drive, we've already introduced the appropriate Remote JMX configuration (-D....jmxremote... etc.), which means we can introduce some monitoring into our production environment and easily restart the JMX exporter as needed to iterate through configuration changes.

The JMX exporter will pick up configuration changes without restarting.
 

We recognise that running a separate JVM has its disadvantages, namely:
* it will incur a JVM memory overhead
* it will likely need to be run as the same user with the same version/type of JVM (I'm not sure if this is accurate, but it seems safer).
* it creates a potential hole (via RMI) in the security boundary of the application, so we would prefer to house this on the same server (similar to a 'side-car' type of deployment, I suppose)

The really big disadvantage is that it's much slower. You also lose some process and JVM metrics.
The JMX exporter used to go attribute by attribute, switching to batch gave a substantial speedup. That's not a change I'd be looking to undo as it'd make the jmx exporter unusable for too many users, for the sake of a small handful of poor JMX implementations.



Now let's look at another similar case; one where there are no bugs present. In this example I want to get information about database connection pool utilisation because this is valuable information and a common load-related performance issue (this tends to be true of connection-pools in general, such as for LDAP, but you get plenty of third-party libraries in the JDBC space).

For this you'll need to find some suitable MBeans, assuming if they are even visible at all; one of my studies had a Tomcat 6 deployment with PostgreSQL and it didn't seem to expose any MBeans that I could see, my other study had Tomcat 7 and the MBeans lived in a domain specific to the application (in this case, an online learning product called Blackboard).

2020-06-05 20_40_34-RHEL Server 7 [Running] - Oracle VM VirtualBox.png


Note that the ClassName is org.apache.tomcat.jdbc.pool.jmx.ConnectionPool .... but its the application that decides where to put the MBean and what to use as the ObjectName, so if the application is managing its own connection pools (rather than using a connection-pool provided by the middleware), prepared to hunt around it. The ClassName does come into play though, because that tells us what data is inside the MBean (and helps us find some documentation as to what those attributes might actually mean).

So let's see what attributes this fairly common class exposes for monitoring: There are some obvious things here we would want to measure, either as GAUGES or as COUNTERS, but most of it we wouldn't need or want. In this screenshot, remember that I'm using Remote JMX, and JConsole is also using Remote JMX in this instance. If you hover over the red Unavailable for JdbcInterceptorsAsArray, you see the exception that causes it to be unavailable, and it's the same exception you see in the jmx_exporter (or more accurately, ./jmx_prometheus_httpserver.jar) when you have debug logging enabled.

2020-06-05 21_29_05-RHEL Server 7 [Running] - Oracle VM VirtualBox.png


java.rmi.UnmarshalException: error unmarshalling return; nested exception is: 
java.lang.ClassNotFoundException: org.apache.tomcat.jdbc.pool.PoolProperties$InterceptorDefinition (no security manager: RMI class loader disabled)

Let's unpack this a bit to understand what this means: the RMI client (Jconsole in Remote JMX mode, or jmx_prometheus_httpserver.jar) has received a serialised version of a class called org.apache.tomcat.jdbc.pool.PoolProperties$InterceptorDefinition, and it needs to deserialise it to extract a value from it (eg. a string value or floating point value). But to do that it needs to have that class available somewhere. You can see that this class is specific to Tomcat and JDBC, so Jconsole (or jmx_prometheus_httpserver.jar) won't be likely to have that available.

Presumably you could hunt around (a lot) and stuff a lot of things into the classpath of the RMI client, but that's painful, needless work (I tried and failed, but I'm not enough of a Tomcat wizard to know how to determine what classpath is present (classloaders, yay) for that webapp etc.

Alternatively, I could apparently make use of 'RMI class loader' which sends the classes over the wire too to be loaded on the client side --- and also have to navigate a security manager --- that's a learning path I may have to attempt next.

Either way, considering I have no interest in JdbcInterceptorsAsArray anyway, all I want is Active, Idle, Size and a few counters that bear critical importance for my monitoring. But if I can't get a complete result set, I get nothing.


Let's recap and see how this affects the value I'm expecting to achieve:

1) provide telemetry we're missing to diagnose urgent and important production issues, particularly for database connection pools and thread counts (memory/garbage collection would also be useful in the general case, and application-specific MBeans that would be useful in specific cases, such as applications that use SOLR or particular frameworks that instrument various URL handlers with nice statistics)
2) impart minimal changes to application runtime or risk changing behaviour in mission-critical production application
3) impart minimal changes in performance; we don't want to induce unreasonable load by introducing monitoring.

#1 is mostly unattainable either because something is not serialisable on the RMI server side, or is not serialisable on the RMI client side. All I can get are the 'nice-to-haves'.
#2 would be met by Remote JMX; if I have to use the Agent then my lead-time for introducing monitoring increases, decreasing my agility and ability to quickly withdraw the functionality in a production environment without an application restart.
#3 with appropriate whitelisting of ObjectNames we can get most of the way there and could reasonably scrape the metrics once a minute without fear, although some MBeans do become very large, particularly if they contain arrays, when you often only need a small handful of attributes. If we can scrape a smaller set however, we could achieve a higher fidelity if desired, which might paint a truer picture if all you have to work with are gauges.


I would like to propose that we introduce one of two things:

EITHER add a new attribute whitelistObjectNameAttributes that could be used for Jconsole-style attribute at a time (or similar; can you grab a few named attributes in one go?), which would allow for either the broad-brush or fine-brush approach to collecting the data;

I'd be open to considering an attribute name blacklist, if there's a sane way to specify that.
 

OR allow for using the slower attribute-at-a-time as either an option or as a fallback.

I'd rather not, the performance difference is just too much.


On a higher level, I'd suggest looking at ways to get metrics that don't involve JMX. Using client_java directly as far as possible will avoid all the fun and performance issues that JMX brings.

Brian
 

Personally I would prefer the first option because I would much rather pick and choose, since I need to be familiar with what data is available anyway in order to use it effectively.

I'm not a Java programmer (at all, but I am a bit of a polyglot and I've been supporting Java workloads for years) but I'd be willing to give a go at implementing this and submitting a pull-request if people would be interested in receiving one.


PS. If anyone would like an Ansible playbook for deploying jmx_prometheus_httpserver.jar I'm willing to share what I have so far.

PPS. If anyone has experience setting up RMI class loader, I'd love some tips.

Thanks for reading this far, and I hope this (long) post helps people to understand and use jmx_exporter more effectively. Once I complete some of this, you can expect some documentation-related PRs

Cameron

 

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/04481e73-465e-4815-a6a9-4697c4e930ceo%40googlegroups.com.


--

Cameron Kerr

unread,
Jun 5, 2020, 7:31:44 AM6/5/20
to Prometheus Users


On Friday, 5 June 2020 22:31:13 UTC+12, Brian Brazil wrote:
On Fri, 5 Jun 2020 at 11:13, Cameron Kerr <camero...@gmail.com> wrote:
Hi all, I've spent a few days coming up to speed on understanding jmx_exporter, and I think I'm in a pretty good place now, understanding how MBeans, JMX, and RMI work.

I've so far deployed jmx_exporter in two ways:

* as a Java Agent on Tomcat 6 (on RHEL6) for a small application that includes SOLR and PostgreSQL
* and as a standalone Java HTTP server on Tomcat 8.5.34 (that comes bundled with a mission-critical application)

I found going with the Java Agent relatively easy, although I think I'll contribute a blog post and pull-request to help on the documentation front.

You might reasonably ask why I'm bothering with the HTTP server. Here's my business logic that drives this:

* we have a mission-critical application that we urgently need to improve our visibility on to diagnose some performance limits we believe we're reaching
* we're reluctant to introduce (and cause an outage to introduce) a java agent --- as far as I'm aware jmx_exporter lacks the ability that jconsole has to dynamically inject an agent.

It's a very simple Java agent that doesn't do anything fancy. I believe it's possible if you already know how to do such things, though using it as an agent is almost always better.

I can appreciate that a lot more now that I understand more about the issues around RMI etc.
 
* as part of a previous monitoring drive, we've already introduced the appropriate Remote JMX configuration (-D....jmxremote... etc.), which means we can introduce some monitoring into our production environment and easily restart the JMX exporter as needed to iterate through configuration changes.

The JMX exporter will pick up configuration changes without restarting

Very happy to hear that. I'll be sure to mention that in an upcoming doc PR, as that behaviour is not listed on the README.md that I could find.
 
[...] You also lose some process and JVM metrics.

I'm not sure that's necessarily the case; at least I can see MBeans for Memory, OperatingSytem, Runtime, Threading, etc. under the 'java.lang' domain, but that's likely to be JVM dependent, I don't think I saw that on Java 7 (or Tomcat 6). But I guess you mean that jmx_exporter is exporting some of its own information, rather than from MBeans for "process and JVM metrics"

I would like to propose that we introduce one of two things:

EITHER add a new attribute whitelistObjectNameAttributes that could be used for Jconsole-style attribute at a time (or similar; can you grab a few named attributes in one go?), which would allow for either the broad-brush or fine-brush approach to collecting the data;

I'd be open to considering an attribute name blacklist, if there's a sane way to specify that.

Looking at the following makes me think it should be reasonably straightforward to create an additional white/blacklists for Attributes within an ObjectName; I think the crux of it would be in located around this part of the code:

        Map<String, MBeanAttributeInfo> name2AttrInfo = new LinkedHashMap<String, MBeanAttributeInfo>();
        for (int idx = 0; idx < attrInfos.length; ++idx) {
            MBeanAttributeInfo attr = attrInfos[idx];
            if (!attr.isReadable()) {
                logScrape(mbeanName, attr, "not readable");
                continue;
            }
            name2AttrInfo.put(attr.getName(), attr);

Plus there would have to the config-level stuff and validation thereof, and plumbing the config into the various method calls in places where where whitelistObjectName and blacklistObjectName appear. And test-cases.

On a higher level, I'd suggest looking at ways to get metrics that don't involve JMX. Using client_java directly as far as possible will avoid all the fun and performance issues that JMX brings.

That would seem to be impractical in all my use-cases, as I don't control the application (certainly not the code-base) to that extent, and JMX would be the right, supported, tool for this job. I think I can probably work in the agent though; it seems to be the normal way for other similar vendors to do the same monitoring thing.


Brian Brazil

unread,
Jun 5, 2020, 7:44:51 AM6/5/20
to Cameron Kerr, Prometheus Users
On Fri, 5 Jun 2020 at 12:31, Cameron Kerr <cameron...@gmail.com> wrote:


On Friday, 5 June 2020 22:31:13 UTC+12, Brian Brazil wrote:
On Fri, 5 Jun 2020 at 11:13, Cameron Kerr <camero...@gmail.com> wrote:
Hi all, I've spent a few days coming up to speed on understanding jmx_exporter, and I think I'm in a pretty good place now, understanding how MBeans, JMX, and RMI work.

I've so far deployed jmx_exporter in two ways:

* as a Java Agent on Tomcat 6 (on RHEL6) for a small application that includes SOLR and PostgreSQL
* and as a standalone Java HTTP server on Tomcat 8.5.34 (that comes bundled with a mission-critical application)

I found going with the Java Agent relatively easy, although I think I'll contribute a blog post and pull-request to help on the documentation front.

You might reasonably ask why I'm bothering with the HTTP server. Here's my business logic that drives this:

* we have a mission-critical application that we urgently need to improve our visibility on to diagnose some performance limits we believe we're reaching
* we're reluctant to introduce (and cause an outage to introduce) a java agent --- as far as I'm aware jmx_exporter lacks the ability that jconsole has to dynamically inject an agent.

It's a very simple Java agent that doesn't do anything fancy. I believe it's possible if you already know how to do such things, though using it as an agent is almost always better.

I can appreciate that a lot more now that I understand more about the issues around RMI etc.
 
* as part of a previous monitoring drive, we've already introduced the appropriate Remote JMX configuration (-D....jmxremote... etc.), which means we can introduce some monitoring into our production environment and easily restart the JMX exporter as needed to iterate through configuration changes.

The JMX exporter will pick up configuration changes without restarting

Very happy to hear that. I'll be sure to mention that in an upcoming doc PR, as that behaviour is not listed on the README.md that I could find.
 
[...] You also lose some process and JVM metrics.

I'm not sure that's necessarily the case; at least I can see MBeans for Memory, OperatingSytem, Runtime, Threading, etc. under the 'java.lang' domain, but that's likely to be JVM dependent, I don't think I saw that on Java 7 (or Tomcat 6). But I guess you mean that jmx_exporter is exporting some of its own information, rather than from MBeans for "process and JVM metrics"

There's some process stuff it'll pull in on Linux systems that's not available from mBeans. It shouldn't vary by Java version.
 

I would like to propose that we introduce one of two things:

EITHER add a new attribute whitelistObjectNameAttributes that could be used for Jconsole-style attribute at a time (or similar; can you grab a few named attributes in one go?), which would allow for either the broad-brush or fine-brush approach to collecting the data;

I'd be open to considering an attribute name blacklist, if there's a sane way to specify that.

Looking at the following makes me think it should be reasonably straightforward to create an additional white/blacklists for Attributes within an ObjectName; I think the crux of it would be in located around this part of the code:

        Map<String, MBeanAttributeInfo> name2AttrInfo = new LinkedHashMap<String, MBeanAttributeInfo>();
        for (int idx = 0; idx < attrInfos.length; ++idx) {
            MBeanAttributeInfo attr = attrInfos[idx];
            if (!attr.isReadable()) {
                logScrape(mbeanName, attr, "not readable");
                continue;
            }
            name2AttrInfo.put(attr.getName(), attr);

Plus there would have to the config-level stuff and validation thereof, and plumbing the config into the various method calls in places where where whitelistObjectName and blacklistObjectName appear. And test-cases.

That's the bit of code, the challenge is that there's more than one object so you can't work off attribute name alone.

Brian
 

On a higher level, I'd suggest looking at ways to get metrics that don't involve JMX. Using client_java directly as far as possible will avoid all the fun and performance issues that JMX brings.

That would seem to be impractical in all my use-cases, as I don't control the application (certainly not the code-base) to that extent, and JMX would be the right, supported, tool for this job. I think I can probably work in the agent though; it seems to be the normal way for other similar vendors to do the same monitoring thing.


--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages