hazelcast HealthMonitor prints no log for 30 seconds

832 views
Skip to first unread message

Hao Boo

unread,
Aug 4, 2014, 8:20:58 PM8/4/14
to haze...@googlegroups.com
Hey Hazelcast DEV/experts,
I am using 3.1.4. The value of property "hazelcast.health.monitoring.delay.seconds" is set to 5. But on a node, for over 30 seconds(please compare the highlighted timestamps below), hazelcast HealthMonitor doesn't print any log. What happened? I couldn't find anything suspicious from the log. Below I copied the HealthMonitor printout fyi. Can you please advise?

2014-08-04 13:07:49,953 INFO  [com.hazelcast.util.HealthMonitor] (hz._hzInstance_1_prod.HealthMonitor:) [node1]:5701 [prod] memory.used=411.1M, memory.free=612.9M, memory.total=1024.0M, memory.max=2.9G, memory.used/total=40.15%, memory.used/max=13.70%, load.process=4.00%, load.system=3.00%, load.systemAverage=4.00%, thread.count=36, thread.peakCount=58, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=29, proxy.count=1, clientEndpoint.count=0, connection.active.count=17, connection.count=7
2014-08-04 13:08:27,468 INFO  [com.hazelcast.util.HealthMonitor] (hz._hzInstance_1_prod.HealthMonitor:) [node1]:5701 [prod] memory.used=414.1M, memory.free=609.9M, memory.total=1024.0M, memory.max=2.9G, memory.used/total=40.44%, memory.used/max=13.80%, load.process=5.00%, load.system=5.00%, load.systemAverage=468.00%, thread.count=53, thread.peakCount=58, event.q.size=382, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=94, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.response.size=0, operations.remote.size=3, operations.running.size=29, proxy.count=1, clientEndpoint.count=0, connection.active.count=22, connection.count=7

Thanks a lot in advance!


Peter Veentjer

unread,
Aug 5, 2014, 2:38:23 AM8/5/14
to haze...@googlegroups.com
Apart from this gap, the other health monitor logstatements on that JVM before/after do run with a 5 second delay?






--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.
To post to this group, send email to haze...@googlegroups.com.
Visit this group at http://groups.google.com/group/hazelcast.
To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/94e104cc-842d-4bac-ab8c-e6c173bd2aa8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ole Tjensvold Johannessen

unread,
Aug 5, 2014, 8:32:46 AM8/5/14
to haze...@googlegroups.com
If the property hazelcast.health.monitoring.level is set to SILENT (default) then HealthMonitor will not print anything unless the system exceeds the threshold of 70% of either memoryused, prosessCPU or systemCPULoad
It is all in the source of: com.hazelcast.util.HealthMonitor

The 70% is hardcoded and seems a bit arbitrary in my view. Perhaps the property hazelcast.health.monitoring.level should be an Integer (0-100) instead and use that value directly.

Hao Boo

unread,
Aug 5, 2014, 10:38:17 AM8/5/14
to haze...@googlegroups.com
Hi Peter -- Yes, before/after that, the other health monitor log statements on that JMV run/print every 5 seconds. hazelcast.health.monitoring.level is set to NOISY.

Hao Boo

unread,
Aug 5, 2014, 11:13:38 AM8/5/14
to haze...@googlegroups.com
Also, in the latter log statement, value of some metrics, e.g. load.systemAverage, event.q.size, executor.q.operation.size,  increased a lot. What might have caused this? Is the increase related to the log gap issue? Our own monitoring shows there is number of requests (mainly GET operation in our case) increase. Please advise. Thanks!

On Monday, August 4, 2014 11:38:23 PM UTC-7, peter veentjer wrote:

dsukho...@gmail.com

unread,
Aug 5, 2014, 11:47:48 PM8/5/14
to haze...@googlegroups.com
May be it was long GC pause?

Hao Boo

unread,
Aug 6, 2014, 2:11:24 AM8/6/14
to haze...@googlegroups.com, dsukho...@gmail.com
NO, gclog doesn't show long gc pause. gc pause time is typically 0.00XXX second.
Reply all
Reply to author
Forward
0 new messages