Hi people... I know this is an old thread, but I could use some help from you guys...
I've got a omd/nagios/gearmand box, it seems to be working fine with it's own workers, so I think... gearman_top shows 5 workers avaliable on host, eventhandler and service, but it shows 0 jobs waiting and 0 jobs running...
It's also showing my other box (worker only), called german-01. On this box I'm running mod_german_worker with 5 threads, but gearman_top shows me only 1 avaliable worker, and it's also 0 for jobs waiting and jobs running:
eventhandler | 0 | 0 | 0
host | 0 | 0 | 0
service | 0 | 0 | 0
worker_gearman-01 | 1 | 0 | 0
--------------------------------------------------------------------
The server box and the gearman-01 box can talk to each other over tcp 4370 without problems, I even tested it with telnet.
On the gearman-01 box, the worker is running like this:
nagios 30618 0.0 0.1 135604 3128 ? S 17:36 0:00 /usr/local/bin/mod_gearman_worker -d --config=/etc/mod_gearman/mod_gearman/mod_gearman_worker.conf --pidfile=/var/mod_gearman/mod_gearman_worker.pid
The worker.conf file is pointing to the server box ip and port, the keyfile is the same for both, so I'm losing some hair here cause I can't find why these pals aren't working together...
check_gearman shows me this:
For server:
/usr/bin/check_gearman -H localhost
check_gearman CRITICAL - Queue worker_nagios-omd-a has 1 job without any worker. |'eventhandler_waiting'=0;10;100;0 'eventhandler_running'=0 'eventhandler_worker'=5;25;50;0 'host_waiting'=0;10;100;0 'host_running'=0 'host_worker'=5;25;50;0 'service_waiting'=0;10;100;0 'service_running'=0 'service_worker'=5;25;50;0 'worker_gearman-01_waiting'=0;10;100;0 'worker_gearman-01_running'=0 'worker_gearman-01_worker'=1;25;50;0 'worker_nagios-omd-a_waiting'=1;10;100;0 'worker_nagios-omd-a_running'=0 'worker_nagios-omd-a_worker'=0;25;50;0
For gearman-01:
/opt/omd/versions/1.10/lib/nagios/plugins/check_gearman -H 10.18.0.49 -q worker_`hostname` -t 10 -s check
check_gearman OK - gearman-01 has 5 worker and is working on 0 jobs. Version: 1.4.14|worker=5;;;5;200 jobs=496c
Note that this number of jobs, 496c, seems to be addind just because of every check_gearman execution I did....
(Yes... I already run that 496 times.... : / )
Any idea? help?