Mod-Gearman 3.0b1, call for testers

105 views
Skip to first unread message

Sven Nierlein

unread,
Apr 15, 2016, 10:02:50 AM4/15/16
to mod_gearman
Hi list,

the next release will be a major release, so there is a beta package available to hopefully find
some last glitches before the actual release. The 3.0.0b1 packages can be downloaded here
http://mod-gearman.org/download/v3.0.0b1/
or installed via the Consol Labs Testing Repository at https://labs.consol.de/repo/testing/

The changelog says it all:

3.x
- support ipv6 in gearman_top and check_gearman
- introduce configure options --enable-naemon-neb-module, --enable-nagios3-neb-module and --enable-nagios4-neb-module
to determine which neb module to build. This obsoletes mod-gearman 1.x and 2.x and
combines both into mod-gearman 3.x.
- moved perfdata=all to separate option perfdata_send_all=yes
- support multiple perfdata queues

The reason for the major release is to obsolete having mod-gearman 1 and mod-gearman 2 packages just for the solely reason
of having 2 different NEB modules while the rest of the package is identical. So Mod-Gearman 3 ships the Naemon, the Nagios 3
module and new a separate Nagios 4 module.

So please, if you could test the new packages and provide some feedback, that'll be great.

Btw, Mod-Gearman 3.0b1 is also already in the OMD-Labs nightly builds if that makes testing easier for you.

Cheers,
Sven

Jean-François Rameau

unread,
Apr 25, 2016, 10:24:53 AM4/25/16
to mod_g...@googlegroups.com
Hi sven,

my two omd-210 (master) + 2.11.20160414-labs-edition(worker gearman 3) are working nicely since your announce.

Medium load around 10 checks/s, CentOS 7.2 x64.

No crash, no memory leak, no weird logs.

I didn't play btw with perfdata queues. I'll give it a try this week.

jfr


--
You received this message because you are subscribed to the Google Groups "mod_gearman" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mod_gearman...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andreas Foerster

unread,
May 3, 2016, 11:06:26 AM5/3/16
to mod_gearman
Hi Sven,

today I installed 3.0b1 on our naemon server (CENTOS 7.2, x86_64) using your testing repo.
Installed naemon-releases (consol-labs testing repo):
libnaemon.x86_64                       1.0.4_20160325-1.el7.centos     @labs_consol_testing
naemon.x86_64                          1.0.4_20160325-1.el7.centos     @labs_consol_testing
naemon-core.x86_64                     1.0.4_20160325-1.el7.centos     @labs_consol_testing
naemon-core-dbg.x86_64                 1.0.4_20160325-1.el7.centos     @labs_consol_testing
naemon-devel.x86_64                    1.0.4_20160325-1.el7.centos     @labs_consol_testing
naemon-livestatus.x86_64               1.0.4_20160325-1.el7.centos     @labs_consol_testing
naemon-thruk.x86_64                    1.0.4_20160325-1.el7.centos     @labs_consol_testing
naemon-tools.x86_64                    1.0.4_20160325-1.el7.centos     @labs_consol_testing


When naemon starts up it throws a segfault error:

(/var/log/messages)
...
May  3 16:15:31 de01asr0053 systemd: Started Cluster Controlled naemon.
May  3 16:15:31 de01asr0053 kernel: naemon[12428]: segfault at 18 ip 00007fd68899c094 sp 00007ffd75e06748 error 4 in libnaemon.so.0.0.0[7fd68895f000+9d000]
May  3 16:15:31 de01asr0053 systemd: naemon.service: main process exited, code=dumped, status=11/SEGV
May  3 16:15:31 de01asr0053 systemd: Unit naemon.service entered failed state.
May  3 16:15:31 de01asr0053 systemd: naemon.service failed.
...

After that I tried to build mod_geaman from scratch, same result.

Any idea?

Regards,
Andreas

Andreas Foerster

unread,
May 4, 2016, 5:52:47 AM5/4/16
to mod_gearman
additional info about segfault (gdb backtrace):

...
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7b71094 in is_host_member_of_hostgroup () from /usr/lib64/naemon/libnaemon.so.0
(gdb) bt
#0  0x00007ffff7b71094 in is_host_member_of_hostgroup () from /usr/lib64/naemon/libnaemon.so.0
#1  0x00007ffff66af459 in set_target_queue (hst=hst@entry=0x642bd0, svc=svc@entry=0x0) at neb_module_naemon/../neb_module/mod_gearman.c:1106
#2  0x00007ffff66aff14 in handle_host_check (event_type=<optimized out>, data=0x7fffffffd9b0) at neb_module_naemon/../neb_module/mod_gearman.c:607
#3  0x00007ffff7b6807f in neb_make_callbacks () from /usr/lib64/naemon/libnaemon.so.0
#4  0x00007ffff7b485cb in broker_host_check () from /usr/lib64/naemon/libnaemon.so.0
#5  0x00007ffff7b4ccc6 in ?? () from /usr/lib64/naemon/libnaemon.so.0
#6  0x00007ffff7b4d1b7 in ?? () from /usr/lib64/naemon/libnaemon.so.0
#7  0x00007ffff7b609b7 in ?? () from /usr/lib64/naemon/libnaemon.so.0
#8  0x00007ffff7b60ece in event_poll () from /usr/lib64/naemon/libnaemon.so.0
#9  0x0000000000403345 in main ()



Sven Nierlein

unread,
May 4, 2016, 6:34:48 AM5/4/16
to mod_g...@googlegroups.com
Thanks for the detailed backtrace. Thats very helpful. Can you by any chance enable debug logging to
find out which host causes the segfault and to see if there is anything special about that host.

Thanks,
Sven
> --
> You received this message because you are subscribed to the Google Groups "mod_gearman" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to mod_gearman...@googlegroups.com <mailto:mod_gearman...@googlegroups.com>.

Andreas Foerster

unread,
May 4, 2016, 11:11:53 AM5/4/16
to mod_gearman
Hi Sven,

finally I found the error when looking at the debug-file: For "historic" reasons we configured some hosts having "check_interval 0" to disable active checking of these hosts. Naemon does not claim about that w/o mod_gearman. As soon as mod_gearman is configured as a broker, the seg_fault error occurs.

Anyway, when "check_interval=0" should not be used, naemon should claim about it. If zero is still a valid check_interval, in my opinion mod_gearman should be able to deal with that.
For now, we will change our config, I'll keep you informed if there are further issues using mod_gearman.

Thanks for your help,
Andreas

Sven Nierlein

unread,
May 9, 2016, 4:07:21 AM5/9/16
to mod_g...@googlegroups.com
Hi Andreas,

hmm, that sounds strange. I could easily skip hosts/services which have check_interval=0 set, but according to your backtrace, naemon dies in 'is_host_member_of_hostgroup()' and i fail to see any connection to the check_interval. And furthermore, this logic should be done by naemon already before it runs the broker callback. Could you raise an issue for naemon-core including your backtrace?

Cheers,
Sven

Andreas Foerster

unread,
May 9, 2016, 4:34:40 AM5/9/16
to mod_gearman
Hi Sven,

naemon-core issue created today:

disabled host_checks and mod_gearman: seg_fault in 'is_host_member_of_hostgroup()' #131


Thanks,

Andreas


Marcos Soto

unread,
Aug 9, 2016, 4:48:48 AM8/9/16
to mod_gearman
Hi Sven!!

mod_gearman-3 works fine with this architecture:
- CentOS Linux release 7.2.1511 (Core)
- naemon-1.0.6_20160803-1.el7.centos.x86_64
- thruk-2.09-20160802.x86_64
- gearmand-0.33-5.x86_64
- mod_gearman-3.0.0b1-1.el7.centos.x86_64

Thanks master you are the best.
See you!!

Sven Nierlein

unread,
Aug 22, 2016, 12:32:08 PM8/22/16
to mod_gearman
Hi list,

I just uploaded the beta 2 to
http://mod-gearman.org/download/v3.0.0b2/
also available via the Consol Labs Testing Repository at https://labs.consol.de/repo/testing/

Cheers,
Sven
Reply all
Reply to author
Forward
0 new messages