Mod_Gearman freezing

61 views
Skip to first unread message

Jonathan

unread,
Jul 8, 2018, 9:12:40 AM7/8/18
to mod_gearman
I'm getting this error constantly in the geamand log.

  ERROR 2018-06-08 13: 05: 32.000000 [4] lost connection to client recv (EPIPE || ECONNRESET || EHOSTDOWN) (Connection reset by peer) -> libgearman-server / io.cc: 100
   ERROR 2018-06-08 13: 05: 32.000000 [4] closing connection due to previous errno error -> libgearman-server / io.cc: 109
   ERROR 2018-06-08 13: 06: 42.000000 [4] lost connection to client recv (EPIPE || ECONNRESET || EHOSTDOWN) (Connection reset by peer) -> libgearman-server / io.cc: 100
   ERROR 2018-06-08 13: 06: 42.000000 [4] closing connection due to previous errno error -> libgearman-server / io.cc: 109



after a while the gearmand locks and just normalizes after restart.

I'm using v0.33. Any idea what this might be caused.?

Tiago Augusto Furlaneto

unread,
Feb 21, 2019, 6:28:15 AM2/21/19
to mod_gearman
Hello.

 I'm having the same problem.
 Our gearman is freezing constantly.

 Did you find any solutions to this?

Thanks.

Patrick Wolfe

unread,
Jul 25, 2021, 6:15:50 AM7/25/21
to mod_gearman
3 years later, and we're having the same issue.  v0.33 just hangs once a week, at least.  I tried compiling v1.19.1.1, the latest, from source on our CentOS 7 Naemon+mod_gearman host, and it hangs even more often.

Has no one found a solution yet?

Sven Nierlein

unread,
Jul 26, 2021, 4:31:11 AM7/26/21
to mod_g...@googlegroups.com, Patrick Wolfe
Hi,

with what options do you start your gearmand? Btw, i am in the progress of moving mod-gearman builds to the opensuse build service, so there
will soon be packages build against the gearmand from epel.

https://build.opensuse.org/project/show/home:naemon:daily

Would be great if you could evaluate if those are working for you.

Cheers,
Sven


Am 25.07.21 um 12:15 schrieb Patrick Wolfe:
> --
> You received this message because you are subscribed to the Google Groups "mod_gearman" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to mod_gearman...@googlegroups.com <mailto:mod_gearman...@googlegroups.com>.
> To view this discussion on the web visit https://groups.google.com/d/msgid/mod_gearman/1e6906a4-784f-4634-84f0-5aada58411fcn%40googlegroups.com <https://groups.google.com/d/msgid/mod_gearman/1e6906a4-784f-4634-84f0-5aada58411fcn%40googlegroups.com?utm_medium=email&utm_source=footer>.

Patrick Wolfe (whistl)

unread,
Jul 26, 2021, 7:30:50 AM7/26/21
to Sven Nierlein, mod_g...@googlegroups.com
We’re currently running v0.33 (from EPEL) using:

/usr/sbin/gearmand -d —worker-wakeup=10 -R —log-file=/var/log/gearmand/gearmand.log

I’ve also compiled V1.1.19.1 from source (on CentOS 7) and tried it with:

/usr/local/sbin/gearmand —daemon -l /var/log/gearmand/gearmand.log —worker-wakeup=10 -R —verbose=WARNING

But 1.1.19.1 hangs much sooner than 0.33. V0.33 can last for days before it hangs. V1.1.19.1 hangs within a day. It lasted approx 9 hours yesterday before stoppiing (daemon stays running, but gearadmin —status and gearman_top can’t connect, and Naemon+mod_gearman complains of no workers for any queue.) Killing gearmand and restarting it resolves the issue.

The logfile doesn’t seem to include any clearly useful messages. I tried running with —verbose=DEBUG but the logfile filled up the disk within a few hours. Last night’s hang was preceeded by just two error messages:

ERROR 2021-06-26 10:07:18.000000 [ main ] write(Bad file descriptor) -> libgearman-server/gearmand_thread.cc:213. Line 213 of that file isn’t even a write statement, so I’m confused what went wrong.

Yes, I know it’s currently July and the date in the message says June, but that date is what appeared last night just after 10pm EDT. The system clock is correct.
Reply all
Reply to author
Forward
0 new messages