Guten Tag Brent Rolland,
am Donnerstag, 24. Dezember 2015 um 01:06 schrieben Sie:
> We just completed an upgrade to Bugzilla 4.4.6 (from 3.6.2) on
> CentOS 6.2.
Any reasons for not using the latest stable version 5?
> The OS shows very light loads[...]
Only CPU or especially I/O as well?
> We're using sendmail to handle outbound notification e-mail - and having a problem ...
Please be more detailed about your configuration: Which mail sending
method is configured in Bugzilla and and how exactly do you run
sendmail? It sounds like you use it as a processing daemon with it's
own queue and not just as a command line app to feed messages to e.g.
postfix, which in my experience would be more common these days.
> With an empty bugmail queue, email will move along reasonably well
> for awhile, then suddenly (no obvious errors or cause) sendmail
> simply stops processing the queue.
> If we restart sendmail, it will pick up 10 - 15 messages and process them, then stop again.
> If we empty the bugmail queue (clear the ts_id table), messages
> will be processed for awhile, and then stop.
This sounds like there's some bottleneck in sendmail which leads to a
deadlock for some reason. In your case I would first activate/improve
logging in
jobqueue.pl (-d, look at the file) and sendmail (LogLevel)
and if you already did you could provide some more information of
those logs. Additionally get to know how many workers are configured
at max for sendmail (ForkEachJob, MaxDaemonChildren, ConnectionRateThrottle)
and for
jobqueue.pl. I think the latter only uses exactly 1 all the
time per running instance of
jobqueue.pl, because I didn't see any
configuration and JobQueue::subprocess_worker reads that way, but I
may be wrong or you may be running multiple instances or whatever. In
debugging mode the logs should tell you how many processes where
spawned in which order.
https://www.bugzilla.org/docs/tip/en/html/api/jobqueue.html
http://www.sendmail.org/~ca/email/man/sendmail.html
Additionally, don't just clear the bugzilla queue, but instead look at
the table data for error codes, exit status and delays per job if your
mails got stuck. I don't see where the delays are stored, but Bugzilla
clearly provides some in case of errors. Additionally there's a rather
long timeout of 5 minutes until a job is recognized as failed, so it
may simply take a long time if only 1 worker is used to see any
changed in your jobs status.
Bugzilla::Job::Mailer::retry_delay
http://search.cpan.org/~jfearn/TheSchwartz-1.12/lib/TheSchwartz/Job.pm#$job->failed(_$msg,_$exit_status_)
> If you simply attempt to send mail from (your favorite mail client)
> on the same host, that works fine.
But the interesting part is what happens if you feed mails to sendmail
very quickly, especially quicker than it processes it outgoing, which
jobqueue.pl is able to do, your manually driven client surely not, and
how Bugzilla and your client feed mails to sendmail and maybe even
what happens if sendmail is stuck already and you are trying to feed
some additional mails using your client.
> It's just the bugzilla queue that's getting stuck.
I don't think so, instead I'm guessing Bugzilla is at first feeding
mails to sendmail very quickly until sendmail stucks for any reason.
the comes the interesting part: Is sendmail still able to queue the
jobs of
jobqueue.pl, but just not processing them outgoing anymore, or
not and
jobqueue.pl wiats for 5 minutes per job to get an error
message, that sendmail is not even able to queue new mails anymore?
That should be reflected in the status and exit codes and such of the
jobs in Bugzilla's database and by querying the queue of sendmail,
which is constantly increasing or not.
Mit freundlichen Grüßen,
Thorsten Schöning
--
Thorsten Schöning E-Mail: Thorsten....@AM-SoFT.de
AM-SoFT IT-Systeme
http://www.AM-SoFT.de/
Telefon...........05151- 9468- 55
Fax...............05151- 9468- 88
Mobil..............0178-8 9468- 04
AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow