Running Thruk v1.76-2 on Ubuntu 12.04.2 LTS
We have a Thruk master with 9 backends configured (1 local, 8 remote) and have noticed that when one of the backends becomes unreachable (the network between us and them disappears) instead of seeing an error message or Thruk ignoring the backend while still displaying results from others we get back a 500 server error and Thruk is completely unusable until either the backend is reachable again or we remove it from the thruk_local.conf file and restart Thruk or Apache. This happens even if other backends are selected and working normally.
From /var/log/thruk/error.log:
[2013/09/03 14:26:22][master][ERROR][Thruk.Controller.error] No Backend available
[2013/09/03 14:26:22][master][ERROR][Thruk.Controller.error] on page:
https://master.server.name/thruk/cgi-bin/status.cgi?host=all&_=1378175165120[2013/09/03 14:26:22][master][ERROR][Thruk.Controller.error] remotehost: ERROR: failed to connect (remotehost:6557)
From /var/log/apache/error.log:
[Tue Sep 03 14:26:03 2013] [warn] [client 1.2.3.4] mod_fcgid: error reading data, FastCGI server closed connection, referer:
https://master.server.name/thruk/side.html[Tue Sep 03 14:26:03 2013] [error] [client 1.2.3.4] Premature end of script headers: fcgid_env.sh, referer:
https://master.server.name/thruk/side.html[Tue Sep 03 14:26:30 2013] [warn] [client 1.2.3.4] mod_fcgid: error reading data, FastCGI server closed connection
[Tue Sep 03 14:26:30 2013] [error] [client 1.2.3.4] Premature end of script headers: fcgid_env.sh
I haven't seen this happen prior to updating Thruk to v1.76-2 this weekend just gone so suspect it could be related to
https://github.com/sni/Thruk/commit/0e4b54eba1c22391f4a03dffd909c985a627e9da (show error instead of empty result for a single failed instance)
As we monitor the remote hosts directly from our central master server, would defining check_local_states=1 and then setting the state_host for each backend be a potential workaround for this or is this related to the fact it's now calling die() for all error conditions rather than just fatal ones?
--
Gavin Grieve