[Shib-Users] shibd stops responding to mod_shib socket communication overnight

818 views
Skip to first unread message

Jason Porritt

unread,
Mar 5, 2010, 11:05:20 AM3/5/10
to shibbole...@internet2.edu
Hello,

We've run into a problem with our Shibboleth SP / mod_shib
installation and need some help. Overnight, mod_shib and the
Shibboleth daemon (shibd) stop communicating and we haven't been able
to find a cause. Details follow.

Our setup
======================
OS: RedHat EL 4
Shibboleth: 2.3.1 from shibboleth-2.3.1-1.2.i386.rpm
Apache: 2.0

We are serving as a Service Provider, dealing with an external IdP.
Apache and Shibboleth are running on the same server, communicating
through a Unix socket (not TCP). Our Apache config and Shibboleth
config work correctly with the IdP to provide SSO when shibd and
mod_shib agree to communicate, which is most of the time.

Symptoms
======================
Shibboleth SSO works great for a while after a restart of the
Shibboleth daemon. I can hit /Shibboleth.sso/Status and get a good
response. Then, at some point during the night, we start seeing
errors stating "Cannot connect to shibd process". This continues
until the Shibboleth daemon is restarted, and then it works again for
some time.

I set up a script to monitor the /Shibboleth.sso/Status location last
night and was able to pinpoint the 5-minute window when the failures
began. I haven't found any obvious causes from the logs, but here are
some log snippets from that time.

shibd.log
shows the following lines, and then complete silence until restarted.
-----------------
2010-03-05 03:07:20 INFO Shibboleth.Listener [91]: detected socket
closure, shutting down worker thread
2010-03-05 03:17:17 INFO Shibboleth.Listener [94]: detected socket
closure, shutting down worker thread
2010-03-05 03:28:50 INFO Shibboleth.Listener [93]: detected socket
closure, shutting down worker thread

native.log
shows the following, with the same failures repeating until restarted.
-----------------
2010-03-05 03:17:17 INFO Shibboleth.Config : shibboleth 2.3.1 library
shutting down
2010-03-05 03:17:17 INFO Shibboleth.SessionCache : cleanup thread exiting
2010-03-05 03:17:17 INFO XMLTooling.XMLToolingConfig : xmltooling
1.3.3 library shutdown complete
2010-03-05 03:17:17 INFO Shibboleth.Config : shibboleth 2.3.1 library
shutdown complete
2010-03-05 03:17:17 INFO Shibboleth.Config : Shibboleth SP Version 2.3.1
2010-03-05 03:17:17 INFO Shibboleth.Config : Library versions:
Xerces-C 3.0.1, XMLTooling-C 1.3.3, Shibboleth 1.3.1
2010-03-05 03:17:17 INFO Shibboleth.Config : building ListenerService
of type UnixListener...
2010-03-05 03:17:17 INFO Shibboleth.Config : building SessionCache of
type StorageService...
2010-03-05 03:17:17 INFO Shibboleth.Config : building RequestMapper of
type Native...
2010-03-05 03:17:17 INFO Shibboleth.SessionCache : cleanup thread
started...run every 900 secs; timeout after 900 secs
2010-03-05 03:20:01 ERROR Shibboleth.Listener [25158] shib_handler:
socket call resulted in error (13): no message
2010-03-05 03:20:01 WARN Shibboleth.Listener [25158] shib_handler:
cannot connect socket (15)...retrying
2010-03-05 03:20:01 INFO Shibboleth.Config : Shibboleth SP Version 2.3.1
2010-03-05 03:20:01 INFO Shibboleth.Config : Library versions:
Xerces-C 3.0.1, XMLTooling-C 1.3.3, Shibboleth 1.3.1
2010-03-05 03:20:01 INFO Shibboleth.Config : building ListenerService
of type UnixListener...
2010-03-05 03:20:01 INFO Shibboleth.Config : building SessionCache of
type StorageService...
2010-03-05 03:20:01 INFO Shibboleth.Config : building RequestMapper of
type Native...
2010-03-05 03:20:01 INFO Shibboleth.SessionCache : cleanup thread
started...run every 900 secs; timeout after 900 secs
2010-03-05 03:20:03 ERROR Shibboleth.Listener [25158] shib_handler:
socket call resulted in error (13): no message
2010-03-05 03:20:03 WARN Shibboleth.Listener [25158] shib_handler:
cannot connect socket (15)...retrying
2010-03-05 03:20:07 ERROR Shibboleth.Listener [25158] shib_handler:
socket call resulted in error (13): no message
2010-03-05 03:20:07 WARN Shibboleth.Listener [25158] shib_handler:
cannot connect socket (15)...
2010-03-05 03:20:07 CRIT Shibboleth.Listener [25158] shib_handler:
socket server unavailable, failing

========================

If anyone has experienced this, we'd love to hear how it was resolved.

Thanks,
Jason Porritt

Scott Cantor

unread,
Mar 5, 2010, 12:10:29 PM3/5/10
to shibbole...@internet2.edu
> We've run into a problem with our Shibboleth SP / mod_shib
> installation and need some help. Overnight, mod_shib and the
> Shibboleth daemon (shibd) stop communicating and we haven't been able
> to find a cause. Details follow.

If you're using the UnixListener, my guess is something is sweeping and
destroying the inbound socket file.

> shibd.log shows the following lines, and then complete silence until
> restarted.

That's normal, just existing bound sockets cleaning up when Apache quiets.

> ERROR Shibboleth.Listener [25158] shib_handler: socket call resulted in
error
> (13): no message

I'm not immediately finding the header file to tell me what system error 13
is, but you could check for that also.

-- Scott


Jason Porritt

unread,
Mar 9, 2010, 10:29:06 AM3/9/10
to shibbole...@internet2.edu
You were correct -- there was a script being run that was turning off
world-write on the socket file. The cleanup script now ignores that
file and it has been running without any problems.

Thanks for the help,
--Jason

On Fri, Mar 5, 2010 at 12:10 PM, Scott Cantor <cant...@osu.edu> wrote:
>> We've run into a problem with our Shibboleth SP / mod_shib
>> installation and need some help.  Overnight, mod_shib and the
>> Shibboleth daemon (shibd) stop communicating and we haven't been able
>> to find a cause.  Details follow.
>

> If you're using the UnixListener, my guess is something is sweeping and
> destroying the inbound socket file.
>

>> shibd.log shows the following lines, and then complete silence until
>> restarted.
>

> That's normal, just existing bound sockets cleaning up when Apache quiets.
>

>> ERROR Shibboleth.Listener [25158] shib_handler: socket call resulted in
> error
>> (13): no message
>

Reply all
Reply to author
Forward
0 new messages