'Can't contact LDAP server' (0xffffffff)

657 views
Skip to first unread message

Matthew Slowe

unread,
Oct 1, 2012, 6:03:51 AM10/1/12
to <simplesamlphp@googlegroups.com>
Morning all,

Last week we started seeing the below [anonymised] error (and suffixed stack trace) for some users during logon:

Oct 01 10:40:43 simplesamlphp ERROR [6447140dca] Library - LDAP search(): Failed search on base 'o=uni' for '(|(uid=LOGIN)(mail=LOGIN))'; cause: 'Can't contact LDAP server' (0xffffffff)

This doesn't happen to everyone but is often hitting the same person repeatedly and will then, suddenly, work.

Our IdP implementation is as follows:

* 2 x Solaris 10 running
- non-standard Apache 2.2.21 + PHP 5.3.8 as mod_php
- SimpleSAMLphp 1.9.2
- LDAP authsource configured as LDAPS
- memcached 1.4.10 for session sharing
- Sun LDAP Sun-Java(tm)-System-Directory/6.3.1 B2008.1121.0156 (64-bit)
- Load balanced behind a pair of Linux LVS servers in Direct Routing mode with hash based sticky routing

The LDAP instances are on the same Solaris instances and aren't exhibiting any other problems (they are also Load Balanced and the load balancer service checks are not showing any problems) nor are they logging anything obviously bad in their logs.

More details available if it would be helpful.

It started on 25th September and we've not seen it before (grep -c 0xffff simplesamlphp.log.2012-09-2* across both nodes):

simplesamlphp.log.2012-09-20:0 simplesamlphp.log.2012-09-20:0
simplesamlphp.log.2012-09-21:0 simplesamlphp.log.2012-09-21:0
simplesamlphp.log.2012-09-22:0 simplesamlphp.log.2012-09-22:0
simplesamlphp.log.2012-09-23:0 simplesamlphp.log.2012-09-23:0
simplesamlphp.log.2012-09-24:0 simplesamlphp.log.2012-09-24:0
simplesamlphp.log.2012-09-25:219 simplesamlphp.log.2012-09-25:14
simplesamlphp.log.2012-09-26:63 simplesamlphp.log.2012-09-26:143
simplesamlphp.log.2012-09-27:5 simplesamlphp.log.2012-09-27:325
simplesamlphp.log.2012-09-28:149 simplesamlphp.log.2012-09-28:36
simplesamlphp.log.2012-09-29:198 simplesamlphp.log.2012-09-29:0

The only thing that "changed" in relation to any of these systems is that memcached was incrementally restarted (to increase available memory from 32M to 256M) across both nodes but >8h (max session time) apart and we have, since, restarted Apache and LDAP on both nodes to rule out "long running oddities".

Any suggestions would be greatly appreciated!



Full example session log:

# grep 6447140dca simplesamlphp.log
Oct 01 10:40:38 simplesamlphp INFO [6447140dca] SAML2.0 - IdP.SSOService: Accessing SAML 2.0 IdP endpoint SSOService
Oct 01 10:40:39 simplesamlphp INFO [6447140dca] SAML2.0 - IdP.SSOService: Incomming Authentication request: 'https://jordan.kent.ac.uk/_sp'
Oct 01 10:40:43 simplesamlphp ERROR [6447140dca] SimpleSAML_Error_Exception: Error 2 - ldap_search() [<a href='function.ldap-search'>function.ldap-search</a>]: Search: Can't contact LDAP server
Oct 01 10:40:43 simplesamlphp ERROR [6447140dca] Backtrace:
Oct 01 10:40:43 simplesamlphp ERROR [6447140dca] 8 /www/lib/simplesamlphp-1.9.2/www/_include.php:70 (SimpleSAML_error_handler)
Oct 01 10:40:43 simplesamlphp ERROR [6447140dca] 7 [builtin] (ldap_search)
Oct 01 10:40:43 simplesamlphp ERROR [6447140dca] 6 /www/lib/simplesamlphp-1.9.2/lib/SimpleSAML/Auth/LDAP.php:203 (SimpleSAML_Auth_LDAP::search)
Oct 01 10:40:43 simplesamlphp ERROR [6447140dca] 5 /www/lib/simplesamlphp-1.9.2/lib/SimpleSAML/Auth/LDAP.php:264 (SimpleSAML_Auth_LDAP::searchfordn)
Oct 01 10:40:43 simplesamlphp ERROR [6447140dca] 4 /www/lib/simplesamlphp-1.9.2/modules/ldap/lib/ConfigHelper.php:187 (sspmod_ldap_ConfigHelper::login)
Oct 01 10:40:43 simplesamlphp ERROR [6447140dca] 3 /www/lib/simplesamlphp-1.9.2/modules/ldap/lib/Auth/Source/LDAP.php:52 (sspmod_ldap_Auth_Source_LDAP::login)
Oct 01 10:40:43 simplesamlphp ERROR [6447140dca] 2 /www/lib/simplesamlphp-1.9.2/modules/core/lib/Auth/UserPassBase.php:176 (sspmod_core_Auth_UserPassBase::handleLogin)
Oct 01 10:40:43 simplesamlphp ERROR [6447140dca] 1 /www/lib/simplesamlphp-1.9.2/modules/core/www/loginuserpass.php:49 (require)
Oct 01 10:40:43 simplesamlphp ERROR [6447140dca] 0 /www/lib/simplesamlphp-1.9.2/www/module.php:135 (N/A)
Oct 01 10:40:43 simplesamlphp ERROR [6447140dca] Library - LDAP search(): Failed search on base 'o=uni' for '(|(uid=LOGIN)(mail=LOGIN))'; cause: 'Can't contact LDAP server' (0xffffffff)
Oct 01 10:40:55 simplesamlphp NOTICE STAT [6447140dca] saml20-idp-SSO-first https://jordan.kent.ac.uk/_sp https://sso.id.kent.ac.uk/idp NA
Oct 01 10:40:55 simplesamlphp NOTICE STAT [6447140dca] saml20-idp-SSO https://jordan.kent.ac.uk/_sp https://sso.id.kent.ac.uk/idp NA
Oct 01 10:40:55 simplesamlphp INFO [6447140dca] Sending SAML 2.0 Response to 'https://jordan.kent.ac.uk/_sp'


--
Matthew Slowe
Server Infrastructure Team e: m.s...@kent.ac.uk
IS, University of Kent t: +44 (0)1227 824265
Canterbury, UK w: www.kent.ac.uk

Matthew Slowe

unread,
Oct 3, 2012, 5:11:09 AM10/3/12
to <simplesamlphp@googlegroups.com>
Bump. Still seeing these, any pointers would be handy.

Ta,
foo

Thijs Kinkhorst

unread,
Oct 3, 2012, 6:16:22 AM10/3/12
to simple...@googlegroups.com
Hi Matthew,

On Wed, 3 Oct 2012 09:11:09 +0000, Matthew Slowe <M.S...@kent.ac.uk>
wrote:
> Bump. Still seeing these, any pointers would be handy.

My pointer would be in the direction of the LDAP server. SSP is clear in
what happens: the LDAP server refused the connection. Perhaps you can
increase the LDAP server log level or use tcpdump to investigate exactly
what happens when the problem occurs.

I did see that you use LDAPS but the LDAP server is on localhost, right?
That seems like an unnecessary overhead.


Cheers,
Thijs
--
Thijs Kinkhorst <th...@uvt.nl> – LIS Unix

Universiteit van Tilburg – Library and IT Services
Bezoekadres > Warandelaan 2 • Tel. 013 466 3035 • G 236

Matthew Slowe

unread,
Oct 3, 2012, 6:24:11 AM10/3/12
to <simplesamlphp@googlegroups.com>

On 3 Oct 2012, at 11:16, Thijs Kinkhorst <th...@uvt.nl> wrote:

> Hi Matthew,
>
> On Wed, 3 Oct 2012 09:11:09 +0000, Matthew Slowe <M.S...@kent.ac.uk>
> wrote:
>> Bump. Still seeing these, any pointers would be handy.
>
> My pointer would be in the direction of the LDAP server. SSP is clear in
> what happens: the LDAP server refused the connection. Perhaps you can
> increase the LDAP server log level or use tcpdump to investigate exactly
> what happens when the problem occurs.

That was my first thought, however there are no other systems (local or otherwise) having issues talking to LDAP and LDAP itself isn't logging anything. My next thought was that PHP has an annoying habit of returning "some other error" when it runs out of things to say so the "Can't contact LDAP server" may just be a PHP_OUT_OF_ERRORS error (yes, I know this is speculation).

> I did see that you use LDAPS but the LDAP server is on localhost, right?
> That seems like an unnecessary overhead.

Yes, overhead -- I was considering turning this off to see if the problem went away.
> --
> You received this message because you are subscribed to the Google Groups "simpleSAMLphp" group.
> To post to this group, send email to simple...@googlegroups.com.
> To unsubscribe from this group, send email to simplesamlph...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/simplesamlphp?hl=en.
>

Thijs Kinkhorst

unread,
Oct 3, 2012, 6:35:15 AM10/3/12
to simple...@googlegroups.com
On Wed, 3 Oct 2012 10:24:11 +0000, Matthew Slowe <M.S...@kent.ac.uk>
wrote:
> On 3 Oct 2012, at 11:16, Thijs Kinkhorst <th...@uvt.nl> wrote:
>
>> Hi Matthew,
>>
>> On Wed, 3 Oct 2012 09:11:09 +0000, Matthew Slowe <M.S...@kent.ac.uk>
>> wrote:
>>> Bump. Still seeing these, any pointers would be handy.
>>
>> My pointer would be in the direction of the LDAP server. SSP is clear
in
>> what happens: the LDAP server refused the connection. Perhaps you can
>> increase the LDAP server log level or use tcpdump to investigate
exactly
>> what happens when the problem occurs.
>
> That was my first thought, however there are no other systems (local or
> otherwise) having issues talking to LDAP and LDAP itself isn't logging
> anything. My next thought was that PHP has an annoying habit of
returning
> "some other error" when it runs out of things to say so the "Can't
contact
> LDAP server" may just be a PHP_OUT_OF_ERRORS error (yes, I know this is
> speculation).

This is not PHP: the error is generated by libldap and occurs literally in
its source. It also matches with 0xffffffff which is the hex representation
of libldap's error code '-1' for this error. You may grep the libldap
source for the LDAP_SERVER_DOWN constant to find exactly which code paths
generate the error, but they all seem to boil down to a tcp connection
problem.

Matthew Slowe

unread,
Oct 3, 2012, 6:36:13 AM10/3/12
to <simplesamlphp@googlegroups.com>
On 3 Oct 2012, at 11:35, Thijs Kinkhorst <th...@uvt.nl>
wrote:
>>
>> That was my first thought, however there are no other systems (local or
>> otherwise) having issues talking to LDAP and LDAP itself isn't logging
>> anything. My next thought was that PHP has an annoying habit of
> returning
>> "some other error" when it runs out of things to say so the "Can't
> contact
>> LDAP server" may just be a PHP_OUT_OF_ERRORS error (yes, I know this is
>> speculation).
>
> This is not PHP: the error is generated by libldap and occurs literally in
> its source. It also matches with 0xffffffff which is the hex representation
> of libldap's error code '-1' for this error. You may grep the libldap
> source for the LDAP_SERVER_DOWN constant to find exactly which code paths
> generate the error, but they all seem to boil down to a tcp connection
> problem.

Ok, thanks :(

Olav Morken

unread,
Oct 3, 2012, 6:37:30 AM10/3/12
to simple...@googlegroups.com
On Wed, Oct 03, 2012 at 10:24:11 +0000, Matthew Slowe wrote:
>
> On 3 Oct 2012, at 11:16, Thijs Kinkhorst <th...@uvt.nl> wrote:
>
> > Hi Matthew,
> >
> > On Wed, 3 Oct 2012 09:11:09 +0000, Matthew Slowe <M.S...@kent.ac.uk>
> > wrote:
> >> Bump. Still seeing these, any pointers would be handy.
> >
> > My pointer would be in the direction of the LDAP server. SSP is clear in
> > what happens: the LDAP server refused the connection. Perhaps you can
> > increase the LDAP server log level or use tcpdump to investigate exactly
> > what happens when the problem occurs.
>
> That was my first thought, however there are no other systems (local or otherwise) having issues talking to LDAP and LDAP itself isn't logging anything. My next thought was that PHP has an annoying habit of returning "some other error" when it runs out of things to say so the "Can't contact LDAP server" may just be a PHP_OUT_OF_ERRORS error (yes, I know this is speculation).
>
> > I did see that you use LDAPS but the LDAP server is on localhost, right?
> > That seems like an unnecessary overhead.
>
> Yes, overhead -- I was considering turning this off to see if the problem went away.

One thing to keep in mind is that PHP (actually the OpenLDAP library)
returns this error when there is a problem setting up the SSL
connection to the LDAP server. (E.g. due to certificate problems.)
Trying to turn it off as a debug measure may be useful.


Best regards,
Olav Morken
UNINETT / Feide

Matthew Slowe

unread,
Oct 3, 2012, 6:46:57 AM10/3/12
to <simplesamlphp@googlegroups.com>
On 3 Oct 2012, at 11:37, Olav Morken <olav....@uninett.no>
wrote:

>> Yes, overhead -- I was considering turning this off to see if the problem went away.
>
> One thing to keep in mind is that PHP (actually the OpenLDAP library)
> returns this error when there is a problem setting up the SSL
> connection to the LDAP server. (E.g. due to certificate problems.)
> Trying to turn it off as a debug measure may be useful.


That's the sort of PHP_OUT_OF_ERRORS thing I was thinking about.
Reply all
Reply to author
Forward
0 new messages