IdP use of LDAP and connection pooling

Cantor, Scott

unread,

Sep 8, 2011, 2:26:12 PM9/8/11

to us...@shibboleth.net

Is there any conventional wisdom or experience with the use of connection
pooling in the IdP data connectors for LDAP, and specifically AD?

We're expanding use of LDAP as a data source, but my expertise lies
heavily on the RDBMS side and how the pooling behaves there. I did some
initial playing around adding a connection pool and am seeing what I kind
of expected, which is constant LDAP connection resets when the pools
validate when connections are idle.

I'm just wondering these kinds of things:

- are pools necessary to get reasonable performance on highly loaded IdPs?
- do they handle failed connections reasonably without ever surfacing them
as actual data connector failures?
- are there ways to maintain connections and avoid the timeouts from the
client end?
- is pool validation even needed, or does it just retry on failures and
handle things gracefully?

I use LDAP as a secondary source right now, with no pooling yet, and it's
basically fine, though I don't know the actual load on the AD servers.
Last thing I want is to add pooling and have it cause failures (which
poorly implemented RDBMS pools do cause).

-- Scott

--
To unsubscribe from this list send an email to users-un...@shibboleth.net

Chad La Joie

unread,

Sep 8, 2011, 2:49:41 PM9/8/11

to Shib Users

Well, there are various reasons you might pool connections, of course:
- because the connections are expensive to establish
- because you know you'll be doing a number calls in succession
- as a way to "reserve" resources on the server

LDAP calls aren't any worse to establish than connections to a
database; one round trip in most cases, more if you're using StartTLS.
So that's not really an issue. You may want to "reserve" server
resources if your LDAP server is under heavy load. So, for your case,
I'd say that you should use connection pooling for LDAP if you used
connection pooling for databases because of the number of rapid
successive calls.

The LDAP data connector does allow you to customize what exactly
happens when a connection is closed unbeknownst to the IdP. The
default is to just retry, I believe. I think the only time you'll
ever see a connection failure is if your servers are actually
unreachable.

In terms of the timeout, there are some LDAP control extensions that
would allow you to adjust the timeout (i.e., the client can tell the
server how long it would like the timeout to be) but as best as I know
AD does not support that.

--
Chad La Joie
www.itumi.biz
trusted identities, delivered

Cantor, Scott

unread,

Sep 8, 2011, 2:58:30 PM9/8/11

to us...@shibboleth.net

On 9/8/11 2:49 PM, "Chad La Joie" <laj...@itumi.biz> wrote:

>The LDAP data connector does allow you to customize what exactly
>happens when a connection is closed unbeknownst to the IdP. The
>default is to just retry, I believe. I think the only time you'll
>ever see a connection failure is if your servers are actually
>unreachable.

Is that option documented? I didn't see anything that jumped out. Not all
of the pool options are exposed in the XML, for example, so some of the
validateOnBorrow-type calls aren't exposed.

Chad La Joie

unread,

Sep 8, 2011, 3:01:53 PM9/8/11

to Shib Users

Indirectly. They are VT LDAP provider options, you can find them here:

http://code.google.com/p/vt-middleware/wiki/vtldapProperties

--

Chad La Joie
www.itumi.biz
trusted identities, delivered

Jim Fox

unread,

Sep 8, 2011, 3:02:31 PM9/8/11

to Shib Users

We use connection pooling for all our accesses to LDAP. We use TLS, and
the overhead of starting up a new session on each query seemed excessive
to me. Our openldap servers keep the sessions open all day.

The only time I occasionally see a failure is on a test idp that gets
accessed once every few days. It sometimes fails to return any
attributes. Retry works as normal.

Jim

On Thu, 8 Sep 2011, Cantor, Scott wrote:

> Date: Thu, 8 Sep 2011 11:26:12 -0700
> From: "Cantor, Scott" <cant...@osu.edu>
> To: "us...@shibboleth.net" <us...@shibboleth.net>
> Reply-To: Shib Users <us...@shibboleth.net>
> Subject: IdP use of LDAP and connection pooling

Chad La Joie

unread,

Sep 8, 2011, 3:05:40 PM9/8/11

to Shib Users

Oh, and some explicit pooling options:
http://code.google.com/p/vt-middleware/wiki/vtldapPooling

Cantor, Scott

unread,

Sep 8, 2011, 3:10:14 PM9/8/11

to us...@shibboleth.net

On 9/8/11 3:02 PM, "Jim Fox" <f...@washington.edu> wrote:
>
>We use connection pooling for all our accesses to LDAP. We use TLS, and
>the overhead of starting up a new session on each query seemed excessive
>to me. Our openldap servers keep the sessions open all day.

Are you using any of the validation options in the pooling element? I see
the retry count defaults to 1 inside the vt-ldap code, so I'm sure no
matter what I do, it's just going to drop the failed connection and retry.
With some cases like that, the problem is if the closed connections hangs
(very common with database pools) but these don't seem to.

Ashok Kumar

unread,

Sep 8, 2011, 3:11:36 PM9/8/11

to Shib Users

Some of the information is here..

https://wiki.shibboleth.net/confluence/display/SHIB2/ResolverLDAPDataConnector

--

Jim Fox

unread,

Sep 8, 2011, 3:15:54 PM9/8/11

to Shib Users

We don't turn on any of the validation options. Looking at them now
the validatePeriodically looks like it might be useful.

Jim

On Thu, 8 Sep 2011, Cantor, Scott wrote:

> Date: Thu, 8 Sep 2011 12:10:14 -0700

> From: "Cantor, Scott" <cant...@osu.edu>
> To: "us...@shibboleth.net" <us...@shibboleth.net>
> Reply-To: Shib Users <us...@shibboleth.net>

> Subject: Re: IdP use of LDAP and connection pooling

Ryan Larscheidt

unread,

Sep 8, 2011, 3:16:53 PM9/8/11

to Shib Users

A setting that is not specified on that page, but is useful to set, is "com.sun.jndi.ldap.read.timeout". It will save you when the IdP can connect to the directory server host, but the directory server process doesn't reply to the bind (because it's dead, etc). Otherwise, the IdP waits until the TCP connection times out, which is way too long.

Ryan

Cantor, Scott

unread,

Sep 8, 2011, 3:36:09 PM9/8/11

to us...@shibboleth.net

On 9/8/11 3:16 PM, "Ryan Larscheidt" <larsc...@doit.wisc.edu> wrote:

>A setting that is not specified on that page, but is useful to set, is
>"com.sun.jndi.ldap.read.timeout". It will save you when the IdP can
>connect to the directory server host, but the directory server process
>doesn't reply to the bind (because it's dead, etc). Otherwise, the IdP
>waits until the TCP connection times out, which is way too long.

Great, that's the sort of thing I was looking for.

I added a link in the wiki page to the VT-LDAP docs because I didn't
realize that the generic property syntax supported its settings, I thought
it only handled the JDK stuff.

Chad La Joie

unread,

Sep 8, 2011, 3:38:13 PM9/8/11

to Shib Users

The only gotchya, which I think we documented, is that you can't use
the generic property setting mechanism for those properties which the
IdP does have explicit config options.

--

Chad La Joie
www.itumi.biz
trusted identities, delivered

Daniel Fisher

unread,

Sep 8, 2011, 3:52:37 PM9/8/11

to Shib Users

On Thu, Sep 8, 2011 at 2:26 PM, Cantor, Scott <cant...@osu.edu> wrote:

I'm just wondering these kinds of things:

- are pools necessary to get reasonable performance on highly loaded IdPs?

I believe so, yes.

- do they handle failed connections reasonably without ever surfacing them
as actual data connector failures?

If they don't I hope someone files a bug report.

- are there ways to maintain connections and avoid the timeouts from the
client end?

You should be able to configure periodic validation to guarantee your connections are always alive.

- is pool validation even needed, or does it just retry on failures and
handle things gracefully?

Checkin/checkout validation is mainly just for the use cases I couldn't imagine. Connections will retry once by default on any communication error, so periodic validation provides the best benefit on top of that.

--Daniel Fisher

Cantor, Scott

unread,

Sep 8, 2011, 4:08:51 PM9/8/11

to us...@shibboleth.net

Thanks...

On 9/8/11 3:52 PM, "Daniel Fisher" <dfi...@vt.edu> wrote:
>
>- are there ways to maintain connections and avoid the timeouts from the
>client end?
>
>You should be able to configure periodic validation to guarantee your
>connections are always alive.

With AD, I'm seeing the connections close on the order of minutes, if not
sooner. That doesn't seem to fit with that strategy, so perhaps AD is
different.

I'll probably have to just try it under load and see how they behave. Even
if they close fast, constant use would keep them open.

Daniel Fisher

unread,

Sep 8, 2011, 4:10:39 PM9/8/11

to Shib Users

On Thu, Sep 8, 2011 at 3:10 PM, Cantor, Scott <cant...@osu.edu> wrote:

On 9/8/11 3:02 PM, "Jim Fox" <f...@washington.edu> wrote:
>
>We use connection pooling for all our accesses to LDAP. We use TLS, and
>the overhead of starting up a new session on each query seemed excessive
>to me. Our openldap servers keep the sessions open all day.

Are you using any of the validation options in the pooling element? I see
the retry count defaults to 1 inside the vt-ldap code, so I'm sure no
matter what I do, it's just going to drop the failed connection and retry.
With some cases like that, the problem is if the closed connections hangs
(very common with database pools) but these don't seem to.

The most common hangs we've seen and heard about are caused by hardware load balancers that don't send resets to both client and server. I haven't tested the read timeout that Ryan mentioned, but typically setting a timeLimit and doing validation will work around that issue.

--Daniel Fisher

Daniel Fisher

unread,

Sep 8, 2011, 4:14:35 PM9/8/11

to Shib Users

On Thu, Sep 8, 2011 at 4:08 PM, Cantor, Scott <cant...@osu.edu> wrote:

Thanks...

On 9/8/11 3:52 PM, "Daniel Fisher" <dfi...@vt.edu> wrote:
>
>- are there ways to maintain connections and avoid the timeouts from the
>client end?
>
>You should be able to configure periodic validation to guarantee your
>connections are always alive.

With AD, I'm seeing the connections close on the order of minutes, if not
sooner. That doesn't seem to fit with that strategy, so perhaps AD is
different.

I'll probably have to just try it under load and see how they behave. Even
if they close fast, constant use would keep them open.

That's sounds very aggressive, and would certainly discourage pooling. Perhaps they don't want you holding connections open? We configure keep-alive on the servers (OpenLDAP, not AD) to encourage it.

--Daniel Fisher

Yuji Shinozaki

unread,

Sep 8, 2011, 4:18:11 PM9/8/11

to Shib Users

So to be explicit about how these properties are specified in the configs, would the following configure the vt-ldap connector to retry failed connections three times with a wait of 300 ms between retries?

<resolver:DataConnector
id="myLDAP"
xsi:type="LDAPDirectory" xmlns="urn:mace:shibboleth:2.0:resolver:dc"
ldapURL="ldap://blah.blah.blah"
baseDN="o=blah,c=US"
principal="cn=blah,ou=blabbityblah,o=blahblah,c=US"
principalCredential="blahblah" >
<FilterTemplate>
<![CDATA[
(uid=$requestContext.principalName)
]]>
</FilterTemplate>

<ReturnAttributes>blah1 blah2 blah3</ReturnAttributes>

</resolver:DataConnector>

yuji
----

> --
> To unsubscribe from this list send an email to users-un...@shibboleth.net

----
Yuji Shinozaki
Technical Director, SHANTI
University of Virginia
http://shanti.virginia.edu
434-924-7171
ys...@virginia.edu
----
"Computers are useless. They only give you answers". --Pablo Picasso

Cantor, Scott

unread,

Sep 8, 2011, 4:24:28 PM9/8/11

to us...@shibboleth.net

On 9/8/11 4:14 PM, "Daniel Fisher" <dfi...@vt.edu> wrote:
>
>That's sounds very aggressive, and would certainly discourage pooling.
>Perhaps they don't want you holding connections open? We configure
>keep-alive on the servers (OpenLDAP, not AD) to encourage it.

There's also a load balancer involved (per your other comment) that could
be affecting it.

One thing that didn't make sense to me was that the expirationTime setting
in the connector is documented as causing the pool to eject stale
connections once they're unused for that length of time. If that's shorter
than the validation interval, I wouldn't expect the background validator
to even try those connections and see that they're closed, since they
should have just expired by then.

-- Scott

Eric Goodman

unread,

Sep 8, 2011, 4:27:05 PM9/8/11

to Shib Users

We had that problem (hardware LB killing connections), and it was pretty ugly. We only use the connection pooling for retrieving attributes, so the behavior we saw was that you'd log in, and if it had been more than an hour (or whatever the LB timeout was) since the previous login, you're SAML assertion to the SP just returned no data. This was with an older version of the IdP (2.1.5, right around when vt-ldap was first added).

Due to an unrelated PeopleSoft LDAP-handling bug, we ended up lowering the idle timeout on out IdP to on the order of 2 minutes. This effectively made the LB timeout handling a non issue.

We haven't seen any performance issues with the lower idle timeout window, but we're not a very high volume server (we see 10,000 shib logins on an average day, maybe 30K on a peak day), so I don't know how useful that is as a datapoint.

Eric Goodman

er...@ucsc.edu

Cantor, Scott

unread,

Sep 8, 2011, 4:33:20 PM9/8/11

to us...@shibboleth.net

On 9/8/11 4:27 PM, "Eric Goodman" <er...@ucsc.edu> wrote:
>
>We had that problem (hardware LB killing connections), and it was pretty
>ugly. We only use the connection pooling for retrieving attributes, so
>the behavior we saw was that you'd log in, and if it had been more than
>an hour (or whatever the LB timeout was) since the previous login, you're
>SAML assertion to the SP just returned no data.

That at least does not seem to happen. It's not hung when the connections
are stale, just doesn't seem to make much use of the pool. I am obviously
much more concerned about my IdP than the AD server, so if they can handle
the load, I don't much care.

> This was with an older version of the IdP (2.1.5, right around when
>vt-ldap was first added).
>Due to an unrelated PeopleSoft LDAP-handling bug, we ended up lowering
>the idle timeout on out IdP to on the order of 2 minutes. This
>effectively made the LB timeout handling a non issue.

Which idle timeout are you referring to, if you don¹t mind?

Eric Goodman

unread,

Sep 8, 2011, 5:07:20 PM9/8/11

to Shib Users

Yeah, that was a typo. I almost replied to my own message to correct but didn't want to clutter the list if no one cared.

The timeout was really the LDAP server's base idle session timeout.

PeopleSoft has (had?) a known bug where it creates lots and lots of orphan LDAP sessions. The suggested workaround was to reduce the idle timeout on LDAP sessions at the LDAP server. So this particular bug really had nothing to do with Shibboleth except that it's the same LDAP server that Shibboleth queries.

The rest was just trying to say that since making the change (and finally upgrading to 2.3.3, yay!) we haven't noticed any performance issues on the IdP.

--- Eric

Eric Goodman
er...@ucsc.edu

Manuel Haim

unread,

Sep 9, 2011, 4:30:12 AM9/9/11

to us...@shibboleth.net

Hi Scott,

just my two cents here...

> - are pools necessary to get reasonable performance on highly loaded IdPs?

We have an IdP cluster and define three OpenLDAP servers within the
attribute resolver's DataConnector (for failover reasons, i.e. each
server has the same data). Initially, we did not define a
ConnectionPool. With this setup, we did some load testing and debugging.

When performing just one login at a time, and some seconds later just
another one, the attribute resolver performs a costly LDAP BIND
operation for each of these logins (taking about 50ms), i.e. the LDAP
connection is closed shortly after attribute resolution. This is okay,
as the few users will not recognize this 50ms, and open connections are
reduced to a minimum.
Then, if we fire lots of login requests to the IdP (using "The
Grinder"), the IdP seems to use a connection pool even if it has not
been explicitly defined (there is no LDAP BIND operation each time).
This is just fine, too, as the system manager does not need to care
about connection pools, and each single user still gets good performance.

For comparison, after the LDAP BIND, a single LDAP request just takes
about 3ms here.

A bottleneck, however, appears to be in the LDAP JAAS login module at
login time (see login.config). By default, the SearchDnResolver (which
resolves the user's DN according to the specified userFilter) never does
connection pooling, thus the IdP always performs an LDAP BIND here where
it could keep the connection open. We replaced the SearchDnResolver by a
static one for test purposes, and our IdP cluster now handled about
twice as much logins per second. (The IdP is not "blocked" by the LDAP
BINDs, but maybe the number of threads or network connections is at a
limit here?!)

This issue has been reported at:
http://code.google.com/p/vt-middleware/issues/detail?id=118

Please remember that the LDAP AUTHN request itself (in contrast to the
SearchDnResolver issue) will always remain using an LDAP BIND, as it
must bind in the name of the user who is going to be authenticated.

> - do they handle failed connections reasonably without ever surfacing them
> as actual data connector failures?

This has not been an issue here yet. "The Grinder" reports approximately
20 errors per 10.000 logins, but I did never catch one in the browser
myself.

-Manuel

Daniel Fisher

unread,

Sep 9, 2011, 9:34:43 AM9/9/11

to Shib Users

On Thu, Sep 8, 2011 at 4:18 PM, Yuji Shinozaki <ys...@virginia.edu> wrote:

So to be explicit about how these properties are specified in the configs, would the following configure the vt-ldap connector to retry failed connections three times with a wait of 300 ms between retries?

<resolver:DataConnector
id="myLDAP"
xsi:type="LDAPDirectory" xmlns="urn:mace:shibboleth:2.0:resolver:dc"
ldapURL="ldap://blah.blah.blah"
baseDN="o=blah,c=US"
principal="cn=blah,ou=blabbityblah,o=blahblah,c=US"
principalCredential="blahblah" >
<FilterTemplate>
<![CDATA[
(uid=$requestContext.principalName)
]]>
</FilterTemplate>

<ReturnAttributes>blah1 blah2 blah3</ReturnAttributes>

<LDAPProperty name="edu.vt.middleware.ldap.operationRetry" value="3" />
<LDAPProperty name="edu.vt.middleware.ldap.operationRetryWait" value="300" />

</resolver:DataConnector>

Correct. Note that by default the only exceptions that trigger retries are CommunicationException and ServiceUnavailableException.

--Daniel Fisher

Daniel Fisher

unread,

Sep 9, 2011, 10:02:18 AM9/9/11

to Shib Users

On Thu, Sep 8, 2011 at 4:24 PM, Cantor, Scott <cant...@osu.edu> wrote:

On 9/8/11 4:14 PM, "Daniel Fisher" <dfi...@vt.edu> wrote:
>
>That's sounds very aggressive, and would certainly discourage pooling.
>Perhaps they don't want you holding connections open? We configure
>keep-alive on the servers (OpenLDAP, not AD) to encourage it.

There's also a load balancer involved (per your other comment) that could
be affecting it.

One thing that didn't make sense to me was that the expirationTime setting
in the connector is documented as causing the pool to eject stale
connections once they're unused for that length of time. If that's shorter
than the validation interval, I wouldn't expect the background validator
to even try those connections and see that they're closed, since they
should have just expired by then.

I think that's correct, if I understand what you're saying. A lower expirationTime will cause those connections to be removed before the validator even runs. So the connections won't be in the pool and therefore won't be validated.

--Daniel Fisher

Daniel Fisher

unread,

Sep 9, 2011, 10:12:07 AM9/9/11

to Shib Users

On Fri, Sep 9, 2011 at 4:30 AM, Manuel Haim <ha...@hrz.uni-marburg.de> wrote:

Hi Scott,

just my two cents here...

A bottleneck, however, appears to be in the LDAP JAAS login module at
login time (see login.config). By default, the SearchDnResolver (which
resolves the user's DN according to the specified userFilter) never does
connection pooling, thus the IdP always performs an LDAP BIND here where
it could keep the connection open. We replaced the SearchDnResolver by a
static one for test purposes, and our IdP cluster now handled about
twice as much logins per second. (The IdP is not "blocked" by the LDAP
BINDs, but maybe the number of threads or network connections is at a
limit here?!)

This issue has been reported at:
http://code.google.com/p/vt-middleware/issues/detail?id=118

Support for pooling LDAP connections for authentication will definitely be supported in IDP v3. I can't guarantee it will ever be formally supported in IDP v2, I'll just have to see how the code shakes out.

--Daniel Fisher

Cantor, Scott

unread,

Sep 9, 2011, 11:40:12 AM9/9/11

to us...@shibboleth.net

On 9/9/11 10:02 AM, "Daniel Fisher" <dfi...@vt.edu> wrote:

>I think that's correct, if I understand what you're saying. A lower
>expirationTime will cause those connections to be removed before the
>validator even runs. So the connections won't be in the pool and
>therefore won't be validated.

I didn't explain it well, but what I'm seeing is that they do get
validated, even though they're all idle, and they all should have expired.
Otherwise there wouldn't be warnings about connection failures when it
attempts to validate them.

-- Scott

Daniel Fisher

unread,

Sep 9, 2011, 1:44:55 PM9/9/11

to Shib Users

On Fri, Sep 9, 2011 at 11:40 AM, Cantor, Scott <cant...@osu.edu> wrote:

On 9/9/11 10:02 AM, "Daniel Fisher" <dfi...@vt.edu> wrote:

>I think that's correct, if I understand what you're saying. A lower
>expirationTime will cause those connections to be removed before the
>validator even runs. So the connections won't be in the pool and
>therefore won't be validated.

I didn't explain it well, but what I'm seeing is that they do get
validated, even though they're all idle, and they all should have expired.
Otherwise there wouldn't be warnings about connection failures when it
attempts to validate them.

Did you set the prune timer period to some value below the expiration time? One controls how often the pool is pruned, the other controls whether a connection should be pruned.

--Daniel Fisher

Cantor, Scott

unread,

Sep 9, 2011, 2:07:23 PM9/9/11

to us...@shibboleth.net

On 9/9/11 1:44 PM, "Daniel Fisher" <dfi...@vt.edu> wrote:
>
>
>Did you set the prune timer period to some value below the expiration
>time? One controls how often the pool is pruned, the other controls
>whether a connection should be pruned.

No, it's higher than the expiration period. The expiration is 5m and the
timer is 15m.

Each time it runs, I see one or two of the connections it's validating log
a Communications Failure.

Daniel Fisher

unread,

Sep 9, 2011, 2:23:56 PM9/9/11

to Shib Users

On Fri, Sep 9, 2011 at 2:07 PM, Cantor, Scott <cant...@osu.edu> wrote:

On 9/9/11 1:44 PM, "Daniel Fisher" <dfi...@vt.edu> wrote:
>
>
>Did you set the prune timer period to some value below the expiration
>time? One controls how often the pool is pruned, the other controls
>whether a connection should be pruned.

No, it's higher than the expiration period. The expiration is 5m and the
timer is 15m.

So every 15m the pool will be checked for expired connections. At that time any connections that have been inactive for at least 5m will be removed from the pool.

Each time it runs, I see one or two of the connections it's validating log
a Communications Failure.

The validator is a separate, unrelated thread. I'm assuming the exception you're seeing is thrown when the connection attempts to close? If so, that can be ignored. Is the pool size what you expect after 15m?

Cantor, Scott

unread,

Sep 9, 2011, 2:28:59 PM9/9/11

to us...@shibboleth.net

On 9/9/11 2:23 PM, "Daniel Fisher" <dfi...@vt.edu> wrote:
>
>The validator is a separate, unrelated thread. I'm assuming the exception
>you're seeing is thrown when the connection attempts to close? If so,
>that can be ignored. Is the pool size what you expect after 15m?

Oh, ok, I didn't realize it was separate, I thought that was the same
operation.

Yes, I guess it could be coming from the close operation. It's just noisy,
I didn't think it actually hurt anything.

If what Manuel said is true and it reuses connections intrinsically even
if I don't specify a pool, I'll probably just remove them in any case.

Yuji Shinozaki

unread,

Sep 9, 2011, 4:01:06 PM9/9/11

to Shib Users

I am spinning off a different thread about the configuration of the vt-ldap implementation of the ldap dataconnector to keep the noise down on the original thread.

Daniel,

How are the exceptions that are caught by the vt-ldap library specified in the configs? The vt-ldap docs (http://code.google.com/p/vt-middleware/wiki/vtldapProperties#Properties) show them as being specified like so:

{CommunicationException, ServiceUnavailableException}

So, I am assuming that it assumes javax.naming.* classes. Can you specify exception classes in other packages? Also, is it simply an instanceof operation and can you specify and match on superclasses or interfaces?

Admittedly, these last two questions are academic, because I just need to retry on javax.naming.TimeLimitExceededException, so I am presuming I just need to specify it thusly:

Does that sound right?

Thanks!
yuji
----

On Thu, Sep 8, 2011 at 4:18 PM, Yuji Shinozaki <ys...@virginia.edu> wrote:

> So to be explicit about how these properties are specified in the configs, would the following configure the vt-ldap connector to retry failed connections three times with a wait of 300 ms between retries?
>

> <resolver:DataConnector ... >
> ...

> <LDAPProperty name="edu.vt.middleware.ldap.operationRetry" value="3" />
> <LDAPProperty name="edu.vt.middleware.ldap.operationRetryWait" value="300" />
> </resolver:DataConnector>

On Sep 9, 2011, at 9:34 AM, Daniel Fisher wrote:

> Correct. Note that by default the only exceptions that trigger retries are CommunicationException and ServiceUnavailableException.
>
> --Daniel Fisher

----

Daniel Fisher

unread,

Sep 9, 2011, 5:12:05 PM9/9/11

to Shib Users

On Fri, Sep 9, 2011 at 4:01 PM, Yuji Shinozaki <ys...@virginia.edu> wrote:

I am spinning off a different thread about the configuration of the vt-ldap implementation of the ldap dataconnector to keep the noise down on the original thread.

Daniel,

How are the exceptions that are caught by the vt-ldap library specified in the configs? The vt-ldap docs (http://code.google.com/p/vt-middleware/wiki/vtldapProperties#Properties) show them as being specified like so:

{CommunicationException, ServiceUnavailableException}

So, I am assuming that it assumes javax.naming.* classes. Can you specify exception classes in other packages?

You can specify any NamingException.

Also, is it simply an instanceof operation and can you specify and match on superclasses or interfaces?

NamingException.isInstance(..)

Admittedly, these last two questions are academic, because I just need to retry on javax.naming.TimeLimitExceededException, so I am presuming I just need to specify it thusly:

<LDAPProperty name="edu.vt.middleware.ldap.operationRetryExceptions" value="{CommunicationException, ServiceUnavailableException, TimeLimitExceededException}" />

Are you sure this is what you want? An operation retry closes and reopens the connection, then presumably you'd get the same TimeLimitExceededException again. Note that LimitExceededException is ignored by the search result handler. So even if it occurs, you'll still get any results that were retrieved before the exception.

--Daniel Fisher

Peter Schober

unread,

Sep 9, 2011, 5:25:39 PM9/9/11

to us...@shibboleth.net

* Yuji Shinozaki <ys...@virginia.edu> [2011-09-09 22:01]:

> I am spinning off a different thread about the configuration of the
> vt-ldap implementation of the ldap dataconnector to keep the noise
> down on the original thread.

No, you didn't, you've just changed the subject line.
-peter

Yuji Shinozaki

unread,

Sep 9, 2011, 8:34:30 PM9/9/11

to Shib Users

> <LDAPProperty name="edu.vt.middleware.ldap.operationRetryExceptions" value="javax.naming.CommunicationException,javax.naming.ServiceUnavailableException,javax.naming.TimeLimitExceededException" />
>

Thanks. That makes more sense. That syntax is unclear in the documentation.

> Are you sure this is what you want? An operation retry closes and reopens the connection, then presumably you'd get the same TimeLimitExceededException again. Note that LimitExceededException is ignored by the search result handler. So even if it occurs, you'll still get any results that were retrieved before the exception.
>

Ok. This may be a matter of upgrading our shib implementation (it is getting quite old) as the error is resulting in an attribute resolution error bubbling all the way up. We are getting the time limit exceeded exceptions at irregular intervals, but with increasing frequency lately. So i was hoping a properly-spaced retry would workaround these brown-outs. Our campus ldap admins have reassured us that they are throwing new hardware at the general problem, but of course that won't be for the proverbial "few weeks".

Thanks for the info.

Yuji
----

P.S. To Peter: sorry about the threading faux pas: I am dumb but not stupid: I was certain that I cut and pasted my reply to a fresh new message, but apparently I was mistaken. I just wish my mailer had better In-Reply-To controls.

Daniel Fisher

unread,

Sep 12, 2011, 10:09:51 AM9/12/11

to Shib Users

On Fri, Sep 9, 2011 at 8:34 PM, Yuji Shinozaki <ys...@virginia.edu> wrote:

> <LDAPProperty name="edu.vt.middleware.ldap.operationRetryExceptions" value="javax.naming.CommunicationException,javax.naming.ServiceUnavailableException,javax.naming.TimeLimitExceededException" />
>

Thanks. That makes more sense. That syntax is unclear in the documentation.

> Are you sure this is what you want? An operation retry closes and reopens the connection, then presumably you'd get the same TimeLimitExceededException again. Note that LimitExceededException is ignored by the search result handler. So even if it occurs, you'll still get any results that were retrieved before the exception.
>

Ok. This may be a matter of upgrading our shib implementation (it is getting quite old) as the error is resulting in an attribute resolution error bubbling all the way up. We are getting the time limit exceeded exceptions at irregular intervals, but with increasing frequency lately. So i was hoping a properly-spaced retry would workaround these brown-outs. Our campus ldap admins have reassured us that they are throwing new hardware at the general problem, but of course that won't be for the proverbial "few weeks".

Since you have a misbehaving LDAP your approach may help triage the issue. Assuming these time limit exceptions are load related, you should also play around with a retry backoff. Stepping down the load might be your best bet.

--Daniel Fisher

Reply all

Reply to author

Forward