[Dspace-tech] Database Connection Timeout

George Stanley Kozak

unread,

Aug 25, 2015, 4:44:55 PM8/25/15

to dspac...@lists.sourceforge.net

Hi…

I am having a problem with DSpace 1.6.2. Every few times during the week, people are no longer able to add records and even going to the site causes “Internal Server Error”.

I see java processes that are using up memory and these run even after I shutdown tomcat and I have to manually “kill” them. Also, I have trouble shutting down PostGreSQL and need to “kill” a few processes before PostGreSQL will shutdown.

What I see in the dspace log is this:

2011-04-25 01:21:39,739 WARN org.dspace.app.webui.servlet.DSpaceServlet @ anonymous:no_context:database_error:org.apache.commons.dbcp.SQLNestedException\colon; Cannot get a connection, pool error\colon; Timeout waiting for idle object

org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error: Timeout waiting for idle object

at org.apache.commons.dbcp.PoolingDriver.connect(PoolingDriver.java:184)

at java.sql.DriverManager.getConnection(DriverManager.java:582)

at java.sql.DriverManager.getConnection(DriverManager.java:207)

at org.dspace.storage.rdbms.DatabaseManager.getConnection(DatabaseManager.java:620)

at org.dspace.core.Context.<init>(Context.java:126)

at org.dspace.app.webui.util.UIUtil.obtainContext(UIUtil.java:113)

…27 more

I have seen looking through the dspace-tech archives that people have reported the problem before, but I did not see any mention of a “fix”.

I am allowing 300 max_connections from PostGreSQL and my Java_opts for tomcat is:

JAVA_OPTS="-server -Xms1024m -Xmx2048m -Xmn64m -Dfile.encoding=UTF-8 -XX:+UseParallelGC -verbose:gc -Xloggc:/dspace/dspace/log/gc.log -XX:+HeapDumpOnOutOfMemoryError -XX:PermSize=1024m -XX:MaxPermSize=1024m -XX:-UseGCOverheadLimit"

Any help that I can get here will be GREATLY appreciated.

George Kozak

Digital Library Specialist

Cornell University Library Information Technologies (CUL-IT)

501 Olin Library

Cornell University

Ithaca, NY 14853

607-255-8924

Peter Dietz

unread,

Aug 25, 2015, 5:15:47 PM8/25/15

to George Stanley Kozak, dspac...@lists.sourceforge.net

Hi George,

Have you found any semi-resolution to "Timeout waiting for idle object"?

Our logs have too many lines:

ERROR org.dspace.app.xmlui.aspect.general.AuthenticatedSelector @ Error selecting based on authentication status: Cannot get a connection, pool error Timeout waiting for idle object

I've been watching pg_top for all of the connections that are open to postgres, and it shows dozens of processes that are idle / sleeping, and just two processes that are doing the work. In my head, I'm thinking that something has frozen the idle processes, because when a user does a flurry of activity (submitting an item, managing a workflow, download bitstreams), they'll get this timeout occasionally. They can usually hit refresh to redo what they where trying to do, and it will work fine. What I think the pool of connections should do is convert an idle process to an active process when a request comes in. However, it seems like it will just wait for one of the two busy processes to do the work for the next request.

Disclaimer: That is all just speculation thus far.

Just today, I've made the following changes, and thus far (three hours), I haven't seen the error crop back up. Too soon to claim victory, but better than the roughly 10/hour before.

 # Maximum number of idle connections in pool (-1 = unlimited)

-db.maxidle = -1

+db.maxidle = 10

 # Determine if prepared statement should be cached. (default is true)

-db.statementpool = true

+db.statementpool = false

My thinking for the changes was that there were too many lazy idle connections just taking up space, so I feel I should be limiting them.

For disabling the statement pool, I found a recommendation to touch that from an old dspace-tech message, so I'm trying it out too.

If I get 24 hours, 1 week, 1 month of error-free behavior, then I'll chalk up these changes, until then... just speculation.

An example of a stack trace that I get looks like:

Caused by: org.apache.cocoon.sitemap.PatternException: Cannot get a connection, pool error Timeout waiting for idle object

at org.dspace.app.xmlui.cocoon.ThemeMatcher.match(ThemeMatcher.java:145)

at org.apache.cocoon.sitemap.impl.DefaultExecutor.invokeMatcher(DefaultExecutor.java:68)

at org.apache.cocoon.components.treeprocessor.sitemap.MatchNode.invoke(MatchNode.java:71)

... 54 more

Caused by: org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object

at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:114)

at org.dspace.storage.rdbms.DatabaseManager.getConnection(DatabaseManager.java:588)

at org.dspace.core.Context.<init>(Context.java:95)

at org.dspace.app.xmlui.utils.ContextUtil.obtainContext(ContextUtil.java:72)

at org.dspace.app.xmlui.utils.ContextUtil.obtainContext(ContextUtil.java:53)

at org.dspace.app.xmlui.utils.HandleUtil.obtainHandle(HandleUtil.java:86)

at org.dspace.app.xmlui.cocoon.ThemeMatcher.match(ThemeMatcher.java:57)

... 56 more

Caused by: java.util.NoSuchElementException: Timeout waiting for idle object

at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:958)

at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)

... 62 more

Peter Dietz

------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today. Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
DSpace-tech mailing list
DSpac...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Alan Orth

unread,

Nov 29, 2015, 2:18:45 PM11/29/15

to DSpace Technical Support

Hey, Peter, how did you fare with this line of logic?

Two weeks ago we upgraded DSpace from 4.3 to 5.1 and I've been losing sleep due to up/down alerts from our monitoring scripts. For years I used PostgreSQL and DSpace defaults for max connections—100 and 30, respectively—and they have been fine. Since we upgraded I've constantly run into "waiting for idle object" errors, to which I've responded by increasing DSpace's db.maxconnections to 60, then to 90, then reducing db.maxidle to 10, then 8. Every time, however, DSpace rewards me with idle connections to match my new defaults!

Today, with db.maxconnections at 30 (I got superstitious and reverted to this default) and db.maxidle at 8 I had the following idle connections in PostgreSQL:

$ psql -c 'SELECT * from pg_stat_activity;' | grep dspace | grep -c idle

49

I'm starting to wonder if I should start drastically *increasing* the limits, or if I should start blaming DSpace for not closing connections properly...

For everyone's reference, there's a running Jira issue about this that has comments going back to DSpace 1.7: https://jira.duraspace.org/browse/DS-1458

Cheers,

Alan

------------------------------------------------------------------------------
Magic Quadrant for Content-Aware Data Loss Prevention
Research study explores the data loss prevention market. Includes in-depth
analysis on the changes within the DLP market, and the criteria used to
evaluate the strengths and weaknesses of these DLP solutions.
http://www.accelacomm.com/jaw/sfnl/114/51385063/_______________________________________________

Andrea Schweer

unread,

Nov 29, 2015, 2:53:29 PM11/29/15

to Alan Orth, DSpace Technical Support

Hi,

On 30/11/15 08:18, Alan Orth wrote:

[...]

Today, with db.maxconnections at 30 (I got superstitious and reverted to this default) and db.maxidle at 8 I had the following idle connections in PostgreSQL:

$ psql -c 'SELECT * from pg_stat_activity;' | grep dspace | grep -c idle

49

I'm starting to wonder if I should start drastically *increasing* the limits, or if I should start blaming DSpace for not closing connections properly...

For everyone's reference, there's a running Jira issue about this that has comments going back to DSpace 1.7: https://jira.duraspace.org/browse/DS-1458

I'm pretty sure there is something not right with regards to closing connections. I've long suspected that there is a bug in the db connection re-use logic somewhere but I haven't been able to find it.

If you're seeing worse behaviour in DSpace 5.x -- do you have Google Analytics enabled? As mentioned by Andrea Bollini in a different conversation here on dspace-tech the other day, the Google Analytics download recording appears to be holding on to the db connection until it has successfully reported the download: https://jira.duraspace.org/browse/DS-2904

I recently disabled the GA downloads event listener and I see a small drop in db connections in our monitoring, though only by about 2 connections. We have db.maxconnections set to 30 but generally have more than 30 connections. I don't know whether the limit also covers connections opened by cron jobs. I see around 25 idle connections with db.maxidle set to 8 (around 27 idle connections before disabling the GA downloads listener). Though we did make some changes to the memory allocation and applied the patch for https://jira.duraspace.org/browse/DS-2832 at the same time, which appears to have had quite a big impact on garbage collection frequency+duration. So the changes in db connections may not be from disabling the event listener after all.

cheers,
Andrea

-- 
Dr Andrea Schweer
IRR Technical Specialist, ITS Information Systems
The University of Waikato, Hamilton, New Zealand
+64-7-837 9120

George Kozak

unread,

Nov 30, 2015, 11:36:08 AM11/30/15

to Alan Orth, DSpace Technical Support

Alan:

We at Cornell University are running DSpace 5.2 with the following settings:

db.maxconnections = 50

db.maxwait = 5000

db.maxidle = 5

db.statementpool = true

In the Cocoon logs and DSpace , I still do see occasionally:

“Cannot get a connection, pool error Timeout waiting for idle object”

and

“Timeout waiting for idle object”

But it doesn’t happen every day for us and it doesn’t appear to bring down the system. So, I don’t think I have a solution for you outside of playing with the settings.

Best of luck,

George Kozak

Digital Library Specialist

Cornell University Library Information Technologies (CUL-IT)

218 Olin Library

Cornell University

Ithaca, NY 14853

607-255-8924

--
You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-tech...@googlegroups.com.
To post to this group, send email to dspac...@googlegroups.com.
Visit this group at http://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.

--

***************************

George Kozak
Digital Library Specialist

Cornell University Library - IT
218 Olin Library

Cornell University
Ithaca, NY 14853
607-255-8924

gs...@cornell.edu

Reply all

Reply to author

Forward