Fwd: dSpace cracks after some minutes

39 views
Skip to first unread message

Camilo Freire

unread,
Aug 23, 2019, 3:29:54 PM8/23/19
to dspace-c...@googlegroups.com



Salud:

since yesterday we are experimenting a serious problem with our dSpace installation (version 5.4). It seems that dspace has problems to perform the database queries which bring to the interface the bitstreams and logos (logos of collections are not shown, the same happens with bitstreams).  After some minutes dspace interface shows an internal error message. This is the first time we experienced this problem after some years of running dSpace without any problem.

When I list the processes related with dspace, a set of dspace-postgresql request idle processes is shown.  The connection pool accepts up to 100 connections opened at the same time, but the list of idle dspace-postgres idle processes is not so big, it pills up to 10 processes around.

It seems that the problems is related with the queries that bring the bitstreams because the tables locked by the idle processes are .

bitstream
bundle2bitstream
bundle
fileextension


We tried to restart postgres, tomcat and the whole server but the problem persists.

The log file shows this

ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL connection Error -
org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
        at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:114)
        at org.dspace.storage.rdbms.DatabaseManager.getConnection(DatabaseManager.java:634)
        at org.dspace.core.Context.init(Context.java:121)
        at org.dspace.core.Context.<init>(Context.java:95)
        at org.dspace.app.webui.util.UIUtil.obtainContext(UIUtil.java:105)
        at org.dspace.app.webui.servlet.DSpaceServlet.processRequest(DSpaceServlet.java:100)
        at org.dspace.app.webui.servlet.DSpaceServlet.doGet(DSpaceServlet.java:67)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:620)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
        at org.dspace.utils.servlet.DSpaceWebappServletFilter.doFilter(DSpaceWebappServletFilter.java:78)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:501)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
        at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
        at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1041)
        at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)
        at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:313)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
        at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:958)
        at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
        ... 27 more



Camilo Freire
Biblioteca Nacional
Uruguay

Terry Brady

unread,
Aug 26, 2019, 10:47:50 AM8/26/19
to Camilo Freire, DSpace Community
Camilo,

I have not heard of this issue being reported previously.  The steps that you have taken sound like the logical steps to follow to resolve the issue.

I wonder if there is a cron task (such as the filter-media job) in place that is locking those tables. 

After stopping tomcat and postgres, did you confirm that all other dspace processes (running dspace/bin/dspace) were terminated?

I will share this thread in the DSpace tech-support Slack channel to see if other folks have actions to recommend.

Terry

--
All messages to this mailing list should adhere to the DuraSpace Code of Conduct: https://duraspace.org/about/policies/code-of-conduct/
---
You received this message because you are subscribed to the Google Groups "DSpace Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-communi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dspace-community/CAMZv8nYgrk114GcjEeRLt3LhC_xSzWkvQfJMFtuUTkHf6La%3Dtg%40mail.gmail.com.


--
Terry Brady
Applications Programmer Analyst
Georgetown University Library Information Technology
425-298-5498 (Seattle, WA)

Tim Donohue

unread,
Aug 26, 2019, 11:24:47 AM8/26/19
to Camilo Freire, dspace-c...@googlegroups.com
Hi Camilo,

Could you provide us with more information about your setup?  What version of PostgreSQL are you using?  Are you using the XMLUI or JSPUI? Have you changed anything recently in your setup (e.g. upgraded Postgres, or maybe added a larger number of new Items) that could have affected this behavior?

It's difficult to narrow down the exact cause without a bit more information on your setup and any recent activities that might have affected your site.  However, it is worth noting that the latest version of 5.x is now version 5.10, so 5.4 is a bit "old" for a 5.x release.  There have been some performance fixes between 5.4 and 5.10 -- it's hard to say though if any could be what you are seeing.  A few examples though include:
Additional bugs/performance issues that were fixed between 5.4 and 5.10 can be found in the 5.x Release Notes: 

I'm not sure if this will help, but maybe it'll give you a few more clues on what to look at.  If you can send more information to this mailing list it might help us to narrow down whether what you are seeing is a known bug (perhaps even one that has been fixed in a later 5.x release) or some sort of configuration issue, etc.

Tim


From: dspace-c...@googlegroups.com <dspace-c...@googlegroups.com> on behalf of Camilo Freire <camilo...@gmail.com>
Sent: Friday, August 23, 2019 1:42 PM
To: dspace-c...@googlegroups.com <dspace-c...@googlegroups.com>
Subject: [dspace-community] Fwd: dSpace cracks after some minutes
 

Camilo Freire

unread,
Aug 26, 2019, 12:56:19 PM8/26/19
to dspace-c...@googlegroups.com, Terry...@georgetown.edu, tim.d...@lyrasis.org
Terry & Tim, thank you very much for your help.

As Terry says the crack seems to happens after a cronjob was run. We have a cronjob that harverst a remote collection and then creates the thumbnails. We weren't sure that this crack happens after that and  beacause that. The fact that the dSpace from which the haverset was done experimented the same problem seems to support this hypothesis. However the problem persisted even after the reset o the postgresql service. It seems very unprobable that any other process realted with dSpace was running because the whole server was roboot. 

No upgrade was made previous to the crak, the server is setted in order to make uprading "by hand" ("a mano").

Finally I "solve" the problem running the clenup command (there where many bitstream to clean) and restarting both postresql and tomcat. In the other site that experienced a similar problem at the same time (the Harveseted dSpace) the solution was reindexing the hole community. After 3 days both installations seem to work as usual.

Muchas gracias y un saludo
Camilo Freire



 


 
Reply all
Reply to author
Forward
0 new messages