Large transactions fails to commit

132 views
Skip to first unread message

mohi...@umd.edu

unread,
Jun 30, 2017, 12:54:13 PM6/30/17
to Fedora Tech
Hello all,

This is Mohamed from the University of Maryland Libraries. We are currently running batch load on our fcrepo-4.7.0 repository and we are wrapping about 70 items in a single transaction. The load runs successfully for several hundred transactions and then it fails with a stuck thread message. Looks like it is trying to acquire a lock to save the session, but it never succeeds in getting the lock. Also, if we have the transaction include a larger number of items, the problem happens even sooner (only a few tens of transactions succeed).


Jun 30, 2017 4:25:39 PM org.apache.catalina.valves.StuckThreadDetectionValve notifyStuckThreadDetected
WARNING: Thread "http-nio-9601-exec-4" (id=54) has been active for 33,205 milliseconds (since 6/30/17 4:25 PM) to serve the same request for https://fcrepolocal/fcrepo/rest/tx:0dc4a797-bb7e-4651-a3a3-bba3d9c73aef/fcr:tx/fcr:commit and may be stuck (configured threshold for this StuckThreadDetectionValve is 30 seconds). There is/are 1 thread(s) in total that are monitored by this Valve and may be stuck.
java.lang.Throwable
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at org.modeshape.common.collection.ring.RingBuffer.add(RingBuffer.java:153)
at org.modeshape.jcr.bus.RepositoryChangeBus.notify(RepositoryChangeBus.java:180)
at org.modeshape.jcr.cache.document.WorkspaceCache.changed(WorkspaceCache.java:320)
at org.modeshape.jcr.txn.Transactions.updateCache(Transactions.java:296)
at org.modeshape.jcr.cache.document.WritableSessionCache.save(WritableSessionCache.java:800)
at org.modeshape.jcr.JcrSession.save(JcrSession.java:1162)
at org.fcrepo.kernel.modeshape.TransactionImpl.commit(TransactionImpl.java:132)
at org.fcrepo.kernel.modeshape.services.TransactionServiceImpl.commit(TransactionServiceImpl.java:204)
at org.fcrepo.http.api.FedoraTransactions.finalizeTransaction(FedoraTransactions.java:145)
at org.fcrepo.http.api.FedoraTransactions.commit(FedoraTransactions.java:105)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161)
at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:160)
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347)
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102)
at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473)
at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:218)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
at edu.umd.lib.tomcat.valves.OptionalSSLAuthenticator.invoke(OptionalSSLAuthenticator.java:40)
at edu.umd.lib.tomcat.valves.OptionalBasicAuthenticator.invoke(OptionalBasicAuthenticator.java:99)
at edu.umd.lib.tomcat.valves.HeaderToCert.invoke(HeaderToCert.java:112)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:169)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
at org.apache.catalina.valves.StuckThreadDetectionValve.invoke(StuckThreadDetectionValve.java:220)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:956)
at org.apache.catalina.authenticator.SingleSignOn.invoke(SingleSignOn.java:270)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:442)
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1082)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:623)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1756)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1715)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:745)


Has anyone come across the stuck thread problem before?

Thanks
Mohamed

mohi...@umd.edu

unread,
Jul 1, 2017, 4:30:27 PM7/1/17
to Fedora Tech
Hello all,

Just want to update our status on the problem. We tested the batch load without the transactions and we still got the stuck thread about the same time. Here is the new stacktrace:
Stacks at 2017-06-30 05:08:14 PM. Uptime is 1h 12m 13s 400ms.

http
-bio-9601-exec-1 [DAEMON] State: WAITING CPU usage on sample: 0ms
sun
.misc.Unsafe.park(boolean, long) Unsafe.java (native)
java
.util.concurrent.locks.LockSupport.park(Object) LockSupport.java:175
java
.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() AbstractQueuedSynchronizer.java:836
java
.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer$Node, int) AbstractQueuedSynchronizer.java:870
java
.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) AbstractQueuedSynchronizer.java:1199
java
.util.concurrent.locks.ReentrantLock$NonfairSync.lock() ReentrantLock.java:209
java
.util.concurrent.locks.ReentrantLock.lock() ReentrantLock.java:285
org
.modeshape.common.collection.ring.RingBuffer.add(Object) RingBuffer.java:153
org
.modeshape.jcr.bus.RepositoryChangeBus.notify(ChangeSet) RepositoryChangeBus.java:180
org
.modeshape.jcr.cache.document.WorkspaceCache.changed(ChangeSet) WorkspaceCache.java:320
org
.modeshape.jcr.txn.Transactions.updateCache(WorkspaceCache, ChangeSet, Transactions$Transaction) Transactions.java:296
org
.modeshape.jcr.cache.document.WritableSessionCache.save(SessionCache, SessionCache$PreSave) WritableSessionCache.java:800
org
.modeshape.jcr.JcrSession.save() JcrSession.java:1162
org
.fcrepo.http.api.FedoraLdp.createObject(ContentDisposition, MediaType, String, InputStream, String, String) FedoraLdp.java:535
sun
.reflect.GeneratedMethodAccessor40.invoke(Object, Object[])
sun
.reflect.DelegatingMethodAccessorImpl.invoke(Object, Object[]) DelegatingMethodAccessorImpl.java:43
java
.lang.reflect.Method.invoke(Object, Object[]) Method.java:497
org
.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(Object, Method, Object[]) ResourceMethodInvocationHandlerFactory.java:81
org
.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run() AbstractJavaResourceMethodDispatcher.java:144
org
.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(ContainerRequest, Object, Object[]) AbstractJavaResourceMethodDispatcher.java:161
org
.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(Object, ContainerRequest) JavaResourceMethodDispatcherProvider.java:160
org
.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(Object, ContainerRequest) AbstractJavaResourceMethodDispatcher.java:99
org
.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(RequestProcessingContext, Object) ResourceMethodInvoker.java:389
org
.glassfish.jersey.server.model.ResourceMethodInvoker.apply(RequestProcessingContext) ResourceMethodInvoker.java:347
org
.glassfish.jersey.server.model.ResourceMethodInvoker.apply(Object) ResourceMethodInvoker.java:102
org
.glassfish.jersey.server.ServerRuntime$2.run() ServerRuntime.java:326
org
.glassfish.jersey.internal.Errors$1.call() Errors.java:271
org
.glassfish.jersey.internal.Errors$1.call() Errors.java:267
org
.glassfish.jersey.internal.Errors.process(Callable, boolean) Errors.java:315
org
.glassfish.jersey.internal.Errors.process(Producer, boolean) Errors.java:297
org
.glassfish.jersey.internal.Errors.process(Runnable) Errors.java:267
org
.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope$Instance, Runnable) RequestScope.java:317
org
.glassfish.jersey.server.ServerRuntime.process(ContainerRequest) ServerRuntime.java:305
org
.glassfish.jersey.server.ApplicationHandler.handle(ContainerRequest) ApplicationHandler.java:1154
org
.glassfish.jersey.servlet.WebComponent.serviceImpl(URI, URI, HttpServletRequest, HttpServletResponse) WebComponent.java:473
org
.glassfish.jersey.servlet.WebComponent.service(URI, URI, HttpServletRequest, HttpServletResponse) WebComponent.java:427
org
.glassfish.jersey.servlet.ServletContainer.service(URI, URI, HttpServletRequest, HttpServletResponse) ServletContainer.java:388
org
.glassfish.jersey.servlet.ServletContainer.service(HttpServletRequest, HttpServletResponse) ServletContainer.java:341
org
.glassfish.jersey.servlet.ServletContainer.service(ServletRequest, ServletResponse) ServletContainer.java:228
org
.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, ServletResponse) ApplicationFilterChain.java:303
org
.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, ServletResponse) ApplicationFilterChain.java:208
org
.apache.tomcat.websocket.server.WsFilter.doFilter(ServletRequest, ServletResponse, FilterChain) WsFilter.java:52
org
.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, ServletResponse) ApplicationFilterChain.java:241
org
.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, ServletResponse) ApplicationFilterChain.java:208
org
.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) StandardWrapperValve.java:218
org
.apache.catalina.core.StandardContextValve.invoke(Request, Response) StandardContextValve.java:122
edu
.umd.lib.tomcat.valves.OptionalSSLAuthenticator.invoke(Request, Response) OptionalSSLAuthenticator.java:40
edu
.umd.lib.tomcat.valves.OptionalBasicAuthenticator.invoke(Request, Response) OptionalBasicAuthenticator.java:99
edu
.umd.lib.tomcat.valves.HeaderToCert.invoke(Request, Response) HeaderToCert.java:112
org
.apache.catalina.core.StandardHostValve.invoke(Request, Response) StandardHostValve.java:169
org
.apache.catalina.valves.ErrorReportValve.invoke(Request, Response) ErrorReportValve.java:103
org
.apache.catalina.valves.StuckThreadDetectionValve.invoke(Request, Response) StuckThreadDetectionValve.java:220
org
.apache.catalina.valves.AccessLogValve.invoke(Request, Response) AccessLogValve.java:956
org
.apache.catalina.authenticator.SingleSignOn.invoke(Request, Response) SingleSignOn.java:270
org
.apache.catalina.core.StandardEngineValve.invoke(Request, Response) StandardEngineValve.java:116
org
.apache.catalina.connector.CoyoteAdapter.service(Request, Response) CoyoteAdapter.java:442
org
.apache.coyote.http11.AbstractHttp11Processor.process(SocketWrapper) AbstractHttp11Processor.java:1082
org
.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(SocketWrapper, SocketStatus) AbstractProtocol.java:623
org
.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run() JIoEndpoint.java:318
java
.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) ThreadPoolExecutor.java:1142
java
.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:617
org
.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run() TaskThread.java:61
java
.lang.Thread.run() Thread.java:745


Thanks
Mohamed

Esmé Cowles

unread,
Jul 3, 2017, 7:27:27 AM7/3/17
to Fedora Tech
Mohamed-

That looks like the hook to fire an event is failing. I don't know what kind of messaging features you're using, but is it possible the events are piling up and the queue is running out memory?

-Esmé
> --
> You received this message because you are subscribed to the Google Groups "Fedora Tech" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to fedora-tech...@googlegroups.com.
> To post to this group, send email to fedor...@googlegroups.com.
> Visit this group at https://groups.google.com/group/fedora-tech.
> For more options, visit https://groups.google.com/d/optout.

Peter Eichman

unread,
Jul 3, 2017, 9:55:47 AM7/3/17
to Fedora Tech
Esmé,

We are using a standalone ActiveMQ instance, configured with 1 GB of memory space, 2 GB of disk storage space, and 100 MB of temp space. Config snippet from activemq.xml:

  <systemUsage>
    <systemUsage>
      <memoryUsage>
        <memoryUsage limit="1024 mb" />
      </memoryUsage>
      <storeUsage>
        <storeUsage limit="2 gb"/>
      </storeUsage>
      <tempUsage>
        <tempUsage limit="100 mb"/>
      </tempUsage>
    </systemUsage>
  </systemUsage>

We have noticed that the queue that feeds into Solr does accumulate somewhere around 150,000 messages by the time we start seeing the stuck threads and hanging transactions.

-Peter

Peter Eichman

unread,
Jul 6, 2017, 10:39:21 AM7/6/17
to Fedora Tech
Hello all,

To follow up, we think we have resolved our problem by increasing the ActiveMQ storeUsage from 2 GB to 8 GB. The queue still fills up, as the solr indexing is not able to keep up with the rate of changing objects in the repository, but 8 GB gives us 12 hours or so of load time before we expect it to fill up.

-Peter

Ralf Claussnitzer

unread,
Jul 7, 2017, 2:05:41 AM7/7/17
to fedor...@googlegroups.com

Hello Peter,

sounds like tuning up Solr would solve the problem. Just out of interest, do you have any performance data that would help identifying the bottleneck exactly? Would clustering help?

-Ralf

Peter Eichman

unread,
Jul 7, 2017, 10:20:00 AM7/7/17
to Fedora Tech, ralf.cla...@slub-dresden.de
Hi Ralf,

Yeah, we were planning on looking at Solr and the whole Solr indexing process. We suspect some of the slowness has to do with the Solr indexer having to make the extra request to the LDPath REST service to get the JSON representation of the resource. We haven't had a chance to gather any concrete performance data yet, though.

-Peter
Reply all
Reply to author
Forward
0 new messages