Dataverse application stops responding and only error is Java Heap Space in the logs to help

251 views
Skip to first unread message

Xavier Tomaszynski

unread,
Sep 22, 2015, 10:06:18 AM9/22/15
to Dataverse Users Community
Hello Everyone,

I recently installed the DataVerse solution on a virtual machine running Red Hat Enterprise Linux Server release 7.1 (Maipo). Because this is my first installation of DVN I followed the indications of the install guide that I found on the Dataverse.org website. All in all this went rathter smooth and the website is up and running which yuo can see here "http://sohda.cumulus.vub.ac.be/". 

The problem I have is that the site will be up for some time, normally like 6 to 8 hours after which it stops responding to url requests.

I have looked into the logs and found this as only indication:

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[2015-09-21T12:41:17.577+0200] [glassfish 4.1] [WARN] [] [org.jboss.weld.Conversation] [tid: _ThreadID=26 _ThreadName=http-listener-1(4)] [timeMillis: 1442832077577] [levelValue: 900] [[
  WELD-000315: Failed to acquire conversation lock in 1,000 ms for Transient conversation]]

[2015-09-21T12:50:37.275+0200] [glassfish 4.1] [WARN] [] [org.jboss.weld.Conversation] [tid: _ThreadID=23 _ThreadName=http-listener-1(1)] [timeMillis: 1442832637275] [levelValue: 900] [[
  WELD-000315: Failed to acquire conversation lock in 1,000 ms for Transient conversation]]

[2015-09-21T17:33:26.458+0200] [glassfish 4.1] [SEVERE] [AS-WEB-CORE-00114] [javax.enterprise.web.core] [tid: _ThreadID=98 _ThreadName=ContainerBackgroundProcessor[StandardEngine[glassfish-web].StandardHost[server].StandardContext[]]] [timeMillis: 1442849606458] [levelValue: 1000] [[
  Exception invoking periodic operation:
java.lang.OutOfMemoryError: Java heap space
        at org.apache.catalina.session.ManagerBase.findSessions(ManagerBase.java:918)
        at org.apache.catalina.session.StandardManager.processExpires(StandardManager.java:1050)
        at org.apache.catalina.core.StandardContext.backgroundProcess(StandardContext.java:6340)
        at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1823)
        at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(ContainerBase.java:1812)
        at java.lang.Thread.run(Thread.java:745)
]]
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

When I get this problem the only thing I can do to restart the website is to stop the domain (/usr/local/glassfish4/bin/asadmin stop-domain) and then afterwards restarting it. This will keep it going again for a couple of hours. Somtimes I have to realy force a kill of the JDK process with a -9 signal in order to shut the domain down. Still, I always get the site up and running again.

Now, in the beginning I attributed only 2GB of memory to -Xmx and I thought that increasing this to 5G would solve the issue. Still, this isn't the case. I still keep getting the same behavior.

I am looking for clues here in the dark and I really would appreciate any help or clues you give me. Maybe someone also experienced this same issue? 

Thanks in advance for any suggestion regarding my issue.
Xavier

PS: The give you an idea on how this site is configured I am pasting here the startup procedure from the same log file as above.

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[2015-09-22T08:05:21.386+0200] [] [INFO] [NCLS-GFLAUNCHER-00005] [javax.enterprise.launcher] [tid: _ThreadID=1 _ThreadName=main] [timeMillis: 1442901921386] [levelValue: 800] [[
  JVM invocation command line:
/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.85-2.6.1.2.el7_1.x86_64/bin/java
-cp
/usr/local/glassfish4/glassfish/modules/glassfish.jar
-XX:+UnlockDiagnosticVMOptions
-XX:PermSize=256m
-XX:MaxPermSize=512m
-XX:NewRatio=2
-Xmx5864m
-client
-javaagent:/usr/local/glassfish4/glassfish/lib/monitor/flashlight-agent.jar
-Ddataverse.files.directory=/usr/local/glassfish4/glassfish/domains/domain1/files
-Dfelix.fileinstall.disableConfigSave=false
-Djavax.net.ssl.keyStore=/usr/local/glassfish4/glassfish/domains/domain1/config/keystore.jks
-Ddoi.password=apitest
-Djava.awt.headless=true
-Dcom.ctc.wstx.returnNullForDefaultNamespace=true
-Dfelix.fileinstall.poll=5000
-Djava.endorsed.dirs=/usr/local/glassfish4/glassfish/modules/endorsed:/usr/local/glassfish4/glassfish/lib/endorsed
-Ddoi.username=apitest
-Dfelix.fileinstall.bundles.startTransient=true
-Djavax.net.ssl.trustStore=/usr/local/glassfish4/glassfish/domains/domain1/config/cacerts.jks
-Djavax.xml.accessExternalSchema=all
-Ddoi.baseurlstring=https://ezid.cdlib.org
-Dcom.sun.enterprise.security.httpsOutboundKeyAlias=s1as
-Djavax.xml.parsers.SAXParserFactory=com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl
-Djava.security.auth.login.config=/usr/local/glassfish4/glassfish/domains/domain1/config/login.conf
-DANTLR_USE_DIRECT_CLASS_LOADING=true
-Dgosh.args=--nointeractive
-Ddataverse.rserve.port=6311
-Ddataverse.fqdn=sohda.cumulus.vub.ac.be
-Dosgi.shell.telnet.maxconn=1
-Djdbc.drivers=org.apache.derby.jdbc.ClientDriver
-Dfelix.fileinstall.dir=/usr/local/glassfish4/glassfish/modules/autostart/
-Dosgi.shell.telnet.port=6666
-Djava.security.policy=/usr/local/glassfish4/glassfish/domains/domain1/config/server.policy
-Dfelix.fileinstall.log.level=2
-Dcom.sun.aas.instanceRoot=/usr/local/glassfish4/glassfish/domains/domain1
-Ddataverse.rserve.user=rserve
-Ddataverse.rserve.host=localhost
-Dcom.sun.enterprise.config.config_environment_factory_class=com.sun.enterprise.config.serverbeans.AppserverConfigEnvironmentFactory
-Dosgi.shell.telnet.ip=127.0.0.1
-Dcom.sun.aas.installRoot=/usr/local/glassfish4/glassfish
-Djava.ext.dirs=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.85-2.6.1.2.el7_1.x86_64/lib/ext:/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.85-2.6.1.2.el7_1.x86_64/jre/lib/ext:/usr/local/glassfish4/glassfish/domains/domain1/lib/ext
-Ddataverse.auth.password-reset-timeout-in-minutes=60
-Dfelix.fileinstall.bundles.new.start=true
-Dorg.glassfish.additionalOSGiBundlesToStart=org.apache.felix.shell,org.apache.felix.gogo.runtime,org.apache.felix.gogo.shell,org.apache.felix.gogo.command,org.apache.felix.shell.remote,org.apache.felix.fileinstall
-Ddataverse.rserve.password=<LEFT BLANK ;)>
-Djdk.corba.allowOutputStreamSubclass=true
-Djava.library.path=/usr/local/glassfish4/glassfish/lib:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
com.sun.enterprise.glassfish.bootstrap.ASMain
-domainname
domain1
-asadmin-args
--host,,,localhost,,,--port,,,4848,,,--secure=false,,,--terse=false,,,--echo=false,,,--interactive=true,,,start-domain,,,--verbose=false,,,--watchdog=false,,,--debug=false,,,--domaindir,,,/usr/local/glassfish4/glassfish/domains,,,domain1
-instancename
server
-verbose
false
-debug
false
-asadmin-classpath
/usr/local/glassfish4/glassfish/lib/client/appserver-cli.jar
-asadmin-classname
com.sun.enterprise.admin.cli.AdminMain
-upgrade
false
-type
DAS
-domaindir
/usr/local/glassfish4/glassfish/domains/domain1
-read-stdin
true]]

[2015-09-22T08:05:28.153+0200] [glassfish 4.1] [INFO] [NCLS-LOGGING-00009] [javax.enterprise.logging] [tid: _ThreadID=15 _ThreadName=RunLevelControllerThread-1442901927958] [timeMillis: 1442901928153] [levelValue: 800] [[
  Running GlassFish Version: GlassFish Server Open Source Edition  4.1  (build 13)]]

[2015-09-22T08:05:28.157+0200] [glassfish 4.1] [INFO] [NCLS-LOGGING-00010] [javax.enterprise.logging] [tid: _ThreadID=15 _ThreadName=RunLevelControllerThread-1442901927958] [timeMillis: 1442901928157] [levelValue: 800] [[
  Server log file is using Formatter class: com.sun.enterprise.server.logging.ODLLogFormatter]]

[2015-09-22T08:05:28.882+0200] [glassfish 4.1] [INFO] [NCLS-SECURITY-01115] [javax.enterprise.system.core.security] [tid: _ThreadID=14 _ThreadName=RunLevelControllerThread-1442901927955] [timeMillis: 1442901928882] [levelValue: 800] [[
  Realm [admin-realm] of classtype [com.sun.enterprise.security.auth.realm.file.FileRealm] successfully created.]]

[2015-09-22T08:05:28.928+0200] [glassfish 4.1] [INFO] [] [org.glassfish.ha.store.spi.BackingStoreFactoryRegistry] [tid: _ThreadID=15 _ThreadName=RunLevelControllerThread-1442901927958] [timeMillis: 1442901928928] [levelValue: 800] [[
  Registered org.glassfish.ha.store.adapter.cache.ShoalBackingStoreProxy for persistence-type = replicated in BackingStoreFactoryRegistry]]

[2015-09-22T08:05:28.998+0200] [glassfish 4.1] [INFO] [NCLS-SECURITY-01115] [javax.enterprise.system.core.security] [tid: _ThreadID=14 _ThreadName=RunLevelControllerThread-1442901927955] [timeMillis: 1442901928998] [levelValue: 800] [[

[2015-09-22T08:05:29.036+0200] [glassfish 4.1] [INFO] [NCLS-SECURITY-01115] [javax.enterprise.system.core.security] [tid: _ThreadID=14 _ThreadName=RunLevelControllerThread-1442901927955] [timeMillis: 1442901929036] [levelValue: 800] [[
  Realm [certificate] of classtype [com.sun.enterprise.security.auth.realm.certificate.CertificateRealm] successfully created.]]

[2015-09-22T08:05:29.267+0200] [glassfish 4.1] [INFO] [SEC-SVCS-00100] [javax.enterprise.security.services] [tid: _ThreadID=14 _ThreadName=RunLevelControllerThread-1442901927955] [timeMillis: 1442901929267] [levelValue: 800] [[
  Authorization Service has successfully initialized.]]

[2015-09-22T08:05:29.279+0200] [glassfish 4.1] [INFO] [NCLS-CORE-00087] [javax.enterprise.system.core] [tid: _ThreadID=17 _ThreadName=RunLevelControllerThread-1442901927967] [timeMillis: 1442901929279] [levelValue: 800] [[
  Grizzly Framework 2.3.15 started in: 167ms - bound to [/0.0.0.0:8080]]]

[2015-09-22T08:05:29.430+0200] [glassfish 4.1] [INFO] [NCLS-CORE-00087] [javax.enterprise.system.core] [tid: _ThreadID=17 _ThreadName=RunLevelControllerThread-1442901927967] [timeMillis: 1442901929430] [levelValue: 800] [[
  Grizzly Framework 2.3.15 started in: 2ms - bound to [/0.0.0.0:8181]]]

[2015-09-22T08:05:29.457+0200] [glassfish 4.1] [INFO] [NCLS-CORE-00087] [javax.enterprise.system.core] [tid: _ThreadID=17 _ThreadName=RunLevelControllerThread-1442901927967] [timeMillis: 1442901929457] [levelValue: 800] [[
  Grizzly Framework 2.3.15 started in: 3ms - bound to [/0.0.0.0:4848]]]

[2015-09-22T08:05:29.588+0200] [glassfish 4.1] [INFO] [NCLS-CORE-00087] [javax.enterprise.system.core] [tid: _ThreadID=14 _ThreadName=RunLevelControllerThread-1442901927955] [timeMillis: 1442901929588] [levelValue: 800] [[
  Grizzly Framework 2.3.15 started in: 1ms - bound to [/0.0.0.0:3700]]]

[2015-09-22T08:05:30.065+0200] [glassfish 4.1] [INFO] [AS-WEB-GLUE-00198] [javax.enterprise.web] [tid: _ThreadID=15 _ThreadName=RunLevelControllerThread-1442901927958] [timeMillis: 1442901930065] [levelValue: 800] [[
  Created HTTP listener http-listener-1 on host/port 0.0.0.0:8080]]

[2015-09-22T08:05:30.074+0200] [glassfish 4.1] [INFO] [AS-WEB-GLUE-00198] [javax.enterprise.web] [tid: _ThreadID=15 _ThreadName=RunLevelControllerThread-1442901927958] [timeMillis: 1442901930074] [levelValue: 800] [[
  Created HTTP listener http-listener-2 on host/port 0.0.0.0:8181]]

[2015-09-22T08:05:30.079+0200] [glassfish 4.1] [INFO] [AS-WEB-GLUE-00198] [javax.enterprise.web] [tid: _ThreadID=15 _ThreadName=RunLevelControllerThread-1442901927958] [timeMillis: 1442901930079] [levelValue: 800] [[
  Created HTTP listener admin-listener on host/port 0.0.0.0:4848]]

[2015-09-22T08:05:30.082+0200] [glassfish 4.1] [INFO] [AS-WEB-GLUE-00198] [javax.enterprise.web] [tid: _ThreadID=15 _ThreadName=RunLevelControllerThread-1442901927958] [timeMillis: 1442901930082] [levelValue: 800] [[
  Created HTTP listener jk-connector on host/port 0.0.0.0:8009]]

[2015-09-22T08:05:30.106+0200] [glassfish 4.1] [INFO] [AS-WEB-GLUE-00200] [javax.enterprise.web] [tid: _ThreadID=15 _ThreadName=RunLevelControllerThread-1442901927958] [timeMillis: 1442901930106] [levelValue: 800] [[
  Created virtual server server]]

[2015-09-22T08:05:30.109+0200] [glassfish 4.1] [INFO] [AS-WEB-GLUE-00200] [javax.enterprise.web] [tid: _ThreadID=15 _ThreadName=RunLevelControllerThread-1442901927958] [timeMillis: 1442901930109] [levelValue: 800] [[
  Created virtual server __asadmin]]

[2015-09-22T08:05:30.449+0200] [glassfish 4.1] [INFO] [AS-WEB-CORE-00306] [javax.enterprise.web.core] [tid: _ThreadID=15 _ThreadName=RunLevelControllerThread-1442901927958] [timeMillis: 1442901930449] [levelValue: 800] [[
  Setting JAAS app name glassfish-web]]

[2015-09-22T08:05:30.449+0200] [glassfish 4.1] [INFO] [AS-WEB-GLUE-00201] [javax.enterprise.web] [tid: _ThreadID=15 _ThreadName=RunLevelControllerThread-1442901927958] [timeMillis: 1442901930449] [levelValue: 800] [[
  Virtual server server loaded default web module ]]

[2015-09-22T08:05:30.818+0200] [glassfish 4.1] [INFO] [AS-CORE-JAVAEE-0002] [javax.enterprise.system.core.ee] [tid: _ThreadID=15 _ThreadName=RunLevelControllerThread-1442901927958] [timeMillis: 1442901930818] [levelValue: 800] [[
  Done with starting web container.]]

[2015-09-22T08:05:32.722+0200] [glassfish 4.1] [INFO] [] [javax.enterprise.system.tools.deployment.common] [tid: _ThreadID=16 _ThreadName=RunLevelControllerThread-1442901927967] [timeMillis: 1442901932722] [levelValue: 800] [[
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


Philip Durbin

unread,
Sep 22, 2015, 10:35:31 AM9/22/15
to dataverse...@googlegroups.com
Hi Xavier, thanks for installing Dataverse! :)

As you indicated it sounds like we need to tune your JVM, especially with regard to memory usage. I'm curious if you used the "install" script mentioned at http://guides.dataverse.org/en/4.1/installation/installer-script.html because it's supposed to figure out for you how much of a heap to give Glassfish: https://github.com/IQSS/dataverse/blob/v4.1/scripts/installer/install#L804 . I'd be curious to know what this line prints out during your installation:

print "\nSetting the heap limit for Glassfish to " . $gf_heap_default . "MB. \n";

I also noticed "-client" as a JVM option. You might want to change that to "-server". We recently noticed we haven't documented this in 4.x: https://github.com/IQSS/dataverse/issues/2521

That issue also links to what seems like some good advice at http://blog.c2b2.co.uk/2013/07/glassfish-4-performance-tuning.html

I hope this helps!

Phil

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/d8ca1128-5a0e-4dc9-9472-af60e762dbc8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Xavier Tomaszynski

unread,
Sep 23, 2015, 1:58:19 AM9/23/15
to Dataverse Users Community, philip...@harvard.edu
Hi Phil,

Yes looking forward to working and maintaining  this application for our clients, especially now that I see that support is so great and that we can count on you ;)

I have looked at you're indication/recommendations and I will try it out and come back to you with feedback.

Thanks Phil,

Kind regards
-Xavier

Op dinsdag 22 september 2015 16:35:31 UTC+2 schreef Philip Durbin:

Xavier Tomaszynski

unread,
Sep 24, 2015, 5:59:55 AM9/24/15
to Dataverse Users Community
Hello Phil,

Some feedback from what I have noticed since our last conversation.

I changed the setting from -client to -server but that didn't fix our problem. I tried to see if I could find what the output was during the installation of the line:

print "\nSetting the heap limit for Glassfish to " . $gf_heap_default . "MB. \n";

But I can't seem to find it. Think I do not have the  no installation logs anymore . So, hoping for the best I added 2GB of RAM to -Xmx setting the value to 7168m.

The site will still stop responding, but I do not see the error anymore in my logs regarding "memory heap size".

Do you think re-running the install.sh script is a bad idea?

Also I noticed that the ports are listening on TCP6 and not on tcp. I suspect that this is not so much an issue but I wanted to mention it here.

tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN     
tcp        0      0 0.0.0.0:5432            0.0.0.0:*               LISTEN     
tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN     
tcp        0      0 0.0.0.0:5666            0.0.0.0:*               LISTEN     
tcp6       0      0 :::8009                 :::*                    LISTEN     
tcp6       0      0 :::3820                 :::*                    LISTEN     
tcp6       0      0 :::35918                :::*                    LISTEN     
tcp6       0      0 :::8686                 :::*                    LISTEN     
tcp6       0      0 :::57615                :::*                    LISTEN     
tcp6       0      0 :::3920                 :::*                    LISTEN     
tcp6       0      0 :::4848                 :::*                    LISTEN     
tcp6       0      0 :::8080                 :::*                    LISTEN     
tcp6       0      0 :::3700                 :::*                    LISTEN     
tcp6       0      0 :::8181                 :::*                    LISTEN     
tcp6       0      0 :::22                   :::*                    LISTEN     
tcp6       0      0 :::52599                :::*                    LISTEN     
tcp6       0      0 :::8983                 :::*                    LISTEN     
tcp6       0      0 :::5432                 :::*                    LISTEN     
tcp6       0      0 ::1:25                  :::*                    LISTEN     
tcp6       0      0 :::55388                :::*                    LISTEN     
tcp6       0      0 :::7676                 :::*                    LISTEN     
tcp6       0      0 :::5666                 :::*                    LISTEN     

Kind regards.
Xavier

Op dinsdag 22 september 2015 16:06:18 UTC+2 schreef Xavier Tomaszynski:

Philip Durbin

unread,
Sep 24, 2015, 9:50:35 AM9/24/15
to dataverse...@googlegroups.com
I don't really know the answer to "Do you think re-running the install.sh script is a bad idea?" I feel like some developers do this but I have my own lighter developer-oriented script that I use (see https://github.com/IQSS/dataverse/issues/2443 for details). Can you please write to sup...@dataverse.org for an answer to this?

You *could* play around with `vagrant up` if you want to see what heap limit is set. You could even change the amount of memory the Vagrant VM has before you start it: http://guides.dataverse.org/en/latest/developers/tools.html#vagrant . This (again) is more of a developer thing, though.

I don't see any mention of heap in the Installation Guide for Dataverse 4 but if you search for "heap" in the DVN 3.6 guides at http://guides.dataverse.org/en/3.6.2/dataverse-installer-main.html you'll see it mentioned.

The script I mentioned says "setting the default heap size limit to 3/8 of the available amount of memory" https://github.com/IQSS/dataverse/blob/v4.1/scripts/installer/install#L833 . This makes sense since https://apitest.dataverse.org has 16 GB of ram and "-Xmx" was set to roughly 6 GB (16 * 3 / 8). You'll also notice that G1GC is in the domain.xml for that apitest server, but I honestly don't know a lot about it:

        <jvm-options>-XX:MaxPermSize=512m</jvm-options>
        <jvm-options>-XX:PermSize=256m</jvm-options>
        <jvm-options>-XX:+UseG1GC</jvm-options>
        <jvm-options>-Xmx5839m</jvm-options>

I'm pretty surprised your installation stops responding so quickly. Are you hitting it with a lot of traffic and activity? How much memory do you have?

I'm sorry I can't be of more assistance. Again, if you email sup...@dataverse.org you can get help. Later maybe you can help us summarize the solution for the mailing list.

Phil

p.s. There's also http://chat.dataverse.org where people who install Dataverse hang out.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages