GC overhead limit exceeded when generating oai index..

773 views
Skip to first unread message

Panyarak Ngamsritragul

unread,
Mar 22, 2021, 10:08:09 AM3/22/21
to DSpace Community

Hi,

I am using DSpace 6.3+Apache Tomcat Version 8.0.37+javac 11.0.10

The instance I am managing has 11,428 records.  (kb.psu.ac.th)

I tried to create indexes for OAI using this command

/dspace/bin/dspace oai import -c

It worked until 8900 items and crashed with error messages:

8600 items imported so far...
8700 items imported so far...
8800 items imported so far...
8900 items imported so far...
java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.util.Arrays.copyOfRange(Arrays.java:3664)
    at java.lang.String.<init>(String.java:207)
    at java.lang.StringBuilder.toString(StringBuilder.java:407)
    at org.hibernate.persister.entity.AbstractEntityPersister.selectFragment(AbstractEntityPersister.java:1422)
    at org.hibernate.persister.entity.AbstractEntityPersister.selectFragment(AbstractEntityPersister.java:4434)
    at org.hibernate.loader.JoinWalker.selectString(JoinWalker.java:1099)
    at org.hibernate.loader.AbstractEntityJoinWalker.initStatementString(AbstractEntityJoinWalker.java:123)
    at org.hibernate.loader.AbstractEntityJoinWalker.initStatementString(AbstractEntityJoinWalker.java:108)
    at org.hibernate.loader.AbstractEntityJoinWalker.initAll(AbstractEntityJoinWalker.java:90)
    at org.hibernate.loader.AbstractEntityJoinWalker.initAll(AbstractEntityJoinWalker.java:77)
    at org.hibernate.loader.criteria.CriteriaJoinWalker.<init>(CriteriaJoinWalker.java:123)
    at org.hibernate.loader.criteria.CriteriaJoinWalker.<init>(CriteriaJoinWalker.java:92)
    at org.hibernate.loader.criteria.CriteriaLoader.<init>(CriteriaLoader.java:95)
    at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1604)
    at org.hibernate.internal.CriteriaImpl.list(CriteriaImpl.java:374)
    at org.dspace.core.AbstractHibernateDAO.list(AbstractHibernateDAO.java:158)
    at org.dspace.authorize.dao.impl.ResourcePolicyDAOImpl.findByDSoAndAction(ResourcePolicyDAOImpl.java:74)
    at org.dspace.authorize.ResourcePolicyServiceImpl.find(ResourcePolicyServiceImpl.java:103)
    at org.dspace.authorize.AuthorizeServiceImpl.getPoliciesActionFilter(AuthorizeServiceImpl.java:575)
    at org.dspace.authorize.AuthorizeServiceImpl.authorize(AuthorizeServiceImpl.java:301)
    at org.dspace.authorize.AuthorizeServiceImpl.authorizeAction(AuthorizeServiceImpl.java:129)
    at org.dspace.authorize.AuthorizeServiceImpl.authorizeAction(AuthorizeServiceImpl.java:95)
    at org.dspace.authorize.AuthorizeServiceImpl.authorizeActionBoolean(AuthorizeServiceImpl.java:181)
    at org.dspace.authorize.AuthorizeServiceImpl.authorizeActionBoolean(AuthorizeServiceImpl.java:166)
    at org.dspace.xoai.app.XOAI.isPublic(XOAI.java:458)
    at org.dspace.xoai.app.XOAI.index(XOAI.java:343)
    at org.dspace.xoai.app.XOAI.index(XOAI.java:280)
    at org.dspace.xoai.app.XOAI.indexAll(XOAI.java:227)
    at org.dspace.xoai.app.XOAI.index(XOAI.java:134)
    at org.dspace.xoai.app.XOAI.main(XOAI.java:560)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

Could you please help to solve this problem?

Thanks and best regards,
Panyarak Ngamsritragul

FILIPPOS KOLOVOS

unread,
Mar 22, 2021, 12:16:59 PM3/22/21
to pany...@gmail.com, DSpace Community
Dear Sir,

This error relates to the Garbage Collection (GC) mechanism of JVM, which means that it does not have enough memory available to complete the task. More in particular, it means that the Garbage Collector is spending too much time clearing the heap space from unused objects, but at the same time it does not free more than 2% of the heap space, which is used for the instantiated objects. That's wh the GC reports that it does not have any more space to do its job.

One thing you can do, if you do not have enough RAM on your server, is first shutdown Tomcat (this will lead to some downtime for your server, but it will free up some valuable RAM and after OAI completes you can restart it) and then go to the /dspace/bin/dspace file of your running dspace instance and edit it with your text editor. Then, in the file search and find the line:

JAVA_OPTS="-Xmx256m -Dfile.encoding=UTF-8"

and change it to:

 JAVA_OPTS="-Xmx2048m -Xms1024m -Dfile.encoding=UTF-8"

DO NOT copy and paste the line, but edit it in place, because you do not want to mess up this file.

This will specify that the dspace TOOL (which is the file you are running for the OAI) will use from 1024MB up to 2048MB RAM for its task and not just 256MB. However, you must have more than 2048MB RAM available on the server (i.e. 3GB), or the script will fail again and it might hang your server. If you have LESS than 3GB allocated to the server, then adjust these values accordingly (i.e. Xmx1024m Xms500m, etc).

The bottomline is that you do not have enough memory to complete the task and even if your server has 32GB RAM, the dspace tool will only use up to the RAM specified in this file.

I hope that I have helped!

Best Regards,

-Fk

--
All messages to this mailing list should adhere to the Code of Conduct: https://duraspace.org/about/policies/code-of-conduct/
---
You received this message because you are subscribed to the Google Groups "DSpace Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-communi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dspace-community/fcc5dfb5-0c01-4665-b490-e915d4662f96n%40googlegroups.com.

Panyarak Ngamsritragul

unread,
Mar 23, 2021, 5:56:14 AM3/23/21
to FILIPPOS KOLOVOS, DSpace Community

Thanks a lot Filippos.  That really help solve my problem!

Anyway, I got another error following the success of importing the items:
11400 items imported so far...
11500 items imported so far...
Total: 11505 items
java.lang.NullPointerException
at org.dspace.xoai.app.XOAI.willChangeStatus(XOAI.java:438)
at org.dspace.xoai.app.XOAI.index(XOAI.java:368)

at org.dspace.xoai.app.XOAI.index(XOAI.java:280)
at org.dspace.xoai.app.XOAI.indexAll(XOAI.java:227)
at org.dspace.xoai.app.XOAI.index(XOAI.java:134)
at org.dspace.xoai.app.XOAI.main(XOAI.java:560)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)

I did searches through the web and found that it will be resolved in 6.4...
https://alanorth.github.io/cgspace-notes/2020-06/

Thanks a lot.
Panyarak

FILIPPOS KOLOVOS

unread,
Mar 23, 2021, 11:57:43 AM3/23/21
to Panyarak Ngamsritragul, DSpace Community
Dear Sir,

I am glad that I have helped you.

This error with the XOAI import happened also to our installation a few years ago and sometimes it happens because it tries to index in OAI an item from the SOLR index that does not have a handle number. This happens when an item with no handle number has been indexed in SOLR, which should not occur.

This might not happen for an ordinary item, but it may occur for an Item Template that is specified for a collection, but it is withdrawn (which should NOT happen).
You should check all your withdrawn items in DSpace and if you find one or more that are not ordinary items, but Item Templates, then you should delete and recreate them for the collection(s) they belong to. Make sure to keep the pre-filled data, so that you will be able to re-create them.

One query you could run in your postgres, in order to see which items in general do not have a handle number is:
SELECT * FROM item WHERE NOT EXISTS (SELECT resource_id FROM handle WHERE handle.resource_id = item.uuid AND handle.resource_type_id = 2);

The query will respond with a number of items with no handle number. For most of the items is normal, since they might be incomplete submissions from users, or submissions which have not completed the workflow, etc. You will also see item templates in that list. They also do not have handle numbers and it is normal.
However, check the "withdrawn" column for everyone of these items and to see if it is "t" (true) and for these items check if they are item templates. You can do that by pasting the uuid in the "internal ID" of the administrative interface of DSpace in "Content->Items" If that is the case, then you must delete them and recreate them for the collection they belong to, but this time they must not be withdrawn. When you recreate them, before you run the OAI index script, you will have to recreate the SOLR  index with
/dspace/bin/index-discovery -b -f so that the erroneous item will be excluded from the index (the -b deletes the index and the -f forces each item to be re-indexed, so make sure that your reindexing process works well, or have a backup of the index before attempting this necessary step, or you might end up with no search index if it also crashes).

If that does not solve the problem, then another item with no handle number might be the culprit, ordinary or Item template, which mistakenly has been indexed in SOLR.

In case you would like to dive into SOLR, there is this query that can be run directly to SOLR from the console of your Linux server, which will return all indexed documents in SOLR that do not have a handle number. However, the returned file is an XML JSON file, but I guess it will provide enough information about these erroneous items and then you can spot them. The query is (keep the single and double quotes, or the shell might not ran the command).:


I hope this also helps you

Best Regards,

-Fk

Panyarak Ngamsritragul

unread,
Mar 24, 2021, 5:49:35 AM3/24/21
to FILIPPOS KOLOVOS, DSpace Community
Dear FILIPPOS,

Thanks a lot for your advice.  I have tried both of your suggested approaches:
1. I ran the query and found that there is no record with "withdrawn" marked as "t".  All are "f".
2. Running the wget command returns 0 item without handle number:
{
  "responseHeader":{
    "status":0,
    "QTime":1,
    "params":{
      "q":"*:* AND -handle:{* TO *}",
      "indent":"true",
      "wt":"json"}},
  "response":{"numFound":0,"start":0,"docs":[]
  }}


I also recreated the SOLR  index with /dspace/bin/dspace index-discovery -b -f followed by /dspace/bin/dspace oai import -c. And I still got the same error.

Anyway, your kind advice is much appreciated.

Panyarak

FILIPPOS KOLOVOS

unread,
Mar 24, 2021, 2:48:50 PM3/24/21
to Panyarak Ngamsritragul, DSpace Community
Dear Sir,

I took a second look in your previous reply, where you provide a link to this bug which is fixed in DSpace 6.4. I think I should have taken a look earlier, because it analyzes the erroneous code.

I also run two DSpace installations, one of which is version 6.3. I frequently work on the source code, in order to apply customizations and / or bug fixes. So, I applied the patch for XOAI in the 6.3 DSpace installation and I sent you the compiled XOAI.class file, in order for you to try it out and see if it works.

My Installation is using Tomcat 9.0.24 and JAVA 8u191.
  1. You will have to first shutdown Tomcat
  2. Then, you must replace the file [running-dspace-directory]/webapps/oai/WEB-INF/classes/org/dspace/xoai/app/XOAI.class with the attached one and it should work fine (the running-dspace-directory is the root directory of your Dspace running instance). However, just in case the java versions do not match (which I do not think so), keep a backup of the old XOAI.class file.
  3. Then check in your Tomcat directory, if this class is in the Tomcat's running cache and delete it, in order for the new file to take effect. This cache is normally at /usr/local/apache-tomcat-version/work/Catalina/localhost/oai/XOAI.class. Same here, the apache-tomcat-version is the directory of your version of Tomcat.
  4. Restart Tomcat

This time I hope that you will solve your problem and finally reindex OAI.

Best Regards,

-Fk

XOAI.class

Panyarak Ngamsritragul

unread,
Mar 25, 2021, 3:00:21 AM3/25/21
to FILIPPOS KOLOVOS, DSpace Community
Dear FILIPPOS,

Thanks again for your advice.
I tried your suggestion right away, but still no luck.  The same error still persists.

BTW, your suggestions help me learn a lot more.   Thank you.

Panyarak

FILIPPOS KOLOVOS

unread,
Mar 25, 2021, 8:38:53 AM3/25/21
to Panyarak Ngamsritragul, DSpace Community
Hm, that's odd. Are you sure that the old XOAI.class is not still running in Tomcat's cache? Did you check for it, deleted it and did a shutdown-start cycle during the XOAI update? The new XOAI.class file should be 24.328 bytes and the old one 24.312.

Because the error that it throws is exactly at line 438 in the method willChangeStatus() and it is a nullPointerException. Exactly at that line (and at 331 but that's not where it throws the error in your case) I have inserted the patched code that explicitly checks if the policy's group is null before doing anything else. It shouldn't throw that error that I include below. 

The only other reason that I can think of that leads to the same error, is that you have item policies with valid groups (not null), but with null names and so it crashes at the next step where it checks for policy.getGroup().getName(). I "enhanced" the XOAI.class patch a bit, by inserting an additional check for the policy's name as well, before checking it's name (policy.getGroup().getName().equals("Anonymous")), because that is the only other place that it could throw an exception like this. 

I recompiled it and I am attaching it again for you to try it out if you wish. The new XOAI.class file should be 24.350 bytes. For such a small patch, its a pitty not to be able to solve that error.

Error Throwed:

Total: 11505 items
java.lang.NullPointerException
at org.dspace.xoai.app.XOAI.willChangeStatus(XOAI.java:438)

at org.dspace.xoai.app.XOAI.index(XOAI.java:368)
at org.dspace.xoai.app.XOAI.index(XOAI.java:280)
at org.dspace.xoai.app.XOAI.indexAll(XOAI.java:227)
at org.dspace.xoai.app.XOAI.index(XOAI.java:134)
at org.dspace.xoai.app.XOAI.main(XOAI.java:560)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
XOAI.class

Panyarak Ngamsritragul

unread,
Mar 26, 2021, 4:48:58 AM3/26/21
to FILIPPOS KOLOVOS, DSpace Community
Dear FILIPPOS

I have tried your new file, but still got the same error. 
Well, what I am suspecting is whether I did it correctly.

The Tomcat I am using was manually installed.  So the file XOAI.class should be placed in /opt/tomcat/webapps/oai/WEB-INF/classes/org/dspace/xoai/app and I found no cache of XOAI.class in /opt/tomcat/work/Catalina/localhost/oai.

I wonder if Tomcat place the cache files in some other locations...

This is what I got when I list the content of directories:
panya@kb:/opt/tomcat/work/Catalina/localhost/oai$ ls -la
total 8
drwxr-x---  2 tomcat tomcat 4096 Jun 24  2020 .
drwxr-x--- 16 tomcat tomcat 4096 Nov  5 08:27 ..
panya@kb:/opt/tomcat/work/Catalina/localhost/oai$

panya@kb1:/opt/tomcat/webapps/oai/WEB-INF/classes/org/dspace/xoai/app$ ls -la
total 44
drwxr-xr-x  2 tomcat tomcat  4096 ต.ค.   1 11:39 .
drwxr-xr-x 10 tomcat tomcat  4096 ต.ค.   1 11:39 ..
-rw-r--r--  1 tomcat tomcat  5364 ต.ค.   1 11:39 BasicConfiguration.class
-rw-r--r--  1 tomcat tomcat  2881 ต.ค.   1 11:39 DSpaceWebappConfiguration.class
-rw-r--r--  1 tomcat tomcat 24350 มี.ค.  26 15:12 XOAI.class
panya@kb1:/opt/tomcat/webapps/oai/WEB-INF/classes/org/dspace/xoai/app$

Maybe you can modify the first output line of XOAI.class so that I can check which version of XOAI.class Tomcat is calling...
OAI 2.0 manager action started  <--- Can you modify this?
Clearing index
Index cleared
Using full import.
Full import
100 items imported so far...
200 items imported so far...

Thanks a  lot.
Panyarak


FILIPPOS KOLOVOS

unread,
Mar 26, 2021, 5:51:14 AM3/26/21
to Panyarak Ngamsritragul, DSpace Community
No problem,

I am reattaching the file, with a small string at the beginning. Now, it should write: OAI 2.0 manager action started - New Version.

And yes, the paths you are using are correct. I'm not aware of another folder where Tomcat might keep its cache.

I wonder if the dspace installation is using war files to deploy the application. In that case, everytime Tomcat is restarted, it redeploys the old war file, overwriting the new XOAI.class with what is in the WAR.
One way to make it work in that case would be to somehow insert the new XOAI.class in the war file with javac and then start Tomcat. However, messing around with WAR files like that in Tomcat is a bit risky, since incorrect manipulation of the war files, might lead to severe problems with the site. These war files (if used) are produced upon compiling/installing dspace and they include the class' files at that time.  This was an older method of dspace application deployment and as far as I know in newer versions of Dspace (>3.0 ? I think) is no longer used. 

I prefer to use symbolic links to the real application paths and not war files, since it is more straightforward. 

Best Regards,

-Fk
XOAI.class

FILIPPOS KOLOVOS

unread,
Mar 26, 2021, 6:50:11 AM3/26/21
to Panyarak Ngamsritragul, DSpace Community
Also, check in the dspace/bin/dspace tool file, at the beginning where it is configured to retrieve the dspace's classpath especially for OAI.

It reads something like this (it is from the default installation). Is the path correct?. The dspace tool sometimes deviates from the tomcat's configuration:

...........
...........
BINDIR=`dirname $0`
DSPACEDIR=`cd "$BINDIR/.." ; pwd`

# Add the directory with all oai classes (original and overlay) needed by the 'dspace oai' launcher.xml
OAICLASSES=$DSPACEDIR/webapps/oai/WEB-INF/classes/

-Fk

Panyarak Ngamsritragul

unread,
Mar 28, 2021, 11:10:55 PM3/28/21
to FILIPPOS KOLOVOS, DSpace Community
Good morning FILIPPOS!

Your last email put up the light!  I don't quite understand how applications run under Tomcat containers.

When DSpace is compiled and deployed.  We are instructed to copy contents under /dspace/webapps to Tomcat's webapps, and I also did that...

I think I still confuse how these modules are called.  In the /dspace/bin/dspace script, it seems that /dspace/webapps/oai/... is used.  This morning I copy the XOAI.class to that location and did the oai import again.  This time it worked just fine!  No single error.

Anyway, when oai is called via the browser, I think (in mycase) /opt/tomcat/webapps/oai will be called.  So it is better to create symbolic links instead of copy all the contents.

Thanks a lot for your helps.

Panyarak

FILIPPOS KOLOVOS

unread,
Mar 29, 2021, 3:32:13 AM3/29/21
to Panyarak Ngamsritragul, DSpace Community
Great! Glad to be of help and now you can proceed with a proper OAI index!

Well, there are various ways that webapps can run under Tomcat containers.
  • The most common is to directly copy the webapps to the Tomcat's container, i.e. the "webapps" directory under Tomcat.
  • Another way is to create symbolic links in the webapps directory of Tomcat to the real webapp, wherever it is in the disk. In this case, everytime DSpace is updated, Tomcat will know where to find the files (via the symbolic link).
  • A third way is not to copy, nor to create symbolic links, but to create configuration files for each webapp in Tomcat's configuration directory instead. These configuration files state where Tomcat should find each webapp and are consulted everytime Tomcat is restarted. These files normally go to the [tomcat running directory]/conf/Catalina/localhost and are XML files containing the configuration for each webapp. For example, in my configuration, I have two configuration files, for two webapps, oai and solr, each with a different configuration file in the /usr/local/apache-tomcat-9.0.24/conf/Catalina/localhost directory and a symbolic link for jspui webapp in the webapps directory of Tomcat. I have a somehow mixed configuration.
For both of these webapps, oai and solr, I do not copy the webapps contents, nor have I created symbolic links. The configuration file for oai is oai.xml and its contents are:

<?xml version='1.0'?>
<Context
    docBase="/dspace/webapps/oai"
    debug="0"
    reloadable="true"
    cachingAllowed="false"
    allowLinking="true"
/>

Similarly for solr.

Best Regards,

-Fk

Panyarak Ngamsritragul

unread,
Mar 29, 2021, 3:36:40 AM3/29/21
to FILIPPOS KOLOVOS, DSpace Community
Dear FILIPPOS,

Thanks.  That helps me learn a lot more!

All the best,
Panyarak
Reply all
Reply to author
Forward
0 new messages