Problem with solr reindex command in the 13 upgrading step to 5.3

526 views
Skip to first unread message

Arthur Sady Cordeiro Rossetti

unread,
Aug 31, 2015, 7:22:55 AM8/31/15
to dspac...@googlegroups.com
Hi,

I upgraded from dspace 4.x -> 5.3 and started having problems with my geo statistcs. After reading about it in:

https://jira.duraspace.org/browse/DS-2486

and the upgrading steps to 5.3 about the solr, in step 13 said that if the instalation was being made from a version before 5.x it would be necessary to reindex solr statistics.

https://wiki.duraspace.org/display/DSDOC5x/Upgrading+DSpace

Using the command suggested in the tutorial:

[dspace]/bin/dspace   solr-reindex-statistics

This solution was implemmented on Dspace 5.2 if im not mistaken, and it should solve my problem with the geo statistics from what I read in the related tickets, but when I try runing the command, it starts and never finishes. Im trying to run it in a testing machine and the stats data have a little less than 7Gb. I let it run for over 72h to no avail. When I check the processor, I tried looking into it with  htop, it points the command isn't using any cpu. I don't know how to reindex my statistics, is there another way to solve this without the command?

When I try stopping the command halfway through it completely messes up the statistics and to normalize it I need to rebuild Dspace.

If anyone could help me solve this problem I would be very gratefull

Thanks in advance for the attention

--
Arthur Sady C. Rossetti



Hilton Gibson

unread,
Aug 31, 2015, 7:40:08 AM8/31/15
to Arthur Sady Cordeiro Rossetti, dspac...@googlegroups.com
Hi Arthur,

I have the same problem.
I tried the re-index on my staging server and it worked sort of - had to be creative with the switches.
However on my production system it just hangs.
I would say this is a major upgrade blocker.

Cheers

hg

Hilton Gibson
Ubuntu Linux Systems Administrator
Stellenbosch University Library


--
You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-tech...@googlegroups.com.
To post to this group, send email to dspac...@googlegroups.com.
Visit this group at http://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.

Andrea Schweer

unread,
Sep 1, 2015, 12:51:35 AM9/1/15
to Hilton Gibson, Arthur Sady Cordeiro Rossetti, dspac...@googlegroups.com
Hi,

that's not good, I tried to put lots of error handling into the reindex! It worked fine for me and for everyone else who tested this for DSpace 5.2, but I know that doesn't mean it will work in all situations. Could you both give us some more information please, hopefully that'll help us troubleshoot?

When it hangs, is there anything at all relevant looking in the logs or on the command line? Does it look like the server is actually doing something (eg system load is higher than usual)?

Which user owns the solr data directory and which user did you run the reindex script as?

What's the size of your solr data directory?

Hilton, what combination of flags made this work on your staging server in the end?

Have you tried upgrading to 5.2, running the reindex there, then upgrading to 5.3 (you don't need another reindex when upgrading 5.2->5.3)?

cheers,
Andrea
-- 
Dr Andrea Schweer
IRR Technical Specialist, ITS Information Systems
The University of Waikato, Hamilton, New Zealand
+64-7-837 9120

Hilton Gibson

unread,
Sep 1, 2015, 4:11:09 AM9/1/15
to Andrea Schweer, Arthur Sady Cordeiro Rossetti, dspac...@googlegroups.com
Hi Andrea,

This is from the history.
>>>>
dspace@repository:~$ history | grep solr-reindex-statistics 
 1987  sudo $HOME/bin/dspace solr-reindex-statistics -k
 1997  sudo $HOME/bin/dspace solr-reindex-statistics -k
>>>>

Cheers

hg



Hilton Gibson
Ubuntu Linux Systems Administrator
Stellenbosch University Library


Arthur Sady Cordeiro Rossetti

unread,
Sep 1, 2015, 1:13:06 PM9/1/15
to Hilton Gibson, Andrea Schweer, dspac...@googlegroups.com
Well, I have tried to use the command first at dspace 5.2 and as it didn't work I then tried 5.3 but the result was the same. I used the comand as the root user, I checked and the acces should be normal. The size as I mentioned before is close to 7Gb. As for the the log, I'm not sure if its related but I tried to see tail the catalina.out and found this, I will try increasing the heap space and see what happens, as soon as I discover how to do it.


Exception in thread "http-bio-8080-exec-10" java.lang.OutOfMemoryError: Java heap space
        at org.apache.lucene.util.PagedBytes.copyUsingLengthPrefix(PagedBytes.java:265)
        at org.apache.lucene.search.FieldCacheImpl$SortedDocValuesCache.createValue(FieldCacheImpl.java:1305)
        at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:213)
        at org.apache.lucene.search.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:1232)
        at org.apache.lucene.search.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:1212)
        at org.apache.lucene.queries.function.docvalues.DocTermsIndexDocValues.<init>(DocTermsIndexDocValues.java:48)
        at org.apache.solr.schema.DateFieldSource$1.<init>(DateField.java:489)
        at org.apache.solr.schema.DateFieldSource.getValues(DateField.java:489)
        at org.apache.solr.handler.component.AbstractStatsValues.setNextReader(StatsValuesFactory.java:220)
        at org.apache.solr.handler.component.SimpleStats.getFieldCacheStats(StatsComponent.java:368)
        at org.apache.solr.handler.component.SimpleStats.getStatsFields(StatsComponent.java:326)
        at org.apache.solr.handler.component.SimpleStats.getStatsCounts(StatsComponent.java:290)
        at org.apache.solr.handler.component.StatsComponent.process(StatsComponent.java:79)
        at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1967)
        at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:777)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
        at org.dspace.solr.filters.LocalHostRestrictionFilter.doFilter(LocalHostRestrictionFilter.java:50)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
        at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:927)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)

Andrea Schweer

unread,
Sep 1, 2015, 5:23:14 PM9/1/15
to Arthur Sady Cordeiro Rossetti, Hilton Gibson, dspac...@googlegroups.com, Pottinger, Hardy J.
Hi,

On 02/09/15 05:12, Arthur Sady Cordeiro Rossetti wrote:
> Well, I have tried to use the command first at dspace 5.2 and as it
> didn't work I then tried 5.3 but the result was the same. I used the
> comand as the root user, I checked and the acces should be normal. The
> size as I mentioned before is close to 7Gb. As for the the log, I'm
> not sure if its related but I tried to see tail the catalina.out and
> found this, I will try increasing the heap space and see what happens,
> as soon as I discover how to do it.
>
>
> Exception in thread "http-bio-8080-exec-10"
> java.lang.OutOfMemoryError: Java heap space

Well spotted Arthur. I thought we had cut down on the memory
requirements, but unfortunately not enough it seems. This might also
explain what Hilton saw; the flags he used shouldn't have made a
difference, so perhaps success or not was determined by how busy Tomcat
was at the time the command was run.

Where to increase the heap space for Tomcat depends on your installation
of Tomcat; on my RHEL6 machines it's in /etc/tomcat6/tomcat6.conf, add a
line like

JAVA_OPTS="${JAVA_OPTS} -Xmx2048m -Xms2048m -XX:MaxPermSize=512m"

just swapping in how much memory you think you might need. The biggest
index I've run the reindex on is ~4GB and I believe the settings were as
above. I believe Hardy ran this on a bigger index during testing; Hardy,
do you recall what your heap space settings were?

cheers,
Andrea

Hilton Gibson

unread,
Sep 1, 2015, 5:33:00 PM9/1/15
to Andrea Schweer, Arthur Sady Cordeiro Rossetti, dspac...@googlegroups.com, Pottinger, Hardy J.
Hi Andrea,

Would it be possible to run the script in "batch mode"?
For example, process year by year or month by month.
To do a full re-index of everything will take a very long time.
It's would be nice to see the script working in an incremental mode with verbose output.

Cheers

hg

Hilton Gibson
Ubuntu Linux Systems Administrator
Stellenbosch University Library


Brian Freels-Stendel

unread,
Sep 1, 2015, 5:43:56 PM9/1/15
to Andrea Schweer, Arthur Sady Cordeiro Rossetti, Hilton Gibson, dspac...@googlegroups.com, Pottinger, Hardy J.
Hello,

I believe the Xmx option is set to 256M in the dspace script ([dspace]/bin/dspace). Won't that override the tomcat conf?

B--

Andrea Schweer

unread,
Sep 1, 2015, 5:58:01 PM9/1/15
to Brian Freels-Stendel, Arthur Sady Cordeiro Rossetti, Hilton Gibson, dspac...@googlegroups.com, Pottinger, Hardy J.
Hi Brian,

On 02/09/15 09:43, Brian Freels-Stendel wrote:
> I believe the Xmx option is set to 256M in the dspace script ([dspace]/bin/dspace). Won't that override the tomcat conf?

No, the dspace script only sets the memory for the command-line tools.
Based on the error message Arthur posted, it looked like *Tomcat* was
running out of memory, not the command-line script (also backed up by
the fact that the error message showed up in the Tomcat log file).

Andrea Schweer

unread,
Sep 1, 2015, 6:04:35 PM9/1/15
to Hilton Gibson, Arthur Sady Cordeiro Rossetti, dspac...@googlegroups.com, Pottinger, Hardy J.
Hi Hilton,


On 02/09/15 09:32, Hilton Gibson wrote:
Would it be possible to run the script in "batch mode"?
For example, process year by year or month by month.
To do a full re-index of everything will take a very long time.
It's would be nice to see the script working in an incremental mode with verbose output.

The reindex is actually creating a brand new solr core and then essentially copies across all records from the existing stats core to the new one, fixing up the fields in the process. Then it does some magic swapping around of cores so that everything still points to the right thing. So yes it's possible to take my code apart and change it so it doesn't necessarily do all actions behind one single command. That might help with the really big cores, but it would also mean that the person running the reindex needs to understand more about what's going on and which steps need to happen in what order.

Perhaps open a Jira issue for changing the reindex script, hopefully someone will volunteer?

Andrea Schweer

unread,
Sep 1, 2015, 6:24:58 PM9/1/15
to Pottinger, Hardy J., Arthur Sady Cordeiro Rossetti, Hilton Gibson, dspac...@googlegroups.com
Hi Hardy,

On 02/09/15 10:18, Pottinger, Hardy J. wrote:
> Hi, I didn't have to re-set the heap space for anything for our upgrade, so I can just tell you what we have set now and hope that helps.
>
> Here's our production setenv.sh script:
>
> https://gist.github.com/hardyoyo/8664b2171d26adcf7b7e
>
> by my reckoning, that's a heap space of about 1.5GB or so.

Thanks for that. Do you recall what size your solr statistics index was
on disk when you re-indexed it? From memory you tested the reindex with
some pretty big indexes, but I may remember wrong.

Hardy Pottinger

unread,
Sep 2, 2015, 9:41:55 AM9/2/15
to DSpace Technical Support
Hi, I notice that both Hilton and Arthur have indicated they ran the reindex script as root. I think that will create permission problems for you both down the road. You need to run *all* DSpace scripts as the user that runs/owns Tomcat (or whatever servlet container you happen to use). In the documentation we call this user the "dspace user" but it can be whatever user the servlet container uses. I can't say for sure whether this is the cause of your current problem, but is suspicious enough I'd first try changing the ownership of your Solr cores, and then re-run the re-index.

I have developed a habit of doing the following things when running a dspace script from the command line (usually in a tmux or screen session, just in case things take a while):

$ sudo su - dspace
$ unset TMOUT
$ export JAVA_OPTS='-Xmx512M -Xms512M -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing -Dfile.encoding=UTF-8'
$ cd /dspace

Not all of it may be necessary, but it keeps me out of trouble. :-)

The PR for the reindex code contains a long history of all the testing we put this through during development: https://github.com/DSpace/DSpace/pull/905 FYI, I successfully ran this script over a stats core with more than 17 million documents, both during testing and during our recent upgrade to 5.3.

Hilton Gibson

unread,
Sep 2, 2015, 9:49:26 AM9/2/15
to Hardy Pottinger, DSpace Technical Support
Hi, interesting comment.

"Even without working safeguards, I'm +1 this script, it gets the job done, you just have to keep your wits about you. I'm OK with merging it as-is and fixing it later."

Cheers

hg

Hilton Gibson
Ubuntu Linux Systems Administrator
Stellenbosch University Library


Pottinger, Hardy J.

unread,
Sep 2, 2015, 10:29:18 AM9/2/15
to Hilton Gibson, DSpace Technical Support
Hi, I respectfully suggest that you read a bit above my vote in the comment thread for DSPR#905:

https://github.com/DSpace/DSpace/pull/905#issuecomment-96036040

The "safeguards" to which I was referring, are Andrea's attempts to ensure the reindex script has enough free disk space to export all the Solr stats. There are enough command line options for you to use to ensure you have enough disk space for the export (just use the -d option and point it to a folder bigger than your existing stats core). The script makes an attempt to check and warn if there won't be enough space... but that wasn't working. The script *does* work if it has enough space, and is important to actually upgrading, so I saw no need to withhold my vote to approve the code. In an ideal world, the code would be perfect, but good enough was (and still is) OK by me.

Since you bring this up in this conversation, do you believe that your reindex script is running out of disk space?

--Hardy


From: Hilton Gibson [hilton...@gmail.com]
Sent: Wednesday, September 02, 2015 8:49 AM
To: Pottinger, Hardy J.
Cc: DSpace Technical Support
Subject: Re: [dspace-tech] Re: Problem with solr reindex command in the 13 upgrading step to 5.3

Hilton Gibson

unread,
Sep 2, 2015, 10:55:56 AM9/2/15
to Pottinger, Hardy J., DSpace Technical Support
Hi Hardy,

I do not have the time to experiment on a production server.
If this could be done incrementally as I asked before, then perhaps.
When there is no verbose output and no log details, it is very difficult to debug.
I am not going to put a production server into debug mode.

So we will live with faulty geo stats for now.
Remember not all institutions have expert java programmers/system personnel at their disposal to fix these things.
In the global south we have to make do.

Cheers

hg


Hilton Gibson
Ubuntu Linux Systems Administrator
Stellenbosch University Library


Andrea Schweer

unread,
Sep 2, 2015, 5:46:12 PM9/2/15
to Hilton Gibson, Pottinger, Hardy J., DSpace Technical Support
Hi Hilton,

I'm sure you're aware that even those of us not in the global south have limited resources. I put in a lot of work to fix a problem that was introduced by people other than myself. I then took the time to share my solution with the community, even continuing to improve my code based on the feedback of several testers to make the reindex work for people without solr knowledge at I time when I had already upgraded all my own solr cores and I could just have easily declared this to be someone else's problem. The reindex script I wrote may not work in all situations, but had I not volunteered my time (and had my employer not let me do so) then there might not be a fix at all. Perhaps there are cultural differences at play here, but your e-mail below reads to me as quite aggressive and as if you'd rather have no code at all than code that worked fine for everyone who helped me test when I developed it and worked fine for all "my" DSpace instances. I'm sure that's not what you actually mean.

A more constructive reaction than the e-mail I'm quoting might be to try out the suggestions made in this thread (increasing the heap space does not require you to put the server into debug mode) and/or to share more details of what you tried and what happened when it failed so that the volunteers (!) on this list can try and figure out what's going on.

I've given you the technical reasons why an incremental reindex is not really doable. You say verbose log output would be nice -- what types of things would you like to see logged? The reindex does log to dspace.log at INFO level; it will tell you every time it writes an export file and every time it reads an import file. Export happens before import -- even without programming skills, the code comments in the reindex method might let you determine this much (https://github.com/DSpace/DSpace/blob/master/dspace-api/src/main/java/org/dspace/util/SolrImportExport.java#L306). So you should be able to use the log output to determine during which part of the process this fails. All access to Solr is also logged in the solr.log file, so again there you will be able to see whether solr is busy doing exports (/select) or imports (/update).

cheers,
Andrea


On 03/09/15 02:55, Hilton Gibson wrote:
Hi Hardy,

I do not have the time to experiment on a production server.
If this could be done incrementally as I asked before, then perhaps.
When there is no verbose output and no log details, it is very difficult to debug.
I am not going to put a production server into debug mode.

So we will live with faulty geo stats for now.
Remember not all institutions have expert java programmers/system personnel at their disposal to fix these things.
In the global south we have to make do.

Cheers

hg


Hilton Gibson
Ubuntu Linux Systems Administrator
Stellenbosch University Library



Hilton Gibson

unread,
Sep 2, 2015, 6:20:07 PM9/2/15
to Andrea Schweer, Pottinger, Hardy J., DSpace Technical Support
Hi Andrea,

There are many who contribute to this community project. I do not wish to single out any persons for fear of offending others.
I think I do my bit with a wiki that I update as often as I can.
I am upset, we had some violence on campus today, and perhaps it spilled over into my reply.
I apologise.

However introducing new features to DSpace that are not "battle tested" seems to be a common theme lately.
I am trying my best with my limited programming skills and extensive production system experience to maintain a production ready repository and help others in developing countries to do the same. (This is what I meant by the global south - most developing countries are in the south)
So the issue with re-indexing solr came at a very bad time.

If you have time please read and consider the following:

Cheers

hg


Hilton Gibson
Ubuntu Linux Systems Administrator
Stellenbosch University Library


Tim Donohue

unread,
Sep 8, 2015, 4:02:13 PM9/8/15
to Hilton Gibson, Andrea Schweer, Pottinger, Hardy J., DSpace Technical Support
Hi Hilton,

With regards to your release testing brainstorms on your own wiki, we honestly would appreciate institutions stepping forward and offering resources (technical infrastructure, staff, etc) during our yearly Test-a-thons. As an established open source project, we have a broad community of users, but our developer core is still very volunteer oriented. There literally is no one who works full time on DSpace (not even myself). We are reliant on the kindness of individuals (and oftentimes their bosses!) to help us to build, support, improve and test DSpace.  It's amazing what we have been able to get done entirely by volunteer work (with a little bit of coordination).

We do hold yearly Testathons where we encourage the broad community to take part, bang on the software and help us to make the next release as "battle tested" as we possibly can. Also, the DSpace Community Advisory Team (DCAT) has begun an initiative (just this week) to help our community to develop a more extensive "Test Plan" (which we can use to ensure each piece of the system has received testing attention from our volunteers).

Their work has begun here, and I'm sure they'd love to have additional contributors to the work overall
https://wiki.duraspace.org/display/cmtygp/DSpace+6+Testathon+Testplan+Working+group

If you or your institution is willing to help out in any way, we'd appreciate the support. (This is the same for anyone else reading this thread!) We'd honestly love to have institutions with larger production environments help us test early versions of DSpace. But, as of yet, we've never been able to find those volunteers (or a large corpus of test data). So, we tend to rely on a more "crowd sourced" testing model (where we put up a couple of test instances and ask folks to help us bang on them). This crowd-sourced model tends to find most software stability bugs, but it admittedly may not always catch all of the scalability bugs.

In all honesty, we want and need more testers to get involved during Testathons and just before major releases. If anyone else is interested and willing to help, get in touch, or join up with the DCAT initiative! We'd love to have your help.

- Tim
--
You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-tech...@googlegroups.com.
To post to this group, send email to dspac...@googlegroups.com.
Visit this group at http://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.

-- 
Tim Donohue
Technical Lead for DSpace & DSpaceDirect
DuraSpace.org | DSpace.org | DSpaceDirect.org

Arthur Sady Cordeiro Rossetti

unread,
Sep 11, 2015, 8:19:49 AM9/11/15
to Andrea Schweer, Pottinger, Hardy J., Hilton Gibson, dspac...@googlegroups.com
Well guys, after trying it some times, after increasing the heap space, the command is aparently working, it finishes processing and the log has just two warnings, and after tring:

curl --globoff 'http://localhost:8080/solr/statistics/select?q=-uid:[*+TO+*]&rows=0&indent=true' | grep numFound


The answer is this:

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   349    0   349    0     0     14      0 --:--:--  0:00:23 --:--:--    96

<result name="response" numFound="0" start="0">


But I don´t know if it worked 100% because of 2 things. I run the command with the -d option as follows:

./dspace solr-reindex-statistics -d /home/dspace/teste/

If I understood correctly, after the command ends, it would delete the temporary files and there should be nothing remaining in the "teste" directory, but I went to check and there are a lot of "statistics_export..." csv files there.

More over, I thought this might solve my geo statistics problem, but it stood the same as in the picture:


Imagem inline 1

Note that the geo statistics are still blank. Before I tried to open the statistics the log only contained two warnings, but after I tried to acces them in xmlui the log changed:

Before:

[WARN] deprecation - The 'component-configurations' section in the sitemap is deprecated. Please check for alternatives.
[WARN] deprecation - The 'component-configurations' section in the sitemap is deprecated. Please check for alternatives.

After:

IO Exception
IO Exception
IO Exception
IO Exception
IO Exception
IO Exception
IO Exception
IO Exception
Error seeking country while seeking 2527261373
IO Exception while seting up segments

I personally think the re index worked, since the curl returned 0 but Im still at a loss on why it didn't solve my geo statistics.


Thank you for the attention.
Reply all
Reply to author
Forward
0 new messages