Fwd: Understanding index-discovery -o

585 views
Skip to first unread message

Terry Brady

unread,
Jan 12, 2016, 2:15:10 PM1/12/16
to DSpace Technical Support
When we run index-discovery -o in our cron, the process generates an exception.  This does not occur when running "index-discovery" without the optimize flag.

This has happened in DSpace 5x, DSpace 4x, and it might have also happened for us in DSpace 3x.

What is the normal, expected output from this command?

Do I need to provide additional resources to this command to make it successful?

DSpace 5.4

SOLR Search Optimize -- Process Started:1451713208640
Exception: Expected mime type application/octet-stream but got text/html. <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>503 Service Temporarily Unavailable</title>
</head><body>
<h1>Service Temporarily Unavailable</h1>
<p>The server is temporarily unable to service your
request due to maintenance downtime or capacity
problems. Please try again later.</p>
</body></html>

org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Expected mime type application/octet-stream but got text/html. <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>503 Service Temporarily Unavailable</title>
</head><body>
<h1>Service Temporarily Unavailable</h1>
<p>The server is temporarily unable to service your
request due to maintenance downtime or capacity
problems. Please try again later.</p>
</body></html>

        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:512)
        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
        at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
        at org.apache.solr.client.solrj.SolrServer.optimize(SolrServer.java:204)
        at org.apache.solr.client.solrj.SolrServer.optimize(SolrServer.java:158)
        at org.dspace.discovery.SolrServiceImpl.optimize(SolrServiceImpl.java:532)
        at org.dspace.discovery.IndexClient.main(IndexClient.java:121)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)

DSpace 4.3

SOLR Search Optimize -- Process Started:1448516406357
Exception: Server at http://localhost/solr/search returned non ok status:503, message:Service Temporarily Unavailable
org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Server at http://localhost/solr/search returned non ok status:503, message:Service Temporarily Unavailable
        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:385)
        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
        at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
        at org.apache.solr.client.solrj.SolrServer.optimize(SolrServer.java:204)
        at org.apache.solr.client.solrj.SolrServer.optimize(SolrServer.java:158)
        at org.dspace.discovery.SolrServiceImpl.optimize(SolrServiceImpl.java:506)
        at org.dspace.discovery.IndexClient.main(IndexClient.java:121)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:225)
        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:77)

--
Terry Brady
Applications Programmer Analyst
Georgetown University Library Information Technology
425-298-5498 (Seattle, WA)

Bill T

unread,
Jan 12, 2016, 2:25:52 PM1/12/16
to DSpace Technical Support
I think the expected output is something akin to:

<status>0</status> or something similar...

Does this happen *every* time you run optimize?
Bill

Terry Brady

unread,
Jan 12, 2016, 2:30:33 PM1/12/16
to Bill T, DSpace Technical Support
Looking at the output from a few periods of time, it does appear to happen every time.



--
You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-tech...@googlegroups.com.
To post to this group, send email to dspac...@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.

helix84

unread,
Jan 13, 2016, 3:52:09 AM1/13/16
to Terry Brady, Bill T, DSpace Technical Support
Hi Terry,

1. The 503 error you see is from the Solr client side (DSpace). To see the actual error you need to look into Solr log (catalina.out or solr.log depending on DSpace version).

2. Running the forceMerge operation (-o) on a regular basis is not really necessary. It can help in special circumstances, but it's not such a big deal in normal operation, especially in read-heavy use case such as DSpace. You can more about that here: [1]



Regards,
~~helix84

Compulsory reading: DSpace Mailing List Etiquette
https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Lulamile Mangali

unread,
Mar 17, 2016, 7:44:58 AM3/17/16
to DSpace Technical Support, Terry...@georgetown.edu, wil...@gmail.com, hel...@centrum.sk
Hi Terry,

Can you assist us here, we have installed a new 5.3 DSpace instance and now are getting an error that says "503 Service Temporarily Unavailable" - could you please give us an indicator as to what might be the problem...?

Thanking you in advance..  

Terry Brady

unread,
Mar 17, 2016, 2:38:25 PM3/17/16
to Lulamile Mangali, DSpace Technical Support, Bill Tantzen, Ivan Masár
See the prior note from helix84 that references the log files of interest.  Take a look at those files to diagnose the problem.

The 503 error you see is from the Solr client side (DSpace). To see the actual error you need to look into Solr log (catalina.out or solr.log depending on DSpace version).

If the errors are unclear, please post the error messages to this thread.

Terry 

Alan Orth

unread,
Aug 2, 2017, 3:24:25 AM8/2/17
to Terry Brady, Lulamile Mangali, DSpace Technical Support, Bill Tantzen, Ivan Masár
Hi,

What is the current best practices recommendation for `index-discovery -o` in DSpace 4, 5, and 6? In this thread helix84 pointed out a quite old issue on the Solr bug tracker that discussed the usefulness of the "optimize" (forceMerge) operation in Solr. I've had this disabled on our DSpace instances for almost a year now but notice it is still recommended on the various scheduled tasks pages on the wiki[0].

What are people's thoughts on this? Should we remove it from the official list of scheduled tasks (and make the `index-discovery -o` subcommand a noop)?


Regards,

--
You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-tech...@googlegroups.com.
To post to this group, send email to dspac...@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.
--

helix84

unread,
Aug 2, 2017, 3:40:30 AM8/2/17
to Alan Orth, Terry Brady, Lulamile Mangali, DSpace Technical Support, Bill Tantzen
No reason to make it a no-op in DSpace (if you need it, you have it
available without needing to fire a curl query to Solr), but running
the scheduled task daily is not necessary. That said, it doesn't cost
much, either. The excessive usage Solr devs referred to was running it
after every batch update or even after every added document.

We might also change the help output of index-discovery -h to
something more descriptive like "if excessive, reduce the number of
segments in Solr search core".

Current output:

stats-util -h
-o,--optimize Run maintenance on the SOLR index

index-discovery -h
-o optimize search core


Regards,
~~helix84

Compulsory reading: DSpace Mailing List Etiquette
https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


Alan Orth

unread,
Aug 3, 2017, 3:36:55 PM8/3/17
to hel...@centrum.sk, Terry Brady, Lulamile Mangali, DSpace Technical Support, Bill Tantzen
Hello,

Yes, perhaps better guidance in the help text, or even changing that line in the example crontab on the wiki so that it has a better description of when it would be appropriate to run that job. Of course people always see "optimize" and they want to run it, but have no idea what it does. Hell, I don't even know what "if excessive, reduce the number of segments in Solr search core" means in the context of a digital repository like DSpace, and I'm a very technical user with a background in computer science!

Cheers,
Reply all
Reply to author
Forward
0 new messages