gSearch not inserting into solr index?

254 views
Skip to first unread message

John

unread,
May 9, 2013, 1:39:02 PM5/9/13
to isla...@googlegroups.com
We have having some unexpected behaviour with gsearch.  When we add a record I can see in the logs that updateIndex is run and that the index has been updated.  Here is a snipit from the logs.

 <updateIndex xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:foxml="info:fedora/fedora-system:def/foxml#" xmlns:zs="http://www.loc.gov/zing/srw/" warnCount="0" docCount="3" deleteTotal="0" updateTotal="0" insertTotal="1" indexName="FgsIndex"/>

The problem is that it is not available in solr until I go the the fedoragsearch/rest interface and press 'updateIndex optimize'.

Looking at solr I can see the a few new files are created when I ingest a new item and that they disappear when I optimize.  Should solr be looking at these?  Or, should gsearch be running optimize after ingest?  Did I miss something during configuration?

Any ideas would be greatly appreciated.

John 

Aaron Coburn

unread,
May 9, 2013, 1:52:24 PM5/9/13
to <islandora@googlegroups.com>
When documents are added to Solr, they are not visible to new search requests until a "commit" operation has been executed. [1]

When you ask gsearch to run an "optimize" operation, it is a type of "hard commit" on Solr, and then the new items will be available to search requests.

There are numerous ways to address this, depending on your needs. You can either run a 'commit' or 'optimize' command manually after bulk ingests.

Or, you can add a "commitWithin" attribute to the <add> element of the Solr DocumentXML:

<add commitWithin="15000">
  <doc>
    ...
  </doc>
</add>

Or, you can update the solrconfig.xml file inside Solr. For that, you will want to configure an <autoCommit> or <autoSoftCommit> clause. For example (in Solr 4.2):

<autoCommit>
    <maxTime>15000</maxTime>
    <openSearcher>true</openSearcher>
</autoCommit>
(commits every 15 seconds)

Or:

<autoSoftCommit> 
    <maxTime>1000</maxTime> 
</autoSoftCommit>
(commits every 1 second)

Aaron


--
You received this message because you are subscribed to the Google Groups "islandora" group.
To unsubscribe from this group and stop receiving emails from it, send an email to islandora+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

jy

unread,
May 9, 2013, 3:32:54 PM5/9/13
to isla...@googlegroups.com
Aaron thanks for the information. 

Its weird that islandora would not issue the commit for single record updates.  I don't know about adding it to solr because then when we do bulk additions or re-index solr is going to hit pretty hard.

Once I read a bit I'll sort something out.

John

Brad Spry

unread,
May 22, 2015, 2:06:51 PM5/22/15
to isla...@googlegroups.com, aco...@amherst.edu
Aaron,

It's obvious you placed an OR between your explanations of the options.  Are the three options mutually exclusive? 

I've tweaked all three options and I'm still not satisfied...   I'm working if there is a point of declining return when all three are set? 

I want all the lag out of the system, there's no excuse for it from an infrastructure perspective.  It's gotta be software where the lag lies, not hardware.


Brad
Reply all
Reply to author
Forward
0 new messages