Reindexing in Solr

21 views
Skip to first unread message

Carl Hall

unread,
Aug 17, 2011, 6:40:36 PM8/17/11
to Nakamura List
During a data load yesterday, we hit an issue where the Solr index file was locked by something that wouldn't let go of it (example message below). We haven't put any time into figuring out why the lock wouldn't release but we are interested in being able to index the content and groups that didn't get indexed due to this. I started looking into this a bit but seem to be missing some key starting points: how to get a list of all users, groups and content that exists in sparse. Is there a way to get some starting points for these? With content, it would be enough to just get the root elements and start walking down from there. Now that v1 is getting close to release, it'd be good to start working on some reindex functionality.

17.08.2011 00:00:00.342 *INFO* [IndexerQueueDispatch] org.apache.solr.core.SolrCore [nakamura] webapp=null path=/update params={} status=500 QTime=1004
17.08.2011 00:00:01.363 *INFO* [IndexerQueueDispatch] org.apache.solr.update.processor.UpdateRequestProcessor {} 0 1002
17.08.2011 00:00:01.364 *ERROR* [IndexerQueueDispatch] org.apache.solr.core.SolrCore org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@/home/atlas/sakai3/sling/solr/nakamura/index/lucene-eae4305949e0adf474e258ed0705abfc-write.lock
        at org.apache.lucene.store.Lock.obtain(Lock.java:84)
        at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:730)
        at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:83)
        at org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:99)
        at org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:173)
        at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:220)
        at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
        at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
        at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
        at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1359)
        at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:151)
        at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:50)
        at org.sakaiproject.nakamura.solr.ContentEventListener.batchedEventRun(ContentEventListener.java:377)
        at org.sakaiproject.nakamura.solr.ContentEventListener.run(ContentEventListener.java:284)
        at java.lang.Thread.run(Thread.java:662)

Ian Boston

unread,
Aug 18, 2011, 3:50:32 AM8/18/11
to sakai-...@googlegroups.com
Carl,
This is relatively easy to do, and once I have done KERN-1957 I was
intending to look at the issue of migration and rebuilding, which
would give you the feeds you needed.

However, I dont think its appropriate to start adding new features
into version 1 at this late stage. V1 is already in RCs. There are a
number of critical things effecting other institutions that have
already been pushed out by the managed project, so adding a large
feature should come after fixing those issues.

If you are talking about post V1 (I hope you are), then I dont see any
mention of reindexing in the Roadmap outlined by the URG, so in theory
a new feature of this scale is already out of scope for the next
version. (at least by the managed project)

If you are looking for a local solution, then investigate in more
detail why the lock would not release. What process/thread had hold of
the lock? When was it created? The Solr indexer is single threaded so
unless you have some code that writes to Solr independently or creates
threads, then the most likely cause is that you reconfigured a bundle
and caused the indexer to die leaving the lock behind. The same
problem happens if you cause Jackrabbit to re-start. Its Lucene lock
hangs around and never gets released.

The normal way out of this is to verify what process had the lock and
if that process is not running (you shut the JVM down so its not,
right), then remove the lock file and start the process up again.

Ian

> --
> You received this message because you are subscribed to the Google Groups
> "Sakai Nakamura" group.
> To post to this group, send email to sakai-...@googlegroups.com.
> To unsubscribe from this group, send email to
> sakai-kernel...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/sakai-kernel?hl=en.
>

Clay Fenlason

unread,
Aug 18, 2011, 8:05:51 AM8/18/11
to sakai-...@googlegroups.com
On this one point:

On Thu, Aug 18, 2011 at 3:50 AM, Ian Boston <i...@tfd.co.uk> wrote:
> ... I dont see any


> mention of reindexing in the Roadmap outlined by the URG, so in theory
> a new feature of this scale is already out of scope for the next
> version. (at least by the managed project)

At the top of that same document [1] you'll see a couple notes to the
effect that URG discussions have not yet included technical
priorities, nor even detailed technical study of its items, and that
what's there should not be confused with a project plan at this stage.
We're particularly mindful of the fact that server team priorities
have not yet been factored in.

I don't know anything about the issue Carl raises, but I did want to
say that ruling anything out of scope because it doesn't show up on
that Confluence page is premature.

~Clay

[1] https://confluence.sakaiproject.org/x/pZOCB

[1] https://confluence.sakaiproject.org/x/pZOCB

Carl Hall

unread,
Aug 18, 2011, 2:14:09 PM8/18/11
to sakai-...@googlegroups.com
No matter the cause or timing (definitely post-v1), NYU now has a staging environment with an incomplete index and no path for recovery without reloading the data. The server had been up for hours, no changes had been applied and the only activity was data loading.

I would prefer to not reload things or to have to result to a solution that calls directly to storage mechanism, so I'm looking for a means through sparse to get a list of items to retrigger indexing. Is it possible to get a list of users, groups and root level content to start the reindexing? I don't see anything currently in AuthorizableManager or ContentManager and I will have to work on this soon.


On Thu, Aug 18, 2011 at 3:50 AM, Ian Boston <i...@tfd.co.uk> wrote:

Ian Boston

unread,
Aug 18, 2011, 5:27:09 PM8/18/11
to sakai-...@googlegroups.com
There are triggerRefresh and triggerRefreshAll for Authorzables and
Content, see [1] done in the last hour.
It has unit test coverage but I havent validated absolutely that it works.

You need to write a bundle that does:

Session session = repository.loginAdministrative();
session.getAuthorizableManager().refreshAll();
session.getContentManager().refreshAll();


It will add events to the Solr queue (at upto 60K events/s), and those
will then be indexed in batches of about 200, but thats going to take
a bit longer.

Although the patch adds just under 400 lines, and touches most files
in core, there are no code changes to Nakamura, and no changes to any
code that can be accessed from Nakamura, so I felt this was Ok to add.

If you need to re-index Acls, just ask.
Please test on Oracle.
I have tested on all the other DBs. Cassandra and HBase dont have this
feature at the moment.

HTH
Ian

1. https://github.com/ieb/sparsemapcontent/commit/de5c3a905ba4c719d3958d0f0b420b30c104fb55

Reply all
Reply to author
Forward
0 new messages