Searches block under heavy indexing load

29 views
Skip to first unread message

Matt Wheeler

unread,
Aug 6, 2013, 3:08:48 PM8/6/13
to zo...@googlegroups.com
Hello,

In our application (which uses SenseiDB), we index some large documents and our analyzers are CPU-intensive.  During heavy load, searches are blocked and the problem appears to be in Zoie.  Our searcher threads get blocked by the ConsumerThread within SearchIndexManager for a very long time, during the getIndexReaders call - indefinitely, it seems, when the indexing load is very high.  During this time, the ConsumerThread is indexing normally.

Here are some example stack traces:

"parallel-searcher-2-thread-146" - Thread t@404
   java.lang.Thread.State: BLOCKED
at proj.zoie.impl.indexing.internal.SearchIndexManager.getIndexReaders(SearchIndexManager.java:227)
- waiting to lock <3fc89896> (a proj.zoie.impl.indexing.internal.SearchIndexManager) owned by "ConsumerThread" t@57
at proj.zoie.impl.indexing.SimpleReaderCache.getIndexReaders(SimpleReaderCache.java:27)

(There are many more threads blocked on this consumer thread, and other consumer threads blocking searcher threads.)

"ConsumerThread" - Thread t@57
   java.lang.Thread.State: RUNNABLE
at org.apache.lucene.analysis.PorterStemmer.r(PorterStemmer.java:235)
at org.apache.lucene.analysis.PorterStemmer.step3(PorterStemmer.java:315)
at org.apache.lucene.analysis.PorterStemmer.stem(PorterStemmer.java:485)
at org.apache.lucene.analysis.PorterStemmer.stem(PorterStemmer.java:460)
at org.apache.lucene.analysis.PorterStemFilter.incrementToken(PorterStemFilter.java:63)
at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:185)
at org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:278)
at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:766)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2066)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2040)
at proj.zoie.impl.indexing.internal.BaseSearchIndex.updateIndex(BaseSearchIndex.java:115)
at proj.zoie.impl.indexing.internal.LuceneIndexDataLoader.consume(LuceneIndexDataLoader.java:215)
at proj.zoie.impl.indexing.internal.RealtimeIndexDataLoader.consume(RealtimeIndexDataLoader.java:106)
- locked <3fc89896> (a proj.zoie.impl.indexing.internal.SearchIndexManager)
- locked <7784fdfc> (a proj.zoie.impl.indexing.internal.RealtimeIndexDataLoader)
at proj.zoie.impl.indexing.AsyncDataConsumer.flushBuffer(AsyncDataConsumer.java:298)

(looking at the source code for version 3.3.0) It looks like RealtimeIndexDataLoader.consume() synchronizes on the SearchIndexManager around line 105, and SearchIndexManager.getIndexReaders() is synchronized on itself around line 225.

I am wondering if anyone has any ideas on solving this problem.  For now I am going to work on writing a test that replicates this issue, and then work on a solution.  Ideally, reading should never block other than to increment a reference count or something similar.

Thanks,
Matt

Matt Wheeler

unread,
Aug 7, 2013, 4:31:35 PM8/7/13
to zo...@googlegroups.com
After further review of our code it looks like we are setting freshness to 0 and therefore using SimpleReaderCache, which would clearly result in this behavior.  Oops!

- Matt
Reply all
Reply to author
Forward
0 new messages