Searches block under heavy indexing load

30 views

Skip to first unread message

Matt Wheeler

unread,

Aug 6, 2013, 3:08:48 PM8/6/13

to zo...@googlegroups.com

Hello,

In our application (which uses SenseiDB), we index some large documents and our analyzers are CPU-intensive. During heavy load, searches are blocked and the problem appears to be in Zoie. Our searcher threads get blocked by the ConsumerThread within SearchIndexManager for a very long time, during the getIndexReaders call - indefinitely, it seems, when the indexing load is very high. During this time, the ConsumerThread is indexing normally.

Here are some example stack traces:

"parallel-searcher-2-thread-146" - Thread t@404

java.lang.Thread.State: BLOCKED

at proj.zoie.impl.indexing.internal.SearchIndexManager.getIndexReaders(SearchIndexManager.java:227)

- waiting to lock <3fc89896> (a proj.zoie.impl.indexing.internal.SearchIndexManager) owned by "ConsumerThread" t@57

at proj.zoie.impl.indexing.SimpleReaderCache.getIndexReaders(SimpleReaderCache.java:27)

(There are many more threads blocked on this consumer thread, and other consumer threads blocking searcher threads.)

"ConsumerThread" - Thread t@57

java.lang.Thread.State: RUNNABLE

at org.apache.lucene.analysis.PorterStemmer.r(PorterStemmer.java:235)

at org.apache.lucene.analysis.PorterStemmer.step3(PorterStemmer.java:315)

at org.apache.lucene.analysis.PorterStemmer.stem(PorterStemmer.java:485)

at org.apache.lucene.analysis.PorterStemmer.stem(PorterStemmer.java:460)

at org.apache.lucene.analysis.PorterStemFilter.incrementToken(PorterStemFilter.java:63)

at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:185)

at org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:278)

at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:766)

at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2066)

at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2040)

at proj.zoie.impl.indexing.internal.BaseSearchIndex.updateIndex(BaseSearchIndex.java:115)

at proj.zoie.impl.indexing.internal.LuceneIndexDataLoader.consume(LuceneIndexDataLoader.java:215)

at proj.zoie.impl.indexing.internal.RealtimeIndexDataLoader.consume(RealtimeIndexDataLoader.java:106)

- locked <3fc89896> (a proj.zoie.impl.indexing.internal.SearchIndexManager)

- locked <7784fdfc> (a proj.zoie.impl.indexing.internal.RealtimeIndexDataLoader)

at proj.zoie.impl.indexing.AsyncDataConsumer.flushBuffer(AsyncDataConsumer.java:298)

(looking at the source code for version 3.3.0) It looks like RealtimeIndexDataLoader.consume() synchronizes on the SearchIndexManager around line 105, and SearchIndexManager.getIndexReaders() is synchronized on itself around line 225.

I am wondering if anyone has any ideas on solving this problem. For now I am going to work on writing a test that replicates this issue, and then work on a solution. Ideally, reading should never block other than to increment a reference count or something similar.

Thanks,

Matt

Matt Wheeler

unread,

Aug 7, 2013, 4:31:35 PM8/7/13

to zo...@googlegroups.com

After further review of our code it looks like we are setting freshness to 0 and therefore using SimpleReaderCache, which would clearly result in this behavior. Oops!