OutOfMemory and StackOverflow after upgrading from 2.11 to 2.12.2

164 views
Skip to first unread message

Saša Živkov

unread,
Apr 26, 2016, 5:46:08 AM4/26/16
to repo-d...@googlegroups.com
After upgrading one of our Gerrit servers from 2.11 to 2.12.2 we see a big slowdown
and are getting OutOfMemory every 2-3 hours and then must restart Gerrit.
Increasing the heap prolonged the time between two restarts.

A few suspicious things that we see in thread dump(s) are: [1] where the red part repeats
hundreds of times until a StackOverflow exception is thrown, [2] where also the red
part repeats hundreds of times. Both are related to the GetRelated REST call.
Do [1] and [2] look like a Lucene bug?

We have impression that memory consumption increases quickly after *some* GetRelated calls.
The issue also looks related to the issue 3727 (https://code.google.com/p/gerrit/issues/detail?id=3727).

NOTE:
The background reindexing is not yet done and will never finish because it starts from
scratch after every restart and it doesn't have enough time to finish between two restarts.

Our plan is to reindex changes using REST API calls because this way we can continue where we stopped
before a restart. And we are also considering to remove the GetRelated temporarily, at least until the reindexing
is finished. We are continuing with heap dump analysis... which should hopefully give us some clue where and if
we have a memory leak.

Any additional suggestion?

[1]
...
    at org.apache.lucene.index.FilterDirectoryReader.doClose()V(FilterDirectoryReader.java:134)
        at org.apache.lucene.index.IndexReader.decRef()V(IndexReader.java:253)
        at org.apache.lucene.index.IndexReader.close()V(IndexReader.java:403)
        - locked <0x00007eff9848e968> (a org.apache.lucene.uninverting.UninvertingReader$UninvertingDirectoryReader)
        at org.apache.lucene.index.FilterDirectoryReader.doClose()V(FilterDirectoryReader.java:134)
        at org.apache.lucene.index.IndexReader.decRef()V(IndexReader.java:253)
        at org.apache.lucene.index.IndexReader.close()V(IndexReader.java:403)
        - locked <0x00007eff984a4d30> (a org.apache.lucene.uninverting.UninvertingReader$UninvertingDirectoryReader)
        at org.apache.lucene.index.FilterDirectoryReader.doClose()V(FilterDirectoryReader.java:134)
        at org.apache.lucene.index.IndexReader.decRef()V(IndexReader.java:253)
        at org.apache.lucene.index.IndexReader.close()V(IndexReader.java:403)
        - locked <0x00007eff984bb0f8> (a org.apache.lucene.uninverting.UninvertingReader$UninvertingDirectoryReader)
        at org.apache.lucene.index.FilterDirectoryReader.doClose()V(FilterDirectoryReader.java:134)
        at org.apache.lucene.index.IndexReader.decRef()V(IndexReader.java:253)
        at org.apache.lucene.index.IndexReader.close()V(IndexReader.java:403)
        - locked <0x00007eff984d14c0> (a org.apache.lucene.uninverting.UninvertingReader$UninvertingDirectoryReader)
        at org.apache.lucene.index.FilterDirectoryReader.doClose()V(FilterDirectoryReader.java:134)
        at org.apache.lucene.index.IndexReader.decRef()V(IndexReader.java:253)
        at com.google.gerrit.lucene.WrappableSearcherManager.decRef(Lorg/apache/lucene/search/IndexSearcher;)V(WrappableSearcherManager.java:140)
        at com.google.gerrit.lucene.WrappableSearcherManager.decRef(Ljava/lang/Object;)V(WrappableSearcherManager.java:68)
        at org.apache.lucene.search.ReferenceManager.release(Ljava/lang/Object;)V(ReferenceManager.java:274)
        at com.google.gerrit.lucene.SubIndex.release(Lorg/apache/lucene/search/IndexSearcher;)V(SubIndex.java:203)
        at com.google.gerrit.lucene.LuceneChangeIndex$QuerySource.read()Lcom/google/gwtorm/server/ResultSet;(LuceneChangeIndex.java:470)
        at com.google.gerrit.server.index.IndexedChangeQuery.read()Lcom/google/gwtorm/server/ResultSet;(IndexedChangeQuery.java:106)
        at com.google.gerrit.server.index.IndexedChangeQuery.restart(I)Lcom/google/gwtorm/server/ResultSet;(IndexedChangeQuery.java:152)
        at com.google.gerrit.server.query.change.AndSource.readImpl()Lcom/google/gwtorm/server/ResultSet;(AndSource.java:133)
        at com.google.gerrit.server.query.change.AndSource.read()Lcom/google/gwtorm/server/ResultSet;(AndSource.java:99)
        at com.google.gerrit.server.query.change.QueryProcessor.queryChanges(Ljava/util/List;Ljava/util/List;)Ljava/util/List;(QueryProcessor.java:162)
        at com.google.gerrit.server.query.change.QueryProcessor.queryChanges(Ljava/util/List;)Ljava/util/List;(QueryProcessor.java:103)
        at com.google.gerrit.server.query.change.QueryProcessor.queryChanges(Lcom/google/gerrit/server/query/Predicate;)Lcom/google/gerrit/server/query/change/QueryResult;(QueryProcessor.java:87)
        at com.google.gerrit.server.query.change.InternalChangeQuery.query(Lcom/google/gerrit/server/query/Predicate;)Ljava/util/List;(InternalChangeQuery.java:245)
        at com.google.gerrit.server.query.change.InternalChangeQuery.byProjectGroups(Lcom/google/gerrit/reviewdb/client/Project$NameKey;Ljava/util/Collection;)Ljava/util/List;(InternalChangeQuery.java:240)
        at com.google.gerrit.server.change.GetRelated.getRelated(Lcom/google/gerrit/server/change/RevisionResource;)Ljava/util/List;(GetRelated.java:74)
        at com.google.gerrit.server.change.GetRelated.apply(Lcom/google/gerrit/server/change/RevisionResource;)Lcom/google/gerrit/server/change/GetRelated$RelatedInfo;(GetRelated.java:63)
        at com.google.gerrit.server.change.GetRelated.apply(Lcom/google/gerrit/extensions/restapi/RestResource;)Ljava/lang/Object;(GetRelated.java:44)

[2]
        at org.apache.lucene.uninverting.FieldCacheImpl.getNumerics(Lorg/apache/lucene/index/LeafReader;Ljava/lang/String;Lorg/apache/lucene/uninverting/FieldCache$Parser;Z)Lorg/apache/lucene/index/NumericDocValues;(FieldCacheImpl.java:456)
        at org.apache.lucene.uninverting.UninvertingReader.getNumericDocValues(Ljava/lang/String;)Lorg/apache/lucene/index/NumericDocValues;(UninvertingReader.java:234)
        at org.apache.lucene.uninverting.FieldCacheImpl.getNumerics(Lorg/apache/lucene/index/LeafReader;Ljava/lang/String;Lorg/apache/lucene/uninverting/FieldCache$Parser;Z)Lorg/apache/lucene/index/NumericDocValues;(FieldCacheImpl.java:456)
        at org.apache.lucene.uninverting.UninvertingReader.getNumericDocValues(Ljava/lang/String;)Lorg/apache/lucene/index/NumericDocValues;(UninvertingReader.java:234)
        at org.apache.lucene.uninverting.FieldCacheImpl.getNumerics(Lorg/apache/lucene/index/LeafReader;Ljava/lang/String;Lorg/apache/lucene/uninverting/FieldCache$Parser;Z)Lorg/apache/lucene/index/NumericDocValues;(FieldCacheImpl.java:456)
        at org.apache.lucene.uninverting.UninvertingReader.getNumericDocValues(Ljava/lang/String;)Lorg/apache/lucene/index/NumericDocValues;(UninvertingReader.java:234)
        at org.apache.lucene.uninverting.FieldCacheImpl.getNumerics(Lorg/apache/lucene/index/LeafReader;Ljava/lang/String;Lorg/apache/lucene/uninverting/FieldCache$Parser;Z)Lorg/apache/lucene/index/NumericDocValues;(FieldCacheImpl.java:456)
        at org.apache.lucene.uninverting.UninvertingReader.getNumericDocValues(Ljava/lang/String;)Lorg/apache/lucene/index/NumericDocValues;(UninvertingReader.java:234)
        at org.apache.lucene.index.DocValues.getNumeric(Lorg/apache/lucene/index/LeafReader;Ljava/lang/String;)Lorg/apache/lucene/index/NumericDocValues;(DocValues.java:225)
        at org.apache.lucene.search.FieldComparator$NumericComparator.getNumericDocValues(Lorg/apache/lucene/index/LeafReaderContext;Ljava/lang/String;)Lorg/apache/lucene/index/NumericDocValues;(FieldComparator.java:167)
        at org.apache.lucene.search.FieldComparator$NumericComparator.doSetNextReader(Lorg/apache/lucene/index/LeafReaderContext;)V(FieldComparator.java:153)
        at org.apache.lucene.search.SimpleFieldComparator.getLeafComparator(Lorg/apache/lucene/index/LeafReaderContext;)Lorg/apache/lucene/search/LeafFieldComparator;(SimpleFieldComparator.java:36)
        at org.apache.lucene.search.FieldValueHitQueue.getComparators(Lorg/apache/lucene/index/LeafReaderContext;)[Lorg/apache/lucene/search/LeafFieldComparator;(FieldValueHitQueue.java:183)
        at org.apache.lucene.search.TopFieldCollector$NonScoringCollector.getLeafCollector(Lorg/apache/lucene/index/LeafReaderContext;)Lorg/apache/lucene/search/LeafCollector;(TopFieldCollector.java:141)
        at org.apache.lucene.search.IndexSearcher.search(Ljava/util/List;Lorg/apache/lucene/search/Weight;Lorg/apache/lucene/search/Collector;)V(IndexSearcher.java:763)
        at org.apache.lucene.search.IndexSearcher.search(Lorg/apache/lucene/search/Query;Lorg/apache/lucene/search/Collector;)V(IndexSearcher.java:486)
        at org.apache.lucene.search.IndexSearcher.search(Lorg/apache/lucene/search/Query;Lorg/apache/lucene/search/CollectorManager;)Ljava/lang/Object;(IndexSearcher.java:695)
        at org.apache.lucene.search.IndexSearcher.searchAfter(Lorg/apache/lucene/search/FieldDoc;Lorg/apache/lucene/search/Query;ILorg/apache/lucene/search/Sort;ZZ)Lorg/apache/lucene/search/TopFieldDocs;(IndexSearcher.java:680)
        at org.apache.lucene.search.IndexSearcher.searchAfter(Lorg/apache/lucene/search/ScoreDoc;Lorg/apache/lucene/search/Query;Lorg/apache/lucene/search/Filter;ILorg/apache/lucene/search/Sort;ZZ)Lorg/apache/lucene/search/TopFieldDocs;(IndexSearcher.java:622)
        at org.apache.lucene.search.IndexSearcher.search(Lorg/apache/lucene/search/Query;Lorg/apache/lucene/search/Filter;ILorg/apache/lucene/search/Sort;ZZ)Lorg/apache/lucene/search/TopFieldDocs;(IndexSearcher.java:528)
        at org.apache.lucene.search.IndexSearcher.search(Lorg/apache/lucene/search/Query;ILorg/apache/lucene/search/Sort;)Lorg/apache/lucene/search/TopFieldDocs;(IndexSearcher.java:578)
        at com.google.gerrit.lucene.LuceneChangeIndex$QuerySource.read()Lcom/google/gwtorm/server/ResultSet;(LuceneChangeIndex.java:435)
        at com.google.gerrit.server.index.IndexedChangeQuery.read()Lcom/google/gwtorm/server/ResultSet;(IndexedChangeQuery.java:106)
        at com.google.gerrit.server.index.IndexedChangeQuery.restart(I)Lcom/google/gwtorm/server/ResultSet;(IndexedChangeQuery.java:152)
        at com.google.gerrit.server.query.change.AndSource.readImpl()Lcom/google/gwtorm/server/ResultSet;(AndSource.java:133)
        at com.google.gerrit.server.query.change.AndSource.read()Lcom/google/gwtorm/server/ResultSet;(AndSource.java:99)
        at com.google.gerrit.server.query.change.QueryProcessor.queryChanges(Ljava/util/List;Ljava/util/List;)Ljava/util/List;(QueryProcessor.java:162)
        at com.google.gerrit.server.query.change.QueryProcessor.queryChanges(Ljava/util/List;)Ljava/util/List;(QueryProcessor.java:103)
        at com.google.gerrit.server.query.change.QueryProcessor.queryChanges(Lcom/google/gerrit/server/query/Predicate;)Lcom/google/gerrit/server/query/change/QueryResult;(QueryProcessor.java:87)
        at com.google.gerrit.server.query.change.InternalChangeQuery.query(Lcom/google/gerrit/server/query/Predicate;)Ljava/util/List;(InternalChangeQuery.java:245)
        at com.google.gerrit.server.query.change.InternalChangeQuery.byProjectGroups(Lcom/google/gerrit/reviewdb/client/Project$NameKey;Ljava/util/Collection;)Ljava/util/List;(InternalChangeQuery.java:240)
        at com.google.gerrit.server.change.GetRelated.getRelated(Lcom/google/gerrit/server/change/RevisionResource;)Ljava/util/List;(GetRelated.java:74)

Saša Živkov

unread,
Apr 26, 2016, 8:33:55 AM4/26/16
to repo-d...@googlegroups.com
On Tue, Apr 26, 2016 at 11:45 AM, Saša Živkov <ziv...@gmail.com> wrote:
After upgrading one of our Gerrit servers from 2.11 to 2.12.2 we see a big slowdown
and are getting OutOfMemory every 2-3 hours and then must restart Gerrit.
Increasing the heap prolonged the time between two restarts.

A few suspicious things that we see in thread dump(s) are: [1] where the red part repeats
hundreds of times until a StackOverflow exception is thrown, [2] where also the red
part repeats hundreds of times. Both are related to the GetRelated REST call.
Do [1] and [2] look like a Lucene bug?

We have impression that memory consumption increases quickly after *some* GetRelated calls.
The issue also looks related to the issue 3727 (https://code.google.com/p/gerrit/issues/detail?id=3727).

NOTE:
The background reindexing is not yet done and will never finish because it starts from
scratch after every restart and it doesn't have enough time to finish between two restarts.

Our plan is to reindex changes using REST API calls because this way we can continue where we stopped
before a restart. And we are also considering to remove the GetRelated temporarily, at least until the reindexing

Strangely, when we disabled GetRelated (always return an empty list) the OOM just starting occurring much more often.

Sven Selberg

unread,
Apr 27, 2016, 4:40:37 AM4/27/16
to Repo and Gerrit Discussion
Can you see anything special about the changes it tries to GetRelated from?
It smells like some sort of cyclic graph. But not having looked at the code it feels unlikely that we look up more than one level of "relation".

/Sven

Saša Živkov

unread,
Apr 28, 2016, 10:19:29 AM4/28/16
to Sven Selberg, Repo and Gerrit Discussion
On Wed, Apr 27, 2016 at 10:40 AM, Sven Selberg <sven.s...@axis.com> wrote:
Can you see anything special about the changes it tries to GetRelated from?
It smells like some sort of cyclic graph. But not having looked at the code it feels unlikely that we look up more than one level of "relation".
 
This doesn't seem to be the case.

In the meantime we looked more into the Gerrit and Lucene code and found that this issue
is likely to affect Gerrit 2.12 "only" during the online reindexing i.e. when Gerrit 2.12 (lucene 5)
is using old index (lucene 4) and we have high enough load.

More details in the comments in:



--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages