Hi,
We have an issue with gerrit 2.9: all push are failing because they reach the timeout (2 minutes).
After investigating, we found out that all the push operations are stuck at indexing because the
InteractiveMergeabilityChecks thread went into an infinite loop, see stack trace below.
This infinite loop is in JGit and was a know problem because there is actually a work around in
intraline difference code which add a timeout to get out of such an infinite loop. See original issue
for more details:
https://code.google.com/p/gerrit/issues/detail?id=487I know that fixing the infinite loop is the best solution but as a short term solution but what can we do in
gerrit code to protect against such a situation, maybe we can use the same approach done for intraline
difference(timeout) and fail the indexing?
This investigation exposed another problem: no matter how many threads we allocate to process receive commands,
these can possibly all bottle neck at at the indexing if we do not allocate the same number of threads. Is is really
necessary to index change asynchronously (from what I understand, this is done to send emails while indexing) ?
I think we should do all interactive indexing in the same thread that caused the indexing, WDYT?
Hugo
"InteractiveMergeabilityChecks-1" prio=10 tid=0x00007fbb300d7000 nid=0xe392 runnable [0x00007fbc469e7000]
java.lang.Thread.State: RUNNABLE
at org.eclipse.jgit.diff.MyersDiff$MiddleEdit$BackwardEditPaths.snake(MyersDiff.java:501)
at org.eclipse.jgit.diff.MyersDiff$MiddleEdit$EditPaths.calculate(MyersDiff.java:414)
at org.eclipse.jgit.diff.MyersDiff$MiddleEdit.calculate(MyersDiff.java:260)
at org.eclipse.jgit.diff.MyersDiff.calculateEdits(MyersDiff.java:188)
at org.eclipse.jgit.diff.MyersDiff.calculateEdits(MyersDiff.java:165)
at org.eclipse.jgit.diff.MyersDiff.<init>(MyersDiff.java:148)
at org.eclipse.jgit.diff.MyersDiff.<init>(MyersDiff.java:112)
at org.eclipse.jgit.diff.MyersDiff$1.diffNonCommon(MyersDiff.java:119)
at org.eclipse.jgit.diff.HistogramDiff$State.diffReplace(HistogramDiff.java:172)
at org.eclipse.jgit.diff.HistogramDiff.diffNonCommon(HistogramDiff.java:133)
at org.eclipse.jgit.diff.LowLevelDiffAlgorithm.diffNonCommon(LowLevelDiffAlgorithm.java:59)
at org.eclipse.jgit.diff.DiffAlgorithm.diff(DiffAlgorithm.java:123)
at org.eclipse.jgit.diff.DiffFormatter.diff(DiffFormatter.java:958)
at org.eclipse.jgit.diff.DiffFormatter.createFormatResult(DiffFormatter.java:936)
at org.eclipse.jgit.diff.DiffFormatter.toFileHeader(DiffFormatter.java:889)
at com.google.gerrit.server.patch.PatchListLoader.readPatchList(PatchListLoader.java:161)
at com.google.gerrit.server.patch.PatchListLoader.load(PatchListLoader.java:87)
at com.google.gerrit.server.patch.PatchListLoader.load(PatchListLoader.java:70)
at com.google.gerrit.server.cache.h2.H2CacheImpl$Loader.load(H2CacheImpl.java:249)
at com.google.gerrit.server.cache.h2.H2CacheImpl$Loader.load(H2CacheImpl.java:229)
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3524)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2317)
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2280)
- locked <0x00007fc39784c4c8> (a com.google.common.cache.LocalCache$StrongAccessEntry)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2195)
at com.google.common.cache.LocalCache.get(LocalCache.java:3934)
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3938)
at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4821)
at com.google.gerrit.server.cache.h2.H2CacheImpl.get(H2CacheImpl.java:111)
at com.google.gerrit.server.patch.PatchListCacheImpl.get(PatchListCacheImpl.java:81)
at com.google.gerrit.server.patch.PatchListCacheImpl.get(PatchListCacheImpl.java:99)
at com.google.gerrit.server.query.change.ChangeData.currentFilePaths(ChangeData.java:278)
at com.google.gerrit.server.index.ChangeField.getFileParts(ChangeField.java:195)
at com.google.gerrit.server.index.ChangeField$12.get(ChangeField.java:210)
at com.google.gerrit.server.index.ChangeField$12.get(ChangeField.java:206)
at com.google.gerrit.server.index.Schema$1.apply(Schema.java:103)
at com.google.gerrit.server.index.Schema$1.apply(Schema.java:98)
at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
at com.google.common.collect.Iterators$7.computeNext(Iterators.java:646)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at com.google.gerrit.lucene.LuceneChangeIndex.toDocument(LuceneChangeIndex.java:483)
at com.google.gerrit.lucene.LuceneChangeIndex.replace(LuceneChangeIndex.java:296)
at com.google.gerrit.server.index.ChangeIndexer.index(ChangeIndexer.java:131)
at com.google.gerrit.server.index.ChangeIndexer.index(ChangeIndexer.java:142)
at com.google.gerrit.server.change.Mergeable.refresh(Mergeable.java:196)
at com.google.gerrit.server.change.Mergeable.apply(Mergeable.java:113)
at com.google.gerrit.server.change.MergeabilityChecker$Task.call(MergeabilityChecker.java:341)
at com.google.gerrit.server.change.MergeabilityChecker$Task.call(MergeabilityChecker.java:294)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at com.google.gerrit.server.git.WorkQueue$Task.run(WorkQueue.java:364)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)