jgit infinite loop during indexing

107 views
Skip to first unread message

Hugo Arès

unread,
Sep 19, 2014, 1:56:04 PM9/19/14
to repo-d...@googlegroups.com
Hi,

We have an issue with gerrit 2.9: all push are failing because they reach the timeout (2 minutes).
After investigating, we found out that all the push operations are stuck at indexing because the
InteractiveMergeabilityChecks thread went into an infinite loop, see stack trace below.

This infinite loop is in JGit and was a know problem because there is actually a work around in
intraline difference code which add a timeout to get out of such an infinite loop. See original issue
for more details: https://code.google.com/p/gerrit/issues/detail?id=487

I know that fixing the infinite loop is the best solution but as a short term solution but what can we do in
gerrit code to protect against such a situation, maybe we can use the same approach done for intraline
difference(timeout) and fail the indexing?

This investigation exposed another problem: no matter how many threads we allocate to process receive commands,
these can possibly all bottle neck at at the indexing if we do not allocate the same number of threads. Is is really
necessary to index change asynchronously (from what I understand, this is done to send emails while indexing) ?
I think we should do all interactive indexing in the same thread that caused the indexing, WDYT?

Hugo


"InteractiveMergeabilityChecks-1" prio=10 tid=0x00007fbb300d7000 nid=0xe392 runnable [0x00007fbc469e7000]
   java.lang.Thread.State: RUNNABLE
    at org.eclipse.jgit.diff.MyersDiff$MiddleEdit$BackwardEditPaths.snake(MyersDiff.java:501)
    at org.eclipse.jgit.diff.MyersDiff$MiddleEdit$EditPaths.calculate(MyersDiff.java:414)
    at org.eclipse.jgit.diff.MyersDiff$MiddleEdit.calculate(MyersDiff.java:260)
    at org.eclipse.jgit.diff.MyersDiff.calculateEdits(MyersDiff.java:188)
    at org.eclipse.jgit.diff.MyersDiff.calculateEdits(MyersDiff.java:165)
    at org.eclipse.jgit.diff.MyersDiff.<init>(MyersDiff.java:148)
    at org.eclipse.jgit.diff.MyersDiff.<init>(MyersDiff.java:112)
    at org.eclipse.jgit.diff.MyersDiff$1.diffNonCommon(MyersDiff.java:119)
    at org.eclipse.jgit.diff.HistogramDiff$State.diffReplace(HistogramDiff.java:172)
    at org.eclipse.jgit.diff.HistogramDiff.diffNonCommon(HistogramDiff.java:133)
    at org.eclipse.jgit.diff.LowLevelDiffAlgorithm.diffNonCommon(LowLevelDiffAlgorithm.java:59)
    at org.eclipse.jgit.diff.DiffAlgorithm.diff(DiffAlgorithm.java:123)
    at org.eclipse.jgit.diff.DiffFormatter.diff(DiffFormatter.java:958)
    at org.eclipse.jgit.diff.DiffFormatter.createFormatResult(DiffFormatter.java:936)
    at org.eclipse.jgit.diff.DiffFormatter.toFileHeader(DiffFormatter.java:889)
    at com.google.gerrit.server.patch.PatchListLoader.readPatchList(PatchListLoader.java:161)
    at com.google.gerrit.server.patch.PatchListLoader.load(PatchListLoader.java:87)
    at com.google.gerrit.server.patch.PatchListLoader.load(PatchListLoader.java:70)
    at com.google.gerrit.server.cache.h2.H2CacheImpl$Loader.load(H2CacheImpl.java:249)
    at com.google.gerrit.server.cache.h2.H2CacheImpl$Loader.load(H2CacheImpl.java:229)
    at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3524)
    at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2317)
    at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2280)
    - locked <0x00007fc39784c4c8> (a com.google.common.cache.LocalCache$StrongAccessEntry)
    at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2195)
    at com.google.common.cache.LocalCache.get(LocalCache.java:3934)
    at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3938)
    at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4821)
    at com.google.gerrit.server.cache.h2.H2CacheImpl.get(H2CacheImpl.java:111)
    at com.google.gerrit.server.patch.PatchListCacheImpl.get(PatchListCacheImpl.java:81)
    at com.google.gerrit.server.patch.PatchListCacheImpl.get(PatchListCacheImpl.java:99)
    at com.google.gerrit.server.query.change.ChangeData.currentFilePaths(ChangeData.java:278)
    at com.google.gerrit.server.index.ChangeField.getFileParts(ChangeField.java:195)
    at com.google.gerrit.server.index.ChangeField$12.get(ChangeField.java:210)
    at com.google.gerrit.server.index.ChangeField$12.get(ChangeField.java:206)
    at com.google.gerrit.server.index.Schema$1.apply(Schema.java:103)
    at com.google.gerrit.server.index.Schema$1.apply(Schema.java:98)
    at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
    at com.google.common.collect.Iterators$7.computeNext(Iterators.java:646)
    at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
    at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
    at com.google.gerrit.lucene.LuceneChangeIndex.toDocument(LuceneChangeIndex.java:483)
    at com.google.gerrit.lucene.LuceneChangeIndex.replace(LuceneChangeIndex.java:296)
    at com.google.gerrit.server.index.ChangeIndexer.index(ChangeIndexer.java:131)
    at com.google.gerrit.server.index.ChangeIndexer.index(ChangeIndexer.java:142)
    at com.google.gerrit.server.change.Mergeable.refresh(Mergeable.java:196)
    at com.google.gerrit.server.change.Mergeable.apply(Mergeable.java:113)
    at com.google.gerrit.server.change.MergeabilityChecker$Task.call(MergeabilityChecker.java:341)
    at com.google.gerrit.server.change.MergeabilityChecker$Task.call(MergeabilityChecker.java:294)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
    at com.google.gerrit.server.git.WorkQueue$Task.run(WorkQueue.java:364)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:724)

Dave Borowitz

unread,
Sep 19, 2014, 2:57:02 PM9/19/14
to Hugo Arès, repo-discuss
On Fri, Sep 19, 2014 at 10:56 AM, Hugo Arès <hugo...@ericsson.com> wrote:
Hi,

We have an issue with gerrit 2.9: all push are failing because they reach the timeout (2 minutes).
After investigating, we found out that all the push operations are stuck at indexing because the
InteractiveMergeabilityChecks thread went into an infinite loop, see stack trace below.

This infinite loop is in JGit and was a know problem because there is actually a work around in
intraline difference code which add a timeout to get out of such an infinite loop. See original issue
for more details: https://code.google.com/p/gerrit/issues/detail?id=487

I know that fixing the infinite loop is the best solution but as a short term solution but what can we do in
gerrit code to protect against such a situation, maybe we can use the same approach done for intraline
difference(timeout) and fail the indexing?

Yes, you would have to do a similar timeout thing. Don't fail indexing the change though, just treat it as if it modified no files. Reducing search recall is preferable to returning stale results.
 
This investigation exposed another problem: no matter how many threads we allocate to process receive commands,
these can possibly all bottle neck at at the indexing if we do not allocate the same number of threads. Is is really
necessary to index change asynchronously (from what I understand, this is done to send emails while indexing) ?
I think we should do all interactive indexing in the same thread that caused the indexing, WDYT?

Email is the most common thing we do in the background but I don't think the only?

We mostly did background indexing to be conservative about the performance impact of adding the index step. It might have been more trouble than it's worth.

Another alternative is to teach WorkQueue how to use a directExecutor (i.e. execute in the current thread) if its threadpool is full. I have no idea how feasible this is.
 

--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages