AW - Gerrit/JGit problem: missing unknown during WindowCursor.open(WindowCursor.java:125)

744 views
Skip to first unread message

Luca Milanesio

unread,
May 15, 2012, 1:53:19 PM5/15/12
to Repo and Gerrit Discussion

Hi all,
we are experiencing a strange random error in our Gerrit 2.2.2 production:

net.sf.ehcache.CacheException: Could not fetch object for cache entry with key "lmit/jenkinsmobi-ios".
        at net.sf.ehcache.constructs.blocking.SelfPopulatingCache.get(SelfPopulatingCache.java:88)
        at com.google.gerrit.server.cache.PopulatingCache.get(PopulatingCache.java:85)
        at com.google.gerrit.server.project.ProjectCacheImpl.get(ProjectCacheImpl.java:133)
        at com.google.gerrit.sshd.commands.ListProjects.display(ListProjects.java:122)
        at com.google.gerrit.sshd.commands.ListProjects.access$200(ListProjects.java:43)
        at com.google.gerrit.sshd.commands.ListProjects$1.run(ListProjects.java:108)
        at com.google.gerrit.sshd.BaseCommand$TaskThunk.run(BaseCommand.java:409)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:207)
        at com.google.gerrit.server.git.WorkQueue$Task.run(WorkQueue.java:324)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
Caused by: org.eclipse.jgit.errors.MissingObjectException: Missing unknown f2fb20e075f3449251b25e5c41e1af0a460af7d0
        at org.eclipse.jgit.storage.file.WindowCursor.open(WindowCursor.java:125)
        at org.eclipse.jgit.lib.ObjectReader.open(ObjectReader.java:228)
        at org.eclipse.jgit.revwalk.RevWalk.parseAny(RevWalk.java:811)
        at org.eclipse.jgit.revwalk.RevWalk.parseCommit(RevWalk.java:724)
        at com.google.gerrit.server.git.VersionedMetaData.load(VersionedMetaData.java:120)
        at com.google.gerrit.server.git.VersionedMetaData.load(VersionedMetaData.java:96)
        at com.google.gerrit.server.project.ProjectCacheImpl$Loader.createEntry(ProjectCacheImpl.java:252)
        at com.google.gerrit.server.project.ProjectCacheImpl$Loader.createEntry(ProjectCacheImpl.java:233)
        at com.google.gerrit.server.cache.PopulatingCache$1.createEntry(PopulatingCache.java:55)
        at net.sf.ehcache.constructs.blocking.SelfPopulatingCache.get(SelfPopulatingCache.java:72)
        ... 15 more

I have seen a similar post to StackOverflow (http://stackoverflow.com/questions/9437652/jgit-clone-on-android) but no valid answers :-(

Has anybody experienced the same problem ?

Thank you in advance.

Luca.
---
Luca Milanesio
Lu...@Milanesio.org
Mobile: +44-(0)7928-617383
Skype: lucamilanesio



Saša Živkov

unread,
May 16, 2012, 8:02:05 AM5/16/12
to repo-d...@googlegroups.com


---------- Forwarded message ----------
From: Saša Živkov <ziv...@gmail.com>
Date: Wed, May 16, 2012 at 2:01 PM
Subject: Re: AW - Gerrit/JGit problem: missing unknown during WindowCursor.open(WindowCursor.java:125)
To: Luca Milanesio <Luca.Mi...@gmail.com>




Have you tried "git fsck" on this git repository?
This is loading project meta data from the "refs/meta/config" branch.
Did you check the existence and the content of this branch?

 

I have seen a similar post to StackOverflow (http://stackoverflow.com/questions/9437652/jgit-clone-on-android) but no valid answers :-(

Has anybody experienced the same problem ?

Thank you in advance.

Luca.
---
Luca Milanesio
Lu...@Milanesio.org
Mobile: +44-(0)7928-617383
Skype: lucamilanesio




Luca Milanesio

unread,
May 16, 2012, 9:07:51 AM5/16/12
to Saša Živkov, repo-d...@googlegroups.com
Hi Sasa,

actually we have done some more investigation and we have seen that:
a) There is a retry logic in JGit on that part of the code
b) Retry is made only once

Possibly the retry was put there for solving race conditions ?
(don't know the answer ... some JGit guy could now something more :-) Mathias or Shawn)

We have put a quick & dirty workaround in place (that sucks but works): adding a further outer look for additional retry.
(from our tests, one extra retry cycle typically solves the problem)

See the patch below:


--- a/gerrit-server/src/main/java/com/google/gerrit/server/git/VersionedMetaData.java
+++ b/gerrit-server/src/main/java/com/google/gerrit/server/git/VersionedMetaData.java
@@ -40,8 +40,11 @@ import org.eclipse.jgit.revwalk.RevTree;
 import org.eclipse.jgit.revwalk.RevWalk;
 import org.eclipse.jgit.treewalk.TreeWalk;
 import org.eclipse.jgit.util.RawParseUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 import java.io.IOException;
+import java.util.Random;
 
 /**
  * Support for metadata stored within a version controlled branch.
@@ -117,7 +120,8 @@ public abstract class VersionedMetaData {
     if (id != null) {
       reader = db.newObjectReader();
       try {
-        revision = new RevWalk(reader).parseCommit(id);
+        RevWalk revWalk = new RevWalk(reader);
+        revision = parseCommitWithRetry(id, revWalk);
         onLoad();
       } finally {
         reader.release();
@@ -130,6 +134,41 @@ public abstract class VersionedMetaData {
     }
   }
 
+  private Logger log = LoggerFactory.getLogger(VersionedMetaData.class);
+  private RevCommit parseCommitWithRetry(ObjectId id, RevWalk revWalk)
+      throws MissingObjectException, IncorrectObjectTypeException, IOException {
+    boolean retry = true;
+    RevCommit commit = null;
+    int retryCount = 0;
+    while (commit == null && retry) {
+      try {
+        commit = revWalk.parseCommit(id);
+        return commit;
+      } catch (MissingObjectException e) {
+        String message = e.getMessage();
+        if (message.toLowerCase().indexOf("missing unknown") >= 0 && retryCount < 10) {
+         log.warn("Cannot parse Git Commit: " + id.getName(), e);
+          randomSleep();
+          retry = true;
+          retryCount++;
+        } else
+          throw e;
+      }

+    }
+
+    return commit;
+  }
+
+  private Random rnd = new Random(System.currentTimeMillis());
+  private void randomSleep() {
+    try {
+      long randomSleep = rnd.nextInt(500);
+      log.warn("Sleeping " + randomSleep + " msec");
+      Thread.sleep(randomSleep);
+    } catch (InterruptedException e) {
+    }
+  }
+

Filesystem is OK and so is Git repo: just a race condition then probably.

What do you think ?

Luca.
Message has been deleted

Luca Milanesio

unread,
Jun 5, 2012, 5:18:30 AM6/5/12
to duft....@gmail.com, Repo and Gerrit Discussion
This seems to be different: when you see "missing tree" or "missing object" it really means that your Git repo is corrupted as the chain is broken.
In my case (missing unknown) it was really a concurrency issue on the input stream (workaround: a retry cycle @Gerrit)

Luca.

duft.markus

unread,
Jun 7, 2012, 11:11:10 AM6/7/12
to Luca Milanesio, Repo and Gerrit Discussion

hm, really? the repo is definitely ok. restarting gerrit (or eclipse - i have the same problem with EGit) always fixes the problem. also there are absolutely no problems detectable using C git...

maybe it's a different problem, but for sure not with the repo.

Regards, Markus

Reply all
Reply to author
Forward
0 new messages