- how large is the Java heap ?
- how large is core.packedGitLimit [3] ?
- how many refs / branches / tags does your repository have ?
- how is the cache "git_tags" configured ?
- what's the output of git count-objects -v ?
How many ssh threads do you have configured?
Not stuck, it is running, doing work. Does a second stack trace show a
similar location?
How much RAM on the server?
We notice major slowdowns around 200-300 packs, I would not be surprised if
this is affecting you. You may need to repack more often.
That's a lot of threads for a large repo and not very big heap. If we pretend
for a moment that a clone of your repo use 20G, and you get 64 clones at a
time, you don't have anywhere near enough heap for that. Are you having java
GC issues?
Do those dumps also show the RevWalk.next line?
How much RAM on the server?96GWe notice major slowdowns around 200-300 packs, I would not be surprised if
this is affecting you. You may need to repack more often.This question has been brought up countless times in our org - how *should* we be garbage collecting/repacking/etc.? I've gone through countless threads and have come up with no clear answers. Should we be ignoring gerrit gc altogether and instead doing our own implementation of git gc/git repack -abdf/git prune? Does Gerrit's use of bitmaps collide with the bitmaps generated by git repack?I *thought* jgit's gc did a repack at the end for you, but git's and jgit's implementations seem to be very different at certain things. I would love to hear how others are running garbage collection, whether it's a combination of gerrit gc and git repacks, or no gerrit gc at all, or other things I'm missing.
On Thursday, February 7, 2019 at 2:05:07 AM UTC+1, luke....@hpe.com wrote:How much RAM on the server?96GWe notice major slowdowns around 200-300 packs, I would not be surprised if
this is affecting you. You may need to repack more often.This question has been brought up countless times in our org - how *should* we be garbage collecting/repacking/etc.? I've gone through countless threads and have come up with no clear answers. Should we be ignoring gerrit gc altogether and instead doing our own implementation of git gc/git repack -abdf/git prune? Does Gerrit's use of bitmaps collide with the bitmaps generated by git repack?I *thought* jgit's gc did a repack at the end for you, but git's and jgit's implementations seem to be very different at certain things. I would love to hear how others are running garbage collection, whether it's a combination of gerrit gc and git repacks, or no gerrit gc at all, or other things I'm missing.We run git ["repack -a -d -b", "pack-refs --all", "prune"] on all repositories weekly.
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
error:refs/changes/XX/XXXXX/meta does not point to a valid object!
What commands and options did you run?
On 28 Feb 2019, at 03:02, luke....@hpe.com wrote:I hate to bring this back up, but we just had a critical failure today. We've been running git pack-refs --all and git repack -Abd every 3 hours on our repos for the last month or so, with no problems. We aren't exactly sure what happened, but during the repack at 3 PM today, a whole bunch of pack files got corrupted (at least jgit seemed to think so) which caused jgit to remove a lot of packs from the pack list - resulting in losing history for a ~2 hour period, but only for changes/meta it seems. This means we lost all votes/comments that happened within that time period. The only thing I can think that would cause this is the repack.
That said, what can we do to prevent this? Do we need to implement similar custom switches like you have above to preserve pack files for another round of gc? Is there something else we can/need to be doing? I'd also love to hear if you have any idea why this problem occurred in the first place.
Thank you,Luke
On Monday, February 11, 2019 at 9:00:26 AM UTC-8, MartinFick wrote:On Friday, February 8, 2019 10:47:22 AM MST luke....@hpe.com wrote:
> > What commands and options did you run?
>
> git repack -abdf, followed by git prune and git pack-refs --all
I believe using -a instead of -A is risky on a running server (-A has it own
risk, loose object explosions). From my understanding (man pages are very
confusing), -a will not include unreferenced (your potentially new objects)
into packs, and it will not "loosen" them either. I believe this effectively
deletes and unreferenced objects that are in packs.
> The errors then showed up when running git pack-refs --all. So you're
> probably right that pruning was triggered the errors, but you're suggesting
> our repack options were the true cause?
Pruning could also be the problem, if your unreferenced objects were loose and
not packed. So I think it depends on whether the unreferenced were loose or
packed,
-Martin
--
The Qualcomm Innovation Center, Inc. is a member of Code
Aurora Forum, hosted by The Linux Foundation
To unsubscribe, email repo-disc...@googlegroups.com
On 28 Feb 2019, at 17:54, luke....@hpe.com wrote:Do you have any suggestions for temporary workarounds? We don't have an HA/multimaster setup, and this problem is deleting history, with no way for us to get it back.
would it help to create a wrapper around git repack that saves the packfiles somewhere so we can retrieve them if jgit decides to delete them?
Or will that not help because as soon as the "old" packfiles are pulled in, jgit will mark them as invalid again?
To unsubscribe, email repo-discuss...@googlegroups.com
On 1 Mar 2019, at 07:54, Luca Milanesio <luca.mi...@gmail.com> wrote:On 28 Feb 2019, at 17:54, luke....@hpe.com wrote:Do you have any suggestions for temporary workarounds? We don't have an HA/multimaster setup, and this problem is deleting history, with no way for us to get it back.Not sure that is correct: the fact that the in-memory JGit packlist is invalid, it means that the repository is misbehaving, true.In my case at least, the situation on the filesystem is fine, there is no history deletion.would it help to create a wrapper around git repack that saves the packfiles somewhere so we can retrieve them if jgit decides to delete them?The JGit's gc.prunepackexpire already does that, but that it is not the problem. Even if you have plenty of valid packfiles on disk (we keep 1 day of packfile history, using gc.prunepackexpire = 1.day.ago) but JGit in-memory list of packfiles is wrong, then you still have problems.Or will that not help because as soon as the "old" packfiles are pulled in, jgit will mark them as invalid again?No, I believe there is a genuine bug in JGit on the packlist synchronisation: instead of being centralised in a well defined single place, it is all scattered around the code on all the possible "failure scenarios".In a nutshell, instead of saying "oh, one packfile isn't good, let's refresh the list", what JGit does is to remove the packfile in the in-memory list and then wait for stuff to fail, which is bad.Then in all around the code, there are conditions that in case of failure, reload the pack list. I am guessing (speculations for now, need to do further analysis) that in some of places, that does not happen.One more thing: the trustfolderstats flag behaviour could have an impact on the story here.If you set "core.trustfolderstat" to true, JGit won't list the files in the underlying directory unless the folder stats are telling to do so. However, imagine that the list of pack files hasn't changed and, still, JGit has removed a "assumed invalid" packfile.Even if JGit is calling the refresh packlist, that won't be executed just because the underlying directory hasn't changed.In fact, we do see that the in-memory packlist that is self-recovering at the following JGit GC phase, where the folder stats are udpated.
Not sure that is correct: the fact that the in-memory JGit packlist is invalid, it means that the repository is misbehaving, true.In my case at least, the situation on the filesystem is fine, there is no history deletion.
The JGit's gc.prunepackexpire already does that, but that it is not the problem. Even if you have plenty of valid packfiles on disk (we keep 1 day of packfile history, using gc.prunepackexpire = 1.day.ago) but JGit in-memory list of packfiles is wrong, then you still have problems.
To unsubscribe, email repo-disc...@googlegroups.com
On 1 Mar 2019, at 15:37, Martin Fick <mf...@codeaurora.org> wrote:On Wednesday, February 27, 2019 7:02:30 PM MST luke....@hpe.com wrote:I hate to bring this back up, but we just had a critical failure today.
We've been running git pack-refs --all and git repack -Abd every 3 hours on
our repos for the last month or so, with no problems. We aren't exactly
sure what happened, but during the repack at 3 PM today, a whole bunch of
pack files got corrupted (at least jgit seemed to think so)
Where they really corrupted? Does git think so?
ObjectLoader openPackedObject(WindowCursor curs, AnyObjectId objectId) {
PackList pList;
do {
SEARCH: for (;;) {
pList = packList.get();
for (PackFile p : pList.packs) {
try {
ObjectLoader ldr = p.get(curs, objectId); <== (0) Throws PackMismatchException because the packfile/checksum <> index/checksum
p.resetTransientErrorCount();
if (ldr != null)
return ldr;
} catch (PackMismatchException e) { <== (1) When the pack/checksum <> idx/checksum
// Pack was modified; refresh the entire pack list.
if (searchPacksAgain(pList)) <== (2) It would refresh the pack-list, but there are no changes on the FS, so returns false
continue SEARCH; <== (3) The code doesn't come here, so the search is interrupted
} catch (IOException e) {
handlePackError(e, p);
}
}
break SEARCH; <== (4) Search is interrupted
}
} while (searchPacksAgain(pList)); <== (5) Same as above, this returns false
return null;
}
which caused jgit to remove a lot of packs from the pack list
Just to clarify, how do you know this, messages in the logs? Or do you mean it
actually deleted them?
- resulting in losing
history for a ~2 hour period, but only for changes/meta it seems. This
means we lost all votes/comments that happened within that time period. The
only thing I can think that would cause this is the repack.
It sounds like you are saying they were removed from the internal list of
cached jgit files. If that is the case, it would be good to know more about
your specific setup.
What filesystem are you using, what version of Gerrit/
jgit, and what does the stacktrace look like from the logs when it removed a
packfile from the list. The last part is essential if jgit is to get a fix
(assuming it needs one) for such a situation,
-Martin
--
The Qualcomm Innovation Center, Inc. is a member of Code
Aurora Forum, hosted by The Linux Foundation
Where they really corrupted? Does git think so?
Just to clarify, how do you know this, messages in the logs? Or do you mean it
actually deleted them?
What filesystem are you using, what version of Gerrit/
jgit, and what does the stacktrace look like from the logs when it removed a
packfile from the list.
[2019-02-27
15:12:35,143] [ReplicateTo-XXX-3] WARN
org.eclipse.jgit.internal.storage.file.ObjectDirectory : Pack file
/home/gerrit/review_site/git/XXX.git/objects/pack/pack-f5736cf43e69c023fde03607d38dda83ad5c66f1.pack
is corrupt, removing it from pack list
[2019-02-27
15:12:35,194] [ReplicateTo-XXX-2] ERROR
org.eclipse.jgit.internal.storage.file.ObjectDirectory : ERROR: Exception
caught while accessing pack file
/home/gerrit/review_site/git/XXX.git/objects/pack/pack-7ec6213fc057e36599f41c739d30fc16e9934cd2.pack,
the pack file might be corrupt, {1}. Caught {2} consecutive errors while trying
to read this pack.
java.io.IOException: Unreadable pack index: /home/gerrit/review_site/git/XXX.git/objects/pack/pack-7ec6213fc057e36599f41c739d30fc16e9934cd2.idx
at
org.eclipse.jgit.internal.storage.file.PackIndex.open(PackIndex.java:102)
at
org.eclipse.jgit.internal.storage.file.PackFile.idx(PackFile.java:183)
at org.eclipse.jgit.internal.storage.file.PackFile.get(PackFile.java:284)
at
org.eclipse.jgit.internal.storage.file.ObjectDirectory.openPackedObject(ObjectDirectory.java:486)
at
org.eclipse.jgit.internal.storage.file.ObjectDirectory.openPackedFromSelfOrAlternate(ObjectDirectory.java:444)
at
org.eclipse.jgit.internal.storage.file.ObjectDirectory.openObject(ObjectDirectory.java:435)
at
org.eclipse.jgit.internal.storage.file.WindowCursor.open(WindowCursor.java:165)
at
org.eclipse.jgit.revwalk.RevWalk.getCachedBytes(RevWalk.java:933)
at
org.eclipse.jgit.revwalk.RevCommit.parseHeaders(RevCommit.java:159)
at
org.eclipse.jgit.revwalk.PendingGenerator.next(PendingGenerator.java:147)
at
org.eclipse.jgit.revwalk.RevWalk.next(RevWalk.java:444)
at org.eclipse.jgit.revwalk.ObjectWalk.next(ObjectWalk.java:307)
at
org.eclipse.jgit.revwalk.BitmapWalker.findObjectsWalk(BitmapWalker.java:213)
at
org.eclipse.jgit.revwalk.BitmapWalker.findObjects(BitmapWalker.java:137)
at
org.eclipse.jgit.internal.storage.pack.PackWriter.findObjectsToPackUsingBitmaps(PackWriter.java:2000)
at
org.eclipse.jgit.internal.storage.pack.PackWriter.findObjectsToPack(PackWriter.java:1795)
at
org.eclipse.jgit.internal.storage.pack.PackWriter.preparePack(PackWriter.java:914)
at org.eclipse.jgit.internal.storage.pack.PackWriter.preparePack(PackWriter.java:864)
at
org.eclipse.jgit.internal.storage.pack.PackWriter.preparePack(PackWriter.java:786)
at
org.eclipse.jgit.transport.BasePackPushConnection.writePack(BasePackPushConnection.java:356)
at org.eclipse.jgit.transport.BasePackPushConnection.doPush(BasePackPushConnection.java:219)
at
org.eclipse.jgit.transport.BasePackPushConnection.push(BasePackPushConnection.java:170)
at
org.eclipse.jgit.transport.PushProcess.execute(PushProcess.java:172)
at org.eclipse.jgit.transport.Transport.push(Transport.java:1346)
at
org.eclipse.jgit.transport.Transport.push(Transport.java:1392)
at
com.googlesource.gerrit.plugins.replication.PushOne.pushVia(PushOne.java:452)
at
com.googlesource.gerrit.plugins.replication.PushOne.runImpl(PushOne.java:431)
at
com.googlesource.gerrit.plugins.replication.PushOne.runPushOperation(PushOne.java:316)
at
com.googlesource.gerrit.plugins.replication.PushOne.access$000(PushOne.java:82)
at
com.googlesource.gerrit.plugins.replication.PushOne$1.call(PushOne.java:281)
at
com.googlesource.gerrit.plugins.replication.PushOne$1.call(PushOne.java:278)
at
com.google.gerrit.server.util.RequestScopePropagator.lambda$cleanup$1(RequestScopePropagator.java:212)
at com.google.gerrit.server.util.RequestScopePropagator.lambda$context$0(RequestScopePropagator.java:191)
at
com.google.gerrit.server.git.PerThreadRequestScope$Propagator.lambda$scope$0(PerThreadRequestScope.java:73)
at
com.googlesource.gerrit.plugins.replication.PushOne.run(PushOne.java:285)
at
com.google.gerrit.server.logging.LoggingContextAwareRunnable.run(LoggingContextAwareRunnable.java:72)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at
java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at com.google.gerrit.server.git.WorkQueue$Task.run(WorkQueue.java:646)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException:
/home/gerrit/review_site/git/XXX.git/objects/pack/pack-7ec6213fc057e36599f41c739d30fc16e9934cd2.idx
(No such file or directory)
at
java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at
java.io.FileInputStream.<init>(FileInputStream.java:138)
at
org.eclipse.jgit.util.io.SilentFileInputStream.<init>(SilentFileInputStream.java:64)
at
org.eclipse.jgit.internal.storage.file.PackIndex.open(PackIndex.java:98)
... 43 more
I would use git to see if those files really are corrupt
On 1 Mar 2019, at 21:03, luke....@hpe.com wrote:Update on this.. I checked logs from previous days - we've been getting those errors every single repack run. Every 3 hours we've been getting those errors, but we only lost history the one time. We stopped repacking as soon as we lost history, and just recently ran a gerrit gc with no errors.
To unsubscribe, email repo-disc...@googlegroups.com
On 1 Mar 2019, at 21:33, luke....@hpe.com wrote:Can I change the 'gerrit gc' options by modifying the [gc] section of the gerrit.config? The documentation suggests you can only modify aggressive, start time, and interval. Can I also change prunepackexpire, and threads in the [gc] section?
To unsubscribe, email repo-discuss...@googlegroups.com
HTHLuca.
To unsubscribe, email repo-discuss+unsubscribe@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss+unsubscribe@googlegroups.com.
On 2 Mar 2019, at 22:30, thomasmulhall410 via Repo and Gerrit Discussion <repo-d...@googlegroups.com> wrote:Does this bug only affect 2.16? Or 2.15 too?
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To unsubscribe, email repo-discuss+unsubscribe@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss+unsubscribe@googlegroups.com.
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To unsubscribe, email repo-disc...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe, email repo-disc...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
Delaying gc for even a day has large performance impacts for us. We attempted running gc offline, but that wasn't viable since the "enumerating objects" phase of the repack was taking forever (almost 2 hours, by my calculations) because we have so many changes come in within a 24 hour period. Shutting the server down for >10 minutes is not something we can do since we're a global company and there's no time where everyone is offline.
To unsubscribe, email repo-discuss...@googlegroups.com
You can alternatively make a local build on your dev environment, test and deploy it.
How did you guys managed until now?If your adoption is so widespread, that means that somehow it was working before, or am I mistaken?
On 6 Mar 2019, at 19:27, luke....@hpe.com wrote:You can alternatively make a local build on your dev environment, test and deploy it.Is there documentation/instructions on how to do this? Specifically how to incorporate the local jgit build into gerrit?
How did you guys managed until now?If your adoption is so widespread, that means that somehow it was working before, or am I mistaken?So, we have been getting these error messages in the logs since 02/09/2019. Two important things happened on that date:1. Gerrit was upgraded from 2.15.7 to 2.16.4.
2. We switched from running 'gerrit gc' once daily to running git plumbing commands directly - every 6 hours (then later switched to every 3 hours).
We did attempt the git plumbing commands on 2/6 (which is why this thread started in the first place), and we didn't see any errors in the logs for that day. That tells me the gerrit upgrade (not using git instead of jgit) is the reason the error messages are now showing up.
From what I can see, it happens at the end of the repack. When the repack consolidates smaller packfiles into larger ones, jgit seems to throw errors specifically for the packfiles that were consolidated, and then more errors are thrown for unreadable .idx files corresponding to those packfiles, which sounds "fine" to me, since those files were indeed removed by gc. Once gc is finished, everything is fine (most of the time).
There was the one instance where running a repack seemingly deleted history in the refs/changes/meta branches, and since then we've switched from running direct git commands every 3 hours to running gerrit gc once per day. We have not seen any issues this week aside from the errors in the logs. That's where our problem lies here. We absolutely *have* to repack, but we're seemingly rolling the dice when we do it live, and we can't afford to take down the servers because the repack takes ~1-2 hours. Our trade-off is - we run it during low traffic times.
We will stop running gc for now and monitor performance over the next couple days. If we notice significant slow downs we're stuck at that point to either: roll the dice and run gc, or try pointing everyone to the cache servers as best we can to minimize direct fetches from Gerrit (not viable for jenkins builds though, unfortunately).
The 'gerrit gc' has full support for bitmaps generation, whilst I'm not sure with the 'git plumbing', it depends on the version.
In my experience, the upgrade to v2.16 (compared to v2.15) has shown a huge improvement in the performance, not a regression.
Did you perform a ReviewDb to NoteDb migration contextual to the upgrade? Or were you on v2.15 / NoteDb already?
You need to preserve existing packfiles during GC (gc.prunepackexpire settings), otherwise you generate "hiccups" to the JGit cache.
The 'gerrit gc' has full support for bitmaps generation, whilst I'm not sure with the 'git plumbing', it depends on the version.We're running git 2.20.1, and it does have support for bitmap generation, which we made sure to include in our repack options.In my experience, the upgrade to v2.16 (compared to v2.15) has shown a huge improvement in the performance, not a regression.Did you perform a ReviewDb to NoteDb migration contextual to the upgrade? Or were you on v2.15 / NoteDb already?A huge improvement in the performance of what, exactly? Because I will say that since upgrading we have seen much faster fetches when the repo gets "messy" (large number of loose objects/packfiles). However we never received these "pack file is corrupt, removing it from pack list" errors until the 2.16 upgrade. I know you said the bug was has been there since 2.15, but I began seeing those errors immediately after the upgrade, and never before it.And yes, we were on NoteDb/2.15.7 prior to the 2.16 upgrade. I believe we did the NoteDb migration/2.15 upgrade in August of 2018.You need to preserve existing packfiles during GC (gc.prunepackexpire settings), otherwise you generate "hiccups" to the JGit cache.Looking up documentation on gitconfig files, I see no references to gc.prunepackexpire, so I assume this is a JGit setting. I went ahead and added the following to our gerrit user's .gitconfig anyways:[gc]prunepackexpire = 1.day.ago
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
On 7 Mar 2019, at 00:29, luke....@hpe.com wrote:The 'gerrit gc' has full support for bitmaps generation, whilst I'm not sure with the 'git plumbing', it depends on the version.We're running git 2.20.1, and it does have support for bitmap generation, which we made sure to include in our repack options.In my experience, the upgrade to v2.16 (compared to v2.15) has shown a huge improvement in the performance, not a regression.Did you perform a ReviewDb to NoteDb migration contextual to the upgrade? Or were you on v2.15 / NoteDb already?A huge improvement in the performance of what, exactly?
Because I will say that since upgrading we have seen much faster fetches when the repo gets "messy" (large number of loose objects/packfiles). However we never received these "pack file is corrupt, removing it from pack list" errors until the 2.16 upgrade.
I know you said the bug was has been there since 2.15, but I began seeing those errors immediately after the upgrade, and never before it.And yes, we were on NoteDb/2.15.7 prior to the 2.16 upgrade. I believe we did the NoteDb migration/2.15 upgrade in August of 2018.
You need to preserve existing packfiles during GC (gc.prunepackexpire settings), otherwise you generate "hiccups" to the JGit cache.Looking up documentation on gitconfig files, I see no references to gc.prunepackexpire, so I assume this is a JGit setting. I went ahead and added the following to our gerrit user's .gitconfig anyways:[gc]prunepackexpire = 1.day.agoAs you suggested, and we are still seeing errors.
I know JGit is respecting the options in this file, because it's using the threads and memory that we have told it to, but it doesn't seem like JGit is respecting that specific variable, because it doesn't show up in the gc logs as a setting that JGit is using for gc. Am I doing something wrong here?
I also would like to say THANK YOU so much for the help, to everyone here. Many of you have helped me gain very valuable knowledge both for me and for my organization, and (in this case specifically) helped me have some level of understanding in an area with a whole lot of question marks. I *sincerely* appreciate the time you have taken to help solve this issue, and to help guide me in the right direction. THANK YOU!!
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Have you restarted Gerrit? (if you are performing Gerrit's GC)
<span
Turns out we still get the below errors with prunepackexpire = 1.day.ago (running gerrit gc twice daily):
--
The fix hasn’t been merged in Gerrit yet, have you only done the prunepackexpire change? Or have you deployed a patched JGIt in your Gerrit.war?
Sent from my iPhone
Turns out we still gt the below errors with prunepackexpire = 1.day.ago (running gerrit gc twice daily):
For more options, visit <a href="https://groups.google.com/d/optout" t
On 24 Mar 2019, at 18:55, luke....@hpe.com wrote:We are still getting the errors after upgrading Gerrit to 2.16.7 :( I'd like to think the errors we're seeing are unimportant, since the pack files that Gerrit is complaining about are "old" and should already be consolidated to a new pack file, but I'm afraid of losing history again, and therefore I'm afraid of going back to a normal gc schedule of multiple times per day (which we should do, given our daily activity. Any ideas?
On 25 Mar 2019, at 00:56, luke....@hpe.com wrote:We actually get errors *before* seeing the "pack is corrupt, removing it from pack list" message. The errors we're seeing are:...Exception caught while accessing pack file <pack file>, the pack file might be corrupt. Caught 1 consecutive errors while trying to read this pack.java.io.IOException: Unreadable pack index <corresponding idx file>
We get a whole bunch of those errors, as well as the "pack is corrupt, removing it from pack list" messages. However, it seems "fine" that tasks find these idx files are unreadable, because they literally don't exist anymore, GC deleted them. I *do* have prunepackexpire set to 2 days ago, and I've been running 'gerrit gc' once per day.
But each time gc runs and deletes all the 2+ day old pack files, various Gerrit tasks seem to be trying to read from them concurrently, and we get the above errors.I don't know if this is a problem or not.
I'd like to think it isn't, now that we set prunepackexpire=2.days.ago. I'd *like* to think that jgit gc will take all the pack files, and consolidate them into one single pack, but just keep the older pack files around, in which case, removing or corrupting the 2 day old packs wouldn't possibly cause any problems since we still have the data in the one large pack file.
On 25 Mar 2019, at 02:18, Luca Milanesio <luca.mi...@gmail.com> wrote:On 25 Mar 2019, at 00:56, luke....@hpe.com wrote:We actually get errors *before* seeing the "pack is corrupt, removing it from pack list" message. The errors we're seeing are:...Exception caught while accessing pack file <pack file>, the pack file might be corrupt. Caught 1 consecutive errors while trying to read this pack.java.io.IOException: Unreadable pack index <corresponding idx file>That's just logging saying that JGit can't read that pack, but it isn't a problem until you see a user-related error and a user-related stack trace associated with it.We get a whole bunch of those errors, as well as the "pack is corrupt, removing it from pack list" messages. However, it seems "fine" that tasks find these idx files are unreadable, because they literally don't exist anymore, GC deleted them. I *do* have prunepackexpire set to 2 days ago, and I've been running 'gerrit gc' once per day.Yes, whenever an expired pack is removed, you'll see errors like the one above. However, because there is a newer pack available, the objects are still found.But each time gc runs and deletes all the 2+ day old pack files, various Gerrit tasks seem to be trying to read from them concurrently, and we get the above errors.I don't know if this is a problem or not.Until you see a user-related stack-trace, it isn't a problem.I'd like to think it isn't, now that we set prunepackexpire=2.days.ago. I'd *like* to think that jgit gc will take all the pack files, and consolidate them into one single pack, but just keep the older pack files around, in which case, removing or corrupting the 2 day old packs wouldn't possibly cause any problems since we still have the data in the one large pack file.Yes, that's the case.However, I don't know if it actually works like that :)-Luke
On Sunday, March 24, 2019 at 2:27:43 PM UTC-7, lucamilanesio wrote:On 24 Mar 2019, at 18:55, luke....@hpe.com wrote:We are still getting the errors after upgrading Gerrit to 2.16.7 :( I'd like to think the errors we're seeing are unimportant, since the pack files that Gerrit is complaining about are "old" and should already be consolidated to a new pack file, but I'm afraid of losing history again, and therefore I'm afraid of going back to a normal gc schedule of multiple times per day (which we should do, given our daily activity. Any ideas?Just to clarify, the message of "pack is corrupt, removing it from pack list" will continue to exist even after the patch: the "corrupt" wording is misleading, it really means that the in-memory cache is stale.So, don't worry, that is only a false alarm.The problem was really what happened *after* that message, a series of "missing unknown" for me ... and that problem after the fix is 100% gone.What do you see instead as error messages *after* the message?
You received this message because you are subscribed to a topic in the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/repo-discuss/v-2YGmmeKE4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to repo-discuss...@googlegroups.com.
On 25 Mar 2019, at 02:18, Luca Milanesio <luca.m...@gmail.com> wrote:
Luca.
<di
On 3 May 2019, at 18:52, luke....@hpe.com wrote:Hi Luca,Was this issue every fully resolved? Since we began running jgit gc outside of gerrit (i.e. not using Gerrit gc) we've been seeing this same issue with some of our smaller, less active repos. We are currently on 2.16.7, is the issue fully resolved in 2.16.8?
Thanks,
Luca.
<div clas
On 21 May 2019, at 16:24, Luke Engle <luke....@hpe.com> wrote:Any update on this? It looks like the review has gone stale.. I need to upgrade to 2.16.8, but don't want to include the 2.5s slowdown, and I'd also rather not fork jgit. Any way this can be prioritized?
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/a0e69e83-d8bf-41fb-8784-b2a86c4abedc%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/6954ABCB-D32D-4617-AFF3-BDF5EFA6FC4C%40gmail.com.
On 22 May 2019, at 06:53, Matthias Sohn <matthi...@gmail.com> wrote:I can move this patch series to stable-5.1 if this helps for 2.16.
Could someone review it ?
Is this considered fix now?
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/5b65366e-541b-4f27-adc2-58111aee233d%40googlegroups.com.
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/repo-discuss/v-2YGmmeKE4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/e6fd5325-b6d3-401a-a381-00bb3f40e433%40googlegroups.com.
Hi Doug,The issue I was seeing with repo corruption was due to running git gc (separate from Gerrit) with nothing in place to preserve pack files before gc went and removed them. This caused issues with in flight changes in Gerrit while GC was running. This was due to our lack of understanding the proper way of garbage collecting repos on a live server in Gerrit, not due to any bug in jgit.AFAIK you should not see repo corruption due the the jgit bug discussed in this thread and will not experience repo corruption if you have a proper gc process.
Thank you, Luke and Matthias!
Still working on the series, this is more complex than expected.But I think we are coming closer to a solution.