Migrating data to schema 154 ... =====> 18:09
Collecting accounts: 93681 =====> 19:02
Counting objects: 3867448
Finding sources: 100% (3867448/3867448)
Getting sizes: 100% (1873509/1873509)
Compressing objects: 100% (2038704/2038704)
Writing objects: 100% (3867448/3867448)
Prune loose objects also found in pack files: 100% (259/259)
Prune loose, unreferenced objects: 100% (259/259) =====> 23:55
> Done (39499.524 s)
All-Users repo is about 8gb in size.
On Feb 4, 2022, at 10:54 AM, Nguyen Tuan Khang Phan <phan....@gmail.com> wrote:
On Friday, February 4, 2022 at 11:57:55 AM UTC-5 Matthias Sohn wrote:On Fri, Feb 4, 2022 at 5:07 PM Nguyen Tuan Khang Phan <phan....@gmail.com> wrote:Hi,
We currently just did a test upgrade from 2.14 to 2.16 in order to reach 3.4. However, during 2.16 upgrade we faced an increased slow down at step " Migrating data to schema 154 .. " which took (39499.524 s) compared to other steps which are less than a second usually. We didn't even start indexing yet.Migrating data to schema 154 ... =====> 18:09
Collecting accounts: 93681 =====> 19:02
Counting objects: 3867448
Finding sources: 100% (3867448/3867448)
Getting sizes: 100% (1873509/1873509)
Compressing objects: 100% (2038704/2038704)
Writing objects: 100% (3867448/3867448)
Prune loose objects also found in pack files: 100% (259/259)
Prune loose, unreferenced objects: 100% (259/259) =====> 23:55
> Done (39499.524 s)
All-Users repo is about 8gb in size.
Did you use the latest release 2.16.28 for the upgrade ?
We were using the tip of the current 2.16. Our repos are hosted on NFS, usually, GC on All-Users takes around 4 hours.
-Matthias--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/f29d46a5-eec4-43a2-b6e2-bbd699d990fdn%40googlegroups.com.
repos/All-Users.git|loose_ref_dirs: 363
repos/All-Users.git|all_refs: 233218
warning: garbage found: ./objects/pack/preserved
repos/All-Users.git|count: 2510466
repos/All-Users.git|size: 9448008
repos/All-Users.git|in-pack: 2060672
repos/All-Users.git|packs: 1
repos/All-Users.git|size-pack: 505063
repos/All-Users.git|prune-packable: 0
repos/All-Users.git|garbage: 1
repos/All-Users.git|size-garbage: 4
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/cd743c34-841b-44b0-bcf0-c43102ab7d8fn%40googlegroups.com.
[core]
repositoryformatversion = 0
filemode = true
bare = true
trustfolderstat = false
trustfilestat = true
logAllRefUpdates = true
[remote "origin"]
url = .../All-Users.git
fetch = +refs/*:refs/*
mirror = true
[gc]
prunePackExpire = 10.minutes.ago
pruneExpire = 1.week.ago
autoDetach = false
[receive]
autogc = false
[repack]
packKeptObjects = true
On Feb 14, 2022, at 9:26 AM, Nguyen Tuan Khang Phan <phan....@gmail.com> wrote:We did another attempt this weekend, and it took weekends around 45 hours to reach stable-2.16. The upgrade log is below.There are also some time logs to see that we reached the second day.
All-Users config:[core]
repositoryformatversion = 0
filemode = true
bare = true
trustfolderstat = false
trustfilestat = true
logAllRefUpdates = true
[remote "origin"]
url = .../All-Users.git
fetch = +refs/*:refs/*
mirror = true
[gc]
prunePackExpire = 10.minutes.ago
pruneExpire = 1.week.ago
autoDetach = false
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/repo-discuss/JoqJMddgpLA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/d773076c-381d-4f2b-8542-ff9c665f9032n%40googlegroups.com.
On Feb 15, 2022, at 12:30 PM, Kaushik Lingarkar <kaus...@codeaurora.org> wrote:We have seen performance degradation with trustfolderstat set as false. Can you try a run with it set to true during the upgrade process? You should probably do it for *all* your repositories as it will also impact the notedb migration performance.On Feb 14, 2022, at 9:26 AM, Nguyen Tuan Khang Phan <phan....@gmail.com> wrote:We did another attempt this weekend, and it took weekends around 45 hours to reach stable-2.16. The upgrade log is below.There are also some time logs to see that we reached the second day.
All-Users config:[core]
repositoryformatversion = 0
filemode = true
bare = true
trustfolderstat = false
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/705D4F1A-A373-47FE-8E3B-AABA38151648%40codeaurora.org.
On Mar 2, 2022, at 12:02 PM, Nguyen Tuan Khang Phan <phan....@gmail.com> wrote:The upgrade to stable-2.16 from stable-2.14 took approximately 5 hours + 2 hours of GC on All-Users prior to the upgrade.
The NoteDb migration took 28 hours. The first attempt ended in failure, heap memory exhaustion.The command used:
java -jar gerrit.war migrate-to-note-db --shuffle-project-slices --reindex=falseTo cap memory usage:java -Xmx256g -jar gerrit.war migrate-to-note-db --shuffle-project-slices --reindex=false
The next step for us is to upgrade to 3.1 and do a reindex on groups, accounts and changes:java -Xmx256g -jar gerrit.war reindex --threads 56 --index <what_to_index>
Is it correct so far? Is there a way to speed up the migration?
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/18cdab96-974b-45f6-9de3-d15061d5d703n%40googlegroups.com.
That’s a great improvement! Did you apply the trustFolderStat = true update to all repositories or only All-Users? It will greatly impact performance for all repositories, especially when you run migrate-to-note-db.
On Mar 2, 2022, at 2:29 PM, Nguyen Tuan Khang Phan <phan....@gmail.com> wrote:That’s a great improvement! Did you apply the trustFolderStat = true update to all repositories or only All-Users? It will greatly impact performance for all repositories, especially when you run migrate-to-note-db.We applied trustFolderStat = true to all repositories.> If you’re using 56 threads for reindex, maybe you want to use --threads 56 for migrate-to-note-db too?I see that ISSUE_8022_THREAD_LIMIT was used. Are you implying to use more than 4? What number did you try using?
> We found that using --shuffle-project-slices degraded performance for us. Maybe try without that?We will try after arriving to 3.1.
> How many changes and projects total do you have for this instance? If you aren’t sure of the gc status of all the repos, it might be worth running that git-stats.sh script on all the repos to see if you need to GC everything before starting the upgrade (which can happen while the server is still online).This instance is a copy of production. So, a lot :)
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/0ae14398-bc9a-4b6a-b21d-68d3bf22616an%40googlegroups.com.
On Mar 3, 2022, at 9:33 AM, Nguyen Tuan Khang Phan <phan....@gmail.com> wrote:> We have around 60K projects and 1million changesI lost a 0, its 10 million
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/2660160a-088d-426b-874a-2483b30d054en%40googlegroups.com.
On Mar 3, 2022, at 11:10 AM, Nguyen Tuan Khang Phan <phan....@gmail.com> wrote:During the last community meeting, the topic of using populated disk cache to speed up reindex was brought up. How much disk cache is needed for Gerrit instance of 1 TB. Is it 2 TB?
We want some approximation to order hardware for the upcoming upgrade.
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/1a22901c-008d-4f2c-9d3d-3c0599440b31n%40googlegroups.com.
On Mar 3, 2022, at 11:55 AM, Nguyen Tuan Khang Phan <phan....@gmail.com> wrote:> What are your gc-conductor packed/loose config values set to? I also noticed gc-conductor doesn’t have an evaluation value for loose refs nor a way of only packing refs as it always does a full aggressive gc. If you run git-stats on your repos and see many loose refs for repos with few loose/packed objects, adding that could be beneficial as it would be much less expensive.I believe its set to 400. Do you suggest running gc on all of the projects prior to migration?
> Another overall strategy I failed to mention earlier but that we found incredibly useful (and shared in our summit talk) was to narrow down performance problems using subsets of production data so that we could have faster iterations. For example, you could pick a set of repos that are about 5% of your total changes, then in your test area, remove data for anything else (including from the database tables). It makes it much less costly to do testing where you only modify one variable at a time and then you get much higher confidence in which modifications are important. Once you see improvement with your modification that you can project would meet your target goal for all data, increase the size of your subset. We started with a subset that only had 2 repos (plus All-Projects and All-Users) and ~150k changes, then 8 repos and ~400k changes, then 50 repos and ~850k changes. Each time we wanted to try a new idea, we evaluated what was the smallest subset where we thought we could see an improvement and used that for initial testing.We are currently experimenting with deleting very old changes and abandoned changes.
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/8b288aec-c52f-46ce-95b4-9522c569b789n%40googlegroups.com.
On Mar 3, 2022, at 12:38 PM, Przemyslaw Waliszewski <pwalis...@gmail.com> wrote:Hi we are using
GC settings
packed = 40
loose = 400
We are running upgrade on staging env with snapshot from last week. There is no new changes before upgrade so gc queue is empty.
You wrote:
" best offline reindex performance, you want to be using at least Gerrit v3.2.14, and ideally v3.3.9, because of all the improvements made in these changes." Can we safely cherry pick those changes to 3.1 and use it :
a) only during 1st reidex after upgrade?
b) using during upgrade and on production
c) to risky because to many changes between 3.1 -> 3.2.14
Regarding running gc in aggressive mode we are introducing change to gc-conductor that will run gc in non-agressive mode. Does it mean that we also should modify evaluation part ? more details here-> https://gerrit-review.googlesource.com/c/plugins/gc-conductor/+/329363
On Thursday, March 3, 2022 at 8:03:03 PM UTC+1 nas...@codeaurora.org wrote:On Mar 3, 2022, at 11:55 AM, Nguyen Tuan Khang Phan <phan....@gmail.com> wrote:> What are your gc-conductor packed/loose config values set to? I also noticed gc-conductor doesn’t have an evaluation value for loose refs nor a way of only packing refs as it always does a full aggressive gc. If you run git-stats on your repos and see many loose refs for repos with few loose/packed objects, adding that could be beneficial as it would be much less expensive.I believe its set to 400. Do you suggest running gc on all of the projects prior to migration?I think the defaults of 40 packs and 400 loose objects are fine.If you mean running gc as part of migration downtime, no, I think that would add a lot to your overall downtime. I would check that your gc conductor queue is being emptied. If gc isn’t keeping up, it doesn’t matter what your config is set to.> Another overall strategy I failed to mention earlier but that we found incredibly useful (and shared in our summit talk) was to narrow down performance problems using subsets of production data so that we could have faster iterations. For example, you could pick a set of repos that are about 5% of your total changes, then in your test area, remove data for anything else (including from the database tables). It makes it much less costly to do testing where you only modify one variable at a time and then you get much higher confidence in which modifications are important. Once you see improvement with your modification that you can project would meet your target goal for all data, increase the size of your subset. We started with a subset that only had 2 repos (plus All-Projects and All-Users) and ~150k changes, then 8 repos and ~400k changes, then 50 repos and ~850k changes. Each time we wanted to try a new idea, we evaluated what was the smallest subset where we thought we could see an improvement and used that for initial testing.We are currently experimenting with deleting very old changes and abandoned changes.Note, I was only talking about for test purposes. The only changes we’ve removed from production are ones that the migration testing showed to be broken.--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/8b288aec-c52f-46ce-95b4-9522c569b789n%40googlegroups.com.
--
--
To unsubscribe, email repo-discuss...@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/repo-discuss/c69677c1-8a2d-4fba-a8b8-f1d131b99053n%40googlegroups.com.
What specifically is problematic?
a) Small number of refs per repository (<<100k) but a lot of repositories
b) Large number of refs per repository (>>100k to Millions)
Small update on our side regarding the upgrade. We were able to init to 2.16 in 2 hours instead. We forgot the --no-reindex option last time. However, noteDB migration took a log of time, 22 hours. We had more threads available as well (100 threads) so the improvement was around 6 hours from 28. The reindex on 3.4 with disk cache took 5h. Is it possible to speed up noteDB even more?
We are using PostgressDB as well.
We have around 24gb RAM,
however the cache size for diffs is around 41gb.
Did you host your DB on SSD as well?
Your test instance is using NFS for storage as well?
Can you provide a full stack trace of these NPEs ?
Did they start occurring after updating to 3.4 or already after one of the earlier steps ?