On 2022-02-06 01:46, David Ostrovsky wrote:
> MartinFick schrieb am Mittwoch, 18. August 2021 um 22:59:27 UTC+2:
>
>> ------------- PERFORMANCE OBJECTIVE -------------
>> As mentioned in the previous thread, our objective is to perform the
>> entire
>> 2.7-3.4+ upgrade in under 4 hours. While many improvements are still
>> needed
>> to get there, this is huge leap towards that goal, and we believe we
>> still
>> have many things we can potentially improve. The current full
>> upgrade timeline
>> for the following 3 phases, for us, looks like :
>>
>> * PHASE 1 - Upgrade from 2.7 to 2.16 (schema migrations)
>> -> ~3.5 hours
>>
>> * PHASE 2 - NoteDB migration
>> -> ~2.5hours + 1 hour to repack All-Users (we hope to inline this in
>> the
>> migration soon)
>>
>> * PHASE 3 - Upgrade from 2.16 to 3.4+ (schema migrations + indexing)
>>
>> -> ~22hrs
>
> Can you break down the numbers in PHASE 3, and also clarify how often
> you do reindexing? If it's something: 1 hour schema migrations + 21
> hrs
> reindexing at 3.4+ only,
These numbers are actually quite out of date now. We did get the
opportunity to work on the schema migrations and the indexing.
The indexing is indeed the bulk of the upgrade from 2.16 to 3.5
(our latest target). Those schema migrations are only a few minutes,
not even close to an hour fortunately! We did a presentation at the
last user summit with more up-to-date numbers, and they were much
better. Indexing is quite fast now, around 2 hours only! Since
these numbers are very much out of date, and not everyone has
watched our presentation (which is also a bit out of date now), I
will try to get another email out soon to summarize our latest
results, but we are still improving them!
The indexing could us some fixes because 3.5 can't seem to
handle the old format for the auto-merge refs properly. Kaushik is
working on fixes for that. We are also exploring the ES approach
now since our IT would like to use it, and it seems from our
analysis that ES provides read after write consistency which the
current Lucene approach doesn't seem to. We are working on a fix
for indexing to disable that consistency during offline re-indexing
since it isn't needed then, and this seems to bring the ES
implementation up to speed with the Lucene implementation, to the
point that neither is now the bottleneck for indexing. Reading the
git data from the repos seems to be currently the bottleneck.
> have you considered delta reindexing
> approach:
> backup the prod data, perform full migration on staging machine. Copy
> index directory to production site, skip offline reindex step and
> perform
> online reindexing of changes that changed during migration process.
Thank you for the suggestion David. We really want to avoid this
approach as it is more complicated, and potentially error prone.
We will likely however use a similar approach to at least
pre-populate Gerrit's persistent caches since that is very simple
thing to do, and hard to get wrong, and thus unlikely to get
out-of-date info accidentally. We really want to make this fast
and easy for everyone!