Hi,
Tentative short story:
Could repo be made smart enough not to download the same (and big) git bundle multiple times in a row?
Long story:
I performed various repo --trace sync experiments in order to locate "hotspots" in the initial, full repo checkout and see if it could be made faster. Think a fully scripted build from scratch.
First I naively try repo sync --current-branch --no-tags. This made me realize that repo sync is for the very first sync using git bundles in many places.
So next I went and tried repo sync --no-clone-bundle --current-branch --no-tags and it saved a bit of time but not much. Looks like git bundle optimizations (whatever they are) are pretty good at offsetting the cost of including all tags and branches.
Bundles or not and unless you're building the browser (who does that in os-dev? :-), third_party/kernel/vX.Y take the lion's share. Not a surprise.
Now what came as a surprise is repo downloading the same 1.3G
https://chromium.googlesource.com/chromiumos/third_party/kernel/clone.bundle FIVE TIMES. Once per kernel. More disappointing: repo does not even download these five times in parallel but *consecutively*. I guess it's because they all eventually share the same .repo/project-objects/chromiumos/third_party/kernel.git backend and repo knows that from the manifest. The shared backend is a great optimization for disk space and incremental repo sync but it seems to hurt the initial repo sync fairly bad.
One workaround could be to hack the manifest before the repo sync and remove all but the desired kernel. Temporary and not very pretty.
Comments, ideas, thoughts, encouragements, pointers to source, documentation,...? Thanks!
Marc