What is "Resolving deltas" actually doing during a fetch?

2,508 views
Skip to first unread message

Zachary Turner

unread,
Nov 13, 2013, 7:57:16 PM11/13/13
to Chromium-dev
When I do a fetch chromium, it has lots of lines of outputs that look like this:

Resolving deltas:  92% (1819822/1959453)
Resolving deltas:  93% (1822353/1959453)
Resolving deltas:  93% (1823617/1959453)

What is actually happening here?  A quick look at task manager shows it's running the following commands

git index-pack --stdin -v --fix-thin "--keep=fetch-pack 13400 on zturner-win81" --pack_header=2,2458340

git fetch-pack --stateless-rpc --stdin --lock-pack --thin https://chromium.googlesource.com/chromium/blink.git

The problem is that this appears to be an n^2 operation, and past about the 1.2 millionth delta it gets extremely slow.  Is this just an inherent limitation of git, or is there anything we can do to remove the apparent n^2 behavior?

Yuta Kitamura

unread,
Nov 13, 2013, 8:12:53 PM11/13/13
to ztu...@chromium.org, Chromium-dev
AFAIK this is because:
- git-on-Windows is slow (partly because I/O on Windows is slow?)
- the repository is so huge (especially Blink)
- delta sizes are not uniformly distributed (some deltas are very large, some not)

I don't think git has O(n^2)-time behavior, because, if it did, delta resolution would take prohibitively long time.

Thanks,
Yuta


--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev

Dirk Pranke

unread,
Nov 13, 2013, 8:52:54 PM11/13/13
to Yuta Kitamura, ztu...@chromium.org, Chromium-dev
You would see the same "resolving deltas" messages on mac/linux. Windows is just slower :( . I believe the infra team is working on things that will hopefully help here. Talk to iannucci@ for more info.

-- Dirk

Daniel Bratell

unread,
Nov 14, 2013, 2:41:21 AM11/14/13
to Chromium-dev, Zachary Turner
On Thu, 14 Nov 2013 01:57:16 +0100, Zachary Turner <ztu...@chromium.org>
wrote:
I think it gets slow at the end because at the end it's working on those
big deltas it didn't complete quickly earlier. The tail of monster deltas
we can call them.

/Daniel

Daniel Bratell

unread,
Nov 14, 2013, 2:43:05 AM11/14/13
to Yuta Kitamura, Dirk Pranke, ztu...@chromium.org, Chromium-dev
On Thu, 14 Nov 2013 02:52:54 +0100, Dirk Pranke <dpr...@chromium.org>
wrote:

> You would see the same "resolving deltas" messages on mac/linux. Windows
> is just slower :( . I >believe the infra team is working on things that
> will hopefully help here. Talk to iannucci@ for more >info.

That is nice news. The Windows git people (assuming it 's more than 1
person nowadays) have long requested help to make git on Windows better.

/Daniel

Primiano Tucci

unread,
Nov 14, 2013, 4:43:12 AM11/14/13
to bra...@opera.com, Yuta Kitamura, Dirk Pranke, ztu...@chromium.org, Chromium-dev
Silly question: did you change your deltaBaseCacheLimit? 
(i.e. git config --global core.deltaBaseCacheLimit 2G)
That might make a substantial difference when repacking.


Torne (Richard Coles)

unread,
Nov 14, 2013, 6:07:20 AM11/14/13
to prim...@chromium.org, Daniel Bratell, Yuta Kitamura, Dirk Pranke, ztu...@chromium.org, Chromium-dev
As well as setting deltaBaseCacheLimit, you should make sure you're using a current version of git; there was a period where git verified downloaded objects in an extremely paranoid and expensive way that took many times longer than normal delta resolution. I forget exactly which versions had that issue, but 1.8.4.x is current and is definitely okay :)

It is kinda a case of O(n^2); the problem is that near the end of the process, some of the objects being resolved are delta compressed against earlier objects which are delta compressed against earlier objects which are delta compressed against earlier objects... repeat for many levels. The core.deltaBaseCacheLimit setting helps with this by making git more willing to cache old objects in memory while resolving deltas, but the full uncompressed size of every object in the repository is pretty massive and so even a large cache can't hold everything. Whenever the cache misses it has to go hunt down and decompress every object in the chain from the start. It gets slower because the later in the pack you are, the longer chains can be.

Blink's horrendous deltas are basically caused by LayoutTests. Large numbers of big, non-static binary files are basically git's nemesis. :/


To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.



--
Torne (Richard Coles)
to...@google.com

Zachary Turner

unread,
Nov 14, 2013, 12:29:44 PM11/14/13
to Daniel Bratell, Yuta Kitamura, Dirk Pranke, Chromium-dev, Scott Graham
fwiw, scottmg@ has a patch that makes git status significantly faster on Windows, but he says it's too difficult to upstream :(
Reply all
Reply to author
Forward
0 new messages