Our repository at work is similar, with well over 100,000 files, but around 3gb in size. We use msysgit, and git extensions, and the commit dialog takes about 5 seconds to load (apart from the first time, where it takes about 15-20). it's 100 times faster than our old svn repo. Yes, git looks at everything in the directory, not just the current directory - that's what makes git fast. It's a central point - I'd recommend the ProGit book - http://progit.org/book
So, here are some thoughts:
- it doesn't matter where your server is, commit is a local action
- with so many files, you're relying heavily on OS caching - if you've got less than 4gb RAM or are tight on memory, the os may need to throw the cache away.
- make sure your .gitignore is ignoring everything you're not interested in. Entries with a slash at the end ignore the whole directory, ignoring bin/ and obj/ for example will help. - having lots of changed files will slow it down. Especially if the changed files are binary.
"Thousands of individual files that have changes" - as you probably don't actually work on thousands of files for a single commit, this sounds like it's the problem. Keeping your "tree clean" is a really good idea. Work with branches, make sure you've got everything committed somewhere. In Git, there is no concept of "I can't commit that yet".
When you open the
Apologies - I sent a half written version of this earlier - hit Ctrl-Enter too soon!
When you open the commit dialog, what git does is scan all the files for a datestamp other than that in the repository (plus various other characteristics of changed files), when it finds one, it builds a hash of the file, and compares it to the one in the repository to see if the file has really changed. If it has to build a hash for thousands of files, that will slow it down. If they're large binary files, that's even worse. Git isn't great at binary files, this is probably the only point where SVN often scores better than git - if you've got a heavily binary repository, you may need to rethink (or possibly ask why you have so many binaries checked in). If the binaries don't change often, it's not usually a problem (we have many many images and many DLLs/libs from external sources checked in, but they very rarely change).
Hope that helps. It's worth persevering to get it to go fast, git is a fantastic tool once you've rid yourself of SVN tendencies!
Dave.
You really ought to consider moving tons of static content like that out of the repository and storing it elsewhere. I use blob storage on Windows Azure for that stuff.