Soooo sloooow...

1,636 views
Skip to first unread message

Robert Phillips

unread,
Nov 20, 2012, 4:04:41 PM11/20/12
to gitext...@googlegroups.com
Hi,

I'm relatively new to Git, and have just installed GitExtensions. Having previously worked with SVN, I initially installed TortoiseGit for the familiar interface. But it was too slow to be usable.

I'm trying GitExtensions now, but I'm still facing a similar problem with the speed of operations. It's too slow!

I think I'm misunderstanding how Git works, and want to try and improve my usage of it.

My setup:
  • GitExtensions v2.43 on Windows 7 x64.
  • Our Git repository is Git-O-Lite running on an OSX server, which I'm connected to with a 1GB ethernet connection.
  • I have a Git project that is made up 100,000 files in 18,000 folders totalling 4.08GB.
Now, obviously this is a pretty big project, so I try to ensure I don't commit the *whole* project every time - just the specific directory I'm making changes in. Here are my steps to committing files:
  1. I right-click on a specific sub folder in the project and select 'Commit'.
  2. GitExtensions scans the *entire* project folder anyway. This takes ages (around 10-15 mins).
  3. GitExtensions then lists *all* the thousands of individual files that have changes, most of which aren't even in the folder I selected to commit.
  4. I wade through the small window to find the files I want to commit and select them.
  5. I push to Stage, Commit and then Push. The whole operation takes another 10 mins or so, even if there's just a few dozen files.
In short, Git is unusable. I had the same problems with TortoiseGit.

SVN would have no problems with a project like this, and I would be able to commit in seconds.

Given how much everyone harps on about "how much better" Git is than SVN, I've yet to see any evidence to support this. Git is supposed to be great for large distributed projects, right? Well, this is a large project, and Git seems unable to handle it.

Maybe I'm just not using it correctly...?

Can anyone advise on where I can improve our project's performance?

Thanks!

Adrian Codrington

unread,
Nov 20, 2012, 6:47:55 PM11/20/12
to gitext...@googlegroups.com
Hi,

The Explorer context menu options are mostly shortcuts to Git Extension features - in this case, the "Commit" option is simply going to open the normal commit window, regardless of which sub-folder you right clicked on. As Git is designed to look at the change history of the whole repository, not individual files or folders, the commit window shows the changes made over the entire repository. Clearly, for a project of your size and structure, this is not feasible.

You could try using the command prompt a bit more. If you only want to stage the files in a particular folder, the command is "git add path/to/changes". Then to commit, it's simply "git commit" which will then prompt you for a commit message.

I personally use both the command prompt and the Git Extensions GUI, and mix and match when needed. Git Extensions provides some great UI features on top of git, but git is designed as a command line tool so the commands are very powerful - use each when appropriate.

As far as general speed goes, git is highly optimized for text/source files, so it is not ideal if a large proportion of your 100,000 files are binaries/images/etc.

I have also heard that msysgit (the Windows git port) suffers from some performance issues - this may or may not be true (I haven't experienced issues), but if it is true it may have an effect on a project of your size.

In the end, the problems you're having are more Git related than Git Extensions - as you said, you had the same problem with TortoiseGit. It may be that your project and workflow does not suit Git, but perhaps some minor changes to the way you do things will help performance.

Good luck!

Dave Brotherstone

unread,
Nov 21, 2012, 1:04:18 AM11/21/12
to gitext...@googlegroups.com

Our repository at work is similar, with well over 100,000 files, but around 3gb in size. We use msysgit, and git extensions, and the commit dialog takes about 5 seconds to load (apart from the first time, where it takes about 15-20). it's 100 times faster than our old svn repo.  Yes, git looks at everything in the directory, not just the current directory - that's what makes git fast. It's a central point - I'd recommend the ProGit book - http://progit.org/book

So, here are some thoughts:
- it doesn't matter where your server is, commit is a local action
- with so many files, you're relying heavily on OS caching - if you've got less than 4gb RAM or are tight on memory, the os may need to throw the cache away.
- make sure your .gitignore is ignoring everything you're not interested in. Entries with a slash at the end ignore the whole directory, ignoring bin/ and obj/ for example will help. - having lots of changed files will slow it down. Especially if the changed files are binary. 

"Thousands of individual files that have changes" - as you probably don't actually work on thousands of files for a single commit, this sounds like it's the problem.  Keeping your "tree clean" is a really good idea.  Work with branches, make sure you've got everything committed somewhere.  In Git, there is no concept of "I can't commit that yet".  

When you open the 


Dave Brotherstone

unread,
Nov 21, 2012, 1:13:31 AM11/21/12
to gitext...@googlegroups.com

Apologies - I sent a half written version of this earlier - hit Ctrl-Enter too soon!

When you open the commit dialog, what git does is scan all the files for a datestamp other than that in the repository (plus various other characteristics of changed files), when it finds one, it builds a hash of the file, and compares it to the one in the repository to see if the file has really changed.  If it has to build a hash for thousands of files, that will slow it down.  If they're large binary files, that's even worse.  Git isn't great at binary files, this is probably the only point where SVN often scores better than git - if you've got a heavily binary repository, you may need to rethink (or possibly ask why you have so many binaries checked in). If the binaries don't change often, it's not usually a problem (we have many many images and many DLLs/libs from external sources checked in, but they very rarely change).


Hope that helps. It's worth persevering to get it to go fast, git is a fantastic tool once you've rid yourself of SVN tendencies!

Dave.

Robert Phillips

unread,
Nov 25, 2012, 9:46:47 PM11/25/12
to gitext...@googlegroups.com, dav...@pobox.com
Hmm...well, I'm trying to Pull and Push every day to ease the load on Git, but it's still hard work.

To answer a question about binaries - yes, this project has around 1400 images and PDF's, totalling 330MB. These have not changed in over a month, so from what Dave Brotherstone says, it should be really quick now.

However, I have just done a Pull, and...well - it wasn't a pleasant experience. I stared at this dialog for nearly 10 mins before it eventually closed:

The .PSD file referred to in that dialog was around 10MB. The rest of the the files that came down from the Git server were text files.

I've tried adding directories containing binaries to the ignore list, but none of it makes any difference.

Git is causing us risk because I can't take that much time out of my day to push/pull.

Adrian Codrington - I will use the command line, but only if absolutely necessary. Sorry if I sound like a technophobe, but I really don't see that using a command line should be standard practice these days. For a modern program, it should be a last resort only.

Dave Brotherstone - Scanning the entire directory doesn't sound very efficient to me. If the time spent rebuilding the entire directory tree index is this great, then it doesn't seem like I've made any progress.

Hmmm...perhaps you are right. SVN would probably be a better solution for me. I'd heard a lot of good things about Git, and I really wish I could see the benefits of it. It's a shame it only seems to work well on Unix based machines.

Rob.

Alex Ford

unread,
Nov 25, 2012, 9:48:44 PM11/25/12
to gitext...@googlegroups.com

You really ought to consider moving tons of static content like that out of the repository and storing it elsewhere. I use blob storage on Windows Azure for that stuff.

Dave Brotherstone

unread,
Nov 26, 2012, 3:18:26 AM11/26/12
to Robert Phillips, gitext...@googlegroups.com
I don't think SVN would be better in your case.  Yes, 330Mb is a lot of binary to be lumping round in your repository, however, if it's not changing, it shouldn't cause an issue (other than the initial clone / checkout will be slower).  

As I said before, it works much much quicker for us on a similar repository (probably not quite so many binaries, but not too far off), so something isn't right. The claim that it only works well on Unix is just not true.  For your pull, you were caught by a GC, which will happen now and again, and take a few minutes on a large repository. Otherwise, push should be a really fast operation, as should fetch.  

I notice you're pulling - I'd check whether a fetch is fast - that should be in line with the size of the changes - if it's just a few text files, then it should be more or less instant.  A 10Mb PSD will take a bit longer, but no longer than SVN would.  Then you can see if it's the fetch step or the merge step that's taking the time.

You could also try setting the environment variable GIT_TRACE to 1, then running the pull, merge or status, and check if there's any particular step that takes time.  

Overall, just to emphasize the point, what you're seeing is unusual, and is completely the opposite of our experience on a similar repository.

Dave.

Robert Phillips

unread,
Nov 26, 2012, 5:33:09 PM11/26/12
to gitext...@googlegroups.com
OK, fair enough. I'm not sure if the proprietary CMS we are using supports Azure yet though.

Also, as Dave Brotherstone said - if the binary files haven't changed, it shouldn't affect performance. The binary files haven't changed in over a month. 

Robert Phillips

unread,
Nov 26, 2012, 10:54:09 PM11/26/12
to gitext...@googlegroups.com
Oh God! I just tried to view file history on a single file.

10 mins just to view the file.

Sorry guys. I could spend days or weeks optimizing the Git database, 'ignore' file, write command line scripts to speed up the process, take files out, put others in...etc...

But in SVN I wouldn't have to do any of that.

:-/

Alex Ford

unread,
Nov 27, 2012, 3:15:53 AM11/27/12
to gitext...@googlegroups.com
Nothing to apologize for. If Git isn't for you then it isn't for you.
--
Alex Ford

Twitter: CodeTunnel

Reply all
Reply to author
Forward
0 new messages