Problems with larger files "Out of memory"

1,139 views
Skip to first unread message

Sven

unread,
Feb 1, 2011, 7:25:26 AM2/1/11
to msysGit
Hi,

I'm almost new to GIT and I wanted to use it together with SVN as base
repository. So I played around with some testfiles and find a good way
to work with them:

1. Clone
2. Work and Commit to GIT
3. Work and Commit to GIT
4. DCommit back to the SVN Repos

But after working with real files I found out, that larger files
(SQLite with around 200-300MB) won't work on DCommit. What can I do to
get it working?

Here is the errormessage I got from GIT.

D:\xxxx>git svn dcommit
Committing to https://xxxx/trunk ...
M xxx/xxx.s3db
Out of memory during "large" request for 268439552 bytes, total sbrk()
is 141076480 bytes at /usr/lib/perl5/site_perl/Git.pm line 898, <GEN1>
line 1.

I use:
- PortableGit-1.7.3.1-preview20101002
- TortoiseGit 1.6.3.0 (with git version 1.7.3.1.msysgit.0)

But both Tortoise as well as PortableGit itself shows me the
errormessage I shown above.

Hopefully someone could help me.

Kind regard,

Sven

Ciaran

unread,
Feb 1, 2011, 8:11:44 AM2/1/11
to Sven, msysGit
On Tue, Feb 1, 2011 at 12:25 PM, Sven <or...@inpad.de> wrote:
> Hi,
>
> I'm almost new to GIT and I wanted to use it together with SVN as base
> repository. So I played around with some testfiles and find a good way
> to work with them:
>
> 1. Clone
> 2. Work and Commit to GIT
> 3. Work and Commit to GIT
> 4. DCommit back to the SVN Repos
>
> But after working with real files I found out, that larger files
> (SQLite with around 200-300MB) won't work on DCommit. What can I do to
> get it working?

Yeah there are I think, a few 'issues' (that is, known behaviours)
with Git itself in this area (not just msysgit), I came across it
when I discovered another problem issue with large files, i.e. they
work in 64bit, but you can't use repositories created in 64 bit on a
32 bit version of git :( Due to the storable module not saving stuff
in a platform independent matter or something (never figured out how
to fix that ... and it only affects git-svn I think)

I did actually patch a local copy of msysgit so 32bit wouldn't run out
of memory, but I've lost that patch now (machine failure, lost
interest as I could stop adding the large file). Irrc there was some
sort of while loop where the contents of a blob was being held in
memory rather than just being sent out to a passed in handle (cat
springs to mind), I wish I still had the patch as it may have helped
you, sorry, but perhaps this will give you a good starting point to
the kinda places to look.

- Cj.

Sven

unread,
Feb 1, 2011, 8:41:00 AM2/1/11
to msysGit


On Feb 1, 2:11 pm, Ciaran <ciar...@gmail.com> wrote:
As far as I understand it you want me to fix the error in git-svn
itself? But I'm sorry, I don't have the ability to do this. By the way
I use msysgit on a 64Bit Windows 7 machine. Is there a 64Bit Version
of GIT? So that it will work did I got completely wrong?

Sven

Ciaran

unread,
Feb 1, 2011, 8:46:58 AM2/1/11
to Sven, msysGit

Err, me? I don't want you to do anything, I'm merely agreeing that
yes I've seen that too, and gave a hint to the fixes I once applied (
and sadly lost.) ... I was using Git-svn on a 64bit mac when I
*transferred* to a 32bit windows machine it had those problems...
There was I think a 64bit branch of MsysGit, but I'm not sure if it
works ?
- Cj

Sven

unread,
Feb 1, 2011, 8:50:20 AM2/1/11
to msysGit


On Feb 1, 2:46 pm, Ciaran <ciar...@gmail.com> wrote:
Pardon me, I understood it wrong. So I'll wait until someone else has
an idea how to fix this. Or ask me more about my used system/software.

Regards,

Sven

Erik Faye-Lund

unread,
Feb 1, 2011, 8:55:40 AM2/1/11
to Sven, msysGit
On Tue, Feb 1, 2011 at 2:41 PM, Sven <or...@inpad.de> wrote:
> On Feb 1, 2:11 pm, Ciaran <ciar...@gmail.com> wrote:
>> I did actually patch a local copy of msysgit so 32bit wouldn't run out
>> of memory, but I've lost that patch now (machine failure, lost
>> interest as I could stop adding the large file).  Irrc there was some
>> sort of while loop where the contents of a blob was being held in
>> memory rather than just being sent out to a passed in handle (cat
>> springs to mind), I wish I still had the patch as it may have helped
>> you, sorry, but perhaps this will give you a good starting point to
>> the kinda places to look.
>>
> As far as I understand it you want me to fix the error in git-svn
> itself? But I'm sorry, I don't have the ability to do this.

"Please note that there are not enough contributors to the msysGit
project to offer commercial-grade support; if you do not have the
means to fix your problems (possibly with valuable advice from the
msysGit mailing list), or to entice people who can fix them, it is
unlikely that your problem gets solved."

This text is written on the front-page of the msysGit site. Don't
expect people to fix your problem for you. But be happy if someone
does.

> By the way
> I use msysgit on a 64Bit Windows 7 machine. Is there a 64Bit Version
> of GIT? So that it will work did I got completely wrong?
>

Unfortunately, no. There is an experimental 64-bit branch, but it
isn't fully functional IIRC. I guess most Git developers don't need
support for large files. After all, files that large is relatively
uncommon in a source code manager. At least for me, the only important
thing was to get to the point where git-cheetah was running on my
64-bit windows, and this currently works for me.

This is a bit of a tangent, but I guess I should try to get that
64-bit git-cheetah into the installer or something...

Sven

unread,
Feb 1, 2011, 9:32:08 AM2/1/11
to msysGit


On Feb 1, 2:55 pm, Erik Faye-Lund <kusmab...@gmail.com> wrote:
Pardon me, I didn't realize this.

> I guess most Git developers don't need
> support for large files. After all, files that large is relatively
> uncommon in a source code manager.

That might be, but for me it is important, because I use sources
together with big test- and project-database-files. They are essential
and can be left behind.

Cheetah is another GUI for GIT? But for me it seems that the error
comes from GIT-SVN itself or am I wrong?

Regards,

Sven

Erik Faye-Lund

unread,
Feb 1, 2011, 9:54:50 AM2/1/11
to Sven, msysGit
On Tue, Feb 1, 2011 at 3:32 PM, Sven <or...@inpad.de> wrote:
>> I guess most Git developers don't need
>> support for large files. After all, files that large is relatively
>> uncommon in a source code manager.
>
> That might be, but for me it is important, because I use sources
> together with big test- and project-database-files. They are essential
> and can be left behind.
>

Unfortunately for you, that's your problem. Git is open source
software, and there's no commercial support. So if you depend on this
feature, you will either have to do it yourself or find a way to
motivate someone to do it for you.

One effective way to motivate someone to fix a problem for you is to
put a bounty on it. Another is to at least give it a try yourself, and
seek help here whenever you're stuck.

> Cheetah is another GUI for GIT? But for me it seems that the error
> comes from GIT-SVN itself or am I wrong?

Indeed, Cheetah isn't really relevant to your problem.

Git-SVN is written in Perl, so I think it's a 64-bit perl you would
need. Strawberry Perl supplies a 64-bit version of Perl for Windows.
Perhaps you could look into integrating that in Git for Windows?

http://strawberryperl.com/releases.html

IIRC, there has already been some efforts in this direction (but not
specifically for 64-bit that I can remember), perhaps you can search
the mailing list for mentions of "strawberry" to find the ones
involved?

Johannes Schindelin

unread,
Feb 1, 2011, 11:02:32 AM2/1/11
to Erik Faye-Lund, Sven, msysGit
Hi,

On Tue, 1 Feb 2011, Erik Faye-Lund wrote:

> Git-SVN is written in Perl, so I think it's a 64-bit perl you would
> need.

Probably not, because all the real data crunching is performed by the Git
binaries. At least IIRC.

Ciao,
Dscho

Erik Faye-Lund

unread,
Feb 1, 2011, 11:07:51 AM2/1/11
to Johannes Schindelin, Sven, msysGit

The out-of-memory was reported by a perl-script:

Out of memory during "large" request for 268439552 bytes, total sbrk()
is 141076480 bytes at /usr/lib/perl5/site_perl/Git.pm line 898, <GEN1>
line 1.

Unfortunately, the line-number doesn't match my copy of Git.pm, so I'm
unable to dig further.

Ciaran

unread,
Feb 1, 2011, 11:26:49 AM2/1/11
to kusm...@gmail.com, Johannes Schindelin, Sven, msysGit

Briefly putting aside my utter failure at actually solving this
previously, I believe I had made progress by changing the function
declared
http://repo.or.cz/w/git/mingw/4msysgit.git/blame/HEAD:/perl/Git.pm @
Line 866 (as of this moment) .

Line 898 attempts to fill a buffer '$read' with the entire contents of
the blob being CAT'd. IIRC I was able to change the processing so the
passed file handles could be used more effectively as streams to avoid
the overly sized buffer ... but as I said I'm not sure how effective
this was (it did stop running out of memory, but I think it introduced
some other downstream hash problems... )

-Cj.

Johannes Schindelin

unread,
Feb 1, 2011, 11:57:33 AM2/1/11
to Erik Faye-Lund, Sven, msysGit
Hi,

On Tue, 1 Feb 2011, Erik Faye-Lund wrote:

> On Tue, Feb 1, 2011 at 5:02 PM, Johannes Schindelin
> <Johannes....@gmx.de> wrote:
>
> > On Tue, 1 Feb 2011, Erik Faye-Lund wrote:
> >
> >> Git-SVN is written in Perl, so I think it's a 64-bit perl you would
> >> need.
> >
> > Probably not, because all the real data crunching is performed by the
> > Git binaries. At least IIRC.
>
> The out-of-memory was reported by a perl-script:
>
> Out of memory during "large" request for 268439552 bytes, total sbrk()
> is 141076480 bytes at /usr/lib/perl5/site_perl/Git.pm line 898, <GEN1>
> line 1.
>
> Unfortunately, the line-number doesn't match my copy of Git.pm, so I'm
> unable to dig further.

Ah yes, the infamous native Perl module. My mistake.

Ciao,
Dscho

Sven

unread,
Feb 2, 2011, 7:02:04 AM2/2/11
to msysGit


On Feb 1, 5:26 pm, Ciaran <ciar...@gmail.com> wrote:
Hi,

if you changed something in this and want me to try it, please tell
me.

Kind regards,

Sven

PS Thank you all for noticing my problems.

Gregor Uhlenheuer

unread,
Feb 10, 2011, 4:45:06 PM2/10/11
to msysGit
Hi there,

a few days ago I did a small patch for the aforementioned Git.pm
module. I just tried to accomplish something along the hint Ciaran
gave. As far as I can test my changes it definitely helped with
committing large files.
But I am by no means a perl expert and don't have to possibilities to
test the patch more extensively. So any critics or reminders are very
welcome.

Since I don't see any way to send an attachment via the googlegroups
interface I'll try to attach the patch inline.

Cheers,
Gregor

--- Git.pm 2011-02-03 22:02:22.000000000 +0100
+++ Git.pm.new 2011-02-03 22:02:59.000000000 +0100
@@ -886,22 +886,26 @@
}

my $size = $1;
-
- my $blob;
my $bytesRead = 0;

while (1) {
+ my $blob;
my $bytesLeft = $size - $bytesRead;
last unless $bytesLeft;

my $bytesToRead = $bytesLeft < 1024 ? $bytesLeft : 1024;
- my $read = read($in, $blob, $bytesToRead, $bytesRead);
+ my $read = read($in, $blob, $bytesToRead);
unless (defined($read)) {
$self->_close_cat_blob();
throw Error::Simple("in pipe went bad");
}

$bytesRead += $read;
+
+ unless (print $fh $blob) {
+ $self->_close_cat_blob();
+ throw Error::Simple("couldn't write to passed in filehandle");
+ }
}

# Skip past the trailing newline.
@@ -916,11 +920,6 @@
throw Error::Simple("didn't find newline after blob");
}

- unless (print $fh $blob) {
- $self->_close_cat_blob();
- throw Error::Simple("couldn't write to passed in filehandle");
- }
-
return $size;
}

Johannes Schindelin

unread,
Feb 17, 2011, 7:08:31 AM2/17/11
to Gregor Uhlenheuer, msysGit
Hi Gregor,

Looks good, except I had to deduce from the patch what you are doing. Can
you provide a proper commit message? Then I will apply it. Alternatively,
you're welcome to push your commits to the 'mob' branch:

git push mob HEAD:mob

Ciao,
Dscho

Ciaran

unread,
Feb 17, 2011, 7:18:05 AM2/17/11
to Johannes Schindelin, Gregor Uhlenheuer, msysGit
Yeah that was pretty much what I had too iirc, shame I lost it really
;) .. maybe I should've used some sorta version control...no wait <g>
- Cj.

kongo2002

unread,
Feb 18, 2011, 4:55:15 AM2/18/11
to Johannes Schindelin, msysGit
2011/2/17 Johannes Schindelin <Johannes....@gmx.de>
>
> Hi Gregor,

>
> Looks good, except I had to deduce from the patch what you are doing. Can
> you provide a proper commit message? Then I will apply it. Alternatively,
> you're welcome to push your commits to the 'mob' branch:
>
>        git push mob HEAD:mob
>
> Ciao,
> Dscho
>

Hi Johannes,

since I don't really know the exact git workflow for msysgit (are
there some guidelines lying around somewhere?) I attach the patch done
with git format-patch. I hope that's alright with you, too.

Cheers,
Gregor

0001-Git.pm-Use-stream-like-writing-in-cat_blob.patch

Pat Thoyts

unread,
Feb 18, 2011, 5:14:06 AM2/18/11
to kong...@gmail.com, kongo2002, Johannes Schindelin, msysGit

You should add a sign-off (git commit -s) to show you are happy with
the patch and that it conforms to the contribution guidelines (see
git\Documentation\SubmittingPatches.txt)

It's preferred to inline patches rather than attach them as then
people can put review comments in the right places - however, gmail
can make it harder to work like that so attaching can sometimes be
safer.

Your commit comment probably should mention where this problem was
identified -- only in this thread history is it mentioned that the
patch is actually to solve an issue with git-svn and large sqlite
files.

Otherwise - looks fine. Did you run the test suite?

Cheers,
Pat Thoyts

Johannes Schindelin

unread,
Feb 18, 2011, 5:42:12 AM2/18/11
to kong...@gmail.com, msysGit
From: Gregor Uhlenheuer <kong...@googlemail.com>

This commit fixes the issue with the handling of large files causing an
'Out of memory' perl exception. Instead of reading and writing the whole
blob at once now the blob is written in small pieces.

The problem was raised and discussed in this mail to the msysGit mailing
list: http://thread.gmane.org/gmane.comp.version-control.msysgit/12080

Signed-off-by: Gregor Uhlenheuer <kong...@googlemail.com>
---
Hi,

On Fri, 18 Feb 2011, kongo2002 wrote:

> since I don't really know the exact git workflow for msysgit
> (are there some guidelines lying around somewhere?) I attach the
> patch done with git format-patch. I hope that's alright with
> you, too.

This is already very good! As Pat mentioned, we have these
guidelines, and I transformed your patch to conform with them.

BTW I could imagine that there was an intention of atomicity
behind that "write the whole blob in one go", but that's better
done by locking, and that does not need the whole blob, either.

Are you okay with this form of the patch?

Thanks!
Dscho

perl/Git.pm | 15 +++++++--------
1 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/perl/Git.pm b/perl/Git.pm
index a86ab70..0b53566 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -896,22 +896,26 @@ sub cat_blob {


}

my $size = $1;
-
- my $blob;
my $bytesRead = 0;

while (1) {
+ my $blob;
my $bytesLeft = $size - $bytesRead;
last unless $bytesLeft;

my $bytesToRead = $bytesLeft < 1024 ? $bytesLeft : 1024;
- my $read = read($in, $blob, $bytesToRead, $bytesRead);
+ my $read = read($in, $blob, $bytesToRead);
unless (defined($read)) {
$self->_close_cat_blob();
throw Error::Simple("in pipe went bad");
}

$bytesRead += $read;
+
+ unless (print $fh $blob) {
+ $self->_close_cat_blob();
+ throw Error::Simple("couldn't write to passed in filehandle");
+ }
}

# Skip past the trailing newline.

@@ -926,11 +930,6 @@ sub cat_blob {


throw Error::Simple("didn't find newline after blob");
}

- unless (print $fh $blob) {
- $self->_close_cat_blob();
- throw Error::Simple("couldn't write to passed in filehandle");
- }
-
return $size;
}

--
1.7.4

kongo2002

unread,
Feb 18, 2011, 7:05:08 AM2/18/11
to Johannes Schindelin, msysGit
2011/2/18 Johannes Schindelin <Johannes....@gmx.de>:

Hi again,

thanks for your changes to my patch. I will definitely read on those
instructions on patches and committing. I would be glad if that one
was applied.

Cheers,
Gregor

Erik Faye-Lund

unread,
Feb 18, 2011, 7:49:03 AM2/18/11
to Johannes Schindelin, kong...@gmail.com, msysGit
On Fri, Feb 18, 2011 at 11:42 AM, Johannes Schindelin
<Johannes....@gmx.de> wrote:
> From: Gregor Uhlenheuer <kong...@googlemail.com>
>
> This commit fixes the issue with the handling of large files causing an
> 'Out of memory' perl exception. Instead of reading and writing the whole
> blob at once now the blob is written in small pieces.
>
> The problem was raised and discussed in this mail to the msysGit mailing
> list: http://thread.gmane.org/gmane.comp.version-control.msysgit/12080
>
> Signed-off-by: Gregor Uhlenheuer <kong...@googlemail.com>

Isn't this patch also suitable for mainline Git? Wouldn't Unices etc
benefit from this change?

Ciaran

unread,
Feb 18, 2011, 7:54:32 AM2/18/11
to kusm...@gmail.com, Johannes Schindelin, kong...@gmail.com, msysGit
Fwiw, when I originally found this problem and did a similar fix it
was when I moved repo from my unix box
to my windows box,the unix box had been fine, but the windows one failed.

The reason was that on Unix I could use 64bit git so it didn't bust
out the memory (and was by default using the 64 bit version)
, but using msysgit on windows (obviously) was 32bit so exhibited the problem.

I would expect the problem to still occur on 64bit system however,
when I looked into this the prevailing
view in the Git-land was that 'big files are a known problem' and we
don't care ;) .... I even looked at this project:
http://caca.zoy.org/wiki/git-bigfiles to see what options there were

I *am* interested in storing big files, as I'm considering using git
an artefact store in my build pipeline, my solutions
so far have involved making the files smaller <g>
- Cj.

Erik Faye-Lund

unread,
Feb 18, 2011, 8:08:29 AM2/18/11
to Ciaran, Johannes Schindelin, kong...@gmail.com, msysGit
On Fri, Feb 18, 2011 at 1:54 PM, Ciaran <cia...@gmail.com> wrote:
> On Fri, Feb 18, 2011 at 12:49 PM, Erik Faye-Lund <kusm...@gmail.com> wrote:
>> On Fri, Feb 18, 2011 at 11:42 AM, Johannes Schindelin
>> <Johannes....@gmx.de> wrote:
>>> From: Gregor Uhlenheuer <kong...@googlemail.com>
>>>
>>> This commit fixes the issue with the handling of large files causing an
>>> 'Out of memory' perl exception. Instead of reading and writing the whole
>>> blob at once now the blob is written in small pieces.
>>>
>>> The problem was raised and discussed in this mail to the msysGit mailing
>>> list: http://thread.gmane.org/gmane.comp.version-control.msysgit/12080
>>>
>>> Signed-off-by: Gregor Uhlenheuer <kong...@googlemail.com>
>>
>> Isn't this patch also suitable for mainline Git? Wouldn't Unices etc
>> benefit from this change?
> Fwiw, when I originally found this problem and did a similar fix it
> was when I moved repo from my unix box
> to my windows box,the unix box had been fine, but the windows one failed.
>
> The reason was that on Unix I could use 64bit git so it didn't bust
> out the memory (and was by default using the 64 bit version)
> , but using msysgit on windows (obviously) was 32bit so exhibited the problem.
>

Not all Unices (or Unix-ishes) are 64-bit, though. For instance, Linux
on the first-generation of Intel Atom processors. And NVIDIA's Project
Denver makes ARM based workstations more likely, and there's no 64-bit
ARM instruction set announced yet.

Ciaran

unread,
Feb 18, 2011, 8:18:00 AM2/18/11
to kusm...@gmail.com, Johannes Schindelin, kong...@gmail.com, msysGit
>
> Not all Unices (or Unix-ishes) are 64-bit, though. For instance, Linux
> on the first-generation of Intel Atom processors. And NVIDIA's Project
> Denver makes ARM based workstations more likely, and there's no 64-bit
> ARM instruction set announced yet.
>
Oh yes, sorry I wasn't clear, I actually agree, I just decided against
pushing things upstream (regretfully) as my interpretation of the
current
mainline git landscape appeared hostile to the idea of supporting
larger files (frankly the git mailing list scares me a little :) )
-cj

Johannes Schindelin

unread,
Feb 18, 2011, 11:05:18 AM2/18/11
to Erik Faye-Lund, kong...@gmail.com, msysGit
Hi,

Probably. But we can test-drive it in the (by definition) more
experimental setting of msysGit...

Ciao,
Dscho

Johannes Schindelin

unread,
Feb 18, 2011, 11:10:15 AM2/18/11
to kongo2002, msysGit
Hi,

On Fri, 18 Feb 2011, kongo2002 wrote:

> thanks for your changes to my patch. I will definitely read on those
> instructions on patches and committing. I would be glad if that one was
> applied.

Thanks for your contribution!
Dscho

Reply all
Reply to author
Forward
0 new messages