Google Groups

Re: Problems with larger files "Out of memory"


Gregor Uhlenheuer Feb 10, 2011 1:45 PM
Posted in group: Git for Windows
On Feb 2, 1:02 pm, Sven <o...@inpad.de> wrote:
> On Feb 1, 5:26 pm, Ciaran <ciar...@gmail.com> wrote:
>
>
>
> > On Tue, Feb 1, 2011 at 4:07 PM, Erik Faye-Lund <kusmab...@gmail.com> wrote:
> > > On Tue, Feb 1, 2011 at 5:02 PM, Johannes Schindelin
> > > <Johannes.Schinde...@gmx.de> wrote:
> > >> Hi,
>
> > >> On Tue, 1 Feb 2011, Erik Faye-Lund wrote:
>
> > >>> Git-SVN is written in Perl, so I think it's a 64-bit perl you would
> > >>> need.
>
> > >> Probably not, because all the real data crunching is performed by the Git
> > >> binaries. At least IIRC.
>
> > > The out-of-memory was reported by a perl-script:
>
> > > Out of memory during "large" request for 268439552 bytes, total sbrk()
> > > is 141076480 bytes at /usr/lib/perl5/site_perl/Git.pm line 898, <GEN1>
> > > line 1.
>
> > > Unfortunately, the line-number doesn't match my copy of Git.pm, so I'm
> > > unable to dig further.
>
> > Briefly putting aside my utter failure at actually solving this
> > previously, I believe I had made progress by changing the function
> > declaredhttp://repo.or.cz/w/git/mingw/4msysgit.git/blame/HEAD:/perl/Git.pm@
> > Line 866 (as of this moment) .
>
> > Line 898 attempts to fill a buffer '$read' with the entire contents of
> > the blob being CAT'd.  IIRC I was able to change the processing so the
> > passed file handles could be used more effectively as streams to avoid
> > the overly sized buffer ... but as I said I'm not sure how effective
> > this was (it did stop running out of memory, but I think it introduced
> > some other downstream hash problems... )
>
> > -Cj.
>
> Hi,
>
> if you changed something in this and want me to try it, please tell
> me.
>
> Kind regards,
>
> Sven
>
> PS Thank you all for noticing my problems.

Hi there,

a few days ago I did a small patch for the aforementioned Git.pm
module. I just tried to accomplish something along the hint Ciaran
gave. As far as I can test my changes it definitely helped with
committing large files.
But I am by no means a perl expert and don't have to possibilities to
test the patch more extensively. So any critics or reminders are very
welcome.

Since I don't see any way to send an attachment via the googlegroups
interface I'll try to attach the patch inline.

Cheers,
Gregor

--- Git.pm        2011-02-03 22:02:22.000000000 +0100
+++ Git.pm.new        2011-02-03 22:02:59.000000000 +0100
@@ -886,22 +886,26 @@
         }

         my $size = $1;
-
-        my $blob;
         my $bytesRead = 0;

         while (1) {
+                my $blob;
                 my $bytesLeft = $size - $bytesRead;
                 last unless $bytesLeft;

                 my $bytesToRead = $bytesLeft < 1024 ? $bytesLeft : 1024;
-                my $read = read($in, $blob, $bytesToRead, $bytesRead);
+                my $read = read($in, $blob, $bytesToRead);
                 unless (defined($read)) {
                         $self->_close_cat_blob();
                         throw Error::Simple("in pipe went bad");
                 }

                 $bytesRead += $read;
+
+                unless (print $fh $blob) {
+                    $self->_close_cat_blob();
+                    throw Error::Simple("couldn't write to passed in filehandle");
+                }
         }

         # Skip past the trailing newline.
@@ -916,11 +920,6 @@
                 throw Error::Simple("didn't find newline after blob");
         }

-        unless (print $fh $blob) {
-                $self->_close_cat_blob();
-                throw Error::Simple("couldn't write to passed in filehandle");
-        }
-
         return $size;
 }