New issue 409 by gc...@arcor.de: checkout of large files to network drive
fails on XP
http://code.google.com/p/msysgit/issues/detail?id=409
What steps will reproduce the problem?
On a network drive:
1. create a repository with a large file (in my case file was > 105MB)
2. checkout this large file
What is the expected output?
The file that I committed before.
What do you see instead?
Error-Message.
File of correct size, but
content of File: 0x00 0x00 .... 0x00
What version of the product are you using?
Git-1.6.5.1-preview20091022
On what operating system?
XP
Please provide any additional information below.
The failure occurs because the write operation on XP fails when the size >
limit (64MB?), the file is on a network drive and the file was opened with
O_BINARY. (errno == 22)
--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings
I have a repo with two files over 105MB (and some other big files). The .git
directory though is only 82MB. The origin is on a mapped drive. The clone
is on my
local drive. Changes are rare, but clones happen sometimes. The OS is XP
SP3, same
git version.
Do I understand correctly that your repo is located on the network drive?
If so, can
you do the following: create the repo on your local drive; clone a bare
repo to the
network drive; clone it back to your local drive. Does the problem exist?
Does it always happen? (Network file operations are notoriously error-prone
in
Windows; we had intermittent zero'd files just by copying files in
Explorer).
Hello,
I have no problem on local drives. I've trouble when a large files should
be written
to a network drive. This error is reproducible. It's always the same
behavior.
Error-message:
git checkout-index: unable to write file ...
The root cause in my opinion is the write function which is called in
function xwrite
(wrapper.c).
I wrote a little test-prog. -- something like this:
...
fd = open("f1", O_CREAT | O_WRONLY | O_BINARY);
written = write(fd, buff, 106000000);
...
This program works as expected on a local drive.
This program shows the same bug on a network drive. (errno == 22; data in
file not
data of buff, but 0x00 ... instead)
This program works as expected when it is executed on vista instead of XP.
This program works (without O_BINARY) on Linux.
==> I think it is a bug in the XP network stack. But I do not expect that
it will be
fixed soon by MS.
I think it is worth to implement a workaround so that git can live with
this bug.
A mingw_write() which calls write() several times with smaller pieces?
Now that you mention it, I remember to have suffered from this XP bug
before, in
another project. We indeed needed to implemented a wrapper around fwrite
that
decreased the block size until no error occurred.
@chris.gcode: Would you mind trying if this Git executable solves the
problem for you?
http://threekings.tk/tmp/git.exe
Note: It's totally untested, do not use it on sensitive data. Also, it
currently only
implements a wrapper for write(), we probably should do the same for
read(), too.
Thank you for your feedback.
I have no access to the XP machine at the moment. So I can not test your
modification
right now.
I performed some tests on my own. (see patch below) The tests related to
the bug were
successful.
"make test" was not successful so I didn't mail the patch to the mailing
list.
I think you implemented a modification similar to my patch. So I expect
that your
change will work, too.
Interesting is the value of MINGW_WRITE_BLOCK_SIZE_MAX you choose. I tried
to
determine the value in a test program.
But the test program worked with bigger values. Furthermore is the feeling
of
multitasking limited when the write is executed.
==> Reduce MINGW_WRITE_BLOCK_SIZE_MAX much more to have a feeling of
multitasking again?
From fae5cb0956ea4e024feda052705daf59761e2c20 Mon Sep 17 00:00:00 2001
From: Christoph <chris.gc...>
Date: Tue, 16 Feb 2010 22:43:34 +0100
Subject: [PATCH] Workaround for XP-Bug: Enable writing of large files on
network
Implement function mingw_write as a replacement for write.
If size is smaller than a defined limit then mingw_write just calls
write. Otherwise mingw_write calls write several times until the
complete block is written.
This is a workaround for a XP-bug. When on XP write is called with a
file-descriptor pointing to a file on a network AND the file was
opened with O_BINARY AND the size > LIMIT (I don't know the exact
limit) then the write fails and errno is set to 22.
Signed-off-by: Christoph <chris.gc...>
---
compat/mingw.c | 32 ++++++++++++++++++++++++++++++++
compat/mingw.h | 4 ++++
wrapper.c | 1 +
3 files changed, 37 insertions(+), 0 deletions(-)
diff --git a/compat/mingw.c b/compat/mingw.c
index ab65f77..b361053 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -140,6 +140,38 @@ int mingw_open (const char *filename, int oflags, ...)
return fd;
}
+
+/*
+ * Writing large blocks over network fails on XP.
+ * Workaround: Divide block into smaller pieces
+ */
+#define MINGW_WRITE_BLOCK_SIZE_MAX (0x3FF7000)
+
+#undef write
+ssize_t mingw_write(int fd, const void *buf, size_t len)
+{
+ ssize_t written;
+ ssize_t written_total;
+
+ written_total = 0;
+ do {
+ if(len > MINGW_WRITE_BLOCK_SIZE_MAX)
+ written = write(fd, buf, MINGW_WRITE_BLOCK_SIZE_MAX);
+ else
+ written = write(fd, buf, len);
+ if(written >= 0) {
+ buf = ((char*)buf) + written;
+ len -= written;
+ written_total += written;
+ } else
+ return written;
+ } while(len);
+
+ return written_total;
+}
+
+
+
/*
* The unit of FILETIME is 100-nanoseconds since January 1, 1601, UTC.
* Returns the 100-nanoseconds ("hekto nanoseconds") since the epoch.
diff --git a/compat/mingw.h b/compat/mingw.h
index e254fb4..04f44f6 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -170,6 +170,10 @@ int link(const char *oldpath, const char *newpath);
int mingw_open (const char *filename, int oflags, ...);
#define open mingw_open
+ssize_t mingw_write(int fd, const void *buf, size_t len);
+#define write mingw_write
+
+
char *mingw_getcwd(char *pointer, int len);
#define getcwd mingw_getcwd
diff --git a/wrapper.c b/wrapper.c
index 0e3e20a..d5b35ac 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -1,6 +1,7 @@
/*
* Various trivial helper wrappers around standard functions
*/
+#include "git-compat-util.h"
#include "cache.h"
char *xstrdup(const char *str)
--
1.5.6.5
It is a pity that you did not send the patch to the mailing list; it is
much harder
to comment on here in the issue tracker, as quoting is a hassle.
Do not mistake my complaining for negative critique: if I did not find your
work
interesting, I would not even bother to write this here comment.
But there are issues (which I would have had an easier time with if it was
an email,
as stated, so next time, you might want to send this to
msy...@googlegroups.com):
Why does wrapper.c need a new include? Why does cache.h not suffice? If the
new
include is needed, that has to be explained in the commit message.
Further, what makes you think that 0x3FF7000 is a good constant? There is
no clue
whatsoever in the commit message.
You _hint_ _before_ the commit message that you determined it
heuristically, but
would it not be much, much better to just follow the advice and retry with
smaller
block sizes (say, (len + 1) / 2 and len / 2) until things work out? The
maximal
blocksize could even be remembered to the next run.
Ups, you are right, wrapper.c does not need a new include. cache.h includes
git-compat-util.h already.
> What makes you think that 0x3FF7000 is a good constant?
Only my tests I performed. (see statement above)
I assume that in the XP network stack a kind of maximum block size or
similar exists.
==> I expect that the constant can be defined at compile time and that it
is not
necessary to determine the value at runtime.
But perhaps I am wrong and size is limited due to system resources/-load.
==> Your suggestion to determine the block size at runtime could be a
solution in
that situation.
But I think the solution will be very complex:
- How do you want to recover from error 22? I think this is more than just
a write
wrapper.
- When the block size is defined by system resources/-load: does it make
sense to
remember the value? The next time the load will be different, ...
The solution does not solve the problem that the system is nearly locked
during write
operation.
==> I would prefer the solution with the simple wrapper with a defined
maximum block
size. In my opinion the block size 0x3FF7000 is to big because of the
problem "locked
system". 0x3FF7000 is just near to the maximum value that was accepted by
the write
function on my system in my test.
Disadvantage of that solution "simple wrapper": Reduced write performance?
What is the dependency between block size and write performance? Is there a
limit
where an increased block size does not increase write performance
meaningful?
Sorry, I wanted to help and now I enter questions only...
I don't think using a fixed block size like 0x3FF7000 is a good or elegant
solution.
What I did in my wrapper is to halve the block size while error 22 occurs
until
either no more error occurs or the block size is zero.
@chris.gcode: I'd still be interested in whether my executable works for
you.
Moreover, I don't think it makes sense that we both work on a solution,
that's just a
waste of time. However, I absolutely do not insist on taking my solution,
and I'll
gladly review any updated patch from your side if you send it to the
msysGit mailing
list.
On Tue, 23 Feb 2010, msy...@googlecode.com wrote:
> Comment #8 on issue 409 by sschuberth: checkout of large files to network
> drive fails on XP
> http://code.google.com/p/msysgit/issues/detail?id=409
>
> @chris.gcode: I'd still be interested in whether my executable works for
> you. Moreover, I don't think it makes sense that we both work on a
> solution, that's just a waste of time. However, I absolutely do not
> insist on taking my solution, and I'll gladly review any updated patch
> from your side if you send it to the msysGit mailing list.
Let's take your work. And let's stop trying to do reviews in that utterly
unsuited (for that purpose) web application.
Thanks,
Dscho
@sschuberth: I've tested your implementation (10-02-27
http://threekings.tk/tmp/git.exe). It does not fix the problem; behavior is
same as
before.
@johannes.schindelin, sschuberth: I thought after error 22 filepointer
would be at
the end of the 0x00-block and therefore recovering from error would be a
mess. That
seems not to be the case.
==> Your algorithm has advantages in case of unknown (unexpected small) max
block size.
Comment #10 on issue 409 by johannes.schindelin: checkout of large files to
network drive fails on XP
http://code.google.com/p/msysgit/issues/detail?id=409
If you make a new patch with the suggested changes and post it to the
msysGit mailing
list, I will accept it.
I am still experiencing issues even with the patch. I am working with a
repository via its UNC path and it is failing on large files. I first
tried lowing to several different values such as 4MB and still had
problems. I eventually got it to work with the following:
return write(fd, buf, min(count, 1024 * 27));
I didn't notice any real delays with having to call write that many more
times. However, I really don't know how to go about fixing this issue or
validating that this really fixes the problem.
I'm still seeing this issue on a Windows 7 Virtual writing to mapped drive
on the Linux host. The mapped drive is effectively a network drive.
$ git --version
git version 1.7.8.msysgit.0
I would be really grateful if anyone could shed any light?
Please note that this issue tracker is closed. So I respectfully ask you to
redirect your discussion to msy...@googlegroups.com
Looking at the various comments and the source code I have experimented by
altering the mingw_write function from:
return write(fd, buf, min(count, 31 * 1024 * 1024));
to:
return write(fd, buf, min(count, 15 * 1024 * 1024));
I recompiled the project and it appears to fix the issue for me. I hope
this might be useful for someone else in the near future.
Please note that this issue tracker is closed. So I ask you to redirect
your discussion to msy...@googlegroups.com
I did, but it didn't appear on the mailing list. I just wanted to make it
public in case someone else benefited.
I'm still seeing this issue on a Windows 7 Virtual writing to mapped drive on the Linux host. The mapped drive is effectively a network drive. $ git --version git version 1.7.8.msysgit.0 I would be really grateful if anyone could shed any light?
=======
Looking at the various comments and the source code I have experimented by altering the mingw_write function from:
return write(fd, buf, min(count, 31 * 1024 * 1024));to:return write(fd, buf, min(count, 15 * 1024 * 1024));
I recompiled the project and it appears to fix the issue for me. I hope this might be useful for someone else in the near future.
One question I have is, how can I produce the Windows installer file from the src code? I found that I had to manually copy over the executable from by c:\msysgit\git directory to c:\Program Files\git\bin.
Regards,
Adrian Smith
BT Group, UKThat is the script to make the standard installer. You are missing one
of the submodules so either you did not setup using the netinstaller
or full installer or possibly they are broken in some manner. The
script in /share/msysGit/net/setup-msysgit.sh is what gets uses to
setup the full repository with the netinstaller so if you work through
it you can fix up your local repo. Or you might run the netinstaller
to get a working local repository tree and then re-apply your fixes to
the new tree - likely this is simplest. Running the netinstaller
configures the tree and performs a local build so it is quite
comprehensive.
I've just checked this for 1.7.9 - the net installer does not create
the git-cheetah submodule for you. You can resolve that by running git
submodule update && git submodule init at the toplevel of the msysGit
tree.
Probably we should have this occur as part of the setup-msysGit.sh script.
Adrian, could you make and test those changes to setup-msysGit.sh and
contribute them back?
Thank you,
Johannes