[RFC] make open/unlink failures user friendly on windows using retry/abort

25 views
Skip to first unread message

Heiko Voigt

unread,
Jan 18, 2010, 12:38:53 PM1/18/10
to msysGit Mailinglist
Hi,

on windows an unlink/open call can fail in case another application has
the file opened for writing. If this happens it is currently very
annoying for the user as git stops there and he ends up in a half
checked out/merged/... state.

I do not know whether I recall correctly that there has been some work
related repeating such an operation in case of an access denied failure
because I can not find anything related in the source.

As this still happens to my users in irregular intervals it seems to be
such a common case for windows user in their day to day work that I want
to do something about it.

My proposal in case we get into such a situation:

* In case we are in interactive mode (e.g. stdout is a tty) present a
question to the user (cancel/retry) and wait so he can do something
about the situation.

* Additionally wait for an increasing amount of time and then
automatically restart. On every retry the delay time is doubled.

* In case we are not in interactive mode just fail by default. Add a
config option to enable the above cases in this mode.

What do you think about this?

My final goal is to extent this to the gui tools. I would be glad for
any pointers to existing code that may help me to parse the
output/define a nice format.

cheers Heiko

Johannes Sixt

unread,
Jan 18, 2010, 4:48:26 PM1/18/10
to msy...@googlegroups.com, Heiko Voigt
On Montag, 18. Januar 2010, Heiko Voigt wrote:
> on windows an unlink/open call can fail in case another application has
> the file opened for writing. If this happens it is currently very
> annoying for the user as git stops there and he ends up in a half
> checked out/merged/... state.
>
> I do not know whether I recall correctly that there has been some work
> related repeating such an operation in case of an access denied failure
> because I can not find anything related in the source.

Look at mingw_rename in compat/mingw.c.

Do your users have files open for writing for such a long time that the
timeouts in mingw_rename are too short?

BTW, in my private repository:

git://repo.or.cz/git/mingw/j6t.git master

contains an extension of mingw_rename that writes to stderr *if* it retried to
rename a file. You might be interested to see what it says with your users.

-- Hannes

Heiko Voigt

unread,
Jan 18, 2010, 5:22:03 PM1/18/10
to Johannes Sixt, msy...@googlegroups.com
On Mon, Jan 18, 2010 at 10:48:26PM +0100, Johannes Sixt wrote:
> On Montag, 18. Januar 2010, Heiko Voigt wrote:
> > on windows an unlink/open call can fail in case another application has
> > the file opened for writing. If this happens it is currently very
> > annoying for the user as git stops there and he ends up in a half
> > checked out/merged/... state.
> >
> > I do not know whether I recall correctly that there has been some work
> > related repeating such an operation in case of an access denied failure
> > because I can not find anything related in the source.
>
> Look at mingw_rename in compat/mingw.c.

Ah thanks. That was what I was searching ...


>
> Do your users have files open for writing for such a long time that the
> timeouts in mingw_rename are too short?

As far as I can tell. Its mostly IDE indexing and Virus Scanner
activitiy that causing this. Sometimes it seems to be triggered by
switching branches to quickly because the IDE start indexing the changed
files.

Is the retrying for rename only complete? I would think that unlink need
something like that as well?

> BTW, in my private repository:
>
> git://repo.or.cz/git/mingw/j6t.git master
>
> contains an extension of mingw_rename that writes to stderr *if* it retried to
> rename a file. You might be interested to see what it says with your users.

It could be interesting. Will probably extend it to write a log file and check
what it says once it has happened again. Its a very racy behavior.

cheers Heiko

Johannes Sixt

unread,
Jan 19, 2010, 1:23:21 PM1/19/10
to Heiko Voigt, msy...@googlegroups.com
On Montag, 18. Januar 2010, Heiko Voigt wrote:
> Is the retrying for rename only complete?

Only rename() was adjusted.

> I would think that unlink need something like that as well?

I never encountered the problems with open() or unlink() (or at least I never
attributed other rare random failures to them). But extending the mechanism
to these two functions may be worth a test.

-- Hannes

Johannes Schindelin

unread,
Jan 19, 2010, 4:46:43 PM1/19/10
to Heiko Voigt, msysGit Mailinglist
Hi,

On Mon, 18 Jan 2010, Heiko Voigt wrote:

> on windows an unlink/open call can fail in case another application has
> the file opened for writing. If this happens it is currently very
> annoying for the user as git stops there and he ends up in a half
> checked out/merged/... state.
>
> I do not know whether I recall correctly that there has been some work
> related repeating such an operation in case of an access denied failure
> because I can not find anything related in the source.
>
> As this still happens to my users in irregular intervals it seems to be
> such a common case for windows user in their day to day work that I want
> to do something about it.

Indeed.

As Hannes already mentioned, we do something like that for renames. I
think it might be a good thing to generalize this, so that you can --
eventually -- tell the user what is going on, and let them Retry in that
wonderful DOS fashion (how often have you seen the message "Retry, Abort
or Fail?"?)

However, I would try to go about it the same way as Hannes: automatic
first, and only then bolt on something that could even be handled by
something like a hook, or a config setting specifying the helper
application.

Ciao,
Dscho

Heiko Voigt

unread,
Jan 20, 2010, 2:17:05 PM1/20/10
to Johannes Schindelin, msysGit Mailinglist, Johannes Sixt, Junio C Hamano
On Tue, Jan 19, 2010 at 10:46:43PM +0100, Johannes Schindelin wrote:
> On Mon, 18 Jan 2010, Heiko Voigt wrote:
>
> > on windows an unlink/open call can fail in case another application has
> > the file opened for writing. If this happens it is currently very
> > annoying for the user as git stops there and he ends up in a half
> > checked out/merged/... state.
[...]

>
> Indeed.
>
> As Hannes already mentioned, we do something like that for renames. I
> think it might be a good thing to generalize this, so that you can --
> eventually -- tell the user what is going on, and let them Retry in that
> wonderful DOS fashion (how often have you seen the message "Retry, Abort
> or Fail?"?)
>
> However, I would try to go about it the same way as Hannes: automatic
> first, and only then bolt on something that could even be handled by
> something like a hook, or a config setting specifying the helper
> application.

Yes I agree on that. First automatic and then user interaction. I think
we should provide a default user interaction (via CLI) and make it a
config option to use an external helper. What do you think?

cheers Heiko

Johannes Schindelin

unread,
Jan 20, 2010, 6:00:44 PM1/20/10
to Heiko Voigt, msysGit Mailinglist, Johannes Sixt, Junio C Hamano
Hi,

How about taking care of the automatic handling first?

Ciao,
Dscho

Heiko Voigt

unread,
Jan 24, 2010, 8:09:54 AM1/24/10
to Johannes Schindelin, Heiko Voigt, msysGit Mailinglist, Johannes Sixt, Junio C Hamano

So here is a short test script showing the race condition. For me it did
fail for 7 out of 10 runs. Should we add it to the testsuite? If we do
how long should it run and how could we scale the time based on the
machine ? What are your experiences?

cheers Heiko

---8<--------
#!/bin/sh

touch /tmp/$0_test_running

create_repo() {
test ! -d repo &&
rm -f repo &&
mkdir repo &&
(cd repo &&
git init &&
echo "Point A" > file &&
git add file &&
git commit -m "Point A" &&
git checkout -b pointA &&
git checkout -b pointB &&
echo "Point B" > file &&
git add file &&
git commit -m "Point B" )
}

read_n_write_files() {
(cd repo &&
while test -f /tmp/$0_test_running
do
# give git a chance
sleep 0
echo "Lalala" > file
done)
}

switch_branches() {
(cd repo &&
branch=pointA
while test -f /tmp/$0_test_running
do
git checkout -f $branch ||
rm /tmp/$0_test_running
if [ "$branch" = "pointA" ]
then
branch=pointB
else
branch=pointA
fi
done )
}

create_repo

read_n_write_files &
switch_branches &

echo "Press enter to end the test"
read i
rm /tmp/$0_test_running

Johannes Schindelin

unread,
Jan 25, 2010, 9:44:54 AM1/25/10
to Heiko Voigt, msysGit Mailinglist, Johannes Sixt, Junio C Hamano
Hi,

On Sun, 24 Jan 2010, Heiko Voigt wrote:

> So here is a short test script showing the race condition. For me it did
> fail for 7 out of 10 runs. Should we add it to the testsuite? If we do
> how long should it run and how could we scale the time based on the
> machine ?

As this is only reproducible on Windows, I'd rather have it as a part of
/share/msysGit/run-tests.sh...

Ciao,
Dscho

Heiko Voigt

unread,
Jan 26, 2010, 4:05:11 PM1/26/10
to Johannes Schindelin, msysGit Mailinglist, Johannes Sixt, Junio C Hamano

But as the implementation will be part of the compat layer for Windows
it might make sense to include upstream and only enabled for Windows.

I found out that my first script does not reproduce the correct failure
as it also fails on other platforms and is about a race condition where
both processes write. Attached is another test script that should
trigger the right behavior. We need to use windows open function
(CreateFile()) because the mingw open() does not seem to be exclusive
against writing if the file is opened for reading.

The patches that implement automatic retries for unlink follow.

cheers Heiko

---8<------
#!/bin/sh

touch /tmp/$0_test_running

create_repo() {
test ! -d repo &&
rm -f repo &&
mkdir repo &&
(cd repo &&
git init &&
echo "Point A" > file &&
git add file &&
git commit -m "Point A" &&
git checkout -b pointA &&
git checkout -b pointB &&
echo "Point B" > file &&
git add file &&
git commit -m "Point B" )
}

gcc -o block_read -x c - <<BLOCK_READ
#include <stdio.h>
#include <windows.h>

int main(int argc, const char *argv[])
{
HANDLE file;
DWORD dwNumRead;
BOOL ret;
char buffer[256];

if (argc != 3) {
printf("Usage: %s <block-time> <filename>", argv[0]);
exit(1);
}

file = CreateFile(argv[2], GENERIC_READ, FILE_SHARE_READ,
NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
if (file == INVALID_HANDLE_VALUE) {
fprintf(stderr, "failed to open: '%s'\n", argv[2]);
exit(1);
}

Sleep(atoi(argv[1]));

if (!CloseHandle(file)) {
fprintf(stderr, "failed to close: '%s'\n", argv[2]);
exit(1);
}

Sleep(atoi(argv[1]));

return 0;
}
BLOCK_READ

read_n_write_files() {
(cd repo &&
while test -f /tmp/$0_test_running
do
# give git a chance

./../block_read 50 file
done)
}

switch_branches() {
(cd repo &&
branch=pointA
while test -f /tmp/$0_test_running
do
git checkout -f $branch ||
rm /tmp/$0_test_running
if [ "$branch" = "pointA" ]
then
branch=pointB
else
branch=pointA
fi
done )
}

create_repo

read_n_write_files &
switch_branches

echo "Press enter to end the test"
read i
rm /tmp/$0_test_running

Heiko Voigt

unread,
Jan 26, 2010, 4:09:08 PM1/26/10
to Johannes Schindelin, msysGit Mailinglist, Johannes Sixt, Junio C Hamano
The next patch implements a workaround in case unlink fails on Windows.

Signed-off-by: Heiko Voigt <hvo...@hvoigt.net>
---
compat/mingw.c | 8 ++++++++
compat/mingw.h | 11 +++--------
2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 10d6796..e5b2fae 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -147,6 +147,14 @@ int mingw_mkdir(const char *path, int mode)
return ret;
}

+#undef unlink
+int mingw_unlink(const char *pathname)
+{
+ /* read-only files cannot be removed */
+ chmod(pathname, 0666);
+ return unlink(pathname);
+}
+
#undef open
int mingw_open (const char *filename, int oflags, ...)
{
diff --git a/compat/mingw.h b/compat/mingw.h
index 1b528da..349d878 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -98,14 +98,6 @@ static inline int fcntl(int fd, int cmd, long arg)
* simple adaptors
*/

-static inline int mingw_unlink(const char *pathname)
-{
- /* read-only files cannot be removed */
- chmod(pathname, 0666);
- return unlink(pathname);
-}
-#define unlink mingw_unlink
-
static inline int waitpid(pid_t pid, int *status, unsigned options)
{
if (options == 0)
@@ -140,6 +132,9 @@ int readlink(const char *path, char *buf, size_t bufsiz);
int mingw_mkdir(const char *path, int mode);
#define mkdir mingw_mkdir

+int mingw_unlink(const char *pathname);
+#define unlink mingw_unlink
+
int mingw_open (const char *filename, int oflags, ...);
#define open mingw_open

--
1.6.6.rc1.35.g13067

Heiko Voigt

unread,
Jan 26, 2010, 4:10:16 PM1/26/10
to Johannes Schindelin, msysGit Mailinglist, Johannes Sixt, Junio C Hamano
If a file is opened by another process (e.g. indexing of an IDE) for
reading it is not allowed to be deleted. So in case unlink fails retry
after waiting for some time. This extends the workaround from 6ac6f878.

Signed-off-by: Heiko Voigt <hvo...@hvoigt.net>
---

compat/mingw.c | 17 +++++++++++++++--
1 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index e5b2fae..0df3b21 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -7,6 +7,7 @@

extern int hide_dotfiles;
unsigned int _CRT_fmode = _O_BINARY;
+static const int delay[] = { 0, 1, 10, 20, 40 };

static int err_win_to_posix(DWORD winerr)
{
@@ -150,9 +151,22 @@ int mingw_mkdir(const char *path, int mode)
#undef unlink
int mingw_unlink(const char *pathname)
{
+ int ret, tries = 0;


+
/* read-only files cannot be removed */

chmod(pathname, 0666);
- return unlink(pathname);

+ while ((ret = unlink(pathname)) == -1 && tries < ARRAY_SIZE(delay)) {
+ /*
+ * We assume that some other process had the source or
+ * destination file open at the wrong moment and retry.
+ * In order to give the other process a higher chance to
+ * complete its operation, we give up our time slice now.
+ * If we have to retry again, we do sleep a bit.
+ */
+ Sleep(delay[tries]);
+ tries++;
+ }
+ return ret;
}

#undef open
@@ -1042,7 +1056,6 @@ int mingw_rename(const char *pold, const char *pnew)
{
DWORD attrs, gle;
int tries = 0;
- static const int delay[] = { 0, 1, 10, 20, 40 };

/*
* Try native rename() first to get errno right.
--
1.6.6.rc1.35.g13067

Heiko Voigt

unread,
Jan 26, 2010, 4:45:56 PM1/26/10
to Johannes Schindelin, msysGit Mailinglist, Johannes Sixt, Junio C Hamano
On Tue, Jan 26, 2010 at 10:10:16PM +0100, Heiko Voigt wrote:
> If a file is opened by another process (e.g. indexing of an IDE) for
> reading it is not allowed to be deleted. So in case unlink fails retry
> after waiting for some time. This extends the workaround from 6ac6f878.

Just saw that the patches do not cleanly apply on Junios master.
They are based msysgit/master. A rebased version for testing can be
found on[1].

cheers Heiko

[1] http://github.com/hvoigt/git/tree/hv/automatic_retry

Johannes Sixt

unread,
Jan 26, 2010, 5:21:44 PM1/26/10
to Heiko Voigt, Johannes Schindelin, msysGit Mailinglist, Junio C Hamano
On Dienstag, 26. Januar 2010, Heiko Voigt wrote:
> @@ -150,9 +151,22 @@ int mingw_mkdir(const char *path, int mode)
> #undef unlink
> int mingw_unlink(const char *pathname)
> {
> + int ret, tries = 0;
> +
> /* read-only files cannot be removed */
> chmod(pathname, 0666);
> - return unlink(pathname);
> + while ((ret = unlink(pathname)) == -1 && tries < ARRAY_SIZE(delay)) {
> + /*
> + * We assume that some other process had the source or
> + * destination file open at the wrong moment and retry.
> + * In order to give the other process a higher chance to
> + * complete its operation, we give up our time slice now.
> + * If we have to retry again, we do sleep a bit.
> + */
> + Sleep(delay[tries]);
> + tries++;
> + }
> + return ret;
> }

Careful! You cannot loop on every failure, disregarding the failure reason.
For example, if the failuere is due to ENOENT, you do not have to retry.

-- Hannes

Johannes Sixt

unread,
Jan 26, 2010, 5:35:25 PM1/26/10
to Heiko Voigt, Johannes Schindelin, msysGit Mailinglist, Junio C Hamano
On Dienstag, 26. Januar 2010, Heiko Voigt wrote:
> On Mon, Jan 25, 2010 at 03:44:54PM +0100, Johannes Schindelin wrote:
> > Hi,
> >
> > On Sun, 24 Jan 2010, Heiko Voigt wrote:
> > > So here is a short test script showing the race condition. For me it
> > > did fail for 7 out of 10 runs. Should we add it to the testsuite? If we
> > > do how long should it run and how could we scale the time based on the
> > > machine ?
> >
> > As this is only reproducible on Windows, I'd rather have it as a part of
> > /share/msysGit/run-tests.sh...
>
> But as the implementation will be part of the compat layer for Windows
> it might make sense to include upstream and only enabled for Windows.

No; this is not about a test of git, but of the C runtime. Git can expect that
this fundamental infrastructure works. If msysgit has to bend over backwards
to meet this expectation, then it is a problem of msysgit.

> read_n_write_files() {
> (cd repo &&
> while test -f /tmp/$0_test_running
> do
> # give git a chance
> ./../block_read 50 file

You should be able to do this with

(
exec < file
perl -e 'select(undef,undef,undef,0.05)'
)

without another external program.

-- Hannes

Johannes Schindelin

unread,
Jan 27, 2010, 3:37:11 AM1/27/10
to Heiko Voigt, msysGit Mailinglist, Johannes Sixt, Junio C Hamano
Hi,

On Tue, 26 Jan 2010, Heiko Voigt wrote:

> On Tue, Jan 26, 2010 at 10:10:16PM +0100, Heiko Voigt wrote:
> > If a file is opened by another process (e.g. indexing of an IDE) for
> > reading it is not allowed to be deleted. So in case unlink fails retry
> > after waiting for some time. This extends the workaround from 6ac6f878.
>
> Just saw that the patches do not cleanly apply on Junios master.
> They are based msysgit/master. A rebased version for testing can be
> found on[1].

I think that is okay, as we have to test it in msysGit (and maybe to carry
in 4msysgit.git forever).

Care to push to a branch in 4msysgit.git?

Ciao,
Dscho

Heiko Voigt

unread,
Jan 27, 2010, 2:18:47 PM1/27/10
to Johannes Sixt, Johannes Schindelin, msysGit Mailinglist, Junio C Hamano

I considered this to be such an uncommon case that waiting for some ms
would not be to bad. But since you mentioned it and to be clean I will
incorporate a check for that.

cheers Heiko

Heiko Voigt

unread,
Jan 27, 2010, 4:26:36 PM1/27/10
to Johannes Sixt, Johannes Schindelin, msysGit Mailinglist, Junio C Hamano
On Tue, Jan 26, 2010 at 11:35:25PM +0100, Johannes Sixt wrote:
> On Dienstag, 26. Januar 2010, Heiko Voigt wrote:
> > On Mon, Jan 25, 2010 at 03:44:54PM +0100, Johannes Schindelin wrote:
> > > Hi,
> > >
> > > On Sun, 24 Jan 2010, Heiko Voigt wrote:
> > > > So here is a short test script showing the race condition. For me it
> > > > did fail for 7 out of 10 runs. Should we add it to the testsuite? If we
> > > > do how long should it run and how could we scale the time based on the
> > > > machine ?
> > >
> > > As this is only reproducible on Windows, I'd rather have it as a part of
> > > /share/msysGit/run-tests.sh...
> >
> > But as the implementation will be part of the compat layer for Windows
> > it might make sense to include upstream and only enabled for Windows.
>
> No; this is not about a test of git, but of the C runtime. Git can expect that
> this fundamental infrastructure works. If msysgit has to bend over backwards
> to meet this expectation, then it is a problem of msysgit.

Not sure here. According to [1]:

~~~~snip~~~~~
ERRORS

The unlink() function shall fail and shall not unlink the file if:
[...]

[EBUSY]
The file named by the path argument cannot be unlinked because it is
being used by the system or another process and the implementation
considers this an error.
~~~~snap~~~~~

So according to POSIX unlink may fail if the system thinks so. Even
though most POSIX conformant systems do not do that in practise this is
an argument that the infrastructure meets the expectation if this call fails
for an "file in use" case.

But I do not feel very strongly about this, we can leave the test in
msysgit if everyone else thinks it belongs there.

> > read_n_write_files() {
> > (cd repo &&
> > while test -f /tmp/$0_test_running
> > do
> > # give git a chance
> > ./../block_read 50 file
>
> You should be able to do this with
>
> (
> exec < file
> perl -e 'select(undef,undef,undef,0.05)'
> )
>
> without another external program.

Unfortunately this seems to trigger a different failure case. My patch
does not fix this (even if extended to mingw_open). Not sure why.

In my C implementation the pause is during the locked-for-reading-phase.
Whereas your implementation would read as fast as possible and then
pause.

cheers Heiko

[1] http://www.opengroup.org/onlinepubs/000095399/functions/unlink.html

Heiko Voigt

unread,
Jan 27, 2010, 4:36:55 PM1/27/10
to Johannes Schindelin, msysGit Mailinglist, Johannes Sixt, Junio C Hamano

Here you go. An updated version of the changes can be found on
hv/automatic_retry. I have based it on the devel branch.

Some feedback about your experiences with this change would be nice.

cheers Heiko

Johannes Sixt

unread,
Jan 27, 2010, 4:40:36 PM1/27/10
to Heiko Voigt, Johannes Schindelin, msysGit Mailinglist, Junio C Hamano

Do you intend to check just for ENOENT? Then I will not agree.

Look how we do it for rename(): We do not retry for every random error
condition, but only for that one error condition that happens when another
process has the file open -- ERROR_ACCESS_DENIED.

-- Hannes

Heiko Voigt

unread,
Jan 27, 2010, 4:47:13 PM1/27/10
to Johannes Sixt, Johannes Schindelin, msysGit Mailinglist, Junio C Hamano
On Wed, Jan 27, 2010 at 10:40:36PM +0100, Johannes Sixt wrote:
> On Mittwoch, 27. Januar 2010, Heiko Voigt wrote:
> > On Tue, Jan 26, 2010 at 11:21:44PM +0100, Johannes Sixt wrote:
> > > Careful! You cannot loop on every failure, disregarding the failure
> > > reason. For example, if the failuere is due to ENOENT, you do not have to
> > > retry.
> >
> > I considered this to be such an uncommon case that waiting for some ms
> > would not be to bad. But since you mentioned it and to be clean I will
> > incorporate a check for that.
>
> Do you intend to check just for ENOENT? Then I will not agree.

No of course not! ;) Have a look here:

http://repo.or.cz/w/git/mingw/4msysgit.git/commitdiff/a655bb

Just as you hopefully expected it.

> Look how we do it for rename(): We do not retry for every random error
> condition, but only for that one error condition that happens when another
> process has the file open -- ERROR_ACCESS_DENIED.

See above.

cheers Heiko

Johannes Sixt

unread,
Jan 27, 2010, 4:51:54 PM1/27/10
to Heiko Voigt, Johannes Schindelin, msysGit Mailinglist, Junio C Hamano

No. 'exec <file' opens the file and keeps it open until the shell (or
sub-shell) exits. So, my example *does* keep the file open while it sleeps.

The difference might be that the fork of the sub-shell takes considerable
time, so you see more delays between the reads. perl startup may also be
slower and add to the delay.

-- Hannes

Johannes Sixt

unread,
Jan 27, 2010, 5:00:44 PM1/27/10
to Heiko Voigt, Johannes Schindelin, msysGit Mailinglist, Junio C Hamano
On Mittwoch, 27. Januar 2010, Heiko Voigt wrote:
> On Wed, Jan 27, 2010 at 10:40:36PM +0100, Johannes Sixt wrote:
> > On Mittwoch, 27. Januar 2010, Heiko Voigt wrote:
> > > On Tue, Jan 26, 2010 at 11:21:44PM +0100, Johannes Sixt wrote:
> > > > Careful! You cannot loop on every failure, disregarding the failure
> > > > reason. For example, if the failuere is due to ENOENT, you do not
> > > > have to retry.
> > >
> > > I considered this to be such an uncommon case that waiting for some ms
> > > would not be to bad. But since you mentioned it and to be clean I will
> > > incorporate a check for that.
> >
> > Do you intend to check just for ENOENT? Then I will not agree.
>
> No of course not! ;) Have a look here:

Good! You have this code:

+ while ((ret = unlink(pathname)) == -1 && tries < ARRAY_SIZE(delay)) {

+ if (err_win_to_posix(GetLastError()) != EACCES)
+ break;

1. You call GetLastError(). Are you sure that you get the error from the
correct system call?

2. When you look at an errno value. Why do you not inspect errno?

3. When you call GetLastError() anyway, why not check for the real Windows
error code that applies to the case of interest? (ERROR_ACCESS_DENIED again,
most likely). Checking the result of err_win_to_posix for EACCES means that
you check for many different Windows error codes that do not indicate that a
program had the file open.

-- Hannes

Heiko Voigt

unread,
Jan 29, 2010, 10:34:48 AM1/29/10
to Johannes Sixt, Johannes Schindelin, msysGit Mailinglist, Junio C Hamano
On Wed, Jan 27, 2010 at 11:00:44PM +0100, Johannes Sixt wrote:
> On Mittwoch, 27. Januar 2010, Heiko Voigt wrote:
> > On Wed, Jan 27, 2010 at 10:40:36PM +0100, Johannes Sixt wrote:
> > > On Mittwoch, 27. Januar 2010, Heiko Voigt wrote:
> > > > On Tue, Jan 26, 2010 at 11:21:44PM +0100, Johannes Sixt wrote:
> > > > > Careful! You cannot loop on every failure, disregarding the failure
> > > > > reason. For example, if the failuere is due to ENOENT, you do not
> > > > > have to retry.
> > > >
> > > > I considered this to be such an uncommon case that waiting for some ms
> > > > would not be to bad. But since you mentioned it and to be clean I will
> > > > incorporate a check for that.
> > >
> > > Do you intend to check just for ENOENT? Then I will not agree.
> >
> > No of course not! ;) Have a look here:
>
> Good! You have this code:
>
> + while ((ret = unlink(pathname)) == -1 && tries < ARRAY_SIZE(delay)) {
> + if (err_win_to_posix(GetLastError()) != EACCES)
> + break;
>
> 1. You call GetLastError(). Are you sure that you get the error from the
> correct system call?

Probably yes. But you are right, we use unlink so we should be
consistent and use errno. I will send another patch series with that
corrected. The branch is already updated.

> 2. When you look at an errno value. Why do you not inspect errno?

See above.

> 3. When you call GetLastError() anyway, why not check for the real Windows
> error code that applies to the case of interest? (ERROR_ACCESS_DENIED again,
> most likely). Checking the result of err_win_to_posix for EACCES means that
> you check for many different Windows error codes that do not indicate that a
> program had the file open.

See above.


cheers Heiko

Heiko Voigt

unread,
Jan 29, 2010, 10:37:17 AM1/29/10
to Johannes Sixt, Johannes Schindelin, msysGit Mailinglist, Junio C Hamano
The next patch implements a workaround in case unlink fails on Windows.

Signed-off-by: Heiko Voigt <hvo...@hvoigt.net>
---


compat/mingw.c | 8 ++++++++
compat/mingw.h | 11 +++--------

2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 33b6b80..3661553 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -157,6 +157,14 @@ int mingw_mkdir(const char *path, int mode)


return ret;
}

+#undef unlink
+int mingw_unlink(const char *pathname)
+{

+ /* read-only files cannot be removed */

+ chmod(pathname, 0666);
+ return unlink(pathname);
+}
+
#undef open
int mingw_open (const char *filename, int oflags, ...)
{
diff --git a/compat/mingw.h b/compat/mingw.h

index 364cc72..740aa59 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -111,14 +111,6 @@ static inline int fcntl(int fd, int cmd, long arg)


* simple adaptors
*/

-static inline int mingw_unlink(const char *pathname)
-{

- /* read-only files cannot be removed */
- chmod(pathname, 0666);
- return unlink(pathname);


-}
-#define unlink mingw_unlink
-

static inline pid_t waitpid(pid_t pid, int *status, unsigned options)
{
if (options == 0)
@@ -153,6 +145,9 @@ int readlink(const char *path, char *buf, size_t bufsiz);


int mingw_mkdir(const char *path, int mode);
#define mkdir mingw_mkdir

+int mingw_unlink(const char *pathname);
+#define unlink mingw_unlink
+
int mingw_open (const char *filename, int oflags, ...);
#define open mingw_open

--

1.7.0.rc0.27.ge8d3a

Heiko Voigt

unread,
Jan 29, 2010, 10:47:24 AM1/29/10
to Johannes Sixt, Johannes Schindelin, msysGit Mailinglist, Junio C Hamano
If a file is opened by another process (e.g. indexing of an IDE) for
reading it is not allowed to be deleted. So in case unlink fails retry
after waiting for some time. This extends the workaround from 6ac6f878.

Signed-off-by: Heiko Voigt <hvo...@hvoigt.net>
---
Junio: How do you feel about including this upstream? We already have
the same logic for mingw_rename?

Johannes(both ;)): Do you think this would be ready in this state?

I have ran my following final testscript with the stated command 5 times
with and without the change. Without it it failed evertime. With it all
runs passed.

rm -rf repo block_read.exe && ./test_it.sh && echo Test OK

----8<--------- test_it.sh
#!/bin/sh

touch /tmp/$0_test_running

Sleep(atoi(argv[1]));

return 0;
}
BLOCK_READ

read_n_write_files() {
(cd repo &&
while test -f /tmp/$0_test_running
do
# give git a chance

./../block_read 50 file 2> /dev/null
done)
}

switch_branches() {
(cd repo &&
branch=pointA

while test -f /tmp/$0_test_running
do

git checkout -f $branch 2> /dev/null ||


rm /tmp/$0_test_running
if [ "$branch" = "pointA" ]
then
branch=pointB
else
branch=pointA
fi
done )
}

create_repo

read_n_write_files &
switch_branches &

echo "This test will run for 10 seconds"
sleep 10 &&
rm /tmp/$0_test_running
----8<-------------

compat/mingw.c | 19 +++++++++++++++++--
1 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 3661553..7a833de 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -7,6 +7,7 @@
#include "../cache.h"



unsigned int _CRT_fmode = _O_BINARY;
+static const int delay[] = { 0, 1, 10, 20, 40 };

static int err_win_to_posix(DWORD winerr)
{

@@ -160,9 +161,24 @@ int mingw_mkdir(const char *path, int mode)


#undef unlink
int mingw_unlink(const char *pathname)
{
+ int ret, tries = 0;

+
/* read-only files cannot be removed */

chmod(pathname, 0666);
- return unlink(pathname);

+ while ((ret = unlink(pathname)) == -1 && tries < ARRAY_SIZE(delay)) {

+ if (errno != EACCES)
+ break;


+ /*
+ * We assume that some other process had the source or
+ * destination file open at the wrong moment and retry.
+ * In order to give the other process a higher chance to
+ * complete its operation, we give up our time slice now.
+ * If we have to retry again, we do sleep a bit.
+ */
+ Sleep(delay[tries]);
+ tries++;
+ }
+ return ret;
}

#undef open
@@ -1054,7 +1070,6 @@ int mingw_rename(const char *pold, const char *pnew)


{
DWORD attrs, gle;
int tries = 0;
- static const int delay[] = { 0, 1, 10, 20, 40 };

/*
* Try native rename() first to get errno right.
--

1.7.0.rc0.27.ge8d3a

Reply all
Reply to author
Forward
0 new messages