[PATCH v15 00/26] Issue 80: Unicode support on Windows

1,295 views
Skip to first unread message

karste...@dcon.de

unread,
Feb 5, 2012, 1:39:18 PM2/5/12
to msy...@googlegroups.com, bl...@dcon.de, Johannes....@gmx.de, kusm...@gmail.com, patt...@gmail.com, sschu...@gmail.com, ki...@mns.spb.ru, mic...@wheelycreek.net, robin.r...@gmail.com, at...@chejz.com, pa...@paulbetts.org, vital...@gmail.com, flat...@users.sourceforge.net
Hi there,

here is v15 of the unicode patch series:

git: https://github.com/kblees/git/tree/kb/unicode-v15
less-444 (unchanged):
https://github.com/kblees/msysgit/commits/kb/msys/less
Unicode msys.dll (unchanged):
https://github.com/kblees/msysgit/commits/kb/msys/unicode
Recodetree script (unchanged):
https://github.com/kblees/msysgit/commits/kb/recodetree

When merging, beware that the less build script and msys.dll patch go to
the msys branch of the msysgit repo. And Atsushi Nakagawa's vim-patch
(an/vim-utf-8) should be merged, too.

An updated Unicode Howto in MediaWiki format is included below (also
available here: https://github.com/kblees/git/wiki).

Again, I'll only include the interdiff (pretty small this time), if you'd
like the 26 patches via email, please speak up.


A git installer that bundles the Unicode msys.dll, less-444, recodetree
script and git v1.7.9 with unicode-v15 patches can be found here:
https://docs.google.com/uc?id=0BxXUoUg2r8ftMzJiNDRlNWEtYjMxZi00YWMzLThiNjktMjcyZTQ1MzlhZDFj&export=download

Feel free to upload to the msysgit downloads area.

And TortoiseGit V1.7.5.0 built with CP_ACP replaced by CP_UTF8
(unchanged):
32 bit:
https://docs.google.com/uc?id=0BxXUoUg2r8ftYjFmMzg0MmItYjZhMy00MjM4LWFkYjktN2RiOTUxNDdiMzdk&export=download

64 bit:
https://docs.google.com/uc?id=0BxXUoUg2r8ftNDQxMWEyMTgtM2FjMy00YzA5LTgxNGEtNTc2MjhkYWZjNDIy&export=download


Bye,
Karsten


Changes since kb/unicode-v14:
* Unicode file name support (gitk and git-gui)
- fix git-gui startup if absolute worktree path contains non-ascii chars
- fix encoding in git-gui options dialog (e.g. user name)


Commit logs ('+': new/rewritten in v15, '-': removed, '!': modified)
---
[01/26] git-gui: fix encoding in git-gui file browser
[02/26] gitk: fix file name encoding in diff hunk headers
[03/26] Revert "Disable test on MinGW that challenges its bash quoting"
[04/26] Revert "Windows: teach getenv to do a case-sensitive search"
[05/26] Revert "mingw.c: move definition of mingw_getenv down"
[06/26] Win32: Thread-safe windows console output
[07/26] Win32: add Unicode conversion functions
[08/26] Win32: Unicode file name support (except dirent)
[09/26] Win32: Unicode file name support (dirent)
! [10/26] Unicode file name support (gitk and git-gui)
[11/26] Win32: Unicode arguments (outgoing)
[12/26] Win32: Unicode arguments (incoming)
[13/26] Win32: sync Unicode console output and file system
[14/26] Win32: Unicode environment (outgoing)
[15/26] Win32: Unicode environment (incoming)
[16/26] MinGW: disable legacy encoding tests
[17/26] Win32: fix environment memory leaks
[18/26] Win32: unify environment case-sensitivity
[19/26] Win32: simplify internal mingw_spawn* APIs
[20/26] Win32: move environment functions
[21/26] Win32: unify environment function names
[22/26] Win32: factor out environment block creation
[23/26] Win32: don't copy the environment twice when spawning child
processes
[24/26] Win32: reduce environment array reallocations
[25/26] Win32: keep the environment sorted
[26/26] Win32: patch Windows environment on startup

Makefile | 2 -
compat/mingw.c | 680
+++++++++++++++++++++++++++++++----------------
compat/mingw.h | 134 +++++++++-
compat/win32/dirent.c | 30 +--
compat/win32/dirent.h | 2 +-
compat/winansi.c | 397 ++++++++++++++++++----------
git-gui/git-gui.sh | 11 +-
git-gui/lib/browser.tcl | 2 +-
git-gui/lib/index.tcl | 6 +-
gitk-git/gitk | 15 +-
run-command.c | 10 +-
t/t3901-i18n-patch.sh | 19 +-
t/t4201-shortlog.sh | 6 +-
t/t5505-remote.sh | 5 +-
t/t8005-blame-i18n.sh | 8 +-
15 files changed, 883 insertions(+), 444 deletions(-)
---


Diff between v14 and v15...
---
$ git diff -p --stat kb/unicode-v14-4efc677a..kb/unicode-v15
git-gui/git-gui.sh | 7 +++++--
1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/git-gui/git-gui.sh b/git-gui/git-gui.sh
index 9988746..e5038dd 100755
--- a/git-gui/git-gui.sh
+++ b/git-gui/git-gui.sh
@@ -533,6 +533,9 @@ proc git {args} {

_trace_exec [concat $opt $cmdp $args]
set result [eval exec $opt $cmdp $args]
+ if {[encoding system] != "utf-8"} {
+ set result [encoding convertfrom utf-8 [encoding convertto
$result]]
+ }
if {$::_trace} {
puts stderr "< $result"
}
@@ -1087,7 +1090,7 @@ git-version proc _parse_config {arr_name args} {
[list git_read config] \
$args \
[list --null --list]]
- fconfigure $fd_rc -translation binary
+ fconfigure $fd_rc -translation binary -encoding
utf-8
set buf [read $fd_rc]
close $fd_rc
}
@@ -1273,7 +1276,7 @@ apply_config

# v1.7.0 introduced --show-toplevel to return the canonical work-tree
if {[package vsatisfies $_git_version 1.7.0]} {
- set _gitworktree [encoding convertfrom utf-8 [git rev-parse
--show-toplevel]]
+ set _gitworktree [git rev-parse --show-toplevel]
} else {
# try to set work tree from environment, core.worktree or use
# cdup to obtain a relative path to the top of the worktree. If
---


Unicode Howto
---
= Git for Windows Unicode Support =

As of V1.7.8, a Unicode-aware Version of Git for Windows is available.
Here are a few things to consider when using the Unicode version.

== Editor ==

If you want to use a custom text editor to enter commit messages or to
edit config files (instead of vim/gvim that are installed with Git for
Windows), find one that supports Unix line breaks (LF only) and can save
UTF-8 without BOM (i.e. Windows notepad.exe is a bad choice).

== Settings ==

=== Windows settings ===

==== Console font (per user) ====

The default console font does not support Unicode. Change the console font
to a TrueType font such as Lucida Console or Consolas. The setup program
can do this automatically, but only for the installing user.

=== Git settings ===

These can be set per user (with the --global option) or per repository,
the repository settings take precedence.

==== Disable quoted file names ====

By default, git will print non-ASCII file names in quoted octal notation,
i.e. "\nnn\nnn...". This can be disabled with

git config [--global] core.quotepath off

==== Disable commit message transcoding ====

Previous Git for Windows required to set the
<tt>i18n.logoutputencoding</tt> to your Windows system's default OEM
encoding for proper console output of non-ASCII commit messages. This is
no longer necessary. Remove this or set it to 'utf-8':

git config [--global] --unset i18n.logoutputencoding

The i18n.commitencoding setting should also be removed or set to 'utf-8'
to support commit messages on the command line (<tt>git commit -m
"..."</tt> from cmd.exe, MSYS bash won't let you enter non-ASCII
characters):

git config [--global] --unset i18n.commitencoding

== Migrating old Git for Windows repositories ==

This is only relevant if you used non-ASCII file names with non-Unicode
Git for Windows versions.

Previous Git for Windows versions stored file names in the default
encoding of the originating Windows system, making these repositories
incompatible with other Windows language-versions and other Git versions
(including Cygwin-git and JGit / EGit on Windows).

The Unicode-enabled Git for Windows stores file names UTF-8 encoded.

=== Checking if a repository contains non-ASCII file names ===

The <tt>recodetree check</tt> command scans the entire history of a git
repository and prints all non-ASCII file names. If the output is empty, no
migration is necessary.

=== Migration with previous Git for Windows version available ===

The simplest way to convert old repositories is by keeping an old Git for
Windows version around:

# With the old Git for Windows version: check out a completely clean state
of the working copy (so <tt>git status</tt> reports nothing, not even
untracked files). This can be achieved by manually deleting everything
except the .git directory, and then doing a <tt>git reset --hard</tt>.
# With the new Git for Windows version: <tt>git status</tt> with the new
version should now report non-ASCII file names as deleted (with mangled
file name) as well as untracked (with correct file name). Add the changes
to the repository with <tt>git add --all & git commit -m "UTF-8
conversion"</tt>.
# Repeat for every branch of interest

=== Migration without previous Git for Windows available ===

==== Manually ====

This requires renaming all non-ASCII file names manually.

# <tt>git status</tt> will report non-ASCII file names as deleted (with
mangled names) as well as untracked (also with mangled names).
# Fix the file names in the working copy manually.
# Add the changes to the repository, e.g. with <tt>git add --all & git
commit -m "UTF-8 conversion"</tt>.
# Repeat for every branch of interest

==== Using the recodetree script (experimental) ====

This requires iconv.exe on the path.

# <tt>git status</tt> will report non-ASCII file names as deleted (with
mangled names) as well as untracked (also with mangled names).
# Use <tt>recodetree head</tt> to convert non-ASCII file names to UTF-8
and stage them for commit. Check the results with <tt>git status</tt>.
# Commit the changes to the repository with <tt>git commit -m "UTF-8
conversion"</tt>.
# Clean up the working copy (remove untracked files with mangled names,
e.g. delete everything except .git, then <tt>git reset --hard</tt>).
# Repeat for every branch of interest

=== Convert config files ===

Git config files with non-ASCII content need to be converted to UTF-8, for
example your name in %HOME%/.gitconfig, or non-ASCII file names in
.gitattributes / .gitignore / .gitmodules files.

=== Migrating the entire history (experimental) ===

The <tt>recodetree history</tt> command can be used to convert the entire
history of the repository (requires iconv.exe). Beware that rewriting
history changes all the object hashes in the repository, which has quite
severe implications on other users if the repository is published (see
"RECOVERING FROM UPSTREAM REBASE" in git help rebase). The <tt>recodetree
history</tt> script currently does not convert config files such as
.gitattributes / .gitignore / .gitmodules.

---

Sebastian Schuberth

unread,
Feb 6, 2012, 4:29:24 AM2/6/12
to karste...@dcon.de, msy...@googlegroups.com, bl...@dcon.de, Johannes....@gmx.de, kusm...@gmail.com, patt...@gmail.com, ki...@mns.spb.ru, mic...@wheelycreek.net, robin.r...@gmail.com, at...@chejz.com, pa...@paulbetts.org, vital...@gmail.com, flat...@users.sourceforge.net
On Sun, Feb 5, 2012 at 19:39, <karste...@dcon.de> wrote:

> less-444 (unchanged):
> https://github.com/kblees/msysgit/commits/kb/msys/less

In an effort to reduce the diff to Karsten's branches I was thinking
about merging the above. Will this just work fine with the current
non-Unicode code, or do you expect any problems due to the "Update
less settings for UTF-8" patch, Karsten?

--
Sebastian Schuberth

karste...@dcon.de

unread,
Feb 6, 2012, 6:53:28 AM2/6/12
to Sebastian Schuberth, at...@chejz.com, bl...@dcon.de, flat...@users.sourceforge.net, Johannes....@gmx.de, ki...@mns.spb.ru, kusm...@gmail.com, mic...@wheelycreek.net, msy...@googlegroups.com, patt...@gmail.com, pa...@paulbetts.org, robin.r...@gmail.com, vital...@gmail.com

Erik Faye-Lund

unread,
Feb 6, 2012, 11:14:34 AM2/6/12
to karste...@dcon.de, msy...@googlegroups.com, bl...@dcon.de, Johannes....@gmx.de, patt...@gmail.com, sschu...@gmail.com, ki...@mns.spb.ru, mic...@wheelycreek.net, robin.r...@gmail.com, at...@chejz.com, pa...@paulbetts.org, vital...@gmail.com, flat...@users.sourceforge.net
On Sun, Feb 5, 2012 at 7:39 PM, <karste...@dcon.de> wrote:
> Hi there,
>
> here is v15 of the unicode patch series:
>
> git: https://github.com/kblees/git/tree/kb/unicode-v15

Awesome work! I've reviewed the patches again, and everything looks good to me!

> less-444 (unchanged):
> https://github.com/kblees/msysgit/commits/kb/msys/less
> Unicode msys.dll (unchanged):
> https://github.com/kblees/msysgit/commits/kb/msys/unicode
> Recodetree script (unchanged):
> https://github.com/kblees/msysgit/commits/kb/recodetree
>
> When merging, beware that the less build script and msys.dll patch go to
> the msys branch of the msysgit repo. And Atsushi Nakagawa's vim-patch
> (an/vim-utf-8) should be merged, too.
>

I would have preferred to get that as "for-msys" and "for-devel"
branches including an/vim-utf-8, but thanks to your instructions +
some common sense I think I figured out how to merge it all :)

Merged, and pushed! If I messed up something, I'm sure we can fix that later...

karste...@dcon.de

unread,
Feb 6, 2012, 12:46:17 PM2/6/12
to kusm...@gmail.com, at...@chejz.com, bl...@dcon.de, flat...@users.sourceforge.net, Johannes....@gmx.de, ki...@mns.spb.ru, mic...@wheelycreek.net, msy...@googlegroups.com, patt...@gmail.com, pa...@paulbetts.org, robin.r...@gmail.com, sschu...@gmail.com, vital...@gmail.com

This is SO COOL, thanks :-)

To be honest, it wasn't quite clear to me how I should publish a
'multi-branch' patch. As I understand it, msys binaries are built in the
msys branch, then cherry-picked to devel [1]. So, could you please
cherry-pick these three (binaries and settings), the rest looks good to
me:

c82c0ff Updated msys-1.0.dll to MSYS-g7ebac74
e1fba54 Updated less.exe to less-444
50f2c54 Update less settings for UTF-8

[1] http://groups.google.com/group/msysgit/msg/f32c9c13a528b156

PS: I hope that links are OK now (I double-checked the "Plain text only"
setting), otherwise I'll have to mangle URLs...

Johannes Schindelin

unread,
Feb 6, 2012, 1:17:52 PM2/6/12
to Erik Faye-Lund, karste...@dcon.de, msy...@googlegroups.com, bl...@dcon.de, patt...@gmail.com, sschu...@gmail.com, ki...@mns.spb.ru, mic...@wheelycreek.net, robin.r...@gmail.com, at...@chejz.com, pa...@paulbetts.org, vital...@gmail.com, flat...@users.sourceforge.net
Hi Karsten & Kusma,

On Mon, 6 Feb 2012, Erik Faye-Lund wrote:

> On Sun, Feb 5, 2012 at 7:39 PM, <karste...@dcon.de> wrote:
>
> > here is v15 of the unicode patch series:
> >
> > git: https://github.com/kblees/git/tree/kb/unicode-v15
>
> Awesome work! I've reviewed the patches again, and everything looks good
> to me!
>

> [...]
>
> Merged, and pushed!

Awesome! Thank you both for your persistence and hard work.

Ciao,
Dscho

Erik Faye-Lund

unread,
Feb 6, 2012, 2:34:00 PM2/6/12
to karste...@dcon.de, at...@chejz.com, bl...@dcon.de, flat...@users.sourceforge.net, Johannes....@gmx.de, ki...@mns.spb.ru, mic...@wheelycreek.net, msy...@googlegroups.com, patt...@gmail.com, pa...@paulbetts.org, robin.r...@gmail.com, sschu...@gmail.com, vital...@gmail.com
On Mon, Feb 6, 2012 at 6:46 PM, <karste...@dcon.de> wrote:
> Erik Faye-Lund <kusm...@gmail.com> wrote on 06.02.2012 17:14:34:
>
>> On Sun, Feb 5, 2012 at 7:39 PM,  <karste...@dcon.de> wrote:
>> > Hi there,
>> >
>> > here is v15 of the unicode patch series:
>> >
>> > git: https://github.com/kblees/git/tree/kb/unicode-v15
>>
>> Awesome work! I've reviewed the patches again, and everything looks
>> good to me!
>>
>> > less-444 (unchanged):
>> > https://github.com/kblees/msysgit/commits/kb/msys/less
>> > Unicode msys.dll (unchanged):
>> > https://github.com/kblees/msysgit/commits/kb/msys/unicode
>> > Recodetree script (unchanged):
>> > https://github.com/kblees/msysgit/commits/kb/recodetree
>> >
>> > When merging, beware that the less build script and msys.dll patch go
> to
>> > the msys branch of the msysgit repo. And Atsushi Nakagawa's vim-patch
>> > (an/vim-utf-8) should be merged, too.
>> >
>>
>> I would have preferred to get that as "for-msys" and "for-devel"
>> branches including an/vim-utf-8, but thanks to your instructions +
>> some common sense I think I figured out how to merge it all :)
>>
>> Merged, and pushed! If I messed up something, I'm sure we can fix
>> that later...
>
> This is SO COOL, thanks :-)
>

Hey, you were the one doing the heavy lifting, thank YOU! :)

> To be honest, it wasn't quite clear to me how I should publish a
> 'multi-branch' patch.

Neither was I, it would just have been a bit easier to just have two
branches ready to pull. But this works fine as well :)

> As I understand it, msys binaries are built in the
> msys branch, then cherry-picked to devel [1]. So, could you please
> cherry-pick these three (binaries and settings), the rest looks good to
> me:
>
> c82c0ff Updated msys-1.0.dll to MSYS-g7ebac74
> e1fba54 Updated less.exe to less-444
> 50f2c54 Update less settings for UTF-8
>

Done, pushed. Thanks again!

Erik Faye-Lund

unread,
Feb 6, 2012, 4:59:04 PM2/6/12
to karste...@dcon.de, at...@chejz.com, bl...@dcon.de, flat...@users.sourceforge.net, Johannes....@gmx.de, ki...@mns.spb.ru, mic...@wheelycreek.net, msy...@googlegroups.com, patt...@gmail.com, pa...@paulbetts.org, robin.r...@gmail.com, sschu...@gmail.com, vital...@gmail.com

Hmm, I'm seeing an annoying little issue; when there's a lot of data
waiting for the pager, it doesn't quit right away on "ESC" any more.
It seems Git rather spends whatever time it takes to complete the
action it was doing. A simple "git log" in git.git shows what I mean
very clearly...

Kirill Smelkov

unread,
Feb 7, 2012, 2:45:55 AM2/7/12
to karste...@dcon.de, Erik Faye-Lund, msy...@googlegroups.com, bl...@dcon.de, Johannes....@gmx.de, patt...@gmail.com, sschu...@gmail.com, mic...@wheelycreek.net, robin.r...@gmail.com, at...@chejz.com, pa...@paulbetts.org, vital...@gmail.com, flat...@users.sourceforge.net

Karsten, Erik, thanks for finally merging this. It's a great step
forward!

Thanks a lot!
Kirill

Erik Faye-Lund

unread,
Feb 7, 2012, 4:36:07 AM2/7/12
to karste...@dcon.de, at...@chejz.com, bl...@dcon.de, flat...@users.sourceforge.net, Johannes....@gmx.de, ki...@mns.spb.ru, mic...@wheelycreek.net, msy...@googlegroups.com, patt...@gmail.com, pa...@paulbetts.org, robin.r...@gmail.com, sschu...@gmail.com, vital...@gmail.com

Never mind, this is not a problem your series introduced, it was in
the 'devel'-branch before I applied your patches as well, it seems.

Atsushi Nakagawa

unread,
Feb 7, 2012, 7:52:16 PM2/7/12
to karste...@dcon.de, kusm...@gmail.com, msy...@googlegroups.com, bl...@dcon.de
On Mon, 6 Feb 2012 17:14:34 +0100
Erik Faye-Lund <kusm...@gmail.com> wrote:
> On Sun, Feb 5, 2012 at 7:39 PM, <karste...@dcon.de> wrote:
> > Hi there,
> >
> > here is v15 of the unicode patch series:
> >
> [...]

>
> Merged, and pushed! If I messed up something, I'm sure we can fix that later...

A big fat congratulations! to Karsten and Eric on finally getting this
merged! It couldn't have been done without Karsten's persistence to see
it through and Eric's vigilant and invaluable efforts!

Regards,


--
Atsushi Nakagawa
<at...@chejz.com>
Changes are made when there is inconvenience.

Atsushi Nakagawa

unread,
Feb 7, 2012, 8:28:36 PM2/7/12
to kusm...@gmail.com, karste...@dcon.de, at...@chejz.com, bl...@dcon.de, flat...@users.sourceforge.net, Johannes....@gmx.de, ki...@mns.spb.ru, mic...@wheelycreek.net, msy...@googlegroups.com, patt...@gmail.com, pa...@paulbetts.org, robin.r...@gmail.com, sschu...@gmail.com, vital...@gmail.com
On Date: Mon, 6 Feb 2012 20:34:00 +0100

Erik Faye-Lund <kusm...@gmail.com> wrote:
> On Mon, Feb 6, 2012 at 6:46 PM, <karste...@dcon.de> wrote:
> > Erik Faye-Lund <kusm...@gmail.com> wrote on 06.02.2012 17:14:34:
> >
> >> On Sun, Feb 5, 2012 at 7:39 PM,  <karste...@dcon.de> wrote:
> >> > When merging, beware that the less build script and msys.dll patch go
> > to
> >> > the msys branch of the msysgit repo. And Atsushi Nakagawa's vim-patch
> >> > (an/vim-utf-8) should be merged, too.
> >> >
> >>
> >> I would have preferred to get that as "for-msys" and "for-devel"
> >> branches including an/vim-utf-8, but thanks to your instructions +
> >> some common sense I think I figured out how to merge it all :)
> >>
> >> Merged, and pushed! If I messed up something, I'm sure we can fix
> >> that later...
>
> > To be honest, it wasn't quite clear to me how I should publish a
> > 'multi-branch' patch.
>
> Neither was I, it would just have been a bit easier to just have two
> branches ready to pull. But this works fine as well :)
>
> > As I understand it, msys binaries are built in the
> > msys branch, then cherry-picked to devel [1]. So, could you please
> > cherry-pick these three (binaries and settings), the rest looks good to
> > me:
> >
> > c82c0ff Updated msys-1.0.dll to MSYS-g7ebac74
> > e1fba54 Updated less.exe to less-444
> > 50f2c54 Update less settings for UTF-8

Oops, I just noticed that the version of "an/vim-utf-8" that was merged
is an older version in msysgit/msysgit[1] which I couldn't work out how
to push over.

The last version that I pushed was to atnak/msysgit[2] and this list[3].
The only delta (aside from the commit message) is an addition of
"git-rebase-todo" to the file name pattern (for `rebase -i').

Is there any way you could get this one merged instead (or evil merged
to the current HEAD)?

Regards,

[1] https://github.com/msysgit/msysgit/tree/an/vim-utf-8
[2] https://github.com/atnak/msysgit/commits/an/vim-utf-8
[3] http://groups.google.com/group/msysgit/browse_thread/thread/ab5ad659831b0fe0

Johannes Schindelin

unread,
Feb 8, 2012, 1:35:16 AM2/8/12
to Atsushi Nakagawa, karste...@dcon.de, kusm...@gmail.com, msy...@googlegroups.com, bl...@dcon.de
Hi,

On Wed, 8 Feb 2012, Atsushi Nakagawa wrote:

> On Mon, 6 Feb 2012 17:14:34 +0100
> Erik Faye-Lund <kusm...@gmail.com> wrote:
> > On Sun, Feb 5, 2012 at 7:39 PM, <karste...@dcon.de> wrote:
> > > Hi there,
> > >
> > > here is v15 of the unicode patch series:
> > >
> > [...]
> >
> > Merged, and pushed! If I messed up something, I'm sure we can fix that later...
>
> A big fat congratulations! to Karsten and Eric on finally getting this
> merged! It couldn't have been done without Karsten's persistence to see
> it through and Eric's vigilant and invaluable efforts!

Yep, there cannot be enough thanks for that. It was a major undertaking, a
lot of convincing needed to be done (yours truly was one of the people who
needed convincing).

Karsten, Eric, if I ever come close to any of you: I buy you dinner.

Ciao,
Dscho

Johannes Schindelin

unread,
Feb 8, 2012, 1:36:35 AM2/8/12
to Atsushi Nakagawa, kusm...@gmail.com, karste...@dcon.de, bl...@dcon.de, flat...@users.sourceforge.net, ki...@mns.spb.ru, mic...@wheelycreek.net, msy...@googlegroups.com, patt...@gmail.com, pa...@paulbetts.org, robin.r...@gmail.com, sschu...@gmail.com, vital...@gmail.com
Hi,

On Wed, 8 Feb 2012, Atsushi Nakagawa wrote:

Could you do a "git merge -s ours <wrongly-merged-commit>" with exactly
what you wrote here for the record? I'd gladly merge this.

Thanks,
Dscho

Atsushi Nakagawa

unread,
Feb 8, 2012, 5:22:48 AM2/8/12
to Johannes Schindelin, Atsushi Nakagawa, kusm...@gmail.com, karste...@dcon.de, bl...@dcon.de, sschu...@gmail.com, msy...@googlegroups.com
Wed, 8 Feb 2012 07:36:35 +0100 (CET)

I wasn't familiar with this workflow so it took me a while to work it
out but I think I've got it and have updated atnak/msysgit/an/vim-utf-8
by merging a nullified msysgit/msysgit/an/vim-utf-8 into it.

I've also sent a "Pull Request" to msysgit/msysgit for the sake of
trying it out.

Regards,

Sebastian Schuberth

unread,
Feb 8, 2012, 8:37:48 AM2/8/12
to Atsushi Nakagawa, Johannes Schindelin, kusm...@gmail.com, karste...@dcon.de, msy...@googlegroups.com, Pat Thoyts
On Wed, Feb 8, 2012 at 11:22, Atsushi Nakagawa <at...@chejz.com> wrote:

> I've also sent a "Pull Request" to msysgit/msysgit for the sake of
> trying it out.

Hmm, although I received a personal notification about this merge
requests, it seems my forwarding filter to the mailing list did not
work. Or did anyone see the merge requests on the mailing list?

--
Sebastian Schuberth

John Chen

unread,
Feb 10, 2012, 6:52:48 AM2/10/12
to msysGit
Hello,

On Feb 6, 2:39 am, karsten.bl...@dcon.de wrote:
> A git installer that bundles the Unicode msys.dll, less-444, recodetree
> script and git v1.7.9 with unicode-v15 patches can be found here:https://docs.google.com/uc?id=0BxXUoUg2r8ftMzJiNDRlNWEtYjMxZi00YWMzLT...

I think I've found a bug in the version Karsten uploaded a few days
ago. The bug can be reproduced by creating an empty repository and
commit something into it, then running git gc, as shown below:

-------------------------------------------------------------
Welcome to Git (version 1.7.9-unicode-20120204)


Run 'git help git' to display the help index.
Run 'git help <command>' to display help for specific commands.

x@SMTW-41 /c/tmp/git/1
$ git version
git version 1.7.9.msysgit.1

x@SMTW-41 /c/tmp/git/1
$ git init
Initialized empty Git repository in C:/tmp/git/1/.git/

x@SMTW-41 /c/tmp/git/1 (master)
$ echo "aoeu" > testing.txt

x@SMTW-41 /c/tmp/git/1 (master)
$ git add testing.txt

x@SMTW-41 /c/tmp/git/1 (master)
$ git commit -m "Just testing"
[master (root-commit) a31615a] Just testing
1 files changed, 1 insertions(+), 0 deletions(-)
create mode 100644 testing.txt

x@SMTW-41 /c/tmp/git/1 (master)
$ git gc
Counting objects: 3, done.
Writing objects: 100% (3/3), done.
Total 3 (delta 0), reused 0 (delta 0)
Deletion of directory '.git/objects/8c/' failed. Should I try again?
(y/n) y
Deletion of directory '.git/objects/8c/' failed. Should I try again?
(y/n) y
Deletion of directory '.git/objects/8c/' failed. Should I try again?
(y/n) y
Deletion of directory '.git/objects/8c/' failed. Should I try again?
(y/n)
-------------------------------------------------------------

After some debugging with OllyDbg, I think that
this bug seemed to be related to the function
int mingw_rmdir(const char *pathname) in compat/mingw.c.

Within the debugger, rmdir() is mapped to _wrmdir() in msvcrt.dll,
which takes const wchar_t * as parameter, but debugging shows that
an ASCII string is passed to it instead. Other functions such as
unlink(), which is mapped to _wunlink(), might have similiar problem,
but I'm not sure as I haven't seen it in action within the debugger.

As I'm new to git development, I'm unable to create a patch for this,
therefore I reported it here.

Thank you.

Chen John L.

Erik Faye-Lund

unread,
Feb 10, 2012, 7:33:53 AM2/10/12
to John Chen, msysGit, Karsten Blees

A similar issue have already been reported, thanks for looking into it.

However, as with the other report, I cannot reproduce it. I'm not
using Karsten's installer, though. Instead, I'm using his v15-series,
as is what is currently applied to 'devel'. This is what will probably
go into the next release.

> After some debugging with OllyDbg, I think that
> this bug seemed to be related to the function
> int mingw_rmdir(const char *pathname) in compat/mingw.c.
>
> Within the debugger, rmdir() is mapped to _wrmdir() in msvcrt.dll,
> which takes const wchar_t * as parameter, but debugging shows that
> an ASCII string is passed to it instead.

Interesting. I can't see how the code I have in my repo could get into
that situation. The code does
---8<---
wchar_t wpathname[MAX_PATH];
if (xutftowcs_path(wpathname, pathname) < 0)
return -1;
---8<---
and pass wpathname to _wrmdir.

Karsten, did you fix a bug matching these symptoms?

Erik Faye-Lund

unread,
Feb 10, 2012, 12:02:36 PM2/10/12
to John Chen, msysGit, Karsten Blees
On Fri, Feb 10, 2012 at 5:36 PM, John Chen <john...@gmail.com> wrote:
> In the non-unicode build, _rmdir() fails with ERROR_SHARING_VIOLATION as
> well. However, the call to:
> is_dir_empty(pathname)
> actually erroneously returns 0, the reason why it returns 0 when the
> directory is in fact actually empty will be detailed later. Also,
> is_dir_empty() modified the last error, so the call to GetLastError()
> returns ERROR_SUCCESS. Therefore, in the next while loop:
> is_file_in_use_error(GetLastError())
> will return 0 because GetLastError() returns ERROR_SUCCESS, and the prompt
> does not appear, then the function fails silently, leaving the directory to
> be removed dangling there.
>
> On the issue with is_dir_empty(), the problem also lies with GetLastError().
> -----------------------------------------------
> if (!FindNextFile(handle, &findbuf)) {
> // Here, Last error is ERROR_NO_MORE_FILES
> strbuf_release(&buf); // calls free(), which changes last error to
> ERROR_SUCCESS
> return GetLastError() == ERROR_NO_MORE_FILES; // return 0, because
> GetLastError() returns ERROR_SUCCESS
> }
> -----------------------------------------------
>
> So the problem isn't introduced by the unicode patch, but this problem
> already exists, but is hidden by the is_dir_empty() bug.
>

Aha, this makes sense, and it's easy to fix. Thanks for spotting it!

> However, I haven't found the reason why this is_dir_empty() bug didn't hide
> the rmdir() failure in unicode build. In the unicode build, is_dir_empty()
> is inlined into mingw_rmdir(), and I haven't traced it properly yet, I'll
> report again when I find out the reason why it didn't hide rmdir()'s
> failure.
>
> Meanwhile, I've a suggestion on why rmdir() fails
> with ERROR_SHARING_VIOLATION. mingw_rmdir() is called from prune_dir() in
> builtin/prune-packed.c, which is called by prune_packed_objects().
>
> In prune_packed_objects():
> -------------------------------------------------
> d = opendir(pathname);
> if (!d)
> continue;
> prune_dir(i, d, pathname, len + 3, opts);
> closedir(d);
> -------------------------------------------------
> Directory is opened, and prune_dir() is called, that means when execution
> reaches rmdir(), the directory to remove is actually opened, and that's why
> the deletion fails. To solve the problem, simply move closedir() into
> prune_dir(), right before rmdir():
> -------------------------------------------------
> pathname[len] = 0;
> closedir(dir);
> rmdir(pathname);
> ...
> if (!d)
> continue;
> prune_dir(i, d, pathname, len + 3, opts);
> }
> -------------------------------------------------
> However, I'm not sure if this builtin/prune-packed.c belongs to the original
> git, that we cannot touch with it.
>

Sure, we can change it. Every file "belongs" to the "original git" ;)

But we can also send a patch upstream, because IMO it's an improvement
nevertheless.

> I'm sorry for not being able to submit a patch, as I'm new to git
> development, but I'll try to learn the basics.
>

No problem, your debugging has been very useful so far, thanks a lot!

Erik Faye-Lund

unread,
Feb 10, 2012, 12:05:36 PM2/10/12
to John Chen, msysGit, Karsten Blees
On Fri, Feb 10, 2012 at 5:36 PM, John Chen <john...@gmail.com> wrote:
> However, I haven't found the reason why this is_dir_empty() bug didn't hide
> the rmdir() failure in unicode build. In the unicode build, is_dir_empty()
> is inlined into mingw_rmdir(), and I haven't traced it properly yet, I'll
> report again when I find out the reason why it didn't hide rmdir()'s
> failure.

This is because Karsten patched the function to use Unicode paths, and
removed the strbuf-usage. So the patch really only just revealed the
already existing bug, which is good :)

John Chen

unread,
Feb 10, 2012, 11:36:12 AM2/10/12
to kusm...@gmail.com, msysGit, Karsten Blees
Hello,

Sorry, it seemed that I got it wrong in my previous bug report, Karsten did implement the code. The call to _wrmdir() actually gets UTF16 input, which is correct. The actual reason for _wrmdir()'s failure is  ERROR_SHARING_VIOLATION, something seemed to be open and therefore the directory cannot be removed.

In order to find out the cause of the problem, I traced the execution of mingw_rmdir() in both the non-unicode build ( Git-1.7.9-preview20120201.exe in download section ), and Karsten's Unicode Build. What is surprising is that the problem exists in the non-unicode build as well, but it's covered up by another bug. After running git gc with the non-unicode build, the directory that _wrmdir() was trying to remove ( .git/objects/xx ) was still there, and it's empty.

In the non-unicode build, _rmdir() fails with ERROR_SHARING_VIOLATION as well. However, the call to:
is_dir_empty(pathname)
actually erroneously returns 0, the reason why it returns 0 when the directory is in fact actually empty will be detailed later. Also, is_dir_empty() modified the last error, so the call to GetLastError() returns ERROR_SUCCESS. Therefore, in the next while loop:
is_file_in_use_error(GetLastError())
will return 0 because GetLastError() returns ERROR_SUCCESS, and the prompt does not appear, then the function fails silently, leaving the directory to be removed dangling there.

On the issue with is_dir_empty(), the problem also lies with GetLastError().
-----------------------------------------------
if (!FindNextFile(handle, &findbuf)) {
// Here, Last error is ERROR_NO_MORE_FILES
strbuf_release(&buf); // calls free(), which changes last error to ERROR_SUCCESS
return GetLastError() == ERROR_NO_MORE_FILES; // return 0, because GetLastError() returns ERROR_SUCCESS
}
-----------------------------------------------

So the problem isn't introduced by the unicode patch, but this problem already exists, but is hidden by the is_dir_empty() bug.

However, I haven't found the reason why this is_dir_empty() bug didn't hide the rmdir() failure in unicode build. In the unicode build, is_dir_empty() is inlined into mingw_rmdir(), and I haven't traced it properly yet, I'll report again when I find out the reason why it didn't hide rmdir()'s failure.

Meanwhile, I've a suggestion on why rmdir() fails with ERROR_SHARING_VIOLATION. mingw_rmdir() is called from prune_dir() in builtin/prune-packed.c, which is called by prune_packed_objects().

In prune_packed_objects():
-------------------------------------------------
d = opendir(pathname);
if (!d)
continue;
prune_dir(i, d, pathname, len + 3, opts);
closedir(d);
-------------------------------------------------
Directory is opened, and prune_dir() is called, that means when execution reaches rmdir(), the directory to remove is actually opened, and that's why the deletion fails. To solve the problem, simply move closedir() into prune_dir(), right before rmdir():
-------------------------------------------------
pathname[len] = 0;
closedir(dir);
rmdir(pathname);
...
if (!d)
continue;
prune_dir(i, d, pathname, len + 3, opts);
}
-------------------------------------------------
However, I'm not sure if this builtin/prune-packed.c belongs to the original git, that we cannot touch with it.

I'm sorry for not being able to submit a patch, as I'm new to git development, but I'll try to learn the basics.

Johannes Schindelin

unread,
Feb 10, 2012, 12:57:46 PM2/10/12
to John Chen, kusm...@gmail.com, msysGit, Karsten Blees
Hi John,

On Sat, 11 Feb 2012, John Chen wrote:

> I'm not sure if this builtin/prune-packed.c belongs to the original git,
> that we cannot touch with it.

If you installed the development environment (which I think you mentioned
in another mail that you did), you can actually compare the version we
have versus the upstream version by calling

git fetch junio
git diff junio/next... -- builtin/prune-packed.c

> I'm sorry for not being able to submit a patch, as I'm new to git
> development, but I'll try to learn the basics.

You are really close: just check your changes with

git diff

(e.g. to remove debug printf()s or other things that do not belong to the
fix you are about to commit)

Then stage your changes for commit; I recommend doing

git gui

for that and clicking on the blue file symbols in the upper file list.

Then you can commit the change directly in the GUI by writing a very short
summary of the fix as first line, then an empty line, and then a bit more
explanation. Then click on the Sign-off button. Then on the Commit button.

If you need inspiration how commit messages typically look, just fire up

gitk

and try to imitate the style.

Once you have done that, you can export the patch with

git format-patch -1

Or you can publish your change by forking on GitHub
(https://github.com/msysgit/git/fork_select) and pushing your branch.

Ciao,
Johannes

karste...@dcon.de

unread,
Feb 10, 2012, 6:22:17 PM2/10/12
to John Chen, kusm...@gmail.com, Karsten Blees, msysGit
On Windows XP (not Win7), directories cannot be deleted while a find
handle
is open, causing "Deletion of directory '...' failed. Should I try again?"
prompts in git-repack and git-gc.

Prior to 19d1e75d "Win32: Unicode file name support (except dirent)",
these failures were silently ignored due to strbuf_free in is_dir_empty
resetting GetLastError to ERROR_SUCCESS.

Close find handles properly before trying to delete directories.

Reported-by: John Chen <john...@gmail.com>
Signed-off-by: Karsten Blees <bl...@dcon.de>
---

Erik Faye-Lund <kusm...@gmail.com> wrote on 10.02.2012 18:02:36:
> On Fri, Feb 10, 2012 at 5:36 PM, John Chen <john...@gmail.com> wrote:
> > In the non-unicode build, _rmdir() fails with ERROR_SHARING_VIOLATION
as
> > well. However, the call to:
> > is_dir_empty(pathname)
> > actually erroneously returns 0, the reason why it returns 0 when the

[...]


> > Directory is opened, and prune_dir() is called, that means when
execution
> > reaches rmdir(), the directory to remove is actually opened, and
that's why
> > the deletion fails. To solve the problem, simply move closedir() into
> > prune_dir(), right before rmdir():

[...]


>
> No problem, your debugging has been very useful so far, thanks a lot!

Indeed, thanks for this very detailed analysis.

I couldn't reproduce the problem until I tried the unicode version on an
old XP box...seems that on XP, FindFirstFile locks the directory, while on
Win7 it does not.

Incidentally, is_dir_empty didn't close its find handle either, so there
were actually three bugs, one fixed by accident, exposing the other two
:-)

This patch fixes it (tested on XP and 7).


builtin/prune-packed.c | 2 +-
compat/mingw.c | 4 +++-
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/builtin/prune-packed.c b/builtin/prune-packed.c
index f9463de..a834417 100644
--- a/builtin/prune-packed.c
+++ b/builtin/prune-packed.c
@@ -36,7 +36,6 @@ static void prune_dir(int i, DIR *dir, char *pathname,
int len, int opts)
display_progress(progress, i + 1);
}
pathname[len] = 0;
- rmdir(pathname);
}

void prune_packed_objects(int opts)
@@ -65,6 +64,7 @@ void prune_packed_objects(int opts)


continue;
prune_dir(i, d, pathname, len + 3, opts);
closedir(d);

+ rmdir(pathname);
}
stop_progress(&progress);
}
diff --git a/compat/mingw.c b/compat/mingw.c
index 183a2a4..6daffa7 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -243,8 +243,10 @@ static int is_dir_empty(const wchar_t *wpath)

while (!wcscmp(findbuf.cFileName, L".") ||
!wcscmp(findbuf.cFileName, L".."))
- if (!FindNextFileW(handle, &findbuf))
+ if (!FindNextFileW(handle, &findbuf)) {
+ FindClose(handle);
return GetLastError() == ERROR_NO_MORE_FILES;
+ }
FindClose(handle);
return 0;
}
--
1.7.9.msysgit.1.1.g50c36

Erik Faye-Lund

unread,
Feb 10, 2012, 6:46:54 PM2/10/12
to karste...@dcon.de, John Chen, Karsten Blees, msysGit
On Sat, Feb 11, 2012 at 12:22 AM, <karste...@dcon.de> wrote:
> I couldn't reproduce the problem until I tried the unicode version on an
> old XP box...seems that on XP, FindFirstFile locks the directory, while on
> Win7 it does not.
>
> Incidentally, is_dir_empty didn't close its find handle either, so there
> were actually three bugs, one fixed by accident, exposing the other two
> :-)
>
> This patch fixes it (tested on XP and 7).
>

Unfortunately, it seems to be white-space corrupted (all tabs turned
to spaces before it reached my inbox), so I'll have to either re-roll
it or have you push it out yourself.

> diff --git a/compat/mingw.c b/compat/mingw.c
> index 183a2a4..6daffa7 100644
> --- a/compat/mingw.c
> +++ b/compat/mingw.c
> @@ -243,8 +243,10 @@ static int is_dir_empty(const wchar_t *wpath)
>
>        while (!wcscmp(findbuf.cFileName, L".") ||
>                        !wcscmp(findbuf.cFileName, L".."))
> -               if (!FindNextFileW(handle, &findbuf))
> +               if (!FindNextFileW(handle, &findbuf)) {
> +                       FindClose(handle);
>                        return GetLastError() == ERROR_NO_MORE_FILES;

Don't we risk that FindClose changes the return code for GetLastError
here, like free (probably backed by VirtualFree) did on WinXP? Perhaps
we should do this instead?

diff --git a/compat/mingw.c b/compat/mingw.c
index 183a2a4..d10734b 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -243,8 +243,11 @@ static int is_dir_empty(const wchar_t *wpath)

while (!wcscmp(findbuf.cFileName, L".") ||
!wcscmp(findbuf.cFileName, L".."))
- if (!FindNextFileW(handle, &findbuf))

- return GetLastError() == ERROR_NO_MORE_FILES;
+ if (!FindNextFileW(handle, &findbuf)) {
+ DWORD err = GetLastError();
+ FindClose(handle);
+ return err == ERROR_NO_MORE_FILES;
+ }
FindClose(handle);
return 0;
}

karste...@dcon.de

unread,
Feb 10, 2012, 7:08:33 PM2/10/12
to kusm...@gmail.com, Karsten Blees, John Chen, msysGit
Right...I guess it's too late for plain C :-( ...fixed and pushed to: https://github.com/kblees/git/tree/kb/fix-rmdir-in-repack

g'night
Karsten

Erik Faye-Lund

unread,
Feb 10, 2012, 7:11:30 PM2/10/12
to karste...@dcon.de, Karsten Blees, John Chen, msysGit

Looks good to me, pushed!

Erik Faye-Lund

unread,
Feb 13, 2012, 7:13:55 AM2/13/12
to karste...@dcon.de, Karsten Blees, John Chen, msysGit

Whoops, seems my push failed. Pushed for real this time!

Stefan Naewe

unread,
Feb 15, 2012, 5:50:50 AM2/15/12
to msy...@googlegroups.com
Erik Faye-Lund <kusmabite <at> gmail.com> writes:

>
> On Sat, Feb 11, 2012 at 1:11 AM, Erik Faye-Lund <kusmabite <at> gmail.com>
wrote:
> > On Sat, Feb 11, 2012 at 1:08 AM,  <karsten.blees <at> dcon.de> wrote:
> >>
> >> Right...I guess it's too late for plain C ...fixed and pushed to:


> >> https://github.com/kblees/git/tree/kb/fix-rmdir-in-repack
> >>
> >> g'night
> >> Karsten
> >
> > Looks good to me, pushed!
>
> Whoops, seems my push failed. Pushed for real this time!

I built an installer from the latest devel branch of '/git' (01fcafb8) and
now I'm getting the 'Deletion of directory...' prompts from 'git gc' when it
runs 'git prune --expire 2.weeks.ago'.

Stefan

Stefan Naewe

unread,
Feb 20, 2012, 6:11:44 AM2/20/12
to msysGit, kusm...@gmail.com, karste...@dcon.de
(I'm resending this because the first time wasn't a 'reply-all')

Stefan
--
----------------------------------------------------------------
python -c "print '73746566616e2e6e6165776540676d61696c2e636f6d'.decode('hex')"

karste...@dcon.de

unread,
Mar 2, 2012, 7:43:48 AM3/2/12
to Stefan Naewe, kusm...@gmail.com, msysGit, Johannes....@gmx.depatthoyts
This is a follow-up to "Win32: fix deletion of .git/objects
sub-directories
in git-repack". There is another rmdir-before-closedir in builtin/prune.c
(i.e. producing "Deletion of directory '...' failed..." prompts in
git-prune on Windows XP).

Fix it by swapping closedir / rmdir.

The other closedir calls look fine at first glance.

Reported-by: Stefan Naewe <stefan...@gmail.com>


Signed-off-by: Karsten Blees <bl...@dcon.de>
---

https://github.com/kblees/git/commit/9db94146

Can I just push this or do we want to squash it with the previous patch
(b4cb824d) on next rebase?


builtin/prune.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/prune.c b/builtin/prune.c
index 58d7cb8..b99b635 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -85,9 +85,9 @@ static int prune_dir(int i, char *path)
}
fprintf(stderr, "bad sha1 file: %s/%s\n", path,
de->d_name);
}
+ closedir(dir);
if (!show_only)
rmdir(path);
- closedir(dir);
return 0;
}

--
1.7.9.msysgit.1.364.g3e2096

Erik Faye-Lund

unread,
Mar 2, 2012, 7:57:31 AM3/2/12
to karste...@dcon.de, Stefan Naewe, msysGit, Johannes Schindelin, Pat Thoyts
(Corrected the CC-list, it somehow got corrupted)

On Fri, Mar 2, 2012 at 1:43 PM, <karste...@dcon.de> wrote:
> This is a follow-up to "Win32: fix deletion of .git/objects
> sub-directories
> in git-repack". There is another rmdir-before-closedir in builtin/prune.c
> (i.e. producing "Deletion of directory '...' failed..." prompts in
> git-prune on Windows XP).
>
> Fix it by swapping closedir / rmdir.
>
> The other closedir calls look fine at first glance.
>
> Reported-by: Stefan Naewe <stefan...@gmail.com>
> Signed-off-by: Karsten Blees <bl...@dcon.de>
> ---
>
> https://github.com/kblees/git/commit/9db94146
>
> Can I just push this or do we want to squash it with the previous patch
> (b4cb824d) on next rebase?
>

I think these should be separate commits. I also suspect that the
rmdir-moving from b4cb824d should be split out into a separate patch
(perhaps squashed with this one, I don't know) and sent upstream, as
it might be an improvement for other systems as well.

>
>  builtin/prune.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/builtin/prune.c b/builtin/prune.c
> index 58d7cb8..b99b635 100644
> --- a/builtin/prune.c
> +++ b/builtin/prune.c
> @@ -85,9 +85,9 @@ static int prune_dir(int i, char *path)
>                }
>                fprintf(stderr, "bad sha1 file: %s/%s\n", path,
> de->d_name);
>        }
> +       closedir(dir);
>        if (!show_only)
>                rmdir(path);
> -       closedir(dir);
>        return 0;
>  }
>

Looks obviously correct to me, nice :)

karste...@dcon.de

unread,
Mar 5, 2012, 4:57:25 AM3/5/12
to kusm...@gmail.com, Johannes Schindelin, msysGit, Pat Thoyts, Stefan Naewe

Erik Faye-Lund <kusm...@gmail.com> wrote on 02.03.2012 13:57:31:
> (Corrected the CC-list, it somehow got corrupted)


Sorry about that.

>

> On Fri, Mar 2, 2012 at 1:43 PM,  <karste...@dcon.de> wrote:
> > This is a follow-up to "Win32: fix deletion of .git/objects
> > sub-directories
> > in git-repack". There is another rmdir-before-closedir in builtin/prune.c
> > (i.e. producing "Deletion of directory '...' failed..." prompts in
> > git-prune on Windows XP).
> >
> > Fix it by swapping closedir / rmdir.
> >
> > The other closedir calls look fine at first glance.
> >
> > Reported-by: Stefan Naewe <stefan...@gmail.com>
> > Signed-off-by: Karsten Blees <bl...@dcon.de>
> > ---
> >
> >
https://github.com/kblees/git/commit/9db94146
> >
> > Can I just push this or do we want to squash it with the previous patch
> > (b4cb824d) on next rebase?
> >
>
> I think these should be separate commits. I also suspect that the
> rmdir-moving from b4cb824d should be split out into a separate patch
> (perhaps squashed with this one, I don't know) and sent upstream, as
> it might be an improvement for other systems as well.
>


I thought the delete / rename problems were a Windows illness (at least the retry-ask-user-yes-no logic is exclusive to mingw.c)? And as all mingw stuff is eventually sent upstream (I hope :-)), I don't see a particular reason to rush this one.

Erik Faye-Lund

unread,
Mar 5, 2012, 5:26:08 AM3/5/12
to karste...@dcon.de, Johannes Schindelin, msysGit, Pat Thoyts, Stefan Naewe
On Mon, Mar 5, 2012 at 10:57 AM, <karste...@dcon.de> wrote:
>> > Can I just push this or do we want to squash it with the previous patch
>> > (b4cb824d) on next rebase?
>> >
>>
>> I think these should be separate commits. I also suspect that the
>> rmdir-moving from b4cb824d should be split out into a separate patch
>> (perhaps squashed with this one, I don't know) and sent upstream, as
>> it might be an improvement for other systems as well.
>>
>
> I thought the delete / rename problems were a Windows illness (at least the
> retry-ask-user-yes-no logic is exclusive to mingw.c)?

POSIX allows for this behavior:

http://pubs.opengroup.org/onlinepubs/009695399/functions/rmdir.html

(EBUSY: "The directory to be removed is currently in use by the system
or some process and the implementation considers this to be an
error.")

I've found traces of similar issues on exotic unices, so I don't think
it's completely hypothetical either...

> And as all mingw stuff
> is eventually sent upstream (I hope :-)), I don't see a particular reason to
> rush this one.
>

There's no automatic system for this. I tend to send all my patches
that cleanly fix problems present in git.git directly upstream, so
people on Windows who doesn't use our branch can benefit from them as
well.

karste...@dcon.de

unread,
Mar 6, 2012, 4:18:41 AM3/6/12
to kusm...@gmail.com, g...@vger.kernel.org, Johannes Schindelin, msysGit, Pat Thoyts, Stefan Naewe
On some systems (e.g. Windows XP), directories cannot be deleted while
they're open. Both git-prune and git-repack (and thus, git-gc) try to
rmdir while holding a DIR* handle on the directory, leaving dangling
empty directories in the .git/objects store.

Fix it by swapping the rmdir / closedir calls.

Reported-by: John Chen <john...@gmail.com>


Reported-by: Stefan Naewe <stefan...@gmail.com>
Signed-off-by: Karsten Blees <bl...@dcon.de>
---

Erik Faye-Lund <kusm...@gmail.com> wrote on 05.03.2012 11:26:08:
> On Mon, Mar 5, 2012 at 10:57 AM, <karste...@dcon.de> wrote:

[...]


> > I thought the delete / rename problems were a Windows illness (at
> > least the retry-ask-user-yes-no logic is exclusive to mingw.c)?
>
> POSIX allows for this behavior:
>
> http://pubs.opengroup.org/onlinepubs/009695399/functions/rmdir.html
>
> (EBUSY: "The directory to be removed is currently in use by the system
> or some process and the implementation considers this to be an
> error.")
>
> I've found traces of similar issues on exotic unices, so I don't think
> it's completely hypothetical either...
>
> > And as all mingw stuff
> > is eventually sent upstream (I hope :-)), I don't see a particular
> > reason to rush this one.
> >
>
> There's no automatic system for this. I tend to send all my patches
> that cleanly fix problems present in git.git directly upstream, so
> people on Windows who doesn't use our branch can benefit from them as
> well.

So what about this one? It applies cleanly to git master, and when
merged / cherry-picked to msysgit devel, the prune-packed.c fix we
already have in b4cb824d is silently ignored.

Note: the is_dir_empty() function in mingw.c is broken in upstream git
as well, but I'm reluctant to backport this as it will clash with the
Unicode patches.

The upstream patch can be pulled from here:
https://github.com/kblees/git/commit/56e1bc62

And cherry-picked to msysgit devel from here:
https://github.com/kblees/git/commit/b07bfd51

Regards,
Karsten


builtin/prune-packed.c | 2 +-
builtin/prune.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/prune-packed.c b/builtin/prune-packed.c
index f9463de..a834417 100644
--- a/builtin/prune-packed.c
+++ b/builtin/prune-packed.c
@@ -36,7 +36,6 @@ static void prune_dir(int i, DIR *dir, char *pathname,
int len, int opts)
display_progress(progress, i + 1);
}
pathname[len] = 0;
- rmdir(pathname);
}

void prune_packed_objects(int opts)
@@ -65,6 +64,7 @@ void prune_packed_objects(int opts)
continue;
prune_dir(i, d, pathname, len + 3, opts);
closedir(d);
+ rmdir(pathname);
}
stop_progress(&progress);
}

diff --git a/builtin/prune.c b/builtin/prune.c
index 58d7cb8..b99b635 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -85,9 +85,9 @@ static int prune_dir(int i, char *path)
}
fprintf(stderr, "bad sha1 file: %s/%s\n", path,
de->d_name);
}
+ closedir(dir);
if (!show_only)
rmdir(path);
- closedir(dir);
return 0;
}

--
1.7.9.msysgit.1.364.g3e2096

Johannes Sixt

unread,
Mar 6, 2012, 1:32:41 PM3/6/12
to karste...@dcon.de, kusm...@gmail.com, g...@vger.kernel.org, Johannes Schindelin, msysGit, Pat Thoyts, Stefan Naewe
Am 06.03.2012 10:18, schrieb karste...@dcon.de:
> On some systems (e.g. Windows XP), directories cannot be deleted while
> they're open. Both git-prune and git-repack (and thus, git-gc) try to
> rmdir while holding a DIR* handle on the directory, leaving dangling
> empty directories in the .git/objects store.
>
> Fix it by swapping the rmdir / closedir calls.

The reasoning makes a lot of sense. I wonder why object directories are
pruned nevertheless when I run git gc --prune (I run git master plus a
few topics from pu).

> diff --git a/builtin/prune-packed.c b/builtin/prune-packed.c
> index f9463de..a834417 100644
> --- a/builtin/prune-packed.c
> +++ b/builtin/prune-packed.c
> @@ -36,7 +36,6 @@ static void prune_dir(int i, DIR *dir, char *pathname,
> int len, int opts)
> display_progress(progress, i + 1);
> }
> pathname[len] = 0;
> - rmdir(pathname);

After moving the rmdir() away from prune_dir(), the truncation of the
pathname does not logically belong here anymore. It should be moved with
the rmdir(). Looks good otherwise.

Junio C Hamano

unread,
Mar 6, 2012, 3:19:06 PM3/6/12
to Johannes Sixt, karste...@dcon.de, kusm...@gmail.com, g...@vger.kernel.org, Johannes Schindelin, msysGit, Pat Thoyts, Stefan Naewe
Johannes Sixt <j...@kdbg.org> writes:

>> diff --git a/builtin/prune-packed.c b/builtin/prune-packed.c
>> index f9463de..a834417 100644
>> --- a/builtin/prune-packed.c
>> +++ b/builtin/prune-packed.c
>> @@ -36,7 +36,6 @@ static void prune_dir(int i, DIR *dir, char *pathname,
>> int len, int opts)
>> display_progress(progress, i + 1);
>> }
>> pathname[len] = 0;
>> - rmdir(pathname);
>
> After moving the rmdir() away from prune_dir(), the truncation of the
> pathname does not logically belong here anymore. It should be moved with
> the rmdir(). Looks good otherwise.

I agree that it is better to have the NUL termination close to where
it actually matters.

Do you want me to take it after locally fixing it up, or do you
prefer to feed this as part of msysgit related updates to me later?

Johannes Sixt

unread,
Mar 6, 2012, 4:19:10 PM3/6/12
to Junio C Hamano, karste...@dcon.de, kusm...@gmail.com, g...@vger.kernel.org, Johannes Schindelin, msysGit, Pat Thoyts, Stefan Naewe

This patch is fairly independent from other topics that are queued in
msysgit, IIRC. Please take it with the mentioned fixup.

Thanks,
-- Hannes

karste...@dcon.de

unread,
Mar 6, 2012, 4:30:40 PM3/6/12
to Junio C Hamano, g...@vger.kernel.org, Johannes Sixt, Johannes Schindelin, kusm...@gmail.com, msysGit, Pat Thoyts, Stefan Naewe
Junio C Hamano <git...@pobox.com> wrote on 06.03.2012 21:19:06:
> Johannes Sixt <j...@kdbg.org> writes:
>
> >> diff --git a/builtin/prune-packed.c b/builtin/prune-packed.c
> >> index f9463de..a834417 100644
> >> --- a/builtin/prune-packed.c
> >> +++ b/builtin/prune-packed.c
> >> @@ -36,7 +36,6 @@ static void prune_dir(int i, DIR *dir, char
*pathname,
> >> int len, int opts)
> >> display_progress(progress, i + 1);
> >> }
> >> pathname[len] = 0;
> >> - rmdir(pathname);
> >
> > After moving the rmdir() away from prune_dir(), the truncation of the
> > pathname does not logically belong here anymore. It should be moved
with
> > the rmdir(). Looks good otherwise.
>
> I agree that it is better to have the NUL termination close to where
> it actually matters.
>

The pathname is extended in prune_dir, so I think it should be reset there
as well; moving it to prune_packed_objects would be quite obscure:

d = opendir(pathname);
prune_dir(d, pathname);
pathname[len] = 0;
rmdir(pathname);


OT: While looking at the code I just stumbled across this immediately
above the patch (prune-packed.c line 32ff):

memcpy(pathname + len, de->d_name, 38);
if (opts & DRY_RUN)
printf("rm -f %s\n", pathname);
else
unlink_or_warn(pathname);

Shouldn't this be memcpy(..., 39) (i.e. including '\0')?

> Do you want me to take it after locally fixing it up, or do you
> prefer to feed this as part of msysgit related updates to me later?

The msysgit queue is quite long, and I think it makes sense to fast-track
this one.

karste...@dcon.de

unread,
Mar 6, 2012, 4:47:19 PM3/6/12
to Johannes Sixt, g...@vger.kernel.org, Johannes Schindelin, kusm...@gmail.com, msysGit, Pat Thoyts, Stefan Naewe
Johannes Sixt <j...@kdbg.org> wrote on 06.03.2012 19:32:41:
> Am 06.03.2012 10:18, schrieb karste...@dcon.de:
> > On some systems (e.g. Windows XP), directories cannot be deleted while
> > they're open. Both git-prune and git-repack (and thus, git-gc) try to
> > rmdir while holding a DIR* handle on the directory, leaving dangling
> > empty directories in the .git/objects store.
> >
> > Fix it by swapping the rmdir / closedir calls.
>
> The reasoning makes a lot of sense. I wonder why object directories are
> pruned nevertheless when I run git gc --prune (I run git master plus a
> few topics from pu).
>

You're right...in upstream git, closedir() is essentially a NOOP. The OS
handle is closed in readdir() on reading the last entry. The dirent.c
'improvements' that move FindFirstFile() to opendir() and FindClose() to
closedir() are in the msysgit fork only [1] (and entirely my fault, I
guess :-)).

So this patch actually isn't currently required for Windows XP (probably
other platforms, I don't know).

[1] https://github.com/msysgit/git/commit/e2b3f70b

Junio C Hamano

unread,
Mar 6, 2012, 4:49:57 PM3/6/12
to karste...@dcon.de, Junio C Hamano, g...@vger.kernel.org, Johannes Sixt, Johannes Schindelin, kusm...@gmail.com, msysGit, Pat Thoyts, Stefan Naewe
karste...@dcon.de writes:

> Junio C Hamano <git...@pobox.com> wrote on 06.03.2012 21:19:06:
>> Johannes Sixt <j...@kdbg.org> writes:
>>
>> >> diff --git a/builtin/prune-packed.c b/builtin/prune-packed.c
>> >> index f9463de..a834417 100644
>> >> --- a/builtin/prune-packed.c
>> >> +++ b/builtin/prune-packed.c
>> >> @@ -36,7 +36,6 @@ static void prune_dir(int i, DIR *dir, char
> *pathname,
>> >> int len, int opts)
>> >> display_progress(progress, i + 1);
>> >> }
>> >> pathname[len] = 0;
>> >> - rmdir(pathname);
>> >
>> > After moving the rmdir() away from prune_dir(), the truncation of the
>> > pathname does not logically belong here anymore. It should be moved
> with
>> > the rmdir(). Looks good otherwise.
>>
>> I agree that it is better to have the NUL termination close to where
>> it actually matters.
>>
>
> The pathname is extended in prune_dir, so I think it should be reset there
> as well; moving it to prune_packed_objects would be quite obscure:

This depends entirely on how you look at it.

You can certainly stare at the original code and declare that the
contract between the caller and the callee was that the caller gives
pathname[] and len (len+3 for the caller) to the callee, and allows
the callee to play with the rest of the pathname[] array but expects
that pathname[] to be properly NUL-terminated when the callee comes
back. From that point of view, "pathname[len] = 0" can belong to
the callee.

But while you are staring the original code, notice that "expects
that pathname[] to be NUL-terminated when the callee comes back" is
not something the caller even depends on. That expectation starts
to matter _only_ if you move rmdir(pathname) to the caller.

That is why I said "where it actually matters."

In other words, "The caller allows the callee to play with the rest
of the pathname[]; as long as the callee does not touch earlier
parts of the array, it can do anything before it returns", without
requiring the callee to NUL-terminate the pathname[] to restore to
its original state, is equally a valid contract between the caller
and the callee in the original code.

And that also holds true for the updated code that has rmdir() in
the caller.

> OT: While looking at the code I just stumbled across this immediately
> above the patch (prune-packed.c line 32ff):
>
> memcpy(pathname + len, de->d_name, 38);
> if (opts & DRY_RUN)
> printf("rm -f %s\n", pathname);
> else
> unlink_or_warn(pathname);
>
> Shouldn't this be memcpy(..., 39) (i.e. including '\0')?

That is not necessary, I think, as get_sha1_hex() does not look at
that NUL in the first place.

Junio C Hamano

unread,
Mar 6, 2012, 4:57:43 PM3/6/12
to karste...@dcon.de, g...@vger.kernel.org, Johannes Sixt, Johannes Schindelin, kusm...@gmail.com, msysGit, Pat Thoyts, Stefan Naewe
karste...@dcon.de writes:

> OT: While looking at the code I just stumbled across this immediately
> above the patch (prune-packed.c line 32ff):
>
> memcpy(pathname + len, de->d_name, 38);
> if (opts & DRY_RUN)
> printf("rm -f %s\n", pathname);
> else
> unlink_or_warn(pathname);
>
> Shouldn't this be memcpy(..., 39) (i.e. including '\0')?

I think the only thing that is guaranteeing that pathname[len+38] is
NUL is that we do not hop around repositories, so once we fill the
static char pathname[PATH_MAX] in prune_packed_objects(), nobody
writes to that location, because the length of the leading part
(i.e. "len" given to prune_dir()) will stay constant during the
lifetime of the process.

So it is not currently a problem, but it would be better to clear
that byte here.

Good eyes.

karste...@dcon.de

unread,
Mar 7, 2012, 5:50:34 AM3/7/12
to Junio C Hamano, g...@vger.kernel.org, Junio C Hamano, Johannes Sixt, Johannes Schindelin, kusm...@gmail.com, msysGit, Pat Thoyts, Stefan Naewe

Well, I just don't like that a function designed to prune a directory
modifies its input parameters as a side effect (it is bad enough that
prune_dir uses the caller's buffer at all). You know, trying to limit
side effects as a general programming principle.

In my opinion, it doesn't matter at all if the caller actually depends on
unmodified parameters or not, it just makes robust APIs that encourage
reuse.

Just my 2 cents, though, prune_packed_objects and prune_dir are so tightly
coupled that it probably doesn't make any difference...(on the other hand,
we have near identical code in prune.c, so thinking about reuse is not so
far fetched)

Reply all
Reply to author
Forward
0 new messages