I think if someone really need to work with Unicode file names, he should
use cygwin
1.7 instead. cygwin 1.7 support UTF-8 locale like *nix does. I don't think
we should
hack mingw-runtime to do the charset conversion like cygwin. Since ...
1. There are too many works to do.
2. It might reduce the performance.
3. We are creating another cygwin! noooo ...
Another approach is rewrite Git by using native Windows API, and do the
charset
conversion in Git itself. I think this is the ideal solution, but who will
do this ...
KJackie: your comment would have been welcome on the mailing list, but not
on the
issue tracker, where we only want comments relating substantial information
regarding
the issue at hand.
Issue 508 has been merged into this issue.
I've tried to fix this problem, and now it seems work well.
I sent patch to the mob branch. Commit id is
83d90724607aec6b479e5e5574d39ef68ed2285d
Thank you so much Takuya! I really hope the patch gets accepted.
Oh wow, we're using the issue tracker as a replacement for the mailing list
now.
And nobody told me.
whats about takuya.murakami fix?
git checkout 83d90724607aec6b479e5e5574d39ef68ed2285d exist
but this commit not exist in mob branch
As posted on the mailing list 23 Sep 2010:
"I have reset the git mob branch in 4msysgit.git to devel.
There were 2 pieces of outstanding work on that branch and I've moved
them to branches as follows:
work/bm/file_copy_content Bastian Moldenhauer: copy content of config
file into old - 1 patch
work/tm/utf8 Takuya Murakami: utf8 support (2 patches and a merge)
The tm/utf8 one doesn't apply to 'devel' but I didn't look very hard.
There seem to be a few work/*/utf8 branches around."
Guys, I think this is one of the biggest issues, so please change it's
priority to High!
That requires someone to prioritize actually WORKING on it, not just
talking about it. Are you volunteering?
Did anyone see this patch:
http://github.com/tmurakam/4msysgit-utf8-filepath/commit/64f15332c154a067911e7730b5e5529a37b41cf3.
I tried Git-1.7.0.2-utf8-20100725.exe from http://tmurakam.org/git, it did
fix this problem.
I tried Git-1.7.0.2-utf8-20100725.exe from http://tmurakam.org/git/, it did
fix this problem.
It seems there are quite a few people who would are keen to see this issue
resolved. It also seems that a patch is available, and there are some
isolated reports of it working for certain people.
Kusmabite - can you (or anyone else) elaborate on what further work needs
to be done? How can people who want to see this issue fixed help move it
towards the official release?
The patch(es) should be submitted to the msysGit mailing list for
discussion. You can forward patches from other people if you're willing to
polish them until they are ready for inclusion (as long as you have a
sign-off from the original author).
Read Documentation/SubmittingPatches for details on the submission process.
Such a series should probably be sent to the msysGit mailing list rather
than the Git mailing list, though.
Last time, re-unified patch series on this topic was submitted to the
mailing list almost 1 month ago by Karsten Blees:
http://groups.google.com/group/msysgit/browse_thread/thread/e7887444ec8f4cf5
without any reply at all!
> It seems there are quite a few people who would are keen to see this
> issue resolved.
> [...]
> Last time, re-unified patch series on this topic was submitted [...]
> without any
> reply at all!
Something does not quite compute: If it is really true that quite a few
people are
keen to see this issue resolved, how come they do not bother to review the
patch
series, let alone comment on it? If there are really interested parties, I
would
expect them to take care.
If people are not investing time and effort in seeing an issue getting
resolved, they are clearly not interested enough in that particular issue
(it is no excuse if those people are not able to review patches; they could
make the issue interesting to those who can; if they don't, they are really
not interested enough in seeing the issue getting resolved).
I'm taking care - will report soon about testing it under wine and real
windows.
> Something does not quite compute: If it is really true that quite a few
> people are
> keen to see this issue resolved, how come they do not bother to review
> the patch
> series, let alone comment on it? If there are really interested parties,
> I would
> expect them to take care.
Johannes, perhaps it's unrealistic to expect people must contribute
directly to this project to prove their interest or commitment. I'm sure
the average software person uses dozens of software tools, yet it is
unrealistic to expect them to ramp up on the learning curve of all of them
to contribute. So they may give back to the OSS community, but on other
projects they are focussing their effort on.
Craig, I would like to keep personal attacks out of this issue tracker.
Please understand that. If you think that my comment was personal, consider
this: when I talked about the people not caring enough, I included myself,
and every other developer who is on the mailing list and did not review the
patch series.
So let's keep things professional. In that spirit, I will delete both your
and my comment.
I would like to help where I can, but my C skills are pretty weak and I
don't think I am able to review patches beyond just applying them and
seeing if they work for me.
Johannes, I'm not sure I understand you. What do you mean when you
say "make the issue interesting to those who can"?
If you want to test my patch series, please note that google groups has
split it in two threads. The second thread
(http://groups.google.com/group/msysgit/browse_thread/thread/d4414235850ce181)
also contains a version rebased to the current v1.7.3.2 (as git bundle).
My suggestions:
Make one git repository configuration variable to specify the filename
encoding used in the repo. Then future client that obey this flag could do
conversion during checkin/checkout/log, old client still work as-is. Hence
it won't break existing repo, instead of assuming everyone is using utf-8
already.
core.pathencoding was suggested for the configuration variable name in a
patch t EGit.
Needless to say we have the sam problem and we assume UTF-8 as the default.
However, if you are on Windows, you'll get into trouble because C Git
doesn't recognize an UTF-8 encoded filename as being UTF-8.
The premise behind the UTF-8 strategy in JGit/EGit is that, if it smells
like UTF-8, we treat it like UTF-8, else fallback to locale encoding (or
ISO-8859-1 if that doesn't work too). I believe we are not done with that
because Java on mac things the 8-bith encodings is MacRoman.
Even if heuristics could work for most people, some would need the
configuration variable anyway.
core.pathencoding requires an incompatible change of the repository format,
as the pathencoding would have to be stored along with every single file
name in the repo (as i18n.commitencoding does for every single commit
message by means of the Content-Encoding header).
Consider userA with core.pathencoding=A adds fileA, and userB with
core.pathencoding=B adds fileB; now you have two filenames with different
encodings in the same tree object. You'll never get that sorted out with a
configuration variable alone.
So, let's just stick with UTF-8 and fix the few platforms that don't
support it yet.
the problem i have is not with files inside the repository
but with the path witch the repository is inside
like this:
c:\users\[my username in unicode]\documents\myrepository <- this does not
work
Just a little note to all those who missed it: a patch series is already
being discussed on the mailing list, where your review and your help is
needed.
http://groups.google.com/group/msysgit/browse_thread/thread/d4414235850ce181/95bfcc1718fd3f1e?lnk=gst&q=blees#95bfcc1718fd3f1e
Thanks Johannes. I pasted that link into an issue for JGit relating to
this, I.e. https://bugs.eclipse.org/bugs/process_bug.cgi
What happened to that thread? Did it go somewhere else?
Some work make in
http://repo.or.cz/w/git/mingw/4msysgit.git/shortlog/refs/heads/kb/unicode
I found that pagination with less has some problem because less don't
process utf-16 input stream correctly.
Also gui like tortoisegit must be updated to work with utf-8 std streams
and wide char command line parameters instead ascii.
When this problem can be resolved?
Karsteen Blees periodically makes available the results of ongoing efforts
to solve this bug. The latest (to this date) patch series and a ready-made
installer are available at
http://http://groups.google.com/group/msysgit/browse_thread/thread/40112decdc564117
Make your testing, report any problems discovered to the mailing list.
Watch the mailing list for mails having the "Issue 80" in their subject and
test what's being offerred.
P.S.
Note that this bug tracker is officially closed and you're supposed to
discuss any issues on the mailing list instead.
The URL is
https://groups.google.com/group/msysgit/browse_thread/thread/40112decdc564117
Comment #81 on issue 80 by kusmab...@gmail.com: git-clone fails when repo
contains UTF-8 filepath
http://code.google.com/p/msysgit/issues/detail?id=80
This issue has been fixed in the source repository, and the fix will be
included in the next release of Git for Windows.