> Adrian Buehlmann wrote (2010-08-28 21:11):
>> On 28.08.2010 13:32, Toshi MARUYAMA wrote:
>>> I add Mercurial-ja google group for post target,
>>> because Japanese Mercurial users discuss 0x5c and utf-8
>>> of Mercurial and TortoiseHg for a long time.
>>> Adrian Buehlmann wrote (2010-08-28 18:02):
>>>> On 28.08.2010 10:08, Toshi MARUYAMA wrote:
>>>>> On Aug 28, 4:13 PM, Adrian Buehlmann<adr...@cadifra.com> wrote:
>>>>>> On 28.08.2010 04:11, Steve Borho wrote:
>>>>>>> On Fri, Aug 27, 2010 at 6:38 PM, Toshi MARUYAMA<marutosi...@gmail.com> wrote:
>>>>>>>> On Aug 24, 4:21, Toshi MARUYAMA<marutosi...@gmail.com> wrote:
>>>>>>>>> On Aug 24, 8:50 AM JST, Steve Borho<st...@borho.org> wrote:
>>>>>>>>>> The tipshellextcode on thg#default already looks for thg.exe before
>>>>>>>>>> hgtk.exe, but the problems Adrian is talking about are much deeper in
>>>>>>>>>> the installer.
>>>>>>>>> I reflected this logic to my win32shellext.
>>>>>>>>> 1. thg.exe with --listfileutf8 option.
>>>>>>>>> 2. hgtk.exe with --listfile option.
>>>>>>>>> 3. thg.cmd with --listfileutf8 option.
>>>>>>>>> I pushed normal changesets and MQ.
>>>>>>>>> Normal changesethttp://bitbucket.org/marutosi/tortoisehg/changeset/131f3d5caac3
>>>>>>>>> MQhttp://bitbucket.org/marutosi/tortoisehg-shellext-mq/changeset/6cd5ad...
>>>>>>>> I uploaded Windowsshellextdlls (ThgShellx86.dll and
>>>>>>>> ThgShellx64.dll).
>>>>>>>> http://bitbucket.org/marutosi/tortoisehg/downloads
>>>>>>>> http://bitbucket.org/marutosi/tortoisehg/downloads/ThgShellx86.dll
>>>>>>>> http://bitbucket.org/marutosi/tortoisehg/downloads/ThgShellx64.dll
>>>>>>>> I don't have 64bit Windows now, so I can't confirm to run 64bit dll.
>>>>>>>> You can replace existing dll to new dll by the way of the following
>>>>>>>> link.
>>>>>>>> http://bitbucket.org/tortoisehg/stable/src/f0433be8925a/win32/shellex...
>>>>>>> Thanks, it will be good to get some exposure for them.
>>>>>>> Can you make a Wiki page for your repo that explains the differences
>>>>>>> that people should look for when they use your DLL?
>>>>>> Am I right in assuming that for this to have any noticeable advantage,
>>>>>> this would be for users affected by repositories containing filenames
>>>>>> that are utf-8 encoded and these users would have to enable the fixutf8
>>>>>> extension
>>>>>> http://mercurial.selenic.com/wiki/FixUtf8Extension
>>>>>> to use mercurial on such repositories, which contains things like
>>>>>> http://bitbucket.org/stefanrusek/hg-fixutf8/src/tip/cpmap.py
>>>>>> which, AFAICS is a 1.9 MB python file?
>>>>>> If yes, then:
>>>>>> Is the FixUtf8 extension going to be supported (or included) by TortoiseHg?
>>>>>> BTW, is that extension maintained?
>>>>> This shellext resolves not only fixutf8 extension
>>>>> (issue #672 shell extension unicode support),
>>>>> but also 0x5c(backslash) DAME-MOJI problem (issue #1241
>>>>> Windows icon overlay of files which contain 0x5C(backslash)
>>>>> in file name)
>>>> This doesn't really answer my question.
>>>> My question is: Do these users need to enable FixUtf8?
>>> No.
>>> If fixutf8 is not enabled, repository encoding is CP_ACP.
>>> In Japan, CP_ACP is CP932(shift-jis).
>>>> After all, you propose to insert code about utf-8 decoding into the
>>>> shell extension.
>>> My logic to read .hg/dirstate and .hg/thgstatus
>>> 1. Firstly read by UTF-8.
>>> 2. If UTF-8 is valid, repository encoding is UTF-8.
>>> 3. If UTF-8 is invalid, repository encoding is CP_ACP(CP932=shift-jis).
>> I've seen that, so your method is just "try and error".
>> You *assume* the repositories filename's are encoded in UTF-8 and then
>> just try to decode them as UTF-8 and recode them into UTF-16 for
>> internal use ("unicode")
>> If the decode function reports an error, you *assume* it must be
>> shitf-jis instead.
>> What I don't understand is: why is this correct? Don't we need some
>> external info in what encoding the filenames are, if we read
>> .hg/dirstate and .hg/thgstatus?
>> IIRC when Stefan Rusek last time tried to improve the shell extension
>> for non-ASCII, he took the encoding info from somewhere (I think he
>> tried to write the name of the encoding into .hg/thgstatus and then have
>> the shell extension decode according to that name).
> I tried Stefan job at
> http://bitbucket.org/tortoisehg/stable/issue/672/shell-extension-unic... .
> But this is not solution.
> Fixutf8 sets encoding.encoding = 'utf8' at
> http://bitbucket.org/stefanrusek/hg-fixutf8/src/baf283ab9f92/fixutf8.... .
> It effects all repositories in regardless of whether fixutf8
> is activated or not activated.
> It is same problem with Steve posted a mail "managing extensions"
> http://groups.google.com/group/thg-dev/browse_frm/thread/a5119a56a9c2...
>> Is "try and error" really correct?
>> What happens if the filenames are in some other encoding? e.g. 'latin1'
>> with a 'ü'?
> As described at my previous post, this is incorrect.
>> And what is your logic to write .hg/thgstatus? In what encoding are the
>> filenames in .hg/thgstatus written?
> I don't touch OverlayServer python which writes .hg/thgstatus.
> It reads simply .hg/dirstate and writes .hg/thgstatus.
> Because .hg/dirstate path separator is '/', there is no problem of 0x5c.
> If fixutf8 is activated, .hg/thgstatus and .hg/thgstatus encoding is utf-8.
> If fixutf8 is not activated, .hg/thgstatus and .hg/thgstatus encoding is CP_ACP(shift-jis).
>> If win32mbcs is activated, in what encoding is .hg/thgstatus written?
>> shift-jis?
> It is shift-jis.
> Win32mbcs does not change repository encoding.
> It hooks mercurial function simply.
> Regardless of whether win32mbcs is activated or is not activated,
> repository encoding is shift-jis in Japan and big5 in China(Taiwan).
> I give up shellext supports fixutf8.
> And I remove assuming utf-8 for filename.
> My current shellext is only converting from CP_ACP to Unicode(Wide char).
> This is compatible with main stream shellext.
> And it resolves 0x5c problem.
> It is big improvement for Japanese and Chinese(Taiwanese) thg users.
> I finished to merge and resolve conflicts and I pushed my bitbucket.
> Normal changeset:
> http://bitbucket.org/marutosi/tortoisehg/changeset/18a682ecdfb1
> MQ:
> http://bitbucket.org/marutosi/tortoisehg-shellext-mq/changeset/98513b...
> I uploaded Windows shellext dlls (ThgShellx86.dll and ThgShellx64.dll).
> http://bitbucket.org/marutosi/tortoisehg/downloads
> http://bitbucket.org/marutosi/tortoisehg/downloads/ThgShellx86.201009...
> http://bitbucket.org/marutosi/tortoisehg/downloads/ThgShellx64.201009...
> I don't have 64 bit Windows now, so I can't confirm to run 64 bit dll.
> You can replace existing dll to new dll by the way of the following link.
> http://bitbucket.org/tortoisehg/stable/src/386a21068b48/win32/shellex...
> I hope to pull my shellext to thg main stream.
I hope this is *not* pushed.