alpha quality installers

48 views
Skip to first unread message

Steve Borho

unread,
Aug 22, 2010, 10:56:21 PM8/22/10
to thg...@googlegroups.com
I've uploaded the very first PyQt4 based TortoiseHg installers to:

http://bitbucket.org/tortoisehg/thg/downloads

* It installs into PF\TortoiseHg2, so it does not conflict, folder or
component-wise with hgtk
* However it is registering the same shell extension DLL, which seems
to be mostly harmless
* kdiff3 is still the old statically linked Qt3 version; anyone have a
dynamically linked version?
* there is a lot of junk in the base install folder. It is my intent
to _not_ add this folder into the system PATH anymore. Instead, I'll
make a bin/ folder and place the minimum files we need to be in there
(thg.cmd, hg.cmd, TortosePlink.exe, Pageant.exe)
* All the dialogs seem to work, including Workbench, but svgs are 1 -
not packaged 2- not used, even if you copy them into place. I haven't
figured out why yet.

This is kind of a milestone, we finally have a functioning thg.exe
installer, but these installers are not ready for general use.

--
Steve Borho

Steve Borho

unread,
Aug 23, 2010, 12:15:04 AM8/23/10
to thg...@googlegroups.com
On Sun, Aug 22, 2010 at 9:56 PM, Steve Borho <st...@borho.org> wrote:
> I've uploaded the very first PyQt4 based TortoiseHg installers to:
>
> http://bitbucket.org/tortoisehg/thg/downloads
>
> * It installs into PF\TortoiseHg2, so it does not conflict, folder or
> component-wise with hgtk
> * However it is registering the same shell extension DLL, which seems
> to be mostly harmless
> * kdiff3 is still the old statically linked Qt3 version; anyone have a
> dynamically linked version?
> * there is a lot of junk in the base install folder.  It is my intent
> to _not_ add this folder into the system PATH anymore.  Instead, I'll
> make a bin/ folder and place the minimum files we need to be in there
> (thg.cmd, hg.cmd, TortosePlink.exe, Pageant.exe)

This is now done.

> * All the dialogs seem to work, including Workbench, but svgs are 1 -
> not packaged 2- not used, even if you copy them into place.  I haven't
> figured out why yet.

There were two reasons for this, I'm working through them now.

I'll post a couple new installers when I'm done for the evening.

--
Steve Borho

Adrian Buehlmann

unread,
Aug 23, 2010, 2:25:12 AM8/23/10
to thg...@googlegroups.com

Changing the ProductUpgradeCode (win32/wix/guids.wxi) is a very bad idea.

Yuya Nishihara

unread,
Aug 23, 2010, 10:27:24 AM8/23/10
to thg...@googlegroups.com
Steve Borho wrote:
> I've uploaded the very first PyQt4 based TortoiseHg installers to:
>
> http://bitbucket.org/tortoisehg/thg/downloads

Wow, thanks!

> * It installs into PF\TortoiseHg2, so it does not conflict, folder or
> component-wise with hgtk
> * However it is registering the same shell extension DLL, which seems
> to be mostly harmless
> * kdiff3 is still the old statically linked Qt3 version; anyone have a
> dynamically linked version?

I uploaded dynamically-linked kdiff3 to bitbucket:
http://bitbucket.org/yuja/kdiff3/downloads?highlight=10639

It needs codecs plugins (codecs/qXXcodecs4.dll) to support CJK encodings.

Build procedure:
1. Install qt-win-opensource-4.6.2-mingw.exe
2. Start "Qt by Nokia 4.6.2 (OpenSource)\Qt 4.6.2 Command Prompt"
3. cd to kdiff3\src-QT4
4. qmake
5. make release

Yuya,

Steve Borho

unread,
Aug 23, 2010, 11:06:29 AM8/23/10
to thg...@googlegroups.com

I thought this was mandatory, for a 2.0 product, to allow the two
packages to both be installed. Why is this wrong?

--
Steve Borho

Adrian Buehlmann

unread,
Aug 23, 2010, 12:01:18 PM8/23/10
to thg...@googlegroups.com

If you need to have both installed, then yes, this is mandatory, but
then, you need to follow the Windows Installer rules and don't use the
same component key path as any other component from any other product.

As I understand Windows Installer matters, you can't share the shell
extension like you do with this now (without violating fundamental
Windows Installer rules).

For example, the 64 bit shell extension ThgShellx64.dll is installed in
C:\Program Files\Common Files\TortoiseHg by two different products now,
which is a conflict of key paths of two different components from two
different products (one is thgshellx64dll.guid with component GUID
{59FD2A49-BA62-40CC-B155-D11DB11EE611} and the other with
{D5D1E532-CDAD-4FFD-9695-757B8A29B4BA}).

And I bet the registry keys (as windows installer key paths) are
conflicting as well (in installer terms).

You need to solve that problem first (shell extension sharing), if you
want to allow parallel installs (1.x and 2.0).

But this isn't solved at all now, so all you can do with the current
code base is treat 2.0 as the same product (=same ProductUpgradeCode)
and have the same ProductUpgradeCode for 2.0 so that 1.x is properly
uninstalled when installing 2.0.

With the current alpha installer, you in fact damage the current 1.x
installs by creating a mess lurking in the Windows installer database of
the user's PCs.

As a side note, I don't see an urgent need for parallel installs. It
might be nice to have, but there is no free (Windows Installer) lunch.

As a variant, you might consider not installing the shell extension for
your alpha 2.0 installer now. But I'm not sure if that's very
attractive. And it most certainly wouldn't be an option for the final
2.0 release anyway.

Steve Borho

unread,
Aug 23, 2010, 12:06:23 PM8/23/10
to thg...@googlegroups.com

I was/am considering this option. I may do this before I announce any
2.0 installers to the public.

--
Steve Borho

Adrian Buehlmann

unread,
Aug 23, 2010, 12:17:38 PM8/23/10
to thg...@googlegroups.com

Please remove the current installer form the downloads, it is damaging
current 1.x installs as a laid out above.

Adrian Buehlmann

unread,
Aug 23, 2010, 4:28:13 PM8/23/10
to thg...@googlegroups.com

Just thought about this a bit more.

If we really would want to allow a 1.x and a 2.0 install in parallel, I
think we would have to fix the shell extension contention problem in the
1.x series first (switching to a sharable shellext install
subcomponent), releasing that inside a 1.y (with y being something in
the future).

Then this 1.y release would be "parallel installable" with the
prospective 2.0.

I don't think it will be possible to create a 2.0 that will cleanly
install in parallel with any already released 1.x (provided that 2.0
contains a shell extension).

Creating such a shell extension subcomponent would be an art of itself.
I guess this would mean someone having to enter the new land of creating
a Windows Installer merge module.

Yikes.

(Sorry folks for raining on your parade)

Toshi MARUYAMA

unread,
Aug 23, 2010, 7:31:34 PM8/23/10
to TortoiseHg Developers
I define "thg" as macro.
http://bitbucket.org/marutosi/tortoisehg/src/d5cbba337b29/win32/shellext/CShellExtCMenu.cpp#cl-20

I have a plan to switch hgtk or thg with environment variable
or registory key.

Steve Borho

unread,
Aug 23, 2010, 7:50:39 PM8/23/10
to thg...@googlegroups.com

The tip shellext code on thg#default already looks for thg.exe before
hgtk.exe, but the problems Adrian is talking about are much deeper in
the installer.

Thinking this through, I'm going to drop side-by-side installer
support. There's no clean way, that I can think of, to finesse this
interim period. So I'm going to switch the default branch back to the
same product code and install folder as the stable branch so it will
simply replace an hgtk based install.

We don't have the resources to thread this needle, so I'm not going to try.

This reminds me. Yuya; if you still have that patch that makes
thg.exe [noargs] open the workbench, please post it again. Now is a
good time to change that.

--
Steve Borho

Steve Borho

unread,
Aug 23, 2010, 10:27:57 PM8/23/10
to thg...@googlegroups.com

This was surprisingly painless. I'm uploading new installers that
upgrade hgtk based installers. The shell extensions work and launch
thg.exe as expected.

Visual Diffs will be broken because kdiff3.exe is no longer in the
system PATH. Once we have a newer Qt4 based kdiff3.exe in our
installer, I plan to add a registry key for its install location so
our MergeTools.rc file can find it without it being in the PATH.

If anyone finds any problems, please start a new email thread.

--
Steve Borho

Toshi MARUYAMA

unread,
Aug 24, 2010, 3:21:18 AM8/24/10
to TortoiseHg Developers

On Aug 24, 8:50 AM JST, Steve Borho <st...@borho.org> wrote:
>
> The tip shellext code on thg#default already looks for thg.exe before
> hgtk.exe, but the problems Adrian is talking about are much deeper in
> the installer.
>

I reflected this logic to my win32 shellext.

1. thg.exe with --listfileutf8 option.
2. hgtk.exe with --listfile option.
3. thg.cmd with --listfileutf8 option.

I pushed normal changesets and MQ.

Normal changeset
http://bitbucket.org/marutosi/tortoisehg/changeset/131f3d5caac3

MQ
http://bitbucket.org/marutosi/tortoisehg-shellext-mq/changeset/6cd5adcccf94

Yuya Nishihara

unread,
Aug 24, 2010, 10:55:54 AM8/24/10
to thg...@googlegroups.com
# HG changeset patch
# User Yuya Nishihara <yu...@tcha.org>
# Date 1282661448 -32400
# Node ID 07c39542b9f6cc7f02302e4ad51276478126784b
# Parent bee27d5b8d1896382d73224c6b290789f93c010c
run: run workbench if no command specified

diff --git a/tortoisehg/hgqt/run.py b/tortoisehg/hgqt/run.py
--- a/tortoisehg/hgqt/run.py
+++ b/tortoisehg/hgqt/run.py
@@ -182,17 +182,15 @@ def _parse(ui, args):

if args:
alias, args = args[0], args[1:]
- aliases, i = cmdutil.findcmd(alias, table, ui.config("ui", "strict"))
- for a in aliases:
- if a.startswith(alias):
- alias = a
- break
- cmd = aliases[0]
- c = list(i[1])
else:
- alias = None
- cmd = None
- c = []
+ alias, args = 'workbench', []
+ aliases, i = cmdutil.findcmd(alias, table, ui.config("ui", "strict"))
+ for a in aliases:
+ if a.startswith(alias):
+ alias = a
+ break
+ cmd = aliases[0]
+ c = list(i[1])

# combine global options into local
for o in globalopts:
@@ -251,8 +249,6 @@ def runcommand(ui, args):

if options['help']:
return help_(ui, cmd)
- elif not cmd:
- return help_(ui, 'shortlist')

path = options['repository']
if path:

Toshi MARUYAMA

unread,
Aug 27, 2010, 7:38:39 PM8/27/10
to TortoiseHg Developers
> MQhttp://bitbucket.org/marutosi/tortoisehg-shellext-mq/changeset/6cd5ad...

I uploaded Windows shellext dlls (ThgShellx86.dll and
ThgShellx64.dll).

http://bitbucket.org/marutosi/tortoisehg/downloads
http://bitbucket.org/marutosi/tortoisehg/downloads/ThgShellx86.dll
http://bitbucket.org/marutosi/tortoisehg/downloads/ThgShellx64.dll

I don't have 64bit Windows now, so I can't confirm to run 64bit dll.
You can replace existing dll to new dll by the way of the following
link.
http://bitbucket.org/tortoisehg/stable/src/f0433be8925a/win32/shellext/README.txt#cl-172

Steve Borho

unread,
Aug 27, 2010, 10:11:24 PM8/27/10
to thg...@googlegroups.com

Thanks, it will be good to get some exposure for them.

Can you make a Wiki page for your repo that explains the differences
that people should look for when they use your DLL?

--
Steve Borho

Adrian Buehlmann

unread,
Aug 28, 2010, 3:13:26 AM8/28/10
to thg...@googlegroups.com

Am I right in assuming that for this to have any noticeable advantage,
this would be for users affected by repositories containing filenames
that are utf-8 encoded and these users would have to enable the fixutf8
extension

http://mercurial.selenic.com/wiki/FixUtf8Extension

to use mercurial on such repositories, which contains things like

http://bitbucket.org/stefanrusek/hg-fixutf8/src/tip/cpmap.py

which, AFAICS is a 1.9 MB python file?


If yes, then:

Is the FixUtf8 extension going to be supported (or included) by TortoiseHg?

BTW, is that extension maintained?

Toshi MARUYAMA

unread,
Aug 28, 2010, 4:08:06 AM8/28/10
to TortoiseHg Developers

On Aug 28, 4:13 PM, Adrian Buehlmann <adr...@cadifra.com> wrote:
> On 28.08.2010 04:11, Steve Borho wrote:
>
> > On Fri, Aug 27, 2010 at 6:38 PM, Toshi MARUYAMA <marutosi...@gmail.com> wrote:
>
> >> On Aug 24, 4:21, Toshi MARUYAMA <marutosi...@gmail.com> wrote:
> >>> On Aug 24, 8:50 AM JST, Steve Borho <st...@borho.org> wrote:
>
> >>>> The tipshellextcode on thg#default already looks for thg.exe before
> >>>> hgtk.exe, but the problems Adrian is talking about are much deeper in
> >>>> the installer.
>
> >>> I reflected this logic to my win32shellext.
>
> >>> 1. thg.exe  with --listfileutf8 option.
> >>> 2. hgtk.exe with --listfile     option.
> >>> 3. thg.cmd  with --listfileutf8 option.
>
> >>> I pushed normal changesets and MQ.
>
> >>> Normal changesethttp://bitbucket.org/marutosi/tortoisehg/changeset/131f3d5caac3
>
> >>> MQhttp://bitbucket.org/marutosi/tortoisehg-shellext-mq/changeset/6cd5ad...
>
> >> I uploaded Windowsshellextdlls (ThgShellx86.dll and
> >> ThgShellx64.dll).
>
> >>http://bitbucket.org/marutosi/tortoisehg/downloads
> >>http://bitbucket.org/marutosi/tortoisehg/downloads/ThgShellx86.dll
> >>http://bitbucket.org/marutosi/tortoisehg/downloads/ThgShellx64.dll
>
> >> I don't have 64bit Windows now, so I can't confirm to run 64bit dll.
> >> You can replace existing dll to new dll by the way of the following
> >> link.
> >>http://bitbucket.org/tortoisehg/stable/src/f0433be8925a/win32/shellex...
>
> > Thanks, it will be good to get some exposure for them.
>
> > Can you make a Wiki page for your repo that explains the differences
> > that people should look for when they use your DLL?
>
> Am I right in assuming that for this to have any noticeable advantage,
> this would be for users affected by repositories containing filenames
> that are utf-8 encoded and these users would have to enable the fixutf8
> extension
>
>    http://mercurial.selenic.com/wiki/FixUtf8Extension
>
> to use mercurial on such repositories, which contains things like
>
> http://bitbucket.org/stefanrusek/hg-fixutf8/src/tip/cpmap.py
>
> which, AFAICS is a 1.9 MB python file?
>
> If yes, then:
>
> Is the FixUtf8 extension going to be supported (or included) by TortoiseHg?
>
> BTW, is that extension maintained?

This shellext resolves not only fixutf8 extension
(issue #672 shell extension unicode support),
but also 0x5c(backslash) DAME-MOJI problem (issue #1241
Windows icon overlay of files which contain 0x5C(backslash)
in file name)

In Japan(shift-jis) and China(big5 used in Taiwan mainly),
0x5c problem is serious problem.
Because of this problem, Mercurial losts many users
and they move to Bazaar.

Adrian Buehlmann

unread,
Aug 28, 2010, 5:02:29 AM8/28/10
to thg...@googlegroups.com

This doesn't really answer my question.

My question is: Do these users need to enable FixUtf8?

After all, you propose to insert code about utf-8 decoding into the
shell extension.

> In Japan(shift-jis) and China(big5 used in Taiwan mainly),
> 0x5c problem is serious problem.

Certainly.

But lots of users use Mercurial to track their program sources, and they
can live just fine with using ASCII *filenames* for their sources.
English filenames for program sources is pretty much standard around the
world.

In fact, I believe that majority of users wouldn't be pleased to be
negatively affected if we introduce bugs or problems here. That's why I
even care to reply at all.

> Because of this problem, Mercurial losts many users
> and they move to Bazaar.

I doubt that. But there's no need to go on a "religious" tangent.

Mercurial has a lot of users on Windows (see the download numbers)
that's why the TortoiseHg project should care about maintenance and
stability. Which is the whole point of my recent replies to your proposals.

Personally, I doubt Bazaar is a better DVCS. But everyone is entitled to
his own opinion. And it is a bit far fetched to implicitly blame the
shell extension for such a claim.

BTW, try a 'bzr check' of the bzr repo itself and see how horribly slow
that is. Checking a repo for integrity is one of the basics of every
DVCS. If it is slow as hell no one will use it.

In my book, the only serious co-contender to hg is git (and maybe
Veracity in the future, but that depends).

Toshi MARUYAMA

unread,
Aug 28, 2010, 7:32:12 AM8/28/10
to thg...@googlegroups.com, mercur...@googlegroups.com
I add Mercurial-ja google group for post target,
because Japanese Mercurial users discuss 0x5c and utf-8
of Mercurial and TortoiseHg for a long time.

No.
If fixutf8 is not enabled, repository encoding is CP_ACP.
In Japan, CP_ACP is CP932(shift-jis).

>
> After all, you propose to insert code about utf-8 decoding into the
> shell extension.
>

My logic to read .hg/dirstate and .hg/thgstatus
1. Firstly read by UTF-8.
2. If UTF-8 is valid, repository encoding is UTF-8.
3. If UTF-8 is invalid, repository encoding is CP_ACP(CP932=shitf-jis).


>> In Japan(shift-jis) and China(big5 used in Taiwan mainly),
>> 0x5c problem is serious problem.
>
> Certainly.
>
> But lots of users use Mercurial to track their program sources, and they
> can live just fine with using ASCII *filenames* for their sources.
> English filenames for program sources is pretty much standard around the
> world.
>

Yes.
As far as using ASCII *filenames* for their sources,
git and Mercurial has no problem.

> In fact, I believe that majority of users wouldn't be pleased to be
> negatively affected if we introduce bugs or problems here. That's why I
> even care to reply at all.
>
>> Because of this problem, Mercurial losts many users
>> and they move to Bazaar.
>
> I doubt that. But there's no need to go on a "religious" tangent.
>
> Mercurial has a lot of users on Windows (see the download numbers)
> that's why the TortoiseHg project should care about maintenance and
> stability. Which is the whole point of my recent replies to your proposals.
>
> Personally, I doubt Bazaar is a better DVCS. But everyone is entitled to
> his own opinion. And it is a bit far fetched to implicitly blame the
> shell extension for such a claim.
>
> BTW, try a 'bzr check' of the bzr repo itself and see how horribly slow
> that is. Checking a repo for integrity is one of the basics of every
> DVCS. If it is slow as hell no one will use it.
>
> In my book, the only serious co-contender to hg is git (and maybe
> Veracity in the future, but that depends).

In Japan, Subversion and TortoiseSVN are used in enterprise widely.
In enterprise, SVN is used for not only program sources but also documents.
Because SVN treats filename as Unicode,
SVN and TortoiseSVN has no problem of 0x5c.
Mercurial resolve 0x5c problem with win32mbcs extension.
But TortoiseHg Windows shellext has 0x5c problem.
So, Japanese Mercurial and TortoiseHg users can not recommend
to switch from TortoiseSVN to TortoiseHg.
And they move to Bazaar and TortoiseBzr.

Adrian Buehlmann

unread,
Aug 28, 2010, 8:11:05 AM8/28/10
to thg...@googlegroups.com, Toshi MARUYAMA, mercur...@googlegroups.com

I've seen that, so your method is just "try and error".

You *assume* the repositories filename's are encoded in UTF-8 and then
just try to decode them as UTF-8 and recode them into UTF-16 for
internal use ("unicode")

If the decode function reports an error, you *assume* it must be
shitf-jis instead.

What I don't understand is: why is this correct? Don't we need some
external info in what encoding the filenames are, if we read
.hg/dirstate and .hg/thgstatus?

IIRC when Stefan Rusek last time tried to improve the shell extension
for non-ASCII, he took the encoding info from somewhere (I think he
tried to write the name of the encoding into .hg/thgstatus and then have
the shell extension decode according to that name).

Is "try and error" really correct?

What happens if the filenames are in some other encoding? e.g. 'latin1'
with a '�'?

And what is your logic to write .hg/thgstatus? In what encoding are the
filenames in .hg/thgstatus written?

If win32mbcs is activated, in what encoding is .hg/thgstatus written?
shitf-jis?

Adrian Buehlmann

unread,
Aug 28, 2010, 8:34:58 AM8/28/10
to thg...@googlegroups.com, Toshi MARUYAMA
On 28.08.2010 13:32, Toshi MARUYAMA wrote:
> I add Mercurial-ja google group for post target,

That doesn't seem to work. I left your added cc in my previous reply and
got the message I pasted at the bottom back (subject: Delivery Status
Notification (Failure)).

I don't know what that means (I don't speak and can't read Japanese).

I guess I can't post to Mercurial-ja because I'm not subscribed (and I
have no intention to do so).

> because Japanese Mercurial users discuss 0x5c and utf-8
> of Mercurial and TortoiseHg for a long time.
>

I'm not subscribed to Mercurial-ja so I can't comment on what's
discussed there. I didn't even know that group existed.

I'd recommend that people who want to improve Mercurial or TortoiseHg
discuss things on Merc...@selenic.com or
Tortoiseh...@lists.sourceforge.net (or the respective developer
lists) instead.

On 28.08.2010 14:11, Mail Delivery Subsystem wrote:
> adr...@cadifra.com
>
> 連絡しようとしたグループ(mercurial-ja)が存在しないか、グループにメッ
セージを投稿する権限がない可能性があります。投稿できなかった詳しい理由を
いくつか紹介します:
>
> * グループ名のスペルや形式が間違っている。
> * グループのオーナーによってこのグループが削除された。
> * 投稿する権限を取得するためには、グループに参加する必要がある。
> * このグループでは投稿を受け付けていない。
>
> このグループやその他の Google グループについてご不明な点がありました
ら、http://groups.google.com/support/?hl=ja_US からヘルプセンターにアク
セスしてください。
>
> よろしくお願いいたします。
>
> Google Groups

Adrian Buehlmann

unread,
Aug 28, 2010, 1:25:29 PM8/28/10
to thg...@googlegroups.com, Toshi MARUYAMA

^^^^^^^^^
Uh, unlucky typo. I copy/pasted this from above without noticing
(should be shift-jis, sorry)

> What I don't understand is: why is this correct? Don't we need some
> external info in what encoding the filenames are, if we read
> .hg/dirstate and .hg/thgstatus?
>
> IIRC when Stefan Rusek last time tried to improve the shell extension
> for non-ASCII, he took the encoding info from somewhere (I think he
> tried to write the name of the encoding into .hg/thgstatus and then have
> the shell extension decode according to that name).
>
> Is "try and error" really correct?
>
> What happens if the filenames are in some other encoding? e.g. 'latin1'
> with a '�'?
>
> And what is your logic to write .hg/thgstatus? In what encoding are the
> filenames in .hg/thgstatus written?
>
> If win32mbcs is activated, in what encoding is .hg/thgstatus written?
> shitf-jis?

^^^^^^^^^
(same here, sorry)

Toshi MARUYAMA

unread,
Aug 28, 2010, 8:08:39 PM8/28/10
to TortoiseHg Developers, mercur...@googlegroups.com

On Aug 28, 8:38 AM, Toshi MARUYAMA <marutosi...@gmail.com> wrote:
> On Aug 24, 4:21, Toshi MARUYAMA <marutosi...@gmail.com> wrote:
> > On Aug 24, 8:50 AM JST, Steve Borho <st...@borho.org> wrote:
>
> > > The tip shellext code on thg#default already looks for thg.exe before
> > > hgtk.exe, but the problems Adrian is talking about are much deeper in
> > > the installer.
>
> > I reflected this logic to my win32 shellext.
>
> > 1. thg.exe  with --listfileutf8 option.
> > 2. hgtk.exe with --listfile     option.
> > 3. thg.cmd  with --listfileutf8 option.
>
> > I pushed normal changesets and MQ.
>
> > Normal changesethttp://bitbucket.org/marutosi/tortoisehg/changeset/131f3d5caac3
>
> > MQhttp://bitbucket.org/marutosi/tortoisehg-shellext-mq/changeset/6cd5ad...
>
> I uploaded Windows shellext dlls (ThgShellx86.dll and
> ThgShellx64.dll).
>
> http://bitbucket.org/marutosi/tortoisehg/downloads
> http://bitbucket.org/marutosi/tortoisehg/downloads/ThgShellx86.dll
> http://bitbucket.org/marutosi/tortoisehg/downloads/ThgShellx64.dll
>

Sorry, I mistook x86 and x64 filename.
THgShellx86.dll(32bit) is 303.5 KB and THgShellx64.dll(64bit) is 439.0
KB.
I deleted and uploaded fixed files.

http://bitbucket.org/marutosi/tortoisehg/downloads
32bit:
http://bitbucket.org/marutosi/tortoisehg/downloads/ThgShellx86.dll
64bit:
http://bitbucket.org/marutosi/tortoisehg/downloads/ThgShellx64.dll

> I don't have 64bit Windows now, so I can't confirm to run 64bit dll.
> You can replace existing dll to new dll by the way of the following
> link.http://bitbucket.org/tortoisehg/stable/src/f0433be8925a/win32/shellex...

Toshi MARUYAMA

unread,
Aug 28, 2010, 8:23:16 PM8/28/10
to thg...@googlegroups.com, mercur...@googlegroups.com
Sorry, google group web interface broke URLs in reference.
I resend by Thunderbird.

Toshi MARUYAMA wrote (2010/08/28 8:38):
> On Aug 24, 4:21, Toshi MARUYAMA<marutosi...@gmail.com> wrote:
>> On Aug 24, 8:50 AM JST, Steve Borho<st...@borho.org> wrote:
>>
>>> The tip shellext code on thg#default already looks for thg.exe before
>>> hgtk.exe, but the problems Adrian is talking about are much deeper in
>>> the installer.
>>
>> I reflected this logic to my win32 shellext.
>>
>> 1. thg.exe with --listfileutf8 option.
>> 2. hgtk.exe with --listfile option.
>> 3. thg.cmd with --listfileutf8 option.
>>
>> I pushed normal changesets and MQ.
>>
>> Normal changesethttp://bitbucket.org/marutosi/tortoisehg/changeset/131f3d5caac3
>>
>> MQhttp://bitbucket.org/marutosi/tortoisehg-shellext-mq/changeset/6cd5ad...
>
> I uploaded Windows shellext dlls (ThgShellx86.dll and
> ThgShellx64.dll).
>
> http://bitbucket.org/marutosi/tortoisehg/downloads
> http://bitbucket.org/marutosi/tortoisehg/downloads/ThgShellx86.dll
> http://bitbucket.org/marutosi/tortoisehg/downloads/ThgShellx64.dll
>

Sorry, I mistook x86 and x64 filename.


THgShellx86.dll(32bit) is 303.5 KB and THgShellx64.dll(64bit) is 439.0
KB. I deleted and uploaded fixed files.

http://bitbucket.org/marutosi/tortoisehg/downloads
32bit: 303.5 KB
http://bitbucket.org/marutosi/tortoisehg/downloads/ThgShellx86.dll
64bit: 439.0 KB

Toshi MARUYAMA

unread,
Aug 30, 2010, 12:19:12 PM8/30/10
to TortoiseHg Developers

I'm sorry for typo 'shitf-jis'.

Adrian Buehlmann wrote (2010-08-28 21:11):

>> My logic to read .hg/dirstate and .hg/thgstatus
>> 1. Firstly read by UTF-8.
>> 2. If UTF-8 is valid, repository encoding is UTF-8.
>> 3. If UTF-8 is invalid, repository encoding is CP_ACP(CP932=shitf-jis).
>
> I've seen that, so your method is just "try and error".
>
> You *assume* the repositories filename's are encoded in UTF-8 and then
> just try to decode them as UTF-8 and recode them into UTF-16 for
> internal use ("unicode")
>
> If the decode function reports an error, you *assume* it must be
> shitf-jis instead.
>
> What I don't understand is: why is this correct? Don't we need some
> external info in what encoding the filenames are, if we read
> .hg/dirstate and .hg/thgstatus?
>
> IIRC when Stefan Rusek last time tried to improve the shell extension
> for non-ASCII, he took the encoding info from somewhere (I think he
> tried to write the name of the encoding into .hg/thgstatus and then have
> the shell extension decode according to that name).
>
> Is "try and error" really correct?
>

I confired this logic fails in case of "\xC2\x80" of
latin1('iso-8859-1')
This logic is used in hglib.tounicode(str).

*********************************

diff --git a/tests/hglib_encoding_test.py b/tests/
hglib_encoding_test.py
--- a/tests/hglib_encoding_test.py
+++ b/tests/hglib_encoding_test.py
@@ -93,3 +93,16 @@
def test_toutf_fallback():
assert_equals(JAPANESE_KANA_I.encode('utf-8'),
hglib.toutf(JAPANESE_KANA_I.encode('euc-jp')))
+
+@with_encoding('iso-8859-1')
+def test_latin1_1():
+ str = "\41\x42"
+ assert_equals(str,
+ hglib.fromunicode(hglib.tounicode(str)))
+
+@with_encoding('iso-8859-1')
+def test_latin1_2():
+ str = "\xC2\x80"
+ assert_equals(str,
+ hglib.fromunicode(hglib.tounicode(str)))
+

*********************************

$ /r/Python26/Scripts/nosetests.exe tests/hglib_encoding_test.py
..............F
======================================================================
FAIL: hglib_encoding_test.test_latin1_2
----------------------------------------------------------------------
Traceback (most recent call last):
File "r:\Python26\lib\site-packages\nose-0.11.4-py2.6.egg\nose
\case.py", line
186, in runTest
self.test(*self.arg)
File "C:\WEB-DOWN\tortoisehg\tests\hglib_encoding_test.py", line
108, in test_
latin1_2
hglib.fromunicode(hglib.tounicode(str)))
AssertionError: '\xc2\x80' != '\x80'

----------------------------------------------------------------------
Ran 15 tests in 0.000s

FAILED (failures=1)

Toshi MARUYAMA

unread,
Aug 30, 2010, 12:41:05 PM8/30/10
to TortoiseHg Developers

On Aug 31, 1:19, Toshi MARUYAMA <marutosi...@yahoo.co.jp> wrote:
> I'm sorry for typo 'shitf-jis'.
>
> Adrian Buehlmann wrote (2010-08-28 21:11):
> >> My logic to read .hg/dirstate and .hg/thgstatus
> >> 1. Firstly read by UTF-8.
> >> 2. If UTF-8 is valid, repository encoding is UTF-8.
> >> 3. If UTF-8 is invalid, repository encoding is CP_ACP(CP932=shitf-jis).
>
> > I've seen that, so your method is just "try and error".
>
> > You *assume* the repositories filename's are encoded in UTF-8 and then
> > just try to decode them as UTF-8 and recode them into UTF-16 for
> > internal use ("unicode")
>
> > If the decode function reports an error, you *assume* it must be
> > shitf-jis instead.
>
> > What I don't understand is: why is this correct? Don't we need some
> > external info in what encoding the filenames are, if we read
> > .hg/dirstate and .hg/thgstatus?
>
> > IIRC when Stefan Rusek last time tried to improve the shell extension
> > for non-ASCII, he took the encoding info from somewhere (I think he
> > tried to write the name of the encoding into .hg/thgstatus and then have
> > the shell extension decode according to that name).
>
> > Is "try and error" really correct?
> >
> > What happens if the filenames are in some other encoding? e.g. 'latin1'
> > with a 'u'?
This logic fails in following matrix.

first char : 0xc2 - 0xdf (30chars)
second char : 0x80 - 0xbf (64chars)

Total: 1920 (=30*64)

Yuya Nishihara

unread,
Aug 30, 2010, 12:42:41 PM8/30/10
to thg...@googlegroups.com

It looks "\xC2\x80" is handled as 'utf-8' with no error.
I'm not sure why 'utf-8' has precedence of locale encoding in tounicode().

IMHO, try-and-error is reasonable for *showing* file contents, name,
commit messages, etc., but dangerous for identifying something.

Yuya,

Toshi MARUYAMA

unread,
Aug 30, 2010, 12:54:30 PM8/30/10
to TortoiseHg Developers
"\xC2\x80" is two chars in latin-1.
"\xC2\x80" is valid utf-8. "\xC2\x79" is invalid utf-8.

Yuya Nishihara

unread,
Aug 31, 2010, 10:47:37 AM8/31/10
to thg...@googlegroups.com

Indeed.
I think tounicode() should respect the locale encoding before trying 'utf-8'.
It's used for conversion from Mercurial string to Qt unicode string.

Yuya,

Yuya,

Toshi MARUYAMA

unread,
Sep 1, 2010, 7:20:51 AM9/1/10
to thg...@googlegroups.com
Adrian Buehlmann wrote (2010-08-28 21:11):
>> 3. If UTF-8 is invalid, repository encoding is CP_ACP(CP932=shift-jis).

>
> I've seen that, so your method is just "try and error".
>
> You *assume* the repositories filename's are encoded in UTF-8 and then
> just try to decode them as UTF-8 and recode them into UTF-16 for
> internal use ("unicode")
>
> If the decode function reports an error, you *assume* it must be
> shitf-jis instead.
>
> What I don't understand is: why is this correct? Don't we need some
> external info in what encoding the filenames are, if we read
> .hg/dirstate and .hg/thgstatus?
>
> IIRC when Stefan Rusek last time tried to improve the shell extension
> for non-ASCII, he took the encoding info from somewhere (I think he
> tried to write the name of the encoding into .hg/thgstatus and then have
> the shell extension decode according to that name).

I tried Stefan job at
http://bitbucket.org/tortoisehg/stable/issue/672/shell-extension-unicode-support#comment-73482 .
But this is not solution.
Fixutf8 sets encoding.encoding = 'utf8' at
http://bitbucket.org/stefanrusek/hg-fixutf8/src/baf283ab9f92/fixutf8.py#cl-129 .
It effects all repositories in regardless of whether fixutf8
is activated or not activated.
It is same problem with Steve posted a mail "managing extensions"
http://groups.google.com/group/thg-dev/browse_frm/thread/a5119a56a9c278c0

>
> Is "try and error" really correct?
>
> What happens if the filenames are in some other encoding? e.g. 'latin1'

> with a 'ü'?
>

As described at my previous post, this is incorrect.

> And what is your logic to write .hg/thgstatus? In what encoding are the
> filenames in .hg/thgstatus written?
>

I don't touch OverlayServer python which writes .hg/thgstatus.
It reads simply .hg/dirstate and writes .hg/thgstatus.
Because .hg/dirstate path separator is '/', there is no problem of 0x5c.

If fixutf8 is activated, .hg/thgstatus and .hg/thgstatus encoding is utf-8.
If fixutf8 is not activated, .hg/thgstatus and .hg/thgstatus encoding is CP_ACP(shift-jis).

> If win32mbcs is activated, in what encoding is .hg/thgstatus written?

> shift-jis?

It is shift-jis.
Win32mbcs does not change repository encoding.
It hooks mercurial function simply.
Regardless of whether win32mbcs is activated or is not activated,
repository encoding is shift-jis in Japan and big5 in China(Taiwan).


I give up shellext supports fixutf8.
And I remove assuming utf-8 for filename.
My current shellext is only converting from CP_ACP to Unicode(Wide char).
This is compatible with main stream shellext.
And it resolves 0x5c problem.
It is big improvement for Japanese and Chinese(Taiwanese) thg users.

I finished to merge and resolve conflicts and I pushed my bitbucket.

Normal changeset:
http://bitbucket.org/marutosi/tortoisehg/changeset/18a682ecdfb1

MQ:
http://bitbucket.org/marutosi/tortoisehg-shellext-mq/changeset/98513b888020

I uploaded Windows shellext dlls (ThgShellx86.dll and ThgShellx64.dll).

http://bitbucket.org/marutosi/tortoisehg/downloads
http://bitbucket.org/marutosi/tortoisehg/downloads/ThgShellx86.20100901.dll
http://bitbucket.org/marutosi/tortoisehg/downloads/ThgShellx64.20100901.dll

I don't have 64 bit Windows now, so I can't confirm to run 64 bit dll.


You can replace existing dll to new dll by the way of the following link.

http://bitbucket.org/tortoisehg/stable/src/386a21068b48/win32/shellext/README.txt#cl-172

I hope to pull my shellext to thg main stream.


Adrian Buehlmann

unread,
Sep 1, 2010, 7:47:09 AM9/1/10
to thg...@googlegroups.com, Toshi MARUYAMA

I hope this is *not* pushed.

Doesn't meet minimal quality standards and I give up trying to dissect
these things.

For a start, http://mercurial.selenic.com/wiki/ContributingChanges might
be helpful.

Reply all
Reply to author
Forward
0 new messages