client hook script temporary files encoding bug?

19 views
Skip to first unread message

merenyics

unread,
May 31, 2019, 3:38:51 PM5/31/19
to TortoiseSVN
Hi!

I have some client-side hook scripts registered. When running the svn operations, the affected paths are stored in temporary files. (PATH and RESULTPATH files, as they are referred to in the TortoiseSVN help)

I found that file names containing the characters 'ő' and 'ű' don't appear in the temp file correctly on my machine, they are replaced by 'o' and 'u', respectively.

On other machines the generated temp files contain the correct characters.

The only difference I am aware of between my machine and the ones that work correctly is that my Windows 10 installation is English, while the others are Hungarian. However, the system locale is Hungarian on all of them, so this may or may not be related to the problem.

My question is: what encoding is supposed to be used in these temp files? What affects the encoding used, if it is not consistent across all systems?

Since I have a loss of data, this issue may be considered a bug.

In any case, can anyone suggest a way to enforce a given encoding in the temp files?

thanks,
Csaba

Stefan

unread,
May 31, 2019, 3:46:50 PM5/31/19
to TortoiseSVN
The encoding of those files is the system locale.
So you have to make sure that the filenames can be encoded with that locale, otherwise you'll get the problem you've just described.

Csaba Merényi

unread,
May 31, 2019, 4:26:49 PM5/31/19
to TortoiseSVN
Thanks for the reply!

When you say system locale, what EXACTLY do you mean? To the best of my knowledge, my system locale IS hungarian, which allows these characters to be encoded (cp1250). See below:
PS C:\Users\csaba> Get-WinSystemLocale | Select-Object Name, DisplayName,
>>                         @{ n='OEMCP'; e={ $_.TextInfo.OemCodePage } },
>>                         @{ n='ACP';   e={ $_.TextInfo.AnsiCodePage } }

Name  DisplayName         OEMCP  ACP
----  -----------         -----  ---
hu-HU Hungarian (Hungary)   852 1250

Could the SVN process be running under a different locale? Sorry, I'm not very familiar with the inner workings of Windows locales.

Csaba Merényi

unread,
May 31, 2019, 5:36:29 PM5/31/19
to TortoiseSVN
I did some further research, and I think I have figured this out.

It seems to me you aren't really using the system-locale to determine the output encoding, but the user-locale, or culture, i.e. the setting that governs the format of money, date and time. I'm not sure if this is a good choice, but at least I know how to get the results I need. 

I guess you don't want to change this for backwards compatibility reasons, but using the real system locale would be more intuitive, and UTF-8 encoded temp files would be ideal. (The commit message file already is utf-8 encoded, so at first I assumed all others are, as well).

Can you comment on this?

thanks,
Csaba

Stefan

unread,
Jun 1, 2019, 2:09:27 AM6/1/19
to TortoiseSVN
I've changed the encoding to UTF-8 in r28579.

Csaba Merényi

unread,
Jun 1, 2019, 8:10:50 AM6/1/19
to TortoiseSVN
that is really good news, thanks
Reply all
Reply to author
Forward
0 new messages