svn prop*-commands not filtering encoding

25 views
Skip to first unread message

Torax Malu

unread,
Dec 3, 2024, 2:09:39 AM12/3/24
to TortoiseSVN-dev
Hi!

TL;DR:
  • Properties are handled as byte sequences and are not converted to the command line encoding.
  • Bug or feature?
Yesterday, while scripting, I stumbled upon a strange phenomenon with the CLI-tool svn.exe. File names and log entries are properly converted to encoding of command line or repository (Codepage 437 / 850, ANSI, UTF8 <=> UTF8). However, input and output of the commands svn prop* seems not. If I pass a string encoded in Codepage 850 or ANSI, it appears exactly as that in e.g. svn:externals.

This behaviour is disastrous for file names containing umlauts or other characters beyond code point 127 (h7F). Then a "Überraschung" turns into a surprise…

This behaviour seems only effecting svn.exe. Loading these strings into the TortoiseSVN property dialogue converts the every encoding and the issue don't comes up.

Is it simple this an intended behaviour of the svn prop* commands (thus poorly documented) or simply a bug?

Merci for a short reply.

Cheers
ToraxMalu

---

Ich hoffe, das hilft dir! Wenn du noch weitere Fragen hast oder Unterstützung benötigst, stehe ich dir gerne zur Verfügung.

Daniel Sahlberg

unread,
Dec 3, 2024, 2:36:31 AM12/3/24
to TortoiseSVN-dev
tisdag 3 december 2024 kl. 08:09:39 UTC+1 skrev torax...@googlemail.com:
Hi!

TL;DR:
  • Properties are handled as byte sequences and are not converted to the command line encoding.
  • Bug or feature?
Yesterday, while scripting, I stumbled upon a strange phenomenon with the CLI-tool svn.exe. File names and log entries are properly converted to encoding of command line or repository (Codepage 437 / 850, ANSI, UTF8 <=> UTF8). However, input and output of the commands svn prop* seems not. If I pass a string encoded in Codepage 850 or ANSI, it appears exactly as that in e.g. svn:externals.

This behaviour is disastrous for file names containing umlauts or other characters beyond code point 127 (h7F). Then a "Überraschung" turns into a surprise…

This behaviour seems only effecting svn.exe. Loading these strings into the TortoiseSVN property dialogue converts the every encoding and the issue don't comes up.

Is it simple this an intended behaviour of the svn prop* commands (thus poorly documented) or simply a bug?

Merci for a short reply.

Hi,

I tried but I can't really figure out where the problem is. I think a short reproduction script would help a lot to explain the issue.

Several of the Subversion command line tools have an --encoding option described like this in the Subversion book:
[[[
--encoding ENC
Tells Subversion that your commit message is composed using the character encoding provided. The default character encoding is derived from your operating system's native locale; use this option if your commit message is composed using any other encoding.
]]]

Maybe this helps?

I'd also like to mention that issues related to the command line client is better answered on us...@subversion.apache.org where you will find many of the Subversion developers. However at the moment I'm not sure if the issue is in the command line client or in TortoiseSVN.

Kind regards,
Daniel

Stefan

unread,
Dec 3, 2024, 12:54:33 PM12/3/24
to TortoiseSVN-dev
subversion properties are byte streams. You could store jpegs in a property if you like to do so.
That's why properties don't have an encoding specified, unless you do so yourself (for custom properties).
However: the properties with names that start with 'svn:' are defined as utf8 strings. And the TSVN properties that start with 'tsvn:' are also defined as utf8 strings.
But that's only the definition - whether you actually pass utf8 encoded strings to the functions is up to you.

Stefan 

toraxmalu1

unread,
Dec 3, 2024, 1:02:33 PM12/3/24
to Stefan via TortoiseSVN-dev
OK, with this definition it is only poorly documented. And we've to take
care for the encoding by our self.

Thanks for clarifying.

Cheers

Torax

Reply all
Reply to author
Forward
0 new messages