Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#1031552: translate: iconv issue

15 views
Skip to first unread message

Markus Reschke

unread,
Feb 18, 2023, 8:30:05 AM2/18/23
to
Package: translate
Version: 0.6.0~debian0

In a shell with ISO-8859-1/latin1 encoding translate triggers an iconv
issue for some words, resulting in the output ending with:
iconv: illegal input sequence at position 560
or some other position.

This can be fixed by adding a '-c' to the translate script:
echo $OPT "$1" | iconv -c -f UTF-8 -t $CHARSET

Cheers,
Markus

Axel Beckert

unread,
Feb 18, 2023, 12:50:05 PM2/18/23
to
Control: tag -1 + confirmed pending

Hi Markus,

thanks for the bug report.

Markus Reschke wrote:
> In a shell with ISO-8859-1/latin1 encoding

Please note that with the upcoming Debian 12 release, non-UTF-8
locales are deprecated and no more offered to be configured via
debconf on new installations or upon reconfiguration. (Already
configured non-UTF-8 locales stay untouched, though.) Additionally,
all non-UTF-8 users are encouraged to switch to a UTF-8 locale.

Editing /etc/locale.gen and then regenerating my locales I was though
able to still create some non-UTF-8 locales on my system for trying to
reproduce this.

I nevertheless would like to fix this issue in translate for Debian 12.

> translate triggers an iconv issue
> for some words, resulting in the output ending with:
> iconv: illegal input sequence at position 560
> or some other position.

Hmmm, I'd have preferred a more explicit example. But in the end I
found one:

* Start an ISO-Latin based xterm with

env `locale | sed -e s/C\.UTF-8/de-DE@euro/` lxterm

* In that xterm call:

env `locale | sed -e s/C\.UTF-8/de-DE@euro/` translate -i common

Then the last output line is:

Agrarpolitik {f} | Gemeinsame Agrarpolitik /GAP/ :: agricultural policy | Common Agricultural Policy /CAP/

And the first missing line is:

Akeleien {pl} (Aquilegia) (botanische Gattung) [bot.] <Akelei> | Gemeine Akelei {f}; Gewöhnliche Akelei {f}; Waldakelei {f} (Aquilegia vulgaris) :: columbines; granny’s bonnets (botanical genus) <columbine> <aquilege> | common columbine; European columbine; European crowfoot; granny’s bonnet

From there on, indeed no other line is sent to STDOUT.

Additionally it emits rather early the error message:

iconv: illegal input sequence at position 1702

> This can be fixed by adding a '-c' to the translate script:
> echo $OPT "$1" | iconv -c -f UTF-8 -t $CHARSET

Adding -c indeed fixes this issue. And for some reason the "ö" in
"Gewöhnliche" is still shown despite the iconv man page suggested
something else (namely that the character is not displayed at all --
which scared me a bit).

So I'll add that to both iconv calls in translate.

Thanks again, also for the hint on "-c".

Regards, Axel
--
,''`. | Axel Beckert <a...@debian.org>, https://people.debian.org/~abe/
: :' : | Debian Developer, ftp.ch.debian.org Admin
`. `' | 4096R: 2517 B724 C5F6 CA99 5329 6E61 2FF9 CD59 6126 16B5
`- | 1024D: F067 EA27 26B9 C3FC 1486 202E C09E 1D89 9593 0EDE
0 new messages