Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Re: [perl #39930] AutoReply: [BUG] concat unicode+iso-8859-1 doesn't work w/o ICU

14 views

Skip to first unread message

Patrick R. Michaud

unread,

Jul 24, 2006, 7:55:49 PM7/24/06

to Parrot via RT

On Mon, Jul 24, 2006 at 03:57:25PM -0700, Pm wrote:
> Found this bug while doing stuff --without-icu today...
>
> Concatenation of a unicode string with an ASCII string
> works even if ICU isn't available.
>
> Concatenation of a unicode string with a Unicode string
> works even if ICU isn't available.
>
> Concatenation of a unicode string with an iso-8859-1 string
> fails with "no ICU lib loaded" if ICU isn't available.

On a possibly related note: for systems that *do* have ICU,
concatenating a unicode: string with an ascii: or unicode:
string appears to result in a different encoding than concatenating
with iso-8859-1. Thus:

$S0 = unicode:"A"
$S1 = ascii:"B"
$S2 = concat $S0, $S1
print $S2 # outputs "AB"

$S0 = unicode:"A"
$S1 = unicode:"B"
$S2 = concat $S0, $S1
print $S2 # outputs "AB"

$S0 = unicode:"A"
$S1 = iso-8859-1:"B"
$S2 = concat $S0, $S1
print $S2 # outputs "A\x00B\x00"

This particular behavior isn't necessarily a bug, but it is
at least somewhat unexpected.

Patrick R . Michaud

unread,

Jul 24, 2006, 6:57:24 PM7/24/06

to bugs-bi...@rt.perl.org

# New Ticket Created by Patrick R. Michaud
# Please include the string: [perl #39930]
# in the subject line of all future correspondence about this issue.
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=39930 >

Found this bug while doing stuff --without-icu today...

Concatenation of a unicode string with an ASCII string
works even if ICU isn't available.

Concatenation of a unicode string with a Unicode string
works even if ICU isn't available.

Concatenation of a unicode string with an iso-8859-1 string
fails with "no ICU lib loaded" if ICU isn't available.

Sample program:

$ cat x.pir
.sub main
# works

$S0 = unicode:"A"
$S1 = ascii:"B"
$S2 = concat $S0, $S1
print $S2

print "\n"

# works

$S0 = unicode:"A"
$S1 = unicode:"B"
$S2 = concat $S0, $S1
print $S2

print "\n"

# fails

$S0 = unicode:"A"
$S1 = iso-8859-1:"B"
$S2 = concat $S0, $S1
print $S2

print "\n"
.end
$ ./parrot x.pir
AB
AB
no ICU lib loaded
current instr.: 'main' pc 34 (x.pir:16)
$

I'll add this as a test when I have the RT#.

0 new messages