Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: Unicode characters

1 view
Skip to first unread message

Andreas J. Koenig

unread,
May 24, 2009, 12:44:04 AM5/24/09
to Saravanan Balaji, perl-u...@perl.org
>>>>> On Fri, 22 May 2009 20:49:24 +0530, Saravanan Balaji <Saravana...@MorganStanley.com> said:

> Could you please help to know what i am missing or doing wrong.
> I'll greatly appreciate the help.

I think all you're missing is (1) that a script written in utf8 needs
to declare that fact with a

use utf8;

and (2) any filehandle you're using that has utf8 semantics needs to
be switched to utf8 as well, so something like

binmode $_, ":utf8" for *STDOUT, *TEMP_OUT;

Hope that helps,
--
andreas

Juerd Waalboer

unread,
May 24, 2009, 4:09:25 AM5/24/09
to perl-u...@perl.org
Andreas J. Koenig skribis 2009-05-24 6:44 (+0200):

> binmode $_, ":utf8" for *STDOUT, *TEMP_OUT;

Although it's safe on output, it's better to get used to using
:encoding(utf8) instead of :utf8. Using :utf8 on input can cause
stability and security issues.
--
Met vriendelijke groet, Kind regards, Korajn salutojn,

Juerd Waalboer: Perl hacker <#####@juerd.nl> <http://juerd.nl/sig>
Convolution: ICT solutions and consultancy <sa...@convolution.nl>
1;

Juerd Waalboer

unread,
May 25, 2009, 5:04:34 AM5/25/09
to Andreas J. Koenig, perl-u...@perl.org
Andreas J. Koenig skribis 2009-05-25 8:30 (+0200):

> >>>>> On Sun, 24 May 2009 10:09:25 +0200, Juerd Waalboer <ju...@convolution.nl> said:
> > Although it's safe on output, it's better to get used to using
> > :encoding(utf8) instead of :utf8. Using :utf8 on input can cause
> > stability and security issues.
> That's new to me. Do you have a link that backs this up?

http://www.perlmonks.org/?node_id=644786
http://www.perlfoundation.org/perl5/index.cgi?the_utf8_perlio_layer
http://perldoc.perl.org/perlunicode.html#Security-Implications-of-Unicode
(perlunicode doesn't refer to :utf8 but does explain how malformed utf8
can cause trouble.)


Perl change #32461 updated documentation to reflect the preference for
:encoding
http://perl5.git.perl.org/perl.git/commit/740d4bb23b722729f87a23733be98429529fd900

(Andreas J. Koenig)

unread,
May 25, 2009, 2:30:29 AM5/25/09
to Juerd Waalboer, perl-u...@perl.org
>>>>> On Sun, 24 May 2009 10:09:25 +0200, Juerd Waalboer <ju...@convolution.nl> said:

> Although it's safe on output, it's better to get used to using
> :encoding(utf8) instead of :utf8. Using :utf8 on input can cause
> stability and security issues.

That's new to me. Do you have a link that backs this up?

--
andreas

0 new messages