utf8 and Starman (wide character in syswrite)

1,172 views
Skip to first unread message

jbjbjb

unread,
Oct 14, 2010, 8:11:08 PM10/14/10
to psgi-plack
I got a "wide character in syswrite" from Starman Server.pm (and a
totally blank page) when some utf8 encoded data found its way into my
dynamic html (it could have come from a form, or utf8 mysql text field
I suppose, but actually it came from some XML).

My possibly naive fix was to add

binmode $conn, ":utf8";
after
my $conn = $self->{server}->{client};

and then the utf8 wide character appeared in my browser ok.

Is this a psgi starman bug, or some config issue with my perl. Is this
the right place?

thanks.

Tatsuhiko Miyagawa

unread,
Oct 14, 2010, 8:22:20 PM10/14/10
to psgi-...@googlegroups.com

No, it is a bug in your code. PSGI spec requires you to encode the content body in bytestream.

jbjbjb

unread,
Oct 14, 2010, 9:27:27 PM10/14/10
to psgi-plack
Fair enough,
so for example, hello world with a utf8 string fails with the
Server.pm error and status 500

my $app = sub {
my $str = "Hello \x{263A}!";
return [ 200, [ 'Content-Type' =>'text/plain' ], [ $str ] ];
};

what exactly is the right way to make this reach the browser without
bombing out?

Tatsuhiko Miyagawa

unread,
Oct 14, 2010, 9:30:35 PM10/14/10
to psgi-...@googlegroups.com
On Fri, Oct 15, 2010 at 10:27 AM, jbjbjb <justi...@gmail.com> wrote:
> Fair enough,
> so for example, hello world with a utf8 string fails with the
> Server.pm error and status 500

Yes, because your code violates PSGI specification. Use
Middleware::Lint to catch such errors.

> my $app = sub {
>    my $str = "Hello \x{263A}!";
>    return [ 200, [ 'Content-Type' =>'text/plain' ], [ $str ] ];
> };
>
> what exactly is the right way to make this reach the browser without
> bombing out?

use Encode;
my $str = encode_utf8("Hello \x{263A}");

> On Oct 15, 11:22 am, Tatsuhiko Miyagawa <miyag...@gmail.com> wrote:
>> No, it is a bug in your code. PSGI spec requires you to encode the content
>> body in bytestream.
>>

--
Tatsuhiko Miyagawa

jbjbjb

unread,
Oct 15, 2010, 9:49:36 PM10/15/10
to psgi-plack
thanks!
my confusion stemmed from trying to fix it with decode ..
as in, since the page has utf8 in it, I should "decode" it.

However I now understand that encode_utf8 means "decode from possibly
utf8, to a bytestream".

On Oct 15, 12:30 pm, Tatsuhiko Miyagawa <miyag...@gmail.com> wrote:

Tatsuhiko Miyagawa

unread,
Oct 16, 2010, 1:49:14 AM10/16/10
to psgi-...@googlegroups.com
Your terminology around encode/decode/utf8/bytestream seems broken.
I suggest you to read perldoc perlunitut to fix it.

On Sat, Oct 16, 2010 at 10:49 AM, jbjbjb <justi...@gmail.com> wrote:
> thanks!
> my confusion stemmed from trying to fix it with decode ..
> as in, since the page has utf8 in it, I should "decode" it.
>
> However I now understand that encode_utf8 means "decode from possibly
> utf8, to a bytestream".
>

--
Tatsuhiko Miyagawa

Reply all
Reply to author
Forward
0 new messages