Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Is there any way to decode String using unknown codec?

30 views
Skip to first unread message

howmuch...@gmail.com

unread,
Jun 27, 2012, 9:14:56 PM6/27/12
to
Hi
I'm a Korean and when I use modules like sys, os, &c,
sometimes the interpreter show me broken strings like
'\x13\xb3\x12\xc8'.
It mustbe the Korean "alphabet" but I can't decode it to the rightway.
I tried to decode it using codecs like cp949,mbcs,utf-8
but It failed.
The only way I found is eval('\x13\xb3\x12\xc8').
It raises an Error with showing right Korean.
Is there any way to deal it being not broken?

Benjamin Kaplan

unread,
Jun 27, 2012, 10:20:28 PM6/27/12
to pytho...@python.org
> --

It's not broken. You're just using the wrong encodings. Try utf-16le.

MRAB

unread,
Jun 28, 2012, 7:28:06 AM6/28/12
to pytho...@python.org
It might be UTF-16:

>>> b'\x13\xb3\x12\xc8'.decode("utf16")
'댓젒'

I don't know Korean, but that looks reasonable!

Dieter Maurer

unread,
Jun 28, 2012, 1:18:12 PM6/28/12
to pytho...@python.org
howmuch...@gmail.com writes:

> I'm a Korean and when I use modules like sys, os, &c,
> sometimes the interpreter show me broken strings like
> '\x13\xb3\x12\xc8'.
> It mustbe the Korean "alphabet" but I can't decode it to the rightway.
> I tried to decode it using codecs like cp949,mbcs,utf-8
> but It failed.
> The only way I found is eval('\x13\xb3\x12\xc8').

This looks as if "sys.stdout/sys.stderr" knew the correct encoding.
Check it like this:

import sys
sys.stdout.encoding

howmuch...@gmail.com

unread,
Jun 28, 2012, 5:27:32 PM6/28/12
to comp.lan...@googlegroups.com, pytho...@python.org
T

2012년 6월 28일 목요일 오전 11시 20분 28초 UTC+9, Benjamin Kaplan 님의 말:
Thank you guys. The problem is solved!

howmuch...@gmail.com

unread,
Jun 28, 2012, 5:27:32 PM6/28/12
to pytho...@python.org
T

2012년 6월 28일 목요일 오전 11시 20분 28초 UTC+9, Benjamin Kaplan 님의 말:
0 new messages