Unicode in creoleparser

16 views
Skip to first unread message

m...@wilfred.me.uk

unread,
Aug 3, 2013, 7:41:07 AM8/3/13
to creole...@googlegroups.com
I've noticed that creole converts unicode objects into UTF-8 encoded bytestrings.

In [3]: from creoleparser import text2html

In [5]: text2html(u"esapañol")
Out[5]: '<p>esapa\xc3\xb1ol</p>\n'

This isn't a problem for me, but it's not documented as far as I can see. Could it be added to the docs? Personally, I think always outputting unicode (maybe even forcing unicode input too) would be a good thing, but I understand if you disagree or are concerned about backwards compatibility.

Thanks.

Stephen Day

unread,
Aug 5, 2013, 9:31:24 PM8/5/13
to creole...@googlegroups.com
Hi,

Use "encoding=None" when creating your Parser object. This will cause the parser to output Unicode objects.

http://creoleparser.googlecode.com/svn/docs/modules/core.html#creoleparser.core.Parser

Thanks,

Steve

Stephen Day

unread,
Aug 5, 2013, 9:58:09 PM8/5/13
to creole...@googlegroups.com
You can also pass arguments to the text2html() convenience function. You are correct that I need to document this better. It's been a while since I made up update...

Thanks,

Steve
Reply all
Reply to author
Forward
0 new messages