unicode

7 views
Skip to first unread message

Juan Fiol

unread,
Nov 24, 2009, 8:46:01 AM11/24/09
to tah...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi, I've got some HTML pages that have some utf-8 chars (not ascii)
and when I process them I've got an error (see below). I am using python-2.6 and
tested it with tahchee-0.9.8 and with latest git sources from repository
Also tested it with cheetah-2.2.2 and cheeta-2.4.0.
My guess is that I've got to define that I am using utf-8 in some template, but
I really do not know where (or how) to do it.

Thanks for any help, Juan
Error message follows:
++++++++++++++++++++++++++++++++++++++++
[ ] Generating file 'bib_en.html'
Traceback (most recent call last):
File "build.py", line 28, in <module>
SiteBuilder(site).build(filter(lambda x:x not in
('local','remote'),sys.argv[1:]))
File "/usr/lib/python2.6/site-packages/tahchee/main.py", line 577, in build
self.applyTemplates(paths)
File "/usr/lib/python2.6/site-packages/tahchee/main.py", line 635, in
applyTemplates
self.processFile( input_path, output_path, force )
File "/usr/lib/python2.6/site-packages/tahchee/main.py", line 659, in processFile
self.applyTemplate(ifile, force)
File "/usr/lib/python2.6/site-packages/tahchee/main.py", line 710, in
applyTemplate
template = Template(file=template, searchList=[localdict])
File "/usr/lib/python2.6/site-packages/Cheetah/Template.py", line 1244, in
__init__
self._compile(source, file, compilerSettings=compilerSettings)
File "/usr/lib/python2.6/site-packages/Cheetah/Template.py", line 1538, in
_compile
keepRefToGeneratedCode=True)
File "/usr/lib/python2.6/site-packages/Cheetah/Template.py", line 742, in compile
settings=(compilerSettings or {}))
File "/usr/lib/python2.6/site-packages/Cheetah/Compiler.py", line 1588, in
__init__
source = unicode(source)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 28193:
ordinal not in range(128)
make: *** [local] Error 1

++++++++++++++++++++++++++++++++++++++++
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAksL45gACgkQqiWjWCO20ux7agCeP8Yg0jVybltVcuYAb+wFusxW
K3gAnjrqikruR2LNDgwwyqu/rbHVgesa
=6yq0
-----END PGP SIGNATURE-----

Sébastien Pierre

unread,
Nov 24, 2009, 9:22:56 AM11/24/09
to tah...@googlegroups.com, juan...@gmail.com
Hello Juan,

Could this be the same bug as the one related here ?

http://github.com/sebastien/tahchee/issues#issue/2

Also, did you try to add encoding info in your template ?

#encoding UTF-8

(see
<http://www.cheetahtemplate.org/docs/users_guide_html_multipage/moduleFormatting.encoding.html>)

If this doesn't work, I would appreciate a test case so I can fix it.

Cheers,

-- Sébastien


Le Tue, 24 Nov 2009 10:46:01 -0300,
Juan Fiol <juan...@gmail.com> a écrit :
> --
>
> You received this message because you are subscribed to the Google
> Groups "tahchee" group. To post to this group, send email to
> tah...@googlegroups.com. To unsubscribe from this group, send email
> to tahchee+u...@googlegroups.com. For more options, visit this
> group at http://groups.google.com/group/tahchee?hl=en.
>
>

Juan Fiol

unread,
Nov 24, 2009, 9:34:48 PM11/24/09
to Sébastien Pierre, tah...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi, after the reply from Sébastien, I can confirm that the problem is in
cheetah. Version 2.0.1 had no problem for me but 2.2.2 and 2.4.0 do not work.
I've tried to use the line with the encoding:
#encoding UTF-8
but it did not work

I looked to the cheeta python files. The problem why cheetah does not recognize
the line with the encoding seems to be that they are using re.match rather than
re.search
I would think that re.match should work but in fact it does not. Follows a patch
to the file Compiler.py to be used in the cheeta python directory (in my case:
/usr/lib/python2.6/site-packages/Cheetah)
I hope it helps others seeing the same behavior. I did not contact cheetah
developers yet. I'll add a bug directly with them.

Note that the behavior of cheetah has changed from version 2.0.1 . Previously I
did not have any encoding defined and it just worked. Now I have to include a
line with the encoding in each file to make it work
Cheers, Juan

Patch follows:
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
- --- /root/Compiler.py 2009-11-24 23:21:24.000000000 -0300
+++ Compiler.py 2009-11-24 23:20:03.000000000 -0300
@@ -1569,7 +1569,7 @@

else:
unicodeMatch = unicodeDirectiveRE.search(source)
- - encodingMatch = encodingDirectiveRE.match(source)
+ encodingMatch = encodingDirectiveRE.search(source)
if unicodeMatch:
if encodingMatch:
raise ParseError(
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

On 11/24/2009 11:22 AM, Sébastien Pierre wrote:
> Hello Juan,
>
> Could this be the same bug as the one related here ?
>
> http://github.com/sebastien/tahchee/issues#issue/2
>
> Also, did you try to add encoding info in your template ?
>
> #encoding UTF-8
>
> (see
> <http://www.cheetahtemplate.org/docs/users_guide_html_multipage/moduleFormatting.encoding.html>)
>
> If this doesn't work, I would appreciate a test case so I can fix it.
>
> Cheers,
>
> -- Sébastien
>
>
> Le Tue, 24 Nov 2009 10:46:01 -0300,
> Juan Fiol <juan...@gmail.com> a écrit :
>
- --
>>
You received this message because you are subscribed to the Google
Groups "tahchee" group. To post to this group, send email to
tah...@googlegroups.com. To unsubscribe from this group, send email
to tahchee+u...@googlegroups.com. For more options, visit this
group at http://groups.google.com/group/tahchee?hl=en.
>>
>>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAksMl8cACgkQqiWjWCO20uwnzgCfUGxgF/nVyjsz+Fdno95x1+7i
mxsAnjaMaZBtlEPc710Q4EO68xxWHHHK
=AwFb
-----END PGP SIGNATURE-----

Sébastien Pierre

unread,
Nov 25, 2009, 8:47:17 AM11/25/09
to Juan Fiol, juan...@gmail.com, tah...@googlegroups.com
OK, thanks a lot for looking into that !

-- Sébastien

Le Tue, 24 Nov 2009 23:34:48 -0300,
Reply all
Reply to author
Forward
0 new messages