More unicode in JS issues

zaf

unread,

Feb 16, 2010, 11:09:42 AM2/16/10

to TurboGears

Hi,
A while ago I had difficulties with non-ascii characters in strings in
JS files.

Basically everytime I used the tg-admin i18n collect command, if I had
a javascript file containing non-ascii characters, it would give me a
unicode error.

This was apparently supposed to be fixed here : http://trac.turbogears.org/changeset/6645.

Now I've been trying to collect my strings using tg-admin collect --js-
encoding utf-8 but I'm getting the following error :
Traceback (most recent call last):
File "/usr/bin/tg-admin", line 8, in <module>
load_entry_point('TurboGears==1.1', 'console_scripts', 'tg-admin')
()
File "/usr/lib/python2.5/site-packages/TurboGears-1.1-py2.5.egg/
turbogears/command/base.py", line 416, in main
command.run()
File "/usr/lib/python2.5/site-packages/TurboGears-1.1-py2.5.egg/
turbogears/command/i18n.py", line 150, in run
self.scan_source_files()
File "/usr/lib/python2.5/site-packages/TurboGears-1.1-py2.5.egg/
turbogears/command/i18n.py", line 336, in scan_source_files
self.scan_js_files(tmp_potfile, js_files)
File "/usr/lib/python2.5/site-packages/TurboGears-1.1-py2.5.egg/
turbogears/command/i18n.py", line 481, in scan_js_files
self._write_potfile_entries(potfile, messages)
File "/usr/lib/python2.5/site-packages/TurboGears-1.1-py2.5.egg/
turbogears/command/i18n.py", line 489, in _write_potfile_entries
text = catalog.normalize(text.encode('utf-8'))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
13: ordinal not in range(128)

So if there is someone around here who knows where the problem is and
how I can fix it, it would be very cool. Removing the non-ascii
characters from my js strings is not really an option as there are a
LOT of them.

Thanks for the help.

zaf

unread,

Feb 16, 2010, 11:10:28 AM2/16/10

to TurboGears

Sorry I forgot to mention this is in TG1.1.
My initial problem was in TG 1.0 however.
Tom

Diez B. Roggisch

unread,

Feb 16, 2010, 11:35:06 AM2/16/10

to turbo...@googlegroups.com, zaf

Can you show at least an example of one such files?

Diez

zaf

unread,

Feb 16, 2010, 11:53:28 AM2/16/10

to TurboGears

Okay well here's a paste of JS code that fail :
http://pastebin.com/m1e7207c6

What's very weird, is that after some debugging,I realized line 10
failed but not the others and only one string in line 10 actually
fails.

I did this to figure out the problem : (this is in turbogears/command/
i18n.py, line 483)

def _write_potfile_entries(self, potfile, messages):
if messages:
fd = open(potfile, 'at+')
for linenumber, fname, text in messages:
if text == '':
continue
print 'Original, ',type(text), text, fname, linenumber

text = catalog.normalize(text.encode('utf-8'))

fd.write('#: %s:%s\n' % (fname, linenumber))
fd.write('msgid %s\n' % text)
fd.write('msgstr ""\n\n')
fd.close()

and this is the output I get for line 10 :

Original, <type 'unicode'> clé(s) envoyée(s). cinego/static/
javascript/film.js 222

Original, <type 'str'> Afficher le détail cinego/static/javascript/
film.js 222

I don't understand why one string would be typed as string and another
as unicode.
I'm guessing it's got to do with my typing of the strings but I really
don't see what's wrong.
Tom

zaf

unread,

Feb 16, 2010, 12:38:16 PM2/16/10

to TurboGears

Also I'm getting this warning when I collect strings on my js files.

/usr/lib/python2.5/site-packages/TurboGears-1.1-py2.5.egg/turbogears/
command/i18n.py:469: UnicodeWarning: Unicode equal comparison failed
to convert both arguments to Unicode - interpreting them as being
unequal

Don't know if it's related...
Tom

zaf

unread,

Feb 16, 2010, 1:40:42 PM2/16/10

to TurboGears

okay well, it seems that moving the strings around solved the problem.
I just put them in a js variable and used that in the js code. Also
the UnicodeWarning disappeared which leads me to think the two were
connected.
This is very strange indeed...

Diez B. Roggisch

unread,

Feb 16, 2010, 5:05:45 PM2/16/10

to turbo...@googlegroups.com

Am 16.02.10 17:53, schrieb zaf:

> Okay well here's a paste of JS code that fail :
> http://pastebin.com/m1e7207c6
>
> What's very weird, is that after some debugging,I realized line 10
> failed but not the others and only one string in line 10 actually
> fails.
>
> I did this to figure out the problem : (this is in turbogears/command/
> i18n.py, line 483)
>
> def _write_potfile_entries(self, potfile, messages):
> if messages:
> fd = open(potfile, 'at+')
> for linenumber, fname, text in messages:
> if text == '':
> continue
> print 'Original, ',type(text), text, fname, linenumber
> text = catalog.normalize(text.encode('utf-8'))
> fd.write('#: %s:%s\n' % (fname, linenumber))
> fd.write('msgid %s\n' % text)
> fd.write('msgstr ""\n\n')
> fd.close()
>
> and this is the output I get for line 10 :
>

> Original,<type 'unicode'> cl�(s) envoy�e(s). cinego/static/
> javascript/film.js 222
>
> Original,<type 'str'> Afficher le d�tail cinego/static/javascript/

> film.js 222
>
> I don't understand why one string would be typed as string and another
> as unicode.
> I'm guessing it's got to do with my typing of the strings but I really
> don't see what's wrong.

Well, that's pretty simple: you saved them in some different encoding
than UTF-8. Which is required for the command to work (there might be
some parameter to determine the encoding, I don't remember that).

And shuffling the files around might have convinced your editor to store
them as UTF-8 by accident. Whatever editor you use, you should try &
figure out how to convince it to save in utf-8 by default.

Diez

zaf

unread,

Feb 17, 2010, 11:18:57 AM2/17/10

to TurboGears

I use Vim and I copied the strings in a different place in the file
and now it works. Could be that my Vim was configured differently
before but I don't remember changin that. Or maybe I used another
editor for some of it. I don't know.
However I did notice that files minified using the yahoo javascript
YUI compressor generate the same error.
Also it was only a selected set of strings in each file, if the editor
was the problem wouldn't it save everything in the same encoding ?
Anyway I found a way to solve it so I'm good for now I guess.
Tom

Reply all

Reply to author

Forward