Markdown Encoding Problem

197 views
Skip to first unread message

bfrederi

unread,
Jun 1, 2009, 2:26:48 PM6/1/09
to Django users
I am having problems using the
django.contrib.markup.templatetags.markup.markdown function with
special characters (diacritics and such).

I am using markdown in my model and creating a function that returns
markdown from a model field. I even went as far as to override the
save method for my model, and encode the field data as "utf-8" prior
to the field data being saved, like so:

def save(self):
"""Overrides the Model's save method """
self.collection_description_short =
self.collection_description_short.encode("utf-8")
self.collection_description_long =
self.collection_description_long.encode("utf-8")
super(MyModel, self).save()

When I try to retrieve it and return the field data in a function, I
do this:

from django.contrib.markup.templatetags.markup import markdown

def collection_description_short_markdown(self):
""" Generate markdown out of the short description text """
if self.collection_description_short:
try:
markdown_html = markdown
(self.collection_description_short)
except:
return "Fail"
return markdown_html
else:
return None

And the markdown function always fails if there is a special character
in the field data.

Any suggestions?

Karen Tracey

unread,
Jun 1, 2009, 2:49:12 PM6/1/09
to django...@googlegroups.com

Get rid of the try/except/return "Fail" so that you get feedback on what, exactly, the problem is.  What you've done there is hide whatever specific error/exception message markdown may have provided and replaced it with a generic "Fail" that doesn't convey any information as to what might be wrong.  I'd hope markdown is a bit more specific about what it is having trouble with, and that should give a clue how to fix it.

Karen

bfrederi

unread,
Jun 1, 2009, 4:02:08 PM6/1/09
to Django users
I'm getting this:

Traceback (most recent call last):
File "/home/django-code/aubrey_explore/tests.py", line 18, in
testSpeaking
self.assert_(markdown(self.bla.collection_description_short))
File "/usr/lib/python2.5/site-packages/django/contrib/markup/
templatetags/markup.py", line 72, in markdown
return mark_safe(force_unicode(markdown.markdown(smart_str(value),
extensions, safe_mode=safe_mode)))
File "/var/lib/python-support/python2.5/markdown.py", line 1722, in
markdown
return md.convert(text)
File "/var/lib/python-support/python2.5/markdown.py", line 1614, in
convert
self.source = removeBOM(self.source, self.encoding)
File "/var/lib/python-support/python2.5/markdown.py", line 74, in
removeBOM
if text.startswith(bom):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
292: ordinal not in range(128)

On Jun 1, 1:49 pm, Karen Tracey <kmtra...@gmail.com> wrote:

bfrederi

unread,
Jun 1, 2009, 4:02:19 PM6/1/09
to Django users
I'm getting this:

Traceback (most recent call last):
File "/home/django-code/aubrey_explore/tests.py", line 18, in
testSpeaking
self.assert_(markdown(self.bla.collection_description_short))
File "/usr/lib/python2.5/site-packages/django/contrib/markup/
templatetags/markup.py", line 72, in markdown
return mark_safe(force_unicode(markdown.markdown(smart_str(value),
extensions, safe_mode=safe_mode)))
File "/var/lib/python-support/python2.5/markdown.py", line 1722, in
markdown
return md.convert(text)
File "/var/lib/python-support/python2.5/markdown.py", line 1614, in
convert
self.source = removeBOM(self.source, self.encoding)
File "/var/lib/python-support/python2.5/markdown.py", line 74, in
removeBOM
if text.startswith(bom):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
292: ordinal not in range(128)

On Jun 1, 1:49 pm, Karen Tracey <kmtra...@gmail.com> wrote:

bfrederi

unread,
Jun 1, 2009, 4:19:02 PM6/1/09
to Django users
I'm getting this:

Traceback (most recent call last):
File "/home/django-code/aubrey_explore/tests.py", line 18, in
testSpeaking
self.assert_(markdown(self.bla.collection_description_short))
File "/usr/lib/python2.5/site-packages/django/contrib/markup/
templatetags/markup.py", line 72, in markdown
return mark_safe(force_unicode(markdown.markdown(smart_str(value),
extensions, safe_mode=safe_mode)))
File "/var/lib/python-support/python2.5/markdown.py", line 1722, in
markdown
return md.convert(text)
File "/var/lib/python-support/python2.5/markdown.py", line 1614, in
convert
self.source = removeBOM(self.source, self.encoding)
File "/var/lib/python-support/python2.5/markdown.py", line 74, in
removeBOM
if text.startswith(bom):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
292: ordinal not in range(128)

On Jun 1, 1:49 pm, Karen Tracey <kmtra...@gmail.com> wrote:

bfrederi

unread,
Jun 1, 2009, 4:21:32 PM6/1/09
to Django users
Sorry, didn't mean to post so many replies. A combination of an
annoying KVM switch and user error.

Karen Tracey

unread,
Jun 1, 2009, 7:11:13 PM6/1/09
to django...@googlegroups.com
On Mon, Jun 1, 2009 at 4:02 PM, bfrederi <brfred...@gmail.com> wrote:

I'm getting this:

Traceback (most recent call last):
 File "/home/django-code/aubrey_explore/tests.py", line 18, in
testSpeaking
   self.assert_(markdown(self.bla.collection_description_short))
 File "/usr/lib/python2.5/site-packages/django/contrib/markup/
templatetags/markup.py", line 72, in markdown
   return mark_safe(force_unicode(markdown.markdown(smart_str(value),
extensions, safe_mode=safe_mode)))
 File "/var/lib/python-support/python2.5/markdown.py", line 1722, in
markdown
   return md.convert(text)
 File "/var/lib/python-support/python2.5/markdown.py", line 1614, in
convert
   self.source = removeBOM(self.source, self.encoding)
 File "/var/lib/python-support/python2.5/markdown.py", line 74, in
removeBOM
   if text.startswith(bom):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
292: ordinal not in range(128)


Much more useful.  If you Google markdown and removeBOM the top hit will be this ticket:

http://code.djangoproject.com/ticket/5663#comment:13

That comment, specifically, includes the same exception and traceback as you are showing. I read subsequent discussion in the ticket to be saying that the problem here is the markdown version, it's some pre-Unicode support level that you likely don't want to be using if you need proper Unicode support.  Apparently some bits may work (that comment shows passing Unicode working whereas passing a utf-8 encoded bytestring of the same content fails), but the comments from a markdown core dev indicate any semblance of "working" here is likely accidental.  Sounds like the easiest fix for you may be to upgrade your markdown to at least 1.7. 

(I'd also get rid of that save() override that is changing the fields to be utf-8 encoded bytestrings.  It's possibly mostly harmless but introduces a difference in type for those fields depending on whether you've called save() on the instance or pulled it from the database, and that could cause some confusion down the road.)

Karen

Waylan Limberg

unread,
Jun 2, 2009, 9:08:23 AM6/2/09
to Django users


On Jun 1, 7:11 pm, Karen Tracey <kmtra...@gmail.com> wrote:
>
> That comment, specifically, includes the same exception and traceback as you
> are showing. I read subsequent discussion in the ticket to be saying that
> the problem here is the markdown version, it's some pre-Unicode support
> level that you likely don't want to be using if you need proper Unicode
> support.  Apparently some bits may work (that comment shows passing Unicode
> working whereas passing a utf-8 encoded bytestring of the same content
> fails), but the comments from a markdown core dev indicate any semblance of
> "working" here is likely accidental.  Sounds like the easiest fix for you
> may be to upgrade your markdown to at least 1.7.

Karen nailed it. If your are using anything prior to Markdown 1.7
upgrade immediately (the 1.6 series was horribly buggy). Actually,
Markdown is currently at version 2.0.1 [1]. With 2.0 we've made a
number of improvements in a number of ways. However, one thing we are
dedicated to keeping the same is that since 1.7, Markdown will only
ever accept unicode text as input - nothing else. That has been a
tremendous help in eliminating these kinds of problems.

[1]: http://pypi.python.org/pypi/Markdown

Waylan Limberg

bfrederi

unread,
Jun 2, 2009, 11:05:04 AM6/2/09
to Django users
Yes. I completely missed that ticket, but I had switched to importing
markdown normally instead of through Django, and it solved my
problems.

But I noticed on Ubuntu that the repository version of python-markdown
is still 1.6. So I will switch to a newer version of Markdown. Thank
you both for your help.
Reply all
Reply to author
Forward
0 new messages