non-ASCII characters in the Sage library

36 views
Skip to first unread message

Pat LeSmithe

unread,
Jan 21, 2010, 12:22:39 AM1/21/10
to sage-...@googlegroups.com
Should we put

# -*- coding: utf-8 -*-

at the top of all .py and .pyx(?) files in the Sage library?

I think this will allow us to use Unicode literal strings in Sage code,
doctests, documentation --- without raising coding errors.

Thoughts?

Some links:

http://wiki.sagemath.org/devel/nonASCII
http://docs.python.org/howto/unicode.html#unicode-literals-in-python-source-code
http://trac.sagemath.org/sage_trac/ticket/6682

Gonzalo Tornaria

unread,
Jan 21, 2010, 9:09:07 AM1/21/10
to sage-...@googlegroups.com
On Thu, Jan 21, 2010 at 3:22 AM, Pat LeSmithe <qed...@gmail.com> wrote:
> Should we put
>
> # -*- coding: utf-8 -*-
>
> at the top of all .py and .pyx(?) files in the Sage library?
>
> I think this will allow us to use Unicode literal strings in Sage code,
> doctests, documentation --- without raising coding errors.

I had to patch sagenb b/c doctests don't display in the notebook when
they have utf-8 (this was rc0,

However, I just discovered that the sagenb (in
local/lib/python2.6/site-packages/sagenb-0.5-py2.6.egg/sagenb) is not
under hg, so I will need to dig what I actually changed...

The relevant ticket is http://trac.sagemath.org/sage_trac/ticket/6682

I'll post more about this later.

BTW a few questions:

a. is it necessary to put the utf-8 stanza on all files, or only on
those which include non-ascii characters?

b. the way the line is written, I think it will be recognized by
emacs, but not by vim. Do we care about that?

c. is there a way to do a "sanity check" to the source to make sure we
don't get incorrect encodings?

d. should doctests with non-ascii characters be created as unicode
strings, or as regular strings with utf-8 encoding?

Gonzalo

kcrisman

unread,
Jan 21, 2010, 9:16:26 AM1/21/10
to sage-devel
Not everyone can easily use a text editor which recognizes all non-
ASCII character properly, so I think we should be careful about
this.

- kcrisman

On Jan 21, 9:09 am, Gonzalo Tornaria <torna...@math.utexas.edu> wrote:
> On Thu, Jan 21, 2010 at 3:22 AM, Pat LeSmithe <qed...@gmail.com> wrote:
> > Should we put
>
> > # -*- coding: utf-8 -*-
>
> > at the top of all .py and .pyx(?) files in the Sage library?
>
> > I think this will allow us to use Unicode literal strings in Sage code,
> > doctests, documentation --- without raising coding errors.
>
> I had to patch sagenb b/c doctests don't display in the notebook when
> they have utf-8 (this was rc0,
>
> However, I just discovered that the sagenb (in
> local/lib/python2.6/site-packages/sagenb-0.5-py2.6.egg/sagenb) is not
> under hg, so I will need to dig what I actually changed...
>

> The relevant ticket ishttp://trac.sagemath.org/sage_trac/ticket/6682

Gonzalo Tornaria

unread,
Jan 21, 2010, 9:53:51 AM1/21/10
to sage-...@googlegroups.com
On Thu, Jan 21, 2010 at 12:16 PM, kcrisman <kcri...@gmail.com> wrote:
> Not everyone can easily use a text editor which recognizes all non-
> ASCII character properly, so I think we should be careful about
> this.

I don't think that's true anymore. It may have been true ten years
ago, but nowadays unicode and utf-8 is pretty much standard.

For the sage source code itself, it probably amounts only to be able
to spell most names correctly. But it will help fully supporting
unicode, which is necessary for translations, and even for the english
version students want to write comments and text in their own
language, so the support is quite important.

Gonzalo

William Stein

unread,
Jan 21, 2010, 10:01:57 AM1/21/10
to sage-devel, sage-notebook
On Thu, Jan 21, 2010 at 6:09 AM, Gonzalo Tornaria
<torn...@math.utexas.edu> wrote:
> On Thu, Jan 21, 2010 at 3:22 AM, Pat LeSmithe <qed...@gmail.com> wrote:
>> Should we put
>>
>> # -*- coding: utf-8 -*-
>>
>> at the top of all .py and .pyx(?) files in the Sage library?
>>
>> I think this will allow us to use Unicode literal strings in Sage code,
>> doctests, documentation --- without raising coding errors.
>
> I had to patch sagenb b/c doctests don't display in the notebook when
> they have utf-8 (this was rc0,
>
> However, I just discovered that the sagenb (in
> local/lib/python2.6/site-packages/sagenb-0.5-py2.6.egg/sagenb) is not
> under hg, so I will need to dig what I actually changed...

You should post to

http://groups.google.com/group/sage-notebook

about the Sage notebook; otherwise, most of the people working on the
notebook might not even see your posts. I've cc'd this message
there.

> The relevant ticket is http://trac.sagemath.org/sage_trac/ticket/6682
>
> I'll post more about this later.
>
> BTW a few questions:
>
> a. is it necessary to put the utf-8 stanza on all files, or only on
> those which include non-ascii characters?
>
> b. the way the line is written, I think it will be recognized by
> emacs, but not by vim. Do we care about that?
>
> c. is there a way to do a "sanity check" to the source to make sure we
> don't get incorrect encodings?
>
> d. should doctests with non-ascii characters be created as unicode
> strings, or as regular strings with utf-8 encoding?
>
> Gonzalo
>

> --
> To post to this group, send an email to sage-...@googlegroups.com
> To unsubscribe from this group, send an email to sage-devel+...@googlegroups.com
> For more options, visit this group at http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org
>
>

--
William Stein
Associate Professor of Mathematics
University of Washington
http://wstein.org

Message has been deleted

Gonzalo Tornaria

unread,
Jan 21, 2010, 10:14:33 PM1/21/10
to sage-...@googlegroups.com
On Fri, Jan 22, 2010 at 12:46 AM, Minh Nguyen <nguye...@gmail.com> wrote:
> With or without the above Unicode preamble, a non-ASCII character in a
> docstring can cause the PDF version of a document to fail to build.
> See ticket #8036 [1] for an example of a case where a source file
> contains the above preamble, but the PDF version of the reference
> manual fails to build due to non-ASCII characters in the docstring of
> a method.

That needs to be fixed in the latex preamble, i.e. let latex know the
file is encoded with utf8.

See my comments in the ticket (and attached latex example file)

> [1] http://trac.sagemath.org/sage_trac/ticket/8036

Gonzalo

Reply all
Reply to author
Forward
0 new messages