only use ASCII characters in patches

34 views
Skip to first unread message

Minh Nguyen

unread,
Aug 4, 2009, 5:10:39 AM8/4/09
to sage-...@googlegroups.com
Hi folks,

Michael Abshoff has complained before about using non-ASCII characters
in patches. Today, I experienced first-hand why he complained. The
thing is, if a Sage library file contains non-ASCII characters, this
can result in errors or warnings when loading Sage. I recently merged
#5793, which has a patch that contains non-ASCII characters. As a
follow-up to that ticket, I created #6674

http://trac.sagemath.org/sage_trac/ticket/6674

in order to fix that issue. The non-ASCII character issue should be
fixed before releasing Sage 4.1.1. Who would like to volunteer to
review #6674?

--
Regards
Minh Van Nguyen

David Kirkby

unread,
Aug 4, 2009, 6:32:31 AM8/4/09
to sage-...@googlegroups.com
2009/8/4 Minh Nguyen <nguye...@gmail.com>:

I agree non-ASCII characters cause problems. I was intending reviewing
it for you, as it is clearly a very trivial fix. However, you have
replaced the first letter of the name with the letter 'O'. At least
when I view it, it looks more like an 'A' and an 'O'. Clearly it can't
be represented accurately, but is an 'A' more appropriate than an 'O'?

Dave.

Message has been deleted

Dag Sverre Seljebotn

unread,
Aug 4, 2009, 6:48:58 AM8/4/09
to sage-...@googlegroups.com
Minh Nguyen wrote:
> Hi David,
> It looks like an "O" to me. I referred to this page
>
> http://users.tkk.fi/pat/cliquer.html
>
> which is the homepage of the software in question.

An O with double dots is definitely not an A. It might be represented as
"Oe" (though I'm open for correction about the usual practice in from
Swedish/Finnish people).

I don't know whether Sage has made a decision to be ASCII-only; in that
case please disregard the below, but:

When it comes to the warnings, it is quite trivial to add a header at the
top of the py file:

# encoding: utf-8

(or whatever encoding one is using -- standardizing on utf-8 is probably a
good idea). I don't see why one should restrict oneself to ASCII in this
day and age, when the names of many contributors will NOT be representable
as ASCII.

Dag Sverre

Minh Nguyen

unread,
Aug 4, 2009, 6:55:23 AM8/4/09
to sage-...@googlegroups.com
Hi Dag,

On Tue, Aug 4, 2009 at 8:48 PM, Dag Sverre
Seljebotn<da...@student.matnat.uio.no> wrote:

<SNIP>

> An O with double dots is definitely not an A. It might be represented as
> "Oe" (though I'm open for correction about the usual practice in from
> Swedish/Finnish people).
>
> I don't know whether Sage has made a decision to be ASCII-only; in that
> case please disregard the below, but:
>
> When it comes to the warnings, it is quite trivial to add a header at the
> top of the py file:
>
> # encoding: utf-8
>
> (or whatever encoding one is using -- standardizing on utf-8 is probably a
> good idea). I don't see why one should restrict oneself to ASCII in this
> day and age, when the names of many contributors will NOT be representable
> as ASCII.

Just an aside: with UTF-8 in source files, one can't read the
non-ASCII characters in the patch when viewed using trac or a browser.
Just my 2-cent.

Dag Sverre Seljebotn

unread,
Aug 4, 2009, 7:05:13 AM8/4/09
to sage-...@googlegroups.com

Actually, you are probably talking about different characters! Neither ö
or å are in ASCII; the latter one is definitely closer to A :-) (in
Norwegian we'd type it as "aa" in ASCII; it wouldn't surprise me if it's
the same with Swedish).

Dag Sverre

Minh Nguyen

unread,
Aug 4, 2009, 7:10:02 AM8/4/09
to sage-...@googlegroups.com
Hi Dag,

On Tue, Aug 4, 2009 at 9:05 PM, Dag Sverre
Seljebotn<da...@student.matnat.uio.no> wrote:

<SNIP>

>> An O with double dots is definitely not an A. It might be represented as


>> "Oe" (though I'm open for correction about the usual practice in from
>> Swedish/Finnish people).
>
> Actually, you are probably talking about different characters! Neither ö
> or å are in ASCII; the latter one is definitely closer to A :-) (in
> Norwegian we'd type it as "aa" in ASCII; it wouldn't surprise me if it's
> the same with Swedish).

So instead of

Ostergard

the patch at #6674 should read

Oestergaard?

Carlo Hamalainen

unread,
Aug 4, 2009, 8:05:11 AM8/4/09
to sage-...@googlegroups.com
On Tue, Aug 4, 2009 at 12:55 PM, Minh Nguyen<nguye...@gmail.com> wrote:
> Just an aside: with UTF-8 in source files, one can't read the
> non-ASCII characters in the patch when viewed using trac or a browser.
> Just my 2-cent.

Why does trac have an issue with non-ASCII characters? It doesn't do utf-8?

--
Carlo Hämäläinen
http://carlo-hamalainen.net

Minh Nguyen

unread,
Aug 4, 2009, 8:10:34 AM8/4/09
to sage-...@googlegroups.com
Hi Carlo,

On Tue, Aug 4, 2009 at 10:05 PM, Carlo
Hamalainen<carlo.ha...@gmail.com> wrote:
>
> On Tue, Aug 4, 2009 at 12:55 PM, Minh Nguyen<nguye...@gmail.com> wrote:
>> Just an aside: with UTF-8 in source files, one can't read the
>> non-ASCII characters in the patch when viewed using trac or a browser.
>> Just my 2-cent.
>
> Why does trac have an issue with non-ASCII characters? It doesn't do utf-8?

Have a look at this patch viewed in a browser:

http://trac.sagemath.org/sage_trac/attachment/ticket/6674/trac_6674-use-ascii.patch

Carlo Hamalainen

unread,
Aug 4, 2009, 8:51:52 AM8/4/09
to sage-...@googlegroups.com
On Tue, Aug 4, 2009 at 2:10 PM, Minh Nguyen<nguye...@gmail.com> wrote:
> Have a look at this patch viewed in a browser:
>
> http://trac.sagemath.org/sage_trac/attachment/ticket/6674/trac_6674-use-ascii.patch

Right, so what happens if we set default_charset to utf-8 in trac.ini?

http://trac.edgewall.org/wiki/TracIni#trac-section

Just curious.

--
Carlo Hamalainen
http://carlo-hamalainen.net

Minh Nguyen

unread,
Aug 4, 2009, 8:55:38 AM8/4/09
to sage-...@googlegroups.com
Hi Carlo,

On Tue, Aug 4, 2009 at 10:51 PM, Carlo
Hamalainen<carlo.ha...@gmail.com> wrote:
>
> On Tue, Aug 4, 2009 at 2:10 PM, Minh Nguyen<nguye...@gmail.com> wrote:
>> Have a look at this patch viewed in a browser:
>>
>> http://trac.sagemath.org/sage_trac/attachment/ticket/6674/trac_6674-use-ascii.patch
>
> Right, so what happens if we set default_charset to utf-8 in trac.ini?
>
> http://trac.edgewall.org/wiki/TracIni#trac-section
>
> Just curious.

I think it wouldn't matter that much. See this

http://sage.math.washington.edu/home/mvngu/patch/trac.txt

David Kirkby

unread,
Aug 4, 2009, 10:49:40 AM8/4/09
to sage-...@googlegroups.com
2009/8/4 Minh Nguyen <nguye...@gmail.com>:


One other possibility is making the reference "Sampo Niskanen et al"

I don't claim to know the most appropiate letter - it was just what it
looked like when I see it. But I would agree non-ASCII characters
should be removed - it causes many issues.

Dave

peter...@optushome.com.au

unread,
Aug 5, 2009, 3:19:15 PM8/5/09
to sage-...@googlegroups.com
On 2009-Aug-04 15:49:40 +0100, David Kirkby <david....@onetel.net> wrote:
>I don't claim to know the most appropiate letter - it was just what it
>looked like when I see it. But I would agree non-ASCII characters
>should be removed - it causes many issues.

I would see restricting Sage to ASCII-only as fairly limiting -
English is one of about two languges that can be expressed using only
ASCII. If Sage is going to be used (and accept contributions from)
outside the English-speaking world then non-ASCII support is fairly
important.

The proposed patch might be the best solution for the short term but
in the longer term, I believe Sage needs to look at how non-ASCII
characters can be supported - UTF-8 is probably the best solution here
since it is a proper superset of ASCII.

--
Peter Jeremy

Gonzalo Tornaria

unread,
Aug 5, 2009, 10:21:17 PM8/5/09
to sage-...@googlegroups.com

+1

My students had very frustrating issues with loading code with
non-ascii characters or CRLF end of lines, just because they use
non-ascii letters in comments or they used an editor with DOS line
endings.

This arises in the following setup: they write a .py or a .sage file,
which they upload to a sage worksheet. When they try to load the .py,
it gives useless errors if the file contains CRLF or non-ascii
letters. I think it is a bit better when loading .sage files.

It's especially frustrating since UTF-8 in the notebook works
reasonably well --- can use non-ascii in comments, html cells, even
python strings, etc. (some quirks may remain, but it's pretty good ---
much better than back in march when my course started).

Also, it'd be nice to be able to spell names properly in sage source
code (this one is way less important to me).

Best, Gonzalo

Minh Nguyen

unread,
Aug 6, 2009, 8:15:49 PM8/6/09
to sage-...@googlegroups.com
Hi Gonzalo,

I agree that restricting to ASCII only is a short-term option. The
issue that started this thread is related to non-ASCII characters in a
patch that modifies

sage/graphs/graph.py

Because of that patch, it has caused the reference manual to fail to
build with Sage 4.1.1.rc1 and resulted in many doctest failures. The
issue of non-ASCII characters should be revisited again after the
release of Sage 4.1.1. Until then, only ASCII characters should be
used in any patches that are to be merged in the 4.1.1 release cycle.

Robert Bradshaw

unread,
Aug 7, 2009, 3:04:38 AM8/7/09
to sage-...@googlegroups.com
On Aug 6, 2009, at 5:15 PM, Minh Nguyen wrote:

>
> I agree that restricting to ASCII only is a short-term option. The
> issue that started this thread is related to non-ASCII characters in a
> patch that modifies
>
> sage/graphs/graph.py
>
> Because of that patch, it has caused the reference manual to fail to
> build with Sage 4.1.1.rc1 and resulted in many doctest failures. The
> issue of non-ASCII characters should be revisited again after the
> release of Sage 4.1.1. Until then, only ASCII characters should be
> used in any patches that are to be merged in the 4.1.1 release cycle.


I'm +1 to removing this restriction if possible, though I do think
non-ascii characters should be used sparingly (e.g. no fancy dashes,
curly quotes, funky variable names).

I've made http://trac.sagemath.org/sage_trac/ticket/6682

- Robert

Reply all
Reply to author
Forward
0 new messages