Semantics of %

6 views
Skip to first unread message

Robert Bradshaw

unread,
Mar 12, 2009, 6:14:09 AM3/12/09
to sage-...@googlegroups.com
Here's a quick poll.

In Python, if I write "-1 % 5", I get 4. This is how we do it in Sage
as well (and I think it's the right way--that's not what I'm trying
to ask). However, in C if I write "-1 % 5" I get -1. The question is,
what should I get in Cython if I write (a % b) where a and b are cdef
ints? Should I

[ ] Get 4, because it should behave just like in Python, even though
it will require extra logic and be a bit slower

[ ] Get -1, because they're C ints, and besides we wouldn't be using
Cython if we didn't care about performance

[ ] Let the programmer decide (e.g. using http://wiki.cython.org/
enhancements/compilerdirectives ) recognizing that % will mean
different things in different contexts.

- Robert

David Joyner

unread,
Mar 12, 2009, 7:16:30 AM3/12/09
to sage-...@googlegroups.com
I have an ignorant question: what are the canonical reps of
ZZZ/nZZZ in C? (-n/2,n/2]?
Is the issue to decide between the interval [0,n-1] as reps of ZZ/nZZ (Python)
vs (-n/2,n/2] (C)?

The only C book I have in my office doesn't have this and my
browser seems to have some problems ("ASSERT: *** Search: _installLocation:
engine has no file!") so I can't search for the answer.

If the answer is yes then I also wonder if you can possibly leavie % the
Python way and write a new operator to do it the C way?

Ryan Hinton

unread,
Mar 12, 2009, 11:18:32 AM3/12/09
to sage-...@googlegroups.com
Robert,

Since I hit the problem I'm motivated to chime in. I also followed the
email trail on the cython list.

Quick summary:

[X] Let the programmer decide, with
[X'] Get 4 as the default

There are obviously some cases where speed is paramount and others where
Python compatibility is paramount, so it would be nice to choose.
Echoing the cython list, I also prefer in-code directives over compiler
options (though the two need not be mutually exclusive -- just a matter
of development time).

But given the ability to choose, you still need a "default": which
semantics are operative without any compiler options or pragmas. Even
though I "grew up" writing C, I came to Cython gradually from Python
code, making small changes to see what speed-ups I could get. So I
assumed Python semantics were operative. Besides my use case and
personal preference, I think it's more pleasant to add a pragma from a
wiki page of optimizations than to get really frustrated debugging
because you made one little change and now your code segfaults (my
experience).

Thanks for working on this!

- Ryan

Carl Witty

unread,
Mar 12, 2009, 11:51:20 AM3/12/09
to sage-...@googlegroups.com
On Thu, Mar 12, 2009 at 4:16 AM, David Joyner <wdjo...@gmail.com> wrote:
>
> I have an ignorant question: what are the canonical reps of
> ZZZ/nZZZ in C? (-n/2,n/2]?
> Is the issue to decide between the interval [0,n-1] as reps of ZZ/nZZ (Python)
> vs (-n/2,n/2] (C)?

In C, ZZ/nZZ does not have canonical representatives. For example,
the equivalent of [n%4 for n in [-7 .. 7]] would give:

-3, -2, -1, 0, -3, -2, -1, 0, 1, 2, 3, 0, 1, 2, 3

This is annoying to a mathematician. The big reason in favor of this
is to align with division: both C and Python give a==(a/b)*b+a%b, but
Python uses floor division and C uses truncating division.

So why does C use truncating division? I'm not sure what the choice
point was, but at this point two very important reasons would be:
because that's what the signed integer division instruction gives you
on basically all processors, and because that's what the C standard
has specified for years, so people have written code depending on that
behavior. Given these facts, it's IMHO unlikely that C will ever
change.

Carl

Craig Citro

unread,
Mar 12, 2009, 12:31:13 PM3/12/09
to sage-...@googlegroups.com
> [ ] Get 4, because it should behave just like in Python, even though
> it will require extra logic and be a bit slower
>
> [X] Get -1, because they're C ints, and besides we wouldn't be using

> Cython if we didn't care about performance
>
> [ ] Let the programmer decide (e.g. using http://wiki.cython.org/
> enhancements/compilerdirectives ) recognizing that % will mean
> different things in different contexts.
>

I've also been reading the Cython thread ... I agree that there's a
good argument for Python semantics, but when it comes down to it, I
think of Cython as "being able to move my inner loops down to C" -- if
I type "cdef int x", I'm generally expecting x to act like a C int,
and be *as fast as humanly possible*. Plus, when we move things from
Python down to Cython, we already have changes to make -- for
instance, x**2 has to change, because C doesn't support
exponentiation, so why would it be any different for %?

That said, it would be really nice if there were an easy way to get
Python semantics for % on C ints. I just don't think it should be the
default.

-cc

Nick Alexander

unread,
Mar 12, 2009, 2:08:51 PM3/12/09
to sage-...@googlegroups.com
>> [X] Get -1, because they're C ints, and besides we wouldn't be using
>> Cython if we didn't care about performance

I support this because I would like Cython to remain primarily a way
to interface to C code rather than become the "default language of
sage".

Nick

Jason Grout

unread,
Mar 12, 2009, 5:17:47 PM3/12/09
to sage-...@googlegroups.com
Craig Citro wrote:
> and be *as fast as humanly possible*. Plus, when we move things from
> Python down to Cython, we already have changes to make -- for
> instance, x**2 has to change, because C doesn't support
> exponentiation, so why would it be any different for %?


Cython doesn't automatically translate x**2 into x*x or pow(x,2)? If
not, why not?


Thanks,

Jason

Robert Bradshaw

unread,
Mar 12, 2009, 6:02:59 PM3/12/09
to sage-...@googlegroups.com

There's actually a thread to re-implement this. Originally, Pyrex
translated a**b to pow(a,b) for c ints, which was strange given the
result was a floating point number, so we disabled it. Of course
there's a narrow range of non-overflowing values.

- Robert


Georg S. Weber

unread,
Mar 13, 2009, 8:30:46 AM3/13/09
to sage-devel


On 12 Mrz., 16:51, Carl Witty <carl.wi...@gmail.com> wrote:
> On Thu, Mar 12, 2009 at 4:16 AM, David Joyner <wdjoy...@gmail.com> wrote:
>
> > I have an ignorant question: what are the canonical reps of
> > ZZZ/nZZZ in C? (-n/2,n/2]?
> > Is the issue to decide between the interval [0,n-1] as reps of ZZ/nZZ (Python)
> > vs (-n/2,n/2] (C)?
>
> In C, ZZ/nZZ does not have canonical representatives.  For example,
> the equivalent of [n%4 for n in [-7 .. 7]] would give:
>
> -3, -2, -1, 0, -3, -2, -1, 0, 1, 2, 3, 0, 1, 2, 3

Hi Carl,

the situation might be worse:
AFAIK, that choice ist *not* defined by the ANSI C standard, but left
complier-dependent. In other words: you might get different answers on
the same system using different C compilers, or even the same C
compiler with different options. C is fun, isn't it?

So the original question turns into:
Do we want to (potentially) sacrifice speed in favor of portability?
If so, we should make the Cython behaviour predictable, i.e. "fixed"
--- favourably identical to the Python behaviour.
And in the Cython build scripts we should check whether we have e.g.
GCC/architecture xyz and set respective options, to not lose speed in
the cases it is not necessary so.

Cheers,
gsw

Carl Witty

unread,
Mar 13, 2009, 10:37:30 AM3/13/09
to sage-...@googlegroups.com
On Fri, Mar 13, 2009 at 5:30 AM, Georg S. Weber
<Georg...@googlemail.com> wrote:
>
>
>
> On 12 Mrz., 16:51, Carl Witty <carl.wi...@gmail.com> wrote:
>> On Thu, Mar 12, 2009 at 4:16 AM, David Joyner <wdjoy...@gmail.com> wrote:
>>
>> > I have an ignorant question: what are the canonical reps of
>> > ZZZ/nZZZ in C? (-n/2,n/2]?
>> > Is the issue to decide between the interval [0,n-1] as reps of ZZ/nZZ (Python)
>> > vs (-n/2,n/2] (C)?
>>
>> In C, ZZ/nZZ does not have canonical representatives.  For example,
>> the equivalent of [n%4 for n in [-7 .. 7]] would give:
>>
>> -3, -2, -1, 0, -3, -2, -1, 0, 1, 2, 3, 0, 1, 2, 3
>
> Hi Carl,
>
> the situation might be worse:
> AFAIK, that choice ist *not* defined by the ANSI C standard, but left
> complier-dependent. In other words: you might get different answers on
> the same system using different C compilers, or even the same C
> compiler with different options. C is fun, isn't it?

Division/modulo was compiler-defined in the original ANSI C, but this
was changed in C99, which specifies truncating division.

Carl

Ryan Hinton

unread,
Mar 13, 2009, 12:47:22 PM3/13/09
to sage-...@googlegroups.com
One more quick thought. The first sentence of the Cython tutorial
(chapter two of the manual):

The fundamental nature of Cython can be summed up as follows: Cython is
Python with C data types.

I understand there are thorny corner-cases and even common cases where
it's best to break with Python semantics. This may be one of them. But
I remembered this sentence in the shower this morning as I was trying to
figure out why I so strongly expected Python semantics on operators. I
suppose this is more personal reflection, though, since I'm fine if it
remains C semantics -- as long as it is clearly documented. Perhaps a
Wiki page, FAQ entry, and/or manual entry of differences between Cython
and Python.

Thanks!

- Ryan

Jason Grout

unread,
Mar 13, 2009, 1:09:31 PM3/13/09
to sage-...@googlegroups.com
Ryan Hinton wrote:
> One more quick thought. The first sentence of the Cython tutorial
> (chapter two of the manual):
>
> The fundamental nature of Cython can be summed up as follows: Cython is
> Python with C data types.
>
> I understand there are thorny corner-cases and even common cases where
> it's best to break with Python semantics. This may be one of them. But
> I remembered this sentence in the shower this morning as I was trying to
> figure out why I so strongly expected Python semantics on operators. I
> suppose this is more personal reflection, though, since I'm fine if it
> remains C semantics -- as long as it is clearly documented. Perhaps a
> Wiki page, FAQ entry, and/or manual entry of differences between Cython
> and Python.
>


+1 on Python semantics for the above reasons. I'm a more casual Cython
user. At one point, I thought someone (Robert B?) said that python code
should run unmodified as Cython code, and if it didn't, that was a bug
(or at least, wasn't implemented yet). That shaped my opinion that
Cython was more Pythonic than C

*plants tongue in cheek* Besides, Cython has *five* letters in common
with Python, and only *one* letter in common with C. Clearly, that
helps us see where the priorities should be :).

That said, I'm sure I'll adjust to whatever is decided. I can see
remembering to change things to C semantics being frustrating, though.

Thanks,

Jason

Robert Miller

unread,
Mar 13, 2009, 1:16:24 PM3/13/09
to sage-devel
> [X] Get -1, because they're C ints, and besides we wouldn't be using  
> Cython if we didn't care about performance

When I sit down to program something in Cython, I expect the ease of
programming in Python, with the speed benefits of programming in C.
When I'm implementing an algorithm, I'm thinking in terms of the code
doing C-like things, because after all, that's why I'm using Cython in
the first place.

Nick Alexander

unread,
Mar 13, 2009, 1:39:15 PM3/13/09
to sage-...@googlegroups.com
> *plants tongue in cheek* Besides, Cython has *five* letters in common
> with Python, and only *one* letter in common with C. Clearly, that
> helps us see where the priorities should be :).

However, all of the letters in C are the same, whereas only 5/6 of the
letters in Python are the same.

Nick

Robert Bradshaw

unread,
Mar 13, 2009, 1:52:27 PM3/13/09
to sage-...@googlegroups.com
On Mar 13, 2009, at 5:30 AM, Georg S. Weber wrote:

> On 12 Mrz., 16:51, Carl Witty <carl.wi...@gmail.com> wrote:
>> On Thu, Mar 12, 2009 at 4:16 AM, David Joyner <wdjoy...@gmail.com>
>> wrote:
>>
>>> I have an ignorant question: what are the canonical reps of
>>> ZZZ/nZZZ in C? (-n/2,n/2]?
>>> Is the issue to decide between the interval [0,n-1] as reps of ZZ/
>>> nZZ (Python)
>>> vs (-n/2,n/2] (C)?
>>
>> In C, ZZ/nZZ does not have canonical representatives. For example,
>> the equivalent of [n%4 for n in [-7 .. 7]] would give:
>>
>> -3, -2, -1, 0, -3, -2, -1, 0, 1, 2, 3, 0, 1, 2, 3
>
> Hi Carl,
>
> the situation might be worse:
> AFAIK, that choice ist *not* defined by the ANSI C standard, but left
> complier-dependent. In other words: you might get different answers on
> the same system using different C compilers, or even the same C
> compiler with different options. C is fun, isn't it?

Interestingly enough, the Python sources are written to work no
matter what C's behavior.

> So the original question turns into:
> Do we want to (potentially) sacrifice speed in favor of portability?

Yes, I at least want to have that option. Also, on a pragmatic note,
Sage has already been programed with this assumption and I would hate
to slow it down due to language changes.

> If so, we should make the Cython behaviour predictable, i.e. "fixed"
> --- favourably identical to the Python behaviour.
> And in the Cython build scripts we should check whether we have e.g.
> GCC/architecture xyz and set respective options, to not lose speed in
> the cases it is not necessary so.
>
> Cheers,
> gsw
>
>>
>> This is annoying to a mathematician. The big reason in favor of this
>> is to align with division: both C and Python give a==(a/b)*b+a%b, but
>> Python uses floor division and C uses truncating division.
>>
>> So why does C use truncating division? I'm not sure what the choice
>> point was, but at this point two very important reasons would be:
>> because that's what the signed integer division instruction gives you
>> on basically all processors, and because that's what the C standard
>> has specified for years, so people have written code depending on
>> that
>> behavior. Given these facts, it's IMHO unlikely that C will ever
>> change.

I would guess C does truncating division because it's easier--one
just performs the unsigned division and copies the sign over.

- Robert


Jason Grout

unread,
Mar 13, 2009, 1:58:20 PM3/13/09
to sage-...@googlegroups.com


I just knew someone would say that. I also thought someone would say
that the "C" comes first in the word...

Jason

Georg S. Weber

unread,
Mar 13, 2009, 4:56:40 PM3/13/09
to sage-devel
Hi all,

is there already an operator named %% (double-percent)?
Somewhere in Python or its relatives?

If not, we could have the best of both worlds. Just let act in Cython
% as the corresponding C operator, i.e. -1 % 5 == -1 (to have maximal
speed), and let in Cython %% act as the Python % operator, i.e. -1 %%
5 == 4 (to have maximal convenience).

In Python, we also have the // (double-slash) operator, which does not
exist in C, so an uninitiated programmer coming across %% for the
first time has a good chance of guessing what that means. And as Craig
pointed out above, "descending from Python to Cython" we already have
to make certain (but very few and easily remembered) changes.

I then would allow %% to be used in Sage, too, just to have this
unambiguous notion available.

Thoughts?

Cheers,
gsw


P.S.:
As William pointed out somewhere, the meaning of 1/2 differs between
different Python versions (right now I don't remember the exact
details).

P.P.S.:
I'd vote against making this dependent on compile-time options. This
leads to non-portable code (think of having to merge two code
fragments relying on different choices of these options --- what a
nightmare).

dagss

unread,
Mar 13, 2009, 6:15:44 PM3/13/09
to sage-devel
On Mar 13, 9:56 pm, "Georg S. Weber" <GeorgSWe...@googlemail.com>
wrote:
> Hi all,
>
> is there already an operator named %% (double-percent)?
> Somewhere in Python or its relatives?
>
> If not, we could have the best of both worlds. Just let act in Cython
> % as the corresponding C operator, i.e. -1 % 5 == -1 (to have maximal
> speed), and let in Cython %% act as the Python % operator, i.e. -1 %%
> 5 == 4 (to have maximal convenience).

It's interesting that this proposal has arisen three times
independently; on the Cython, NumPy and Sage lists :-)

One problem though is that you'd also need a corresponding operator
for truncating division (i.e. -7 // 6 == -2 in Python and -1 in C)...

Dag Sverre
Reply all
Reply to author
Forward
0 new messages