Scary things in Sage's Digraphs

Nathann Cohen

unread,

Jul 25, 2010, 3:27:15 AM7/25/10

to Sage devel, Robert Miller

Hello everybody !!!

I was trying to write the following doctest, and noticed something scary :

We build a directed circulant graph on n vertices by linking the i th
vertex to i+1, i+2, ... , i+k, thus ensuring our graph is k-connected.
Then, by Edmond's theorem, we know this graph has `k` edge-disjoint
spanning arborescences

sage: n = 20
sage: k = 3
sage: g = DiGraph()
sage: g.add_edges( (i,Mod(i+j,n)) for i in range(n) for j in range(1,k+1) )
sage: k == g.edge_connectivity()
False

This should be k, but it is not. Not *that* bad. But it gets worse :

sage: g.strongly_connected_components()
[[0], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]]

Two different "0" ? Then I thought it was because of the Mod, and
maybe some zeroes where regular ones, while other were zeroes of
Z/20Z.... But then :

sage: g
Digraph on 21 vertices
sage: len(g.vertices())
20
sage: g.vertices()
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

That's were I got really scared :-D

Robert ? Could you please tell me "oh yeah, I know where it comes
from, that's just a typo" ? :-D

Thankssssssssssss !!!

Nathann

Nathann Cohen

unread,

Jul 25, 2010, 3:29:22 AM7/25/10

to Sage devel, Robert Miller

/*
this disappears when adding a Integer() cast

sage: n = 20
sage: k = 3
sage: g = DiGraph()

sage: g.add_edges( (Integer(i),Integer(Mod(i+j,n))) for i in range(n)

for j in range(1,k+1) )
sage: k == g.edge_connectivity()

True
*/

Nathann

Robert Miller

unread,

Jul 25, 2010, 7:05:34 AM7/25/10

to sage-...@googlegroups.com

Nathann,

Using the following instead fixes the problem:

g.add_edges( (Mod(i,n),Mod(i+j,n)) for i in range(n) for j in range(1,k+1) )

This is more consistent, since we are actually using the same vertex
objects. However, that should just work, right? Why doesn't it?

This is coming from the code around line 1000 of c_graph, which goes
from vertex labels to ints and back. The IntegerMod case was not in
mind when this code was written. The real problem is that when the
Python int 0 gets passed to get_vertex, it does not add the entry to
the translation dictionary. The idea being that it would be more
efficient for the majority of graphs, which only use [0, 1, ..., n-1]
as labels, going through dictionaries a bunch would be a waste.
However, when the IntegerMod 0 gets passed in, it does not find an
equal object in the dict, and since it is not one of {int, long,
Integer}, it assigns a new int in the translation dictionary for it.

So that's what's happening, what do people think about what we should
do about this? Technically the input is a bit fuzzy, but this raises
the question of how many other objects are there which will pass the
test int(0) == IntegerMod(0, 20)... I am very reluctant to support
adding all the labels to the dicts, unless someone can show that there
really isn't any overhead there...

--
Robert L. Miller
http://www.rlmiller.org/

Nathann Cohen

unread,

Jul 25, 2010, 8:33:44 AM7/25/10

to sage-...@googlegroups.com

Hello !!!

> This is coming from the code around line 1000 of c_graph, which goes
> from vertex labels to ints and back. The IntegerMod case was not in
> mind when this code was written. The real problem is that when the
> Python int 0 gets passed to get_vertex, it does not add the entry to
> the translation dictionary. The idea being that it would be more
> efficient for the majority of graphs, which only use [0, 1, ..., n-1]
> as labels, going through dictionaries a bunch would be a waste.
> However, when the IntegerMod 0 gets passed in, it does not find an
> equal object in the dict, and since it is not one of {int, long,
> Integer}, it assigns a new int in the translation dictionary for it.
>
> So that's what's happening, what do people think about what we should
> do about this? Technically the input is a bit fuzzy, but this raises
> the question of how many other objects are there which will pass the
> test int(0) == IntegerMod(0, 20)... I am very reluctant to support
> adding all the labels to the dicts, unless someone can show that there
> really isn't any overhead there...

Hmmmm.. I would not want to see any loss of performance because of
such matters either, I think we quite agree on this point..

I understood from your explanation why Mod(1,n) is considered
different from 0, and it is to me the correct behaviour... But what
about this

g has 21 vertices
len(g.vertices) == 20 ?

Sorry if you answered already ! :-)

Nathann

Robert Miller

unread,

Jul 25, 2010, 8:49:58 AM7/25/10

to sage-...@googlegroups.com

Nathann,

> I understood from your explanation why Mod(1,n) is considered
> different from 0, and it is to me the correct behaviour... But what
> about this
>
> g has 21 vertices
> len(g.vertices) == 20 ?
>
> Sorry if you answered already ! :-)

I think the information was there, but I was not very clear. It is
better if you just look at the code:

{{{
cdef int get_vertex(object u, dict vertex_ints, dict vertex_labels,
CGraph G) except ? -2:
if u in vertex_ints:
return vertex_ints[u]
if (not isinstance(u, (int, long, Integer)) or
u < 0 or u >= G.active_vertices.size or
u in vertex_labels):
return -1
return u
}}}

When int(0) is input, it returns 0. When Mod(0, 20) is input, however,
it returns -1, meaning that the object is not yet in the translation
dictionaries (since the second block only tests for three types of
integer). So when int(0) is input to check_vertex(), nothing gets
added to the dictionaries. However, when Mod(0, 20) is input, it does
add an entry to the dicts, and since 0 is already taken up (as defined
in a bitset of which ones are used), it gets a different int. This is
how we get two zeros in the vertex dicts.

It is not clear to me how to solve this problem. One could add
IntegerMod to the list of types to check, but that's a slippery
slope... One very silly option is to simply make the behavior of
adding Python ints and IntegerMods at the same time undefined, but
there *must* be a better fix...

Nathann Cohen

unread,

Jul 25, 2010, 9:01:05 AM7/25/10

to sage-...@googlegroups.com

Nonononoooo, I understood why there are two copies of what appears to
be a "zero", and I think it's fine like that !
My question was about the number of vertices as remembered by the graph :

in one case, it says 21, but g.vertices() is only long of 20 elements.
Why aren't there two zeroes in g.vertices() ? I would expect to see
one 0, which is the integer, and one zero which is actually the member
of Z/20Z !

Sorry for this misunderstanding :-)

Nathann

> --
> To post to this group, send an email to sage-...@googlegroups.com
> To unsubscribe from this group, send an email to sage-devel+...@googlegroups.com
> For more options, visit this group at http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org
>

Robert Miller

unread,

Jul 25, 2010, 1:27:43 PM7/25/10

to sage-...@googlegroups.com

On Sun, Jul 25, 2010 at 2:01 PM, Nathann Cohen <nathan...@gmail.com> wrote:
> Nonononoooo, I understood why there are two copies of what appears to
> be a "zero", and I think it's fine like that !

This is definitely *not* fine, since we have

sage: int(0) == Mod(0, 20)
True

As input, the underlying C graph ends up corrupted (see below for
details). Since they are equal, Sage's graphs should treat them as
equal and they do not. Bug.

To belabor the point for clarity:

When you call g.vertices(), it calls iterator_verts in c_graph.pyx:

{{{

cdef int i
cdef object v
if verts is None:
S = set(self.vertex_ints.iterkeys())
for i from 0 <= i < (<CGraph>self._cg).active_vertices.size:
if (i not in self.vertex_labels and
bitset_in((<CGraph>self._cg).active_vertices, i)):
S.add(i)
return iter(S)

}}}

At this point we have:

{{{

active_vertices == 1111111111111111111110000000000000000000

vertex_ints == {0: 20, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8,
9: 9, 10: 10, 11: 11, 12: 12, 13: 13, 14: 14, 15: 15, 16: 16, 17: 17,
18: 18, 19: 19}

vertex_labels == {1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9:
9, 10: 10, 11: 11, 12: 12, 13: 13, 14: 14, 15: 15, 16: 16, 17: 17, 18:
18, 19: 19, 20: 0}

}}}

Note that there are 21 bits in active_vertices, so C int 0 and C int
20 are both vertices according to that. That first zero in vertex_ints
is an IntegerMod, and the 20 it points to is a C int. Vice versa for
the 20: 0 entry in vertex_labels.

At first, the line {{{ S = set(self.vertex_ints.iterkeys()) }}}
creates a set of size twenty, containing twenty IntegerMod's, the set
{0, ..., 19}.

When i == 0 in the loop {{{ for i from 0 <= i <
(<CGraph>self._cg).active_vertices.size }}}, the inner branch
discovers that 0 should be added to S, but since S is a set, and
Mod(0, 20) is already in there, adding int(0) to it does nothing.

So as I said before, the problem is created by line 1084 (in
sage-4.5/4.5.1) of c_graph.pyx, namely:

{{{ if (not isinstance(u, (int, long, Integer)) or }}}

In fact this should check for any Sage object X for which {{{
Integer(Y) == X }}} is true for any Y.

I hope this better illustrates the problem, and I hope I've convinced
you that int(0) and Mod(0, 20) should *not* be considered distinct by
Sage graphs.

>
> Sorry for this misunderstanding :-)
>
> Nathann
>

A: "Stop apologizing!"
B: "I'm sorry!"

Carl Witty

unread,

Jul 25, 2010, 3:10:06 PM7/25/10

to sage-...@googlegroups.com

On Sun, Jul 25, 2010 at 10:27 AM, Robert Miller <r...@rlmiller.org> wrote:
> On Sun, Jul 25, 2010 at 2:01 PM, Nathann Cohen <nathan...@gmail.com> wrote:
>> Nonononoooo, I understood why there are two copies of what appears to
>> be a "zero", and I think it's fine like that !
>
> This is definitely *not* fine, since we have
>
> sage: int(0) == Mod(0, 20)
> True

You seem to want to make the vertex dictionary respect the equivalence
relation defined by Sage equality. If so, you're going to be in
trouble, since Sage equality actually is not an equivalence relation:

sage: 3 == Mod(7, 4)
True
sage: 3 == Mod(8, 5)
True
sage: Mod(7, 4) == Mod(8, 5)
False

This means that mixing Sage objects of different parents as elements
of a single set or keys of a single dictionary is asking for trouble;
in general, it cannot work (or at least we haven't figured out a way
to make it work). (As long as you stick to a single parent you should
be fine.)

> So as I said before, the problem is created by line 1084 (in
> sage-4.5/4.5.1) of c_graph.pyx, namely:
>
> {{{ if (not isinstance(u, (int, long, Integer)) or }}}
>
> In fact this should check for any Sage object X for which {{{
> Integer(Y) == X }}} is true for any Y.

Sounds like what you want is "u in ZZ", which is equivalent to "u ==
ZZ(u)" except that it returns False if ZZ(u) fails.

Carl

Robert Miller

unread,

Jul 25, 2010, 6:50:26 PM7/25/10

to sage-...@googlegroups.com

On Sun, Jul 25, 2010 at 8:10 PM, Carl Witty <carl....@gmail.com> wrote:
> You seem to want to make the vertex dictionary respect the equivalence
> relation defined by Sage equality. If so, you're going to be in
> trouble, since Sage equality actually is not an equivalence relation:

Is it really too much to ask?

It sounds like the proper thing to do here, then, is to adjust the
circulant graph constructor to use Mod(i) instead of i in the edge
additions, and then profusely document this issue -- perhaps a whole
chapter in the reference manual about graph gotchas... This is just
really unsatisfying. I like the idea of having graphs be free enough
that you can make anything you want a vertex.

A mathematician would not usually say that the integer is equal to the
coset. I can see why we might want

> sage: 3 == Mod(7, 4)
> True

But this highlights yet another place where mathematics and computer
science are less than perfect bedfellows. Just a day or two ago this
sort of thing was discovered in another thread. In short, Python sets
do not implement a total ordering (since < means inclusion), and so
are not sortable...

What to do, what to do?

Definition: Mod(n, m, parent=None)
Docstring:

Return the equivalence class of n modulo m as an element of
ZZ/mZZ.

According to this definition, I would expect

sage: Integer(7) == Mod(7,12)
False

Following the Python set notation, which I think is equally silly but
much harder to change, we would have

sage: Integer(7) < Mod(7,12)
True

..........

Robert Miller

unread,

Jul 27, 2010, 8:24:55 AM7/27/10

to sage-...@googlegroups.com

See trac #9610 for a patch which fixes this issue.

Robert Miller

unread,

Aug 3, 2010, 1:33:55 PM8/3/10

to sage-...@googlegroups.com

On Sun, Jul 25, 2010 at 6:50 PM, Robert Miller <r...@rlmiller.org> wrote:
> On Sun, Jul 25, 2010 at 8:10 PM, Carl Witty <carl....@gmail.com> wrote:
>> You seem to want to make the vertex dictionary respect the equivalence
>> relation defined by Sage equality. If so, you're going to be in
>> trouble, since Sage equality actually is not an equivalence relation:

It gets worse:

sage: a = Set([1,2,3])
sage: b = Set([2,3,4])
sage: a < b
True
sage: b > a
False

sage: b < a
True
sage: a > b
False

Let's take a look at Sage's implementation of comparison:

sage: a.__cmp__??
...
if isinstance(other, Set_object_enumerated):
if self.set() == other.set():
return 0
return -1
else:
return Set_object.__cmp__(self, other)

Aduuhhhh! If it is not true that a == b, then a < b. Not good.

I would like to ask, since obviously nobody cares about doing
comparisons with Sage sets at the moment... We obviously need to
define what it means for a < b and a > b for Sage Sets. Does anyone
mind if we do this so that instead of < meaning subset, and getting a
bad ordering with respect to sorting lists, can we make < give a
lexicographic ordering? I will gladly do all the work, including
implementing an is_subset method.

That way, if you like working with sets themselves and you want < to
mean subset, simply use Python sets. Or if you are Chris Godsil, e.g.,
and like to be able to consistently sort sets because you use them as
vertices for graphs, use Sage Sets.

I think this is the best solution we can get to this problem...

William Stein

unread,

Aug 3, 2010, 2:29:27 PM8/3/10

to sage-...@googlegroups.com

+1 It makes no sense for < to mean "subset" because < should be a
total order.

If you want to check for subsets we should use a method like in python:

sage: a = set([1,2,3])
sage: b = set([2,3,4])
sage: a.issubset(b)
False
sage: b.issubset(a)
False
sage: a.issubset(b.union(a))
True

I don't like "issubset" as the method name, and would prefer
"is_subset", personally.

>
> That way, if you like working with sets themselves and you want < to
> mean subset, simply use Python sets.

Yep.

> Or if you are Chris Godsil, e.g.,
> and like to be able to consistently sort sets because you use them as
> vertices for graphs, use Sage Sets.
>
> I think this is the best solution we can get to this problem...

William

Robert Miller

unread,

Aug 3, 2010, 4:50:59 PM8/3/10

to sage-...@googlegroups.com

On Tue, Aug 3, 2010 at 2:29 PM, William Stein <wst...@gmail.com> wrote:
> +1 It makes no sense for < to mean "subset" because < should be a
> total order.
>
> If you want to check for subsets we should use a method like in python:
>
> sage: a = set([1,2,3])
> sage: b = set([2,3,4])
> sage: a.issubset(b)
> False
> sage: b.issubset(a)
> False
> sage: a.issubset(b.union(a))
> True
>
> I don't like "issubset" as the method name, and would prefer
> "is_subset", personally.