Re: [sage-devel] graph theory: refactor implementation of spanning tree algorithms

Message has been deleted

Robert Miller

unread,

Dec 7, 2010, 5:56:52 AM12/7/10

to sage-...@googlegroups.com

I'm not sure it is a good idea to *remove* the methods from the object
of which they are a natural function. I've seen this argument many
times before, and I really like this as an organizing method.
Everything else you say seems like a good idea to me: improving the
documentation, having the actual implementations in their own module,
cleaning up source code, making it all more accessible.

I think you can accomplish all these goals without "unbloating"
graphs. I think Sage users have gotten used to (please feel free to
correct me on this, Sage users!) looking for a method via tab
completion. I worry about how many features we have for graphs already
which I have seen users surprised to find. Cleaning up documentation
might make more people more aware of what Sage can do, but I think
moving these methods (at least in name) out of the graph classes would
be a push in the other direction.

--
Robert L. Miller
http://www.rlmiller.org/

Message has been deleted

Jason Grout

unread,

Dec 7, 2010, 8:15:53 AM12/7/10

to sage-...@googlegroups.com

On 12/7/10 5:25 AM, Minh Nguyen wrote:
> Hi Robert,

>
> On Tue, Dec 7, 2010 at 9:56 PM, Robert Miller<r...@rlmiller.org> wrote:
>> I think you can accomplish all these goals without "unbloating"
>> graphs.
>

> Of course it can be done without unbloating the various graph classes.

>
>
>> I think Sage users have gotten used to (please feel free to
>> correct me on this, Sage users!) looking for a method via tab
>> completion.
>

> Here are two ways to refactor code, while keeping users happy. Both
> approaches below rely on keeping the current interfaces for spanning
> tree methods, while also creating a new module called
> sage/graphs/spanning_tree.py(x).
>
> * Move the relevant core code to the said proposed module. The core
> code I'm referring to would be code that does the actual computation
> of spanning trees, excluding some basic sanity checks, etc. The
> relevant existing methods would be rewritten to call the corresponding
> interfaces in sage/graphs/spanning_tree.py(x).
>
> Take the case of min_spanning_tree() in sage/graphs/graph.py. We move
> the core code that computes minimum spanning trees into
> sage/graphs/spanning_tree.py(x) and improve the minimum spanning tree
> code in that new module. The method min_spanning_tree() would not be
> removed entirely, but would still retain its current interface and
> have some possible improvement as necessary. It's only the definition
> that would be changed; the rewritten definition would actually call
> the relevant minimum spanning tree function in
> sage/graphs/spanning_tree.py(x) to do the heavy duty work of computing
> minimum spanning trees. Furthermore, we add a new method called
> min_spanning_tree() to the class definition of digraphs in
> sage/graphs/digraph.py to mirror its counterpart in
> sage/graphs/graph.py. But we actually call the relevant function in
> sage/graphs/spanning_tree.py(x) as per the rewritten
> min_spanning_tree() for undirected graphs.
>
> * Do the same as in the previous approach, but add deprecation
> warnings to the effect that the relevant methods would be entirely
> removed in a future release.

I worry that it is too confusing to have a min_spanning_tree function in
the documentation of the spanning_tree module that is different than the
min_spanning_tree method of a graph (different interface, more checks,
etc.). How about an option 3:

* directly import the spanning_tree.min_spanning_tree function into the
graph/digraph namespaces.

That way, spanning_tree.min_spanning_tree(G) is exactly the same as
G.min_spanning_tree().

Thanks,

Jason

Message has been deleted

Robert Miller

unread,

Dec 7, 2010, 10:41:19 AM12/7/10

to sage-...@googlegroups.com

> Goodies like algorithms for randomized spanning tree constructions
> should go into another module like sage/graphs/trees.pyx. I feel this
> is really a time to declare a serious moratorium on adding new methods
> to any of the following modules, unless there is a good reason to do
> so:
>
> * sage/graphs/generic_graph.py
> * sage/graphs/graph.py
> * sage/graphs/digraph.py

-1 on moratoriums. Adding restrictions for restriction sake is
foolish. Yes, these files are too large. Yes, splitting off certain
functionality will make them smaller. However, if someone implements a
function which starts off a new area of graph theory, and they don't
know where else to put it, it seems perfectly reasonable to put it in
here. Especially if this is someone we're trying to rope in to more
serious development. Later on when there are more functions, and it
makes sense to split these off into a new file, that's a good idea and
a good time to do it.

>
> --
> Regards
> Minh Van Nguyen
>
> --
> To post to this group, send an email to sage-...@googlegroups.com
> To unsubscribe from this group, send an email to sage-devel+...@googlegroups.com
> For more options, visit this group at http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org

Nathann Cohen

unread,

Dec 7, 2010, 12:22:21 PM12/7/10

to sage-...@googlegroups.com

I read this thread once and I will have to think about this subject
again before coming up with anything interesting..... One thing though
:

Would there be any way to ask Sphinx to produce one page per method
instead of having them all on one page ? Having such a long page
containing all our methods means that it is impossible, from Google,
to find the description of a method containing at once a list of
keywords... In its actual state, Google will often give this page as
an answer to any request containing "reference sage graph" and
anything else, because most of the terms are probably scattered
through it. If there was a way to have instead a "list of methods
contained in graph.py", then a link toward a page describing each of
them, it would finally become possible to isolate functions of
interest through Google.

"I'll be back"

Nathann

Jason Grout

unread,

Dec 7, 2010, 6:26:05 PM12/7/10

to sage-...@googlegroups.com

On 12/7/10 9:34 AM, Minh Nguyen wrote:
> Hi Jason,

>
> On Wed, Dec 8, 2010 at 12:15 AM, Jason Grout
> <jason...@creativetrax.com> wrote:
>> I worry that it is too confusing to have a min_spanning_tree function in the
>> documentation of the spanning_tree module that is different than the
>> min_spanning_tree method of a graph (different interface, more checks,
>> etc.).
>

> That is nothing compared to the case where we have implemented minimum
> spanning tree for digraphs in sage/graphs/digraph.py. Now when someone
> wants to maintain the minimum spanning tree code, the maintainer would
> be looking into at least two different modules, namely
> sage/graphs/graph.py and sage/graphs/digraph.py. If an issue affects
> both implementations of minimum spanning trees, i.e. those in the
> directed and undirected cases, what is the chance of someone fixing
> both implementations in the same patch? The idea of centralization is
> that there is one place where a maintainer would go to maintain the
> relevant code. Scattering code for essentially similar functionality
> across more than one file is a recipe for more work on future
> maintainers. NetworkX was designed as a library and many of its
> implementation of graph theoretic algorithms are not written as
> methods of graph classes, but rather as separate functions and
> organized in separate modules. To me this makes the job of maintenance
> easier. But let's move beyond that argument as I get the impression
> that it has failed so far to impress anyone, just as the argument of
> method bloat has failed.

>
>
>> How about an option 3:
>>
>> * directly import the spanning_tree.min_spanning_tree function into the
>> graph/digraph namespaces.
>>
>> That way, spanning_tree.min_spanning_tree(G) is exactly the same as
>> G.min_spanning_tree().
>

> Here's an option that should preserve the current interface of
> min_spanning_tree() while also allowing one to discover that method
> via tab completion. Move min_spanning_tree() higher up the class
> hierarchy and into sage/graphs/generic_graph.py. Once moved there,
> leave the interface alone and proceed to clean-up that method as
> follows:
>
> * Handle the cases of digraphs, multigraphs, and multidigraphs.
>
> * Handle weighted and unweighted cases of the above graph types.

I think maybe I was too brief in my suggestion to be clear. I would
favor refactoring the code out to a spanning_tree.py(x) file. My point
was that your propositions seemed to indicate that the graph class
methods that would call spanning_tree.min_spanning_tree would also do
some additional error checking, maybe have a different interface, etc.
My suggestion was to not have any additional code; put all of that error
checking, etc., into the spanning_tree.min_spanning_tree function. In
other words, the graph class would simply import the functions into the
class namespace:

class Graph:
from spanning_tree import min_spanning_tree

That way we get a standalone function in
spanning_tree.min_spanning_tree, and we also get a convenient method for
all graphs, and they are exactly the same function, have the same
interface, etc. There is no duplication of code or of doctests.

Of course, this import should probably be moved up to the generic graph
class, like you suggest, if the function can handle digraphs too.

Thanks,

Jason

Rob Beezer

unread,

Dec 7, 2010, 6:49:54 PM12/7/10

to sage-devel

I think it is important that a graph has hundreds of methods, since
Sage can do hundreds of things to a graph. Tab-completion is great,
and more robust wild-cards on tab-completion would be even better
(isn't this one of Jason's favorite suggestions?).

That said, not only is graph.py so big that doctesting is annoying,
but in my text editor there is a noticeable lag for the syntax
highlighting to catch-up when I temporarily unbalance a string or
parenthesis. Extremely annoying. (Yes, I should switch to emacs, or
vi, or something.) So a topical (algorithmic) decomposition of the
source would be a great help.

-1 to moratoriums as well. Fewer rules, and more thoughtful
decisions, guidance, discussion, and help. I can see some new very
basic function for graphs that Sage does not have (not sure what that
would be) that naturally belongs at a high level, so maybe graph.py is
exactly where it should be.

Thanks, Minh, et al, for your work to untangle this.

Rob

On Dec 7, 3:26 pm, Jason Grout <jason-s...@creativetrax.com> wrote:
> On 12/7/10 9:34 AM, Minh Nguyen wrote:
>
>
>
> > Hi Jason,
>
> > On Wed, Dec 8, 2010 at 12:15 AM, Jason Grout

> > <jason-s...@creativetrax.com> wrote:

Message has been deleted

Jason Grout

unread,

Dec 7, 2010, 9:14:53 PM12/7/10

to sage-...@googlegroups.com

On 12/7/10 8:06 PM, Minh Nguyen wrote:
> Hi Jason,
>

> The above seems very reasonable to me. Documentation for spanning
> trees would be advertised on the top-level page for graph theory,
> while we refactor spanning tree code into the proposed module
> spanning_tree.py(x) and import min_spanning_tree() somewhere in
> generic_graph.py. This means that one could still discover
> min_spanning_tree() via tab completion (using a graph object) and also
> have the code for min_spanning_tree() centralized somewhere.
>

Yep. Basically, the method becomes a convenient alias to the function
in the spanning_tree module.

With this model, I can see one solution to the "method bloat". Common
algorithms/functions have aliases as methods. Extensions or uncommon
functions don't have aliases as graph class methods, but live in the
spanning_tree module. I'm not sure how I feel about it, but it is an
idea...

Jason

Message has been deleted

Dima Pasechnik

unread,

Dec 8, 2010, 11:51:43 PM12/8/10

to sage-devel

On Dec 7, 6:56 pm, Robert Miller <r...@rlmiller.org> wrote:
> I'm not sure it is a good idea to *remove* the methods from the object
> of which they are a natural function. I've seen this argument many
> times before, and I really like this as an organizing method.
> Everything else you say seems like a good idea to me: improving the
> documentation, having the actual implementations in their own module,
> cleaning up source code, making it all more accessible.
>
> I think you can accomplish all these goals without "unbloating"
> graphs. I think Sage users have gotten used to (please feel free to
> correct me on this, Sage users!) looking for a method via tab
> completion.

10-20 methods, OK. But 200 methods via tab completion?
The latter is largely usless.
It screams for a restructuring...

Message has been deleted

Nathann Cohen

unread,

Dec 11, 2010, 4:56:45 PM12/11/10

to sage-...@googlegroups.com

Hello everybody !!!

My question on sphinx-dev was finally answered : it sounds possible to
have one page per method ! :-)

http://groups.google.com/group/sphinx-dev/browse_thread/thread/41a2fc455ff98d6e?hl=en

Nathann

Johan S. R. Nielsen

unread,

Dec 13, 2010, 3:13:01 AM12/13/10

to sage-devel

On Dec 11, 10:56 pm, Nathann Cohen <nathann.co...@gmail.com> wrote:
> Hello everybody !!!
>
> My question on sphinx-dev was finally answered : it sounds possible to
> have one page per method ! :-)
>

> http://groups.google.com/group/sphinx-dev/browse_thread/thread/41a2fc...
>
> Nathann

Nice! However, what exactly should we do? I guess we're not interested
in doing this for _all_ classes and modules, but only certain large
ones? Should we add a keyword in the module/class docstring that
Sphinx could pick up, indicating that the docs for this file should be
split into many pages in some predefined way? Or something more fine-
grained? In general, we should be careful not to add too much
documentation-markup into the individual source code files, I think.

Johan

Nathann Cohen

unread,

Dec 13, 2010, 3:39:01 AM12/13/10

to sage-...@googlegroups.com

> Nice! However, what exactly should we do? I guess we're not interested
> in doing this for _all_ classes and modules, but only certain large
> ones? Should we add a keyword in the module/class docstring that
> Sphinx could pick up, indicating that the docs for this file should be
> split into many pages in some predefined way? Or something more fine-
> grained? In general, we should be careful not to add too much
> documentation-markup into the individual source code files, I think.

While I was struggling to make Sphinx run with the correct options,
Minh the Great was already building an example and creating a new
thread :-D