doctesting arbitrary Sage files?

29 views
Skip to first unread message

Dan Drake

unread,
Dec 9, 2008, 11:53:42 PM12/9/08
to sage-s...@googlegroups.com
I'm preparing an arXiv submission that will include an appendix of Sage
code along with a separate .sage file with the code. Working with Sage I
have come to assume that all code not doctested is broken, so I'd like
to include doctests. I've written up my docstrings, but when I run "sage
-t foo.sage", the doctests don't work -- they complain about the
functions not being defined.

The doctesting mechanism seems to assume that you're only testing code
that is available via "from sage import *". Is there a way to run
doctests on an arbitrary file of Sage code?

Dan

--
--- Dan Drake <dr...@kaist.edu>
----- KAIST Department of Mathematical Sciences
------- http://mathsci.kaist.ac.kr/~drake

signature.asc

Jason Grout

unread,
Dec 10, 2008, 2:43:49 AM12/10/08
to sage-s...@googlegroups.com
Dan Drake wrote:
> I'm preparing an arXiv submission that will include an appendix of Sage
> code along with a separate .sage file with the code. Working with Sage I
> have come to assume that all code not doctested is broken, so I'd like
> to include doctests. I've written up my docstrings, but when I run "sage
> -t foo.sage", the doctests don't work -- they complain about the
> functions not being defined.
>
> The doctesting mechanism seems to assume that you're only testing code
> that is available via "from sage import *". Is there a way to run
> doctests on an arbitrary file of Sage code?


Cool. I just posted a note on Arxiv with a .sage program file, and I'm
seeing the same problem. (I thought the doctests passed, but apparently
they don't and I'm seeing the same problem; I think I got around this
once by "including" the file as part of the sage standard library and
then running the doctests.).

Slightly off-topic suggestions for posting to arxiv, for what it's worth:

1. I also included a .sws file for people that use Sage from the
notebook and would not even consider using Sage from the command line.
If you do something like that (and maybe even if you include a .sage
file), you need to tell arxiv to ignore the file (i.e., don't process it
with their autotex program). To do that, you can follow the
instructions here:

http://de.arxiv.org/help/faq/mistakes#auto_ignore

(this is needed in the .sws case since arxiv refuses to proceed when
there is a bzip2 file in your submission).

2. (this is totally a preference thing) I think it's helpful for people
if you put that there is a Sage program included in the preprint in the
comments, just like you would if there was a figure or something. Plus,
everyone that gets the arxiv email starts seeing references to Sage
programs; that helps our marketing department :).


Thanks, and good luck!

Jason

Dan Drake

unread,
Dec 10, 2008, 3:28:56 AM12/10/08
to sage-s...@googlegroups.com
On Wed, 10 Dec 2008 at 01:43AM -0600, Jason Grout wrote:
> Cool. I just posted a note on Arxiv with a .sage program file, and
> I'm seeing the same problem. (I thought the doctests passed, but
> apparently they don't and I'm seeing the same problem; I think I got
> around this once by "including" the file as part of the sage standard
> library and then running the doctests.).

Yeah, I can easily put the file in the standard library and doctest it,
but the point is that it should be easy for other people to run
doctests; they download the code, run a "sage -t", and when everything
passes, they start using it. This preprint will get posted sometime
soon, but (hopefully!) people will find my code useful many years from
now with a future version of Sage.

Doctesting is like putting your code in the refrigerator instead of
leaving it out -- it helps prevent bitrot. :)

> 2. (this is totally a preference thing) I think it's helpful for
> people if you put that there is a Sage program included in the
> preprint in the comments, just like you would if there was a figure or
> something. Plus, everyone that gets the arxiv email starts seeing
> references to Sage programs; that helps our marketing department :).

Exactly! I'm using the listings package to actually put the code into
the PDF and am including some explanatory propaganda. I understand that
very few people will actually download the tarball and get the .sage
file, so I figured I can at least put the code in front of their eyes.

While I am adding to this thread, I'll mention a trick: in the article,
I want to mention how to get the .sage file -- but to do that, I need to
know the arXiv URL. But how do you find out the eprint number before you
submit? The answer is to make your submission early in the day, get it
accepted -- so you find out the eprint number! -- then go back, update
your TeX file with the URL, and resubmit. If you do this early enough
(before 4 pm ET?) it doesn't show up as a new version.

signature.asc

William Stein

unread,
Dec 10, 2008, 7:40:47 AM12/10/08
to sage-s...@googlegroups.com
On Wed, Dec 10, 2008 at 12:28 AM, Dan Drake <dr...@kaist.edu> wrote:
> On Wed, 10 Dec 2008 at 01:43AM -0600, Jason Grout wrote:
>> Cool. I just posted a note on Arxiv with a .sage program file, and
>> I'm seeing the same problem. (I thought the doctests passed, but
>> apparently they don't and I'm seeing the same problem; I think I got
>> around this once by "including" the file as part of the sage standard
>> library and then running the doctests.).
>
> Yeah, I can easily put the file in the standard library and doctest it,
> but the point is that it should be easy for other people to run
> doctests; they download the code, run a "sage -t", and when everything
> passes, they start using it. This preprint will get posted sometime
> soon, but (hopefully!) people will find my code useful many years from
> now with a future version of Sage.
>
> Doctesting is like putting your code in the refrigerator instead of
> leaving it out -- it helps prevent bitrot. :)

That "sage -t foo.sage" doesn't import the functions somehow is a
sucky new-ish bug, that needs to be fixed ASAP, in my opinion. This
is related to recent major changes in how sage -t works on *.sage
files. See this ticket I just made:

http://trac.sagemath.org/sage_trac/ticket/4750

I don't think this will be hard to fix -- probably just a few lines of
code in the
right place.

Note that you could also submit a patch to Sage with the code you're doctesting.
I did that with all the tests from both of the books I published, and
I encourage you and many others to do the same with the code from your
article. The code would go in a file

devel/sage/sage/tests/

like the file devel/sage/sage/tests/book_stein_modform.py

In fact, I could imagine having dozens of files in that directory, and
when doctests break there, we could notify the authors before
releasing the version of Sage that breaks their doctests for feedback
-- then they could update their papers or Sage. Maybe this is how
the technical aspect of jsage should work:
http://www.sagemath.org/library/jsage/index.html


>> 2. (this is totally a preference thing) I think it's helpful for
>> people if you put that there is a Sage program included in the
>> preprint in the comments, just like you would if there was a figure or
>> something. Plus, everyone that gets the arxiv email starts seeing
>> references to Sage programs; that helps our marketing department :).
>
> Exactly! I'm using the listings package to actually put the code into
> the PDF and am including some explanatory propaganda. I understand that
> very few people will actually download the tarball and get the .sage
> file, so I figured I can at least put the code in front of their eyes.
>
> While I am adding to this thread, I'll mention a trick: in the article,
> I want to mention how to get the .sage file -- but to do that, I need to
> know the arXiv URL. But how do you find out the eprint number before you
> submit? The answer is to make your submission early in the day, get it
> accepted -- so you find out the eprint number! -- then go back, update
> your TeX file with the URL, and resubmit. If you do this early enough
> (before 4 pm ET?) it doesn't show up as a new version.
>
> Dan

Here's a question for you -- is there a way to embed a block of text
in an extractable way inside a pdf, etc.? If so, I think we could
easily change the notebook so ".pdf" is one of the formats for
uploading a sage worksheet. Then you could somehow embed the
worksheet itself in the pdf. Then tell readers of the pdf -- "hey,
just upload this pdf you're reading right now to any sage notebook
server, and you're good to go!"

Thoughts?

William

Jason Grout

unread,
Dec 10, 2008, 9:25:24 AM12/10/08
to sage-s...@googlegroups.com
Dan Drake wrote:
> While I am adding to this thread, I'll mention a trick: in the article,
> I want to mention how to get the .sage file -- but to do that, I need to
> know the arXiv URL. But how do you find out the eprint number before you
> submit? The answer is to make your submission early in the day, get it
> accepted -- so you find out the eprint number! -- then go back, update
> your TeX file with the URL, and resubmit. If you do this early enough
> (before 4 pm ET?) it doesn't show up as a new version.

I have a friend that had a result that was independently proven by
another person. They agreed that they ought to publish their results
simultaneously. So one posted his result on arxiv, sent the identifier
to my friend, who then included a reference in his preprint and posted
to arxiv. My friend then sent *his* url back to the other person, who
redid his posting with my friend's reference. The result was that both
papers appeared simultaneously on arxiv, and both contained links to the
other proof.

That was cool.

Jason

Dan Drake

unread,
Dec 10, 2008, 9:27:50 AM12/10/08
to sage-s...@googlegroups.com
On Wed, 10 Dec 2008 at 04:40AM -0800, William Stein wrote:
> Note that you could also submit a patch to Sage with the code you're
> doctesting. I did that with all the tests from both of the books I
> published, and I encourage you and many others to do the same with the
> code from your article. The code would go in a file
>
> devel/sage/sage/tests/
>
> like the file devel/sage/sage/tests/book_stein_modform.py
>
> In fact, I could imagine having dozens of files in that directory, and
> when doctests break there, we could notify the authors before
> releasing the version of Sage that breaks their doctests for feedback
> -- then they could update their papers or Sage.

I like that idea, and I do plan on getting some of this code into the
main Sage library -- but for now, I just want to test some functions.
I'll think about submitting my code to that tests/ directory, though.

> Here's a question for you -- is there a way to embed a block of text
> in an extractable way inside a pdf, etc.? If so, I think we could
> easily change the notebook so ".pdf" is one of the formats for
> uploading a sage worksheet. Then you could somehow embed the
> worksheet itself in the pdf. Then tell readers of the pdf -- "hey,
> just upload this pdf you're reading right now to any sage notebook
> server, and you're good to go!"

This should be possible, though it may be really hard. Or, maybe, not so
hard:

http://tug.ctan.org/cgi-bin/ctanPackageInformation.py?id=attachfile

There's also attachfile2 and embedfile on CTAN. I agree that the coolest
thing would be to just upload the PDF and have the notebook extract the
code, but at any rate, we should be able to nicely include code...

signature.asc

Jason Grout

unread,
Dec 10, 2008, 9:41:33 AM12/10/08
to sage-s...@googlegroups.com
William Stein wrote:
>
> That "sage -t foo.sage" doesn't import the functions somehow is a
> sucky new-ish bug, that needs to be fixed ASAP, in my opinion. This
> is related to recent major changes in how sage -t works on *.sage
> files. See this ticket I just made:
>
> http://trac.sagemath.org/sage_trac/ticket/4750
>

So maybe it *did* work for before! I thought it did.

> I don't think this will be hard to fix -- probably just a few lines of
> code in the
> right place.

I was looking at the code last night. I think it would boil down to
being able to load a .sage file programmatically (i.e., with a function
call). That involves converting it to a .py file, then loading it
(execfile), right?

>
> Note that you could also submit a patch to Sage with the code you're doctesting.
> I did that with all the tests from both of the books I published, and
> I encourage you and many others to do the same with the code from your
> article. The code would go in a file
>
> devel/sage/sage/tests/
>
> like the file devel/sage/sage/tests/book_stein_modform.py
>
> In fact, I could imagine having dozens of files in that directory, and
> when doctests break there, we could notify the authors before
> releasing the version of Sage that breaks their doctests for feedback
> -- then they could update their papers or Sage. Maybe this is how
> the technical aspect of jsage should work:
> http://www.sagemath.org/library/jsage/index.html


That would be very nice and show people that we are serious about people
using Sage to do research. How many other software systems will include
third-party code in their system to do testing?


> Here's a question for you -- is there a way to embed a block of text
> in an extractable way inside a pdf, etc.? If so, I think we could
> easily change the notebook so ".pdf" is one of the formats for
> uploading a sage worksheet. Then you could somehow embed the
> worksheet itself in the pdf. Then tell readers of the pdf -- "hey,
> just upload this pdf you're reading right now to any sage notebook
> server, and you're good to go!"
>
> Thoughts?


That would be *really* cool! I'll look at this soon, unless someone
beats me to it.

For reference, this package embeds movie files into a pdf:
http://www.ctan.org/tex-archive/macros/latex/contrib/movie15/

Jason

Jason Grout

unread,
Dec 10, 2008, 10:12:17 AM12/10/08
to sage-s...@googlegroups.com
William Stein wrote:
> Note that you could also submit a patch to Sage with the code you're doctesting.
> I did that with all the tests from both of the books I published, and
> I encourage you and many others to do the same with the code from your
> article. The code would go in a file
>
> devel/sage/sage/tests/
>
> like the file devel/sage/sage/tests/book_stein_modform.py
>
> In fact, I could imagine having dozens of files in that directory, and
> when doctests break there, we could notify the authors before
> releasing the version of Sage that breaks their doctests for feedback
> -- then they could update their papers or Sage. Maybe this is how
> the technical aspect of jsage should work:
> http://www.sagemath.org/library/jsage/index.html
>

Also, such code could be loaded into a running Sage session easily,
something like the contributions directory of maxima. Personally, I
would love if our special-purpose code (probably too specialized to be
included in Sage) were accessible to anyone that had Sage.

Thanks,

Jason

mabshoff

unread,
Dec 10, 2008, 10:22:52 AM12/10/08
to sage-support
IMHO there is very little code I would consider to specialized for
Sage, but that depends of you are a fan of the kitchen sink model or
not. I really dislike that concept of a contrib directory like in
Maxima, i.e. in Maxima that directory contains at least some code
duplication and generally not well integrated code, for example
various implementations of vectors that are incompatible and so on.
That code is also not well tested, i.e. there are various failures
depending on the lisp you pick to run the contrib code test suite on.

The ultimate goal should be to get code into Sage since there is
nearly always common code to factor out and getting more users for
some infrastructure bits in Sage has always improved that code. And if
you apply the same demands to the contributed code as to Sage library
code, i.e. 100% doctests and so on, you might as well get the code in
the library itself. Obviously some people will likely disagree with me
on the kitchen sink model :)

> Thanks,
>
> Jason

Cheers,

Mihcael

mabshoff

unread,
Dec 10, 2008, 10:25:29 AM12/10/08
to sage-support


On Dec 10, 7:22 am, mabshoff <Michael.Absh...@mathematik.uni-
dortmund.de> wrote:
> On Dec 10, 7:12 am, Jason Grout <jason-s...@creativetrax.com> wrote:

> > Also, such code could be loaded into a running Sage session easily,
> > something like the contributions directory of maxima.  Personally, I
> > would love if our special-purpose code (probably too specialized to be
> > included in Sage) were accessible to anyone that had Sage.

Oops, I am no longer 100% sure I interpreted your intent here
correctly and I now believe my reply is not to what you propsed, so if
I misunderstood you just disregard my last email :)

<SNIP>

Cheers,

Michael

Jason Grout

unread,
Dec 10, 2008, 10:36:02 AM12/10/08
to sage-s...@googlegroups.com
mabshoff wrote:
>
>
>> Also, such code could be loaded into a running Sage session easily,
>> something like the contributions directory of maxima. Personally, I
>> would love if our special-purpose code (probably too specialized to be
>> included in Sage) were accessible to anyone that had Sage.
>
> IMHO there is very little code I would consider to specialized for
> Sage, but that depends of you are a fan of the kitchen sink model or
> not. I really dislike that concept of a contrib directory like in
> Maxima, i.e. in Maxima that directory contains at least some code
> duplication and generally not well integrated code, for example
> various implementations of vectors that are incompatible and so on.
> That code is also not well tested, i.e. there are various failures
> depending on the lisp you pick to run the contrib code test suite on.
>
> The ultimate goal should be to get code into Sage since there is
> nearly always common code to factor out and getting more users for
> some infrastructure bits in Sage has always improved that code. And if
> you apply the same demands to the contributed code as to Sage library
> code, i.e. 100% doctests and so on, you might as well get the code in
> the library itself. Obviously some people will likely disagree with me
> on the kitchen sink model :)


You had my intent right. So you think having a "minimum_rank_bounds"
function on graphs and an associated file or two would be okay to be in
the Sage library? I don't think it would pass the "widely-needed"
criteria of a standard spkg. However, if people think it is interesting
enough to go into the Sage proper, then I have no objection. I'd have
to get the approval of the other developers, of course.

I think probably less than 10 research groups may use this code
currently. Those are people that we are actively exposing to Sage,
though :).

Jason

William Stein

unread,
Dec 10, 2008, 10:48:08 AM12/10/08
to sage-s...@googlegroups.com

I don't see why not, as long as it is up to snuff code-quality wise. Just
don't make it a function imported to the global namespace by default on
startup of Sage.

> I don't think it would pass the "widely-needed"
> criteria of a standard spkg. However, if people think it is interesting
> enough to go into the Sage proper, then I have no objection. I'd have
> to get the approval of the other developers, of course.
>
> I think probably less than 10 research groups may use this code
> currently. Those are people that we are actively exposing to Sage,
> though :).

A Sage build is over a gigabyte, involves well over 5 million lines of
code, and is probably bigger than any other single math software
system in the world. And amazingly we're doing fine size-wise. I
think we can handle a few more hundreds of pages of hand-written
Python code.

william

mabshoff

unread,
Dec 10, 2008, 11:44:44 AM12/10/08
to sage-support


On Dec 10, 7:48 am, "William Stein" <wst...@gmail.com> wrote:
> On Wed, Dec 10, 2008 at 7:36 AM, Jason Grout


<SNIP>

> >> The ultimate goal should be to get code into Sage since there is
> >> nearly always common code to factor out and getting more users for
> >> some infrastructure bits in Sage has always improved that code. And if
> >> you apply the same demands to the contributed code as to Sage library
> >> code, i.e. 100% doctests and so on, you might as well get the code in
> >> the library itself. Obviously some people will likely disagree with me
> >> on the kitchen sink model :)
>
> > You had my intent right.

Ok, I didn't want to flame you :)

> >  So you think having a "minimum_rank_bounds"
> > function on graphs and an associated file or two would be okay to be in
> > the Sage library?
>
> I don't see why not, as long as it is up to snuff code-quality wise.   Just
> don't make it a function imported to the global namespace by default on
> startup of Sage.

I 100% agree, the trade off of not having the code in Sage is minimal.
There have been similar discussions on the Linux kernel mailing list
and in the end the consensus is for the kitchen sink model regarding
drivers. Very often some collection of obscure drivers has some common
infrastructure and as a result the code gets refactored and
consequently code quality improves. People who have to write similar
drivers end up using well debugged existing infrastructure code
instead of writing and maintaining their own implementation.

Transferring this to Sage I think there is a lot of potential for
cross fertilizations and the advantage of having something just there
for someone who needs it even if that group of users is a miniscule
percentage of Sage users it will still be a positive contribution to
Sage. This is clearly one of the key advantages of Sage that we can
take in somewhat specialized code, i.e. how many MMA, Maple or Magma
code libraries are out there that have been written for some old
release and no longer work and/or are a pain to use? I conjecture that
in the long term Sage will just hover up the good quality third party
implementations out there and by doing so make Sage stronger because
instead of downloading some code and either patching it in or doing
all kinds of funny stuff to get the code to do the right thing (think
non-technical grad student or absent minded professor) in Sage all you
need is version Sage x.y.z or later and it just works.

Another thing is that typically new code in Sage goes through various
stages and very often it takes someone to hit the code really hard to
start shaking out the bugs. I.e. the initial code works because the
author uses it in the way it is intended because the person wrote it.
Then code is merged, it passes doctests, but then all the sudden some
third party starts using the code and finds all kind of bugs, be it
correctness or performance. That is because a third party does plenty
of "dumb" things that the author of the code never considered doing
because it was obviously the wrong thing to do, didn't happen in that
use case, etc.

Think about the graph isomorphism code by Boyer. You threw all the
graphs with fewer than X vertices at it and low and behold you found
bugs, in the code as well as our bindings. Now someone else one day
who wants to use some graphs in their computations will have the
benefit of using a well debugged interface and library [once we fix
all the remaining open tickets :)], maybe even never realize that he/
she is using the planarity code because it is called from some other
higher level algorithm. I am sure that you will find plenty of people
who might think that all that graph library code in Sage is not very
important, but some of them will end up using that code.

Another example are the "tetration guys" - it seems like a rather
obscure area of mathematical research, but interacting with them has
already led to some design discussions and hopefully future
improvements for the power series code. That is a prime example of why
we want third party code in Sage. In this particular case it might
never get submitted or will mature outside the main tree for a long
time, but we can incorporate such code because the open source
community friendly model allows us to do so.

> > I don't think it would pass the "widely-needed"
> > criteria of a standard spkg.  However, if people think it is interesting
> > enough to go into the Sage proper, then I have no objection.  I'd have
> > to get the approval of the other developers, of course.
>
> > I think probably less than 10 research groups may use this code
> > currently.  Those are people that we are actively exposing to Sage,
> > though :).

10 research groups is potentially a huge audience and many of those
people might start using Sage in teaching or influence other people to
use Sage :)

> A Sage build is over a gigabyte, involves well over 5 million lines of
> code, and is probably bigger than any other single math software
> system in the world.  And amazingly we're doing fine size-wise.  I
> think we can handle a few more hundreds of pages of hand-written
> Python code.

Yes, the secret goal here is world domination after all ;)

But I think Moore's law will protect us from ever growing faster in
relative terms than the computer's average ability. Just compare the
size of recent Sage releases and you will see that the average growth
rate of the tarball is slowing down. And if we ever reach a colossal
size that makes development cease or significantly slow down we can
always consider creating a smaller core of Sage for those who want it.

> william

Cheers,

Michael

William Stein

unread,
Dec 10, 2008, 12:15:10 PM12/10/08
to sage-s...@googlegroups.com
On Wed, Dec 10, 2008 at 8:44 AM, mabshoff
<Michael...@mathematik.uni-dortmund.de> wrote:
>
>> A Sage build is over a gigabyte, involves well over 5 million lines of
>> code, and is probably bigger than any other single math software
>> system in the world. And amazingly we're doing fine size-wise. I
>> think we can handle a few more hundreds of pages of hand-written
>> Python code.
>
> Yes, the secret goal here is world domination after all ;)
>
> But I think Moore's law will protect us from ever growing faster in
> relative terms than the computer's average ability. Just compare the
> size of recent Sage releases and you will see that the average growth
> rate of the tarball is slowing down. And if we ever reach a colossal
> size that makes development cease or significantly slow down we can
> always consider creating a smaller core of Sage for those who want it.
>

We can also do an "audit" and cut out bits that aren't used or needed anymore.
e.g., quaddouble.

-- William

mabshoff

unread,
Dec 10, 2008, 12:26:59 PM12/10/08
to sage-support
Yep, for me the next library on the chopping block will be symmetrica
unless the relationship with upstream drastically improves (or someone
mean would call that "exist"). clisp is on its way out and is being
replaced by ecl, but other than that there is nothing else in Sage I
would consider myself to be unhappy about.

Wit the extcode as well as documentation reorg we will drop a couple
MB each, so the 3.3 release will probably be below 200 MB for the
source distribution again.

>
>  -- William

Cheers,

Michael

John Cremona

unread,
Dec 10, 2008, 12:32:33 PM12/10/08
to sage-s...@googlegroups.com
2008/12/10 mabshoff <Michael...@mathematik.uni-dortmund.de>:

> but other than that there is nothing else in Sage I
> would consider myself to be unhappy about.

Wow, that's the best news I have heard all day!

John

Jason Grout

unread,
Dec 10, 2008, 11:15:56 PM12/10/08
to sage-s...@googlegroups.com
mabshoff wrote:
>
>
> On Dec 10, 7:48 am, "William Stein" <wst...@gmail.com> wrote:
>> On Wed, Dec 10, 2008 at 7:36 AM, Jason Grout
>
>
> <SNIP>
>
>>>> The ultimate goal should be to get code into Sage since there is
>>>> nearly always common code to factor out and getting more users for
>>>> some infrastructure bits in Sage has always improved that code. And if
>>>> you apply the same demands to the contributed code as to Sage library
>>>> code, i.e. 100% doctests and so on, you might as well get the code in
>>>> the library itself. Obviously some people will likely disagree with me
>>>> on the kitchen sink model :)
>>> You had my intent right.
>
> Ok, I didn't want to flame you :)
>
>>> So you think having a "minimum_rank_bounds"
>>> function on graphs and an associated file or two would be okay to be in
>>> the Sage library?
>> I don't see why not, as long as it is up to snuff code-quality wise. Just
>> don't make it a function imported to the global namespace by default on
>> startup of Sage.
>
> I 100% agree, the trade off of not having the code in Sage is minimal.


Thanks for the clarification. In Sage tradition, I've made a trac
ticket now: http://trac.sagemath.org/sage_trac/ticket/4754

Thanks,

Jason

Reply all
Reply to author
Forward
0 new messages