On Sat, 6 Sep 2014 11:52:33 -0700 (PDT)
Bill Hart <
goodwi...@googlemail.com> wrote:
> On Saturday, 6 September 2014 20:34:56 UTC+2, Robert Bradshaw wrote:
> >
> > Note that Cython supports cProfile these days:
> >
http://docs.cython.org/src/tutorial/profiling_tutorial.html
> > However, that won't help too much as the real missing pieces are
> > the calls from Cython into the various C libraries.
> >
> > I'm also -1 to an approach that slows down all of Sage to track
> > this unconditionally.
>
>
> I am also not comfortable with an approach that slows down the whole
> of Sage
Let me clarify a few points:
- the current (profiling) approach slows everything down when you turn
it on.
It simply turns on the profiler, which either logs _all_ function
calls or periodically samples the running process and logs the active
function in addition to loads of other data that is irrelevant for
citations.
- the (decorator) approach suggested in #3317 [1] notes which
implementation is being used only at the point it is used.
[1]
http://trac.sagemath.org/ticket/3317
It would be simple to add a global switch to turn this logging on or
off. We didn't think this was necessary because the overhead is
incredibly small. See below for details.
Since the profiling approach is so slow, it can only be used on toy
examples which often use a completely different code path. I will copy
from my comment in #16854 [2]:
The profiling approach is broken for several reasons:
- the code used for different problem sizes is often different.
Profiling a small example will not give you the correct
information. If you are really working on the cutting edge of
what is computable, then you don't want to run the whole
computation under the profiler once more.
- you have to guess what is being used from the data obtained from
the profiler.
There is no clean way to associate citation information to
functions this way.
- it does not allow tracking more fine grained information than
function names.
If a Sage function wraps several algorithms by calling an
external package with different arguments, you cannot
differentiate these.
[2]
http://trac.sagemath.org/ticket/16854#comment:6
Decorators & Speed:
We spent a lot of effort on speeding up the implementation in
#3317 and measuring the effect of adding citation information via
decorators.
IIRC, the only additional operation performed by the decorated function
is to add a string to a Python set. Compared to the overhead of calling
a function in Python, this is negligible.
There are some benchmarks in this blog post [3]. The title may give the
wrong idea, but the numbers are quite impressive. Note that we are not
suggesting to decorate arithmetic operations like addition and
multiplication, only calls to higher level routines, like groebner
basis computation or symbolic integration.
[3]
http://sage-citation.blogspot.de/2011/08/awful-benchmarks.html
Here are some numbers from the link above:
- calling a pass-function (empty Python function):
100000 loops, best of 3: 110 ns per loop
- calling the above function after decorating:
100000 loops, best of 3: 295 ns per loop
- calling a pass-function, that takes some parameters:
100000 loops, best of 3: 399 ns per loop
- calling the above function after decorating:
100000 loops, best of 3: 796 ns per loop
This 200 ns difference would be a measuring error if the function in
question did any real work.
> , just so people like me and my colleagues can get more credit.
We suggest to cite not only libraries used by Sage but papers on the
algorithms used. See the example in the ticket description here:
http://trac.sagemath.org/ticket/3317
> Especially when we are trying to speed Sage up. :-)
I sincerely hope that you are not saying I am trying to "slow Sage
down."
> > The decorator approach could be good for annotating
> > functions (e.g. attaching them to some database the citations
> > module would use) but recording every call could be prohibitively
> > expensive.
Let me emphasize again. We definitely do not want to recall every
function call. The goal is to add annotations to functions that
implement / wrap relatively expensive computational routines.
In short, I vote +1 for decorators.
Cheers,
Burcin