get_systems() is totally unreliable without profiling enabled in Cython

91 views
Skip to first unread message

Jeroen Demeyer

unread,
Jun 6, 2017, 9:18:11 AM6/6/17
to sage-devel
Do you know about sage.misc.citation.get_systems()?

It's supposed to tell which underlying "system" (library, package, ...)
is used for a particular computation.

One example from a docstring:

sage: from sage.misc.citation import get_systems
sage: get_systems('((x+1)^2).expand()')
['ginac']

This is implemented by using cProfile to look at which modules implement
the functions called when executing the code.

The problem is that this is totally unreliable when Cython is compiled
without profiling support (which is the default). The above example only
works because `Expression.expand()` is called by Python instead of
Cython. If that call would be inside some other Cython code, then
Python's profiler would not detect it:

sage: cython('def callexpand(x): return x.expand()')
sage: from sage.misc.citation import get_systems
sage: get_systems('callexpand(((x+1)^2))')
[]


So what should we do?

(A) Silently ignore this issue (status quo).

(B) Give a warning when get_systems() is called when Cython profiling
was not enabled.

(C) Deprecate get_systems() completely.


I am asking because #22747 (compiling Cython code with binding=True)
will "break" profiling even further as even the top-level call of
Expression.expand() would not be detected as something to be entered in
the profiler.

Erik Bray

unread,
Jun 6, 2017, 10:11:17 AM6/6/17
to sage-devel
I wonder if there isn't at least some partially technical solution to
this, such as providing a way to manually list systems/libraries used
by a specific function, and make that easily introspectable. Of
course, this is far from foolproof--the developer would have to
hand-code it, and it might not remain accurate.

Absent something like that, I would give a warning,along with
instructions to rebuild with Cython profiling enabled or something.

Jeroen Demeyer

unread,
Jun 6, 2017, 3:31:20 PM6/6/17
to sage-...@googlegroups.com
On 2017-06-06 16:11, Erik Bray wrote:
> such as providing a way to manually list systems/libraries used
> by a specific function

That's not the problem. The problem is how to determine which functions
are actually called.

Sébastien Labbé

unread,
Jun 13, 2017, 5:02:21 AM6/13/17
to sage-devel
I vote for (B) and I am against option (C).

See also this 2014 discussion relative to these issues
https://groups.google.com/d/topic/sage-devel/xoPszCcMtns/discussion

Erik Bray

unread,
Jun 14, 2017, 5:07:58 AM6/14/17
to sage-devel
I guess I don't really understand what it's for then. To me
get_systems() implies, "tell me what programs/libraries were used to
obtain this result so that I can cite them properly"--not necessarily
individual functions (though I suppose that can be relevant in some
cases too).

Jeroen Demeyer

unread,
Jun 14, 2017, 5:58:23 AM6/14/17
to sage-...@googlegroups.com
On 2017-06-14 11:07, Erik Bray wrote:
> I guess I don't really understand what it's for then.

What's your proposed user interface?

Currently, it is for example

sage: get_systems("(x^2 - 1).expand()")

to get the list of "systems" involved in the execution of the Sage
command (x^2 - 1).expand()

Erik Bray

unread,
Jun 14, 2017, 6:36:38 AM6/14/17
to sage-devel
On Wed, Jun 14, 2017 at 11:58 AM, Jeroen Demeyer <jdem...@cage.ugent.be> wrote:
> On 2017-06-14 11:07, Erik Bray wrote:
>>
>> I guess I don't really understand what it's for then.
>
>
> What's your proposed user interface?

I don't understand your question. How do you figure I'm proposing a
"user interface"?

> Currently, it is for example
>
> sage: get_systems("(x^2 - 1).expand()")
>
> to get the list of "systems" involved in the execution of the Sage command
> (x^2 - 1).expand()

The fact that you put "systems" in quotes I think really gets to what
my question is: How is "systems" defined in this case?

Jeroen Demeyer

unread,
Jun 14, 2017, 7:52:36 AM6/14/17
to sage-...@googlegroups.com
On 2017-06-14 12:36, Erik Bray wrote:
> The fact that you put "systems" in quotes I think really gets to what
> my question is: How is "systems" defined in this case?

Essentially, a "system" is any external math package/library that a
computation uses. So things like Pynac, MPFR, PARI, Maxima, OpenBLAS, ...

I agree that this is not really objective, for various reasons:

1. Why restrict to "math" packages? We could consider Python, Cython,
ECL or CyPari2 to be a "system" too.

2. What about trivial uses? For example, if we ask to list all elements
of FiniteField(16), does that really "use" Givaro? Technically, it does
but only in a very trivial way.

Erik Bray

unread,
Jun 15, 2017, 4:28:11 AM6/15/17
to sage-devel
On Wed, Jun 14, 2017 at 1:52 PM, Jeroen Demeyer <jdem...@cage.ugent.be> wrote:
> On 2017-06-14 12:36, Erik Bray wrote:
>>
>> The fact that you put "systems" in quotes I think really gets to what
>> my question is: How is "systems" defined in this case?
>
>
> Essentially, a "system" is any external math package/library that a
> computation uses. So things like Pynac, MPFR, PARI, Maxima, OpenBLAS, ...

That's what I thought. So you don't necessarily *need* a trace of
every function that's called. That's just how this happens to be
implemented in an automated fashion. I think that's still useful to
have though, which is why I think option (B) makes sense: Keep the
functionality but make it clear that it's not going to work properly
without profiling enabled in Cython.

> I agree that this is not really objective, for various reasons:
>
> 1. Why restrict to "math" packages? We could consider Python, Cython, ECL or
> CyPari2 to be a "system" too.

I agree, I think all of these count in some fashion. It sort of
depends on what the individual user's purpose is. If they just want
an informal list of technologies they used they might mention Python,
etc. Though another more serious use case is if one wants to give
citations to the systems they used, and that's only relevant if those
systems have a way of being citied (a paper, for example).

> 2. What about trivial uses? For example, if we ask to list all elements of
> FiniteField(16), does that really "use" Givaro? Technically, it does but
> only in a very trivial way.

This is another reason I was thinking about some manual system of
listing what "systems" are involved in a calculation. That is, for an
individual function, what systems are most relevant to obtaining the
result it returns? On some level this is subjective, and something
that only the human implementing that function can really know.
Reply all
Reply to author
Forward
0 new messages