potential slowdown in list_plot to support complex numbers

13 views
Skip to first unread message

Dan Drake

unread,
Nov 15, 2011, 10:01:40 PM11/15/11
to sage-devel
Hello,

At ticket #12035, I posted a patch that makes list_plot do the obvious
thing when given a list such as

[1+I, 2+I, 4-I]

The only way I could see for detecting "genuine" complex number input
was to iterate over the entire input, running CC() on everything and
looking for nonzero imaginary parts (and also looking for input that is
of a complex number type).

This represents a slowdown, of course. My question is, is it too much of
a slowdown?

My thinking is this: list_plot is typically used interactively, on lists
with maybe 1000 entries. On my machine, I get:

sage: foo = [random() for _ in range(1000)]
sage: %timeit [CC(_) for _ in foo]
125 loops, best of 3: 7.05 ms per loop

I doubt anyone will mind if their plot takes an additional 7
milliseconds to appear. Moreoever, I really like it when the computer
does the obvious thing -- if I have a function that plots points in two
dimensions, and I give it [1+I, 2+I, 4-I], there's only one reasonable
thing I could mean.

What are your thoughts? Suggestions for the patch are welcome.

Thanks.

Dan

--
--- Dan Drake
----- http://mathsci.kaist.ac.kr/~drake
-------

signature.asc

William Stein

unread,
Nov 15, 2011, 11:19:27 PM11/15/11
to sage-...@googlegroups.com

Right now, we have:

sage: list_plot([1+I, 2+I, 4-I])
Traceback (most recent call last):
...
TypeError: unable to simplify to float approximation

I suppose you could rewrite list_plot so that it calls a function
_list_plot that works just like the one now, and if it catches a
TypeError, then -- and only then -- CC'ify all the entries of the
input list. That would probably solve the problem with no slowdown at
all in cases that are currently supported.

Anyway...

>
> Thanks.
>
> Dan
>
> --
> ---  Dan Drake
> -----  http://mathsci.kaist.ac.kr/~drake
> -------
>

--
William Stein
Professor of Mathematics
University of Washington
http://wstein.org

Keshav Kini

unread,
Nov 16, 2011, 12:33:26 AM11/16/11
to sage-...@googlegroups.com
Why separate it out into another function _list_plot? Why not just put a try block into the function? That seems like it would make the source harder to read and tracebacks longer without much benefit. It's also another matter whether or not complex numbers being in the list are an "exceptional" situation. We currently do have if/else blocks handling the different types of input for the list_plot function. What's so special about complex numbers that they should be separated out with a try block and/or a wrapper function?

What about if we just check is_ComplexNumber for things in the dataset? Is that not enough? Should be faster than coercing them, I imagine. Do we need list_plot to be smart enough to figure out if things in the data set are any object that could possibly be a complex number in disguise...?

In any case, a 7 microsecond slowdown per datapoint is not a huge deal for a point plotter, I think.

-Keshav

----
Join us in #sagemath on irc.freenode.net !

William Stein

unread,
Nov 16, 2011, 12:49:48 AM11/16/11
to sage-...@googlegroups.com
On Tue, Nov 15, 2011 at 9:33 PM, Keshav Kini <kesha...@gmail.com> wrote:
> Why separate it out into another function _list_plot? Why not just put a try
> block into the function? That seems like it would make the source harder to
> read and tracebacks longer without much benefit. It's also another matter
> whether or not complex numbers being in the list are an "exceptional"
> situation.

As far as I know, nobody has requested this feature in 5 years as far
as I know, and at least nobody has implemented it until just now; it
thus seems to be less likely to be what a random person will try to
do. I therefore consider it exceptional by definition.

> We currently do have if/else blocks handling the different types
> of input for the list_plot function. What's so special about complex numbers
> that they should be separated out with a try block and/or a wrapper
> function?

Not doing that way is slow, as Dan points out. Dan's question was
basically: "is it possible to make this faster". I described a way
that would allow for that exceptional case without any speed penalty.
That's all.

> What about if we just check is_ComplexNumber for things in the dataset? Is
> that not enough?

That's not enough -- you have to check both "is_ComplexNumber" and also for CDF:

sage: is_ComplexNumber(CDF(1,1))
False

I often draw plots using list_plot because it allows me to control
exactly where the points are put and how many samples I take (as
opposed to using the plot command, which refines). If I'm going to
make 10 plots on one graph, each evaluated at several thousand points,
the overhead might matter.

sage: v = [(1,2)]*10^5
sage: time w = [sage.rings.complex_double.is_ComplexDoubleElement(x)
or sage.rings.complex_number.is_ComplexNumber(x) for x in v]
Time: CPU 0.06 s, Wall: 0.07 s

OK, that's no so bad.

> Should be faster than coercing them, I imagine.

Coercing is really, really slow in comparison:

sage: time w = [CDF(x) for x in v]
Time: CPU 1.49 s, Wall: 1.49 s
sage: 1.49/.07
21.2857142857143

An extra 1.5 seconds of waiting for my plot would suck.

> Do we need
> list_plot to be smart enough to figure out if things in the data set are any
> object that could possibly be a complex number in disguise...?
>
> In any case, a 7 microsecond slowdown per datapoint is not a huge deal for a
> point plotter, I think.

If there 10^5 points that nearly a second. I do consider that a deal.

-- William

>
> -Keshav
>
> ----
> Join us in #sagemath on irc.freenode.net !
>

> --
> To post to this group, send an email to sage-...@googlegroups.com
> To unsubscribe from this group, send an email to
> sage-devel+...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org

Dan Drake

unread,
Nov 16, 2011, 2:10:28 AM11/16/11
to sage-...@googlegroups.com
On Tue, 15 Nov 2011 at 08:19PM -0800, William Stein wrote:
> Right now, we have:
>
> sage: list_plot([1+I, 2+I, 4-I])
> Traceback (most recent call last):
> ...
> TypeError: unable to simplify to float approximation
>
> I suppose you could rewrite list_plot so that it calls a function
> _list_plot that works just like the one now, and if it catches a
> TypeError, then -- and only then -- CC'ify all the entries of the
> input list. That would probably solve the problem with no slowdown at
> all in cases that are currently supported.

That's a smart idea. I'll do that. Thanks for the suggestion.

signature.asc

Dan Drake

unread,
Nov 16, 2011, 2:21:02 AM11/16/11
to sage-...@googlegroups.com
On Tue, 15 Nov 2011 at 09:49PM -0800, William Stein wrote:
> On Tue, Nov 15, 2011 at 9:33 PM, Keshav Kini <kesha...@gmail.com> wrote:
> > Why separate it out into another function _list_plot?

I do like the idea of _list_plot() accepting only one format, and the
user-facing list_plot() doing the work of mangling your data into that
accepted format. Seems like a nice separation of function. (Although
that's very close to the current situation with list_plot, since all the
real work is done by line() or point().)

> I often draw plots using list_plot because it allows me to control
> exactly where the points are put and how many samples I take (as
> opposed to using the plot command, which refines). If I'm going to
> make 10 plots on one graph, each evaluated at several thousand points,
> the overhead might matter.

I didn't think of this yesterday, but consider making animations: now
you might have to create several hundred plots. My 7 millisecond delay
is now on the order of seconds.

signature.asc

Jason Grout

unread,
Nov 16, 2011, 7:47:27 AM11/16/11
to sage-...@googlegroups.com
On 11/16/11 1:21 AM, Dan Drake wrote:
> I do like the idea of _list_plot() accepting only one format, and the
> user-facing list_plot() doing the work of mangling your data into that
> accepted format. Seems like a nice separation of function. (Although
> that's very close to the current situation with list_plot, since all the
> real work is done by line() or point().)

That's why I advocate just using points() or line(), depending on what
you want. At first, I thought list_plot should go away, since it was
just redundant. However, it's still useful for plotting:

list_plot([1,2,3,4])

and automatically generating the x-axes points, as a shortcut to doing:

points(enumerate([1,2,3,4]).

It's also compatible with the mma ListPlot function, at least to some
degree, so I guess that's another reason for it to stay. There is some
inconsistent cruft with it, though, that should be cleaned up;
inconsistencies with how different options are available depending on
whether it is a shortcut for points() or line().

Thanks,

Jason


Dan Drake

unread,
Nov 16, 2011, 11:12:53 PM11/16/11
to sage-...@googlegroups.com
On Tue, 15 Nov 2011 at 08:19PM -0800, William Stein wrote:
> Right now, we have:
>
> sage: list_plot([1+I, 2+I, 4-I])
> Traceback (most recent call last):
> ...
> TypeError: unable to simplify to float approximation
>
> I suppose you could rewrite list_plot so that it calls a function
> _list_plot that works just like the one now, and if it catches a
> TypeError, then -- and only then -- CC'ify all the entries of the
> input list. That would probably solve the problem with no slowdown at
> all in cases that are currently supported.

As a cautionary tale: I wrapped the relevant bits in a "try / except
TypeError", but that didn't work for this input:

list_plot([1, 1+I, 2+I, 4-I])

which raised an IndexError. This is what happened:

* list_plot calls point([(0, 1), (1, 1+I), (2, 2+I), (3, 4-I)]), which
calls point2d() on the same input
* point2d() raises a TypeError because of the 1+I
* point() catches the TypeError and calls point3d()
* point3d() tries to do z[2] on (0,1) and raises an IndexError before
it ever gets to the 1+I.

So, back in list_plot(), we need to catch IndexError as well.

Something about that function call sequence doesn't sit right with me; I
almost wonder if point() should try point3d() *first* and only catch
IndexError to punt to point2d(). It seems like point() should not have
caught that particular TypeError and called point3d(). Comments welcome.

But, in any case, #12035 has a patch ready for review. No slowdown at
all for currently-supported cases, and new support for complex numbers.

signature.asc
Reply all
Reply to author
Forward
0 new messages