Porting fast Add/Mul

5 views
Skip to first unread message

Ondrej Certik

unread,
Dec 11, 2007, 1:48:27 PM12/11/07
to symp...@googlegroups.com
Hi,

in order to port fast Add and Mul to SymPy, is the basic functionality
hidden in the TermCoeffDict class? If so, it should be pretty easy to
port it, right (nice job!)?

The Add in sympycore has some other funcitonality, some of which is in
SymPy, some other things aren't. But I am just concerned with the
speed, and it seems to me the TermCoeffDict is enough.

Another question - how does the ordering of classes work? I didn't
find it in the docs, nor I am completely sure I understand the sources
- if you could give me a few hints, it'd be awesome.

Ondrej

Fredrik Johansson

unread,
Dec 11, 2007, 2:22:31 PM12/11/07
to symp...@googlegroups.com
On Dec 11, 2007 7:48 PM, Ondrej Certik <ond...@certik.cz> wrote:
>
> Hi,
>
> in order to port fast Add and Mul to SymPy, is the basic functionality
> hidden in the TermCoeffDict class? If so, it should be pretty easy to
> port it, right (nice job!)?
>
> The Add in sympycore has some other funcitonality, some of which is in
> SymPy, some other things aren't. But I am just concerned with the
> speed, and it seems to me the TermCoeffDict is enough.

TermCoeffDict, BaseExpDict do most of the job.

> Another question - how does the ordering of classes work? I didn't
> find it in the docs, nor I am completely sure I understand the sources
> - if you could give me a few hints, it'd be awesome.

Ordering of classes is no longer needed: dicts store their keys in an
unordered fashion. Ordering is only required for printing expressions
in a consistent way.

Fredrik

Pearu Peterson

unread,
Dec 11, 2007, 2:28:39 PM12/11/07
to Sympy Core


On Dec 11, 6:48 pm, "Ondrej Certik" <ond...@certik.cz> wrote:

> in order to port fast Add and Mul to SymPy, is the basic functionality
> hidden in the TermCoeffDict class? If so, it should be pretty easy to
> port it, right (nice job!)?

Yes, TermCoeffDict is very similar to MutableAdd but it was
changed because now Add is a subclass of Function holding
the operants data + Add has internal attribute holding
TermCoeffDict instance that holds data in (term, coeff)
form. Now, using either iterAdd or iterTermCoeff methods,
one can Add data in both forms efficiently.

> The Add in sympycore has some other funcitonality, some of which is in
> SymPy, some other things aren't. But I am just concerned with the
> speed, and it seems to me the TermCoeffDict is enough.

I'd suggest also porting iterAdd, iterTermCoeff methods
as those are crucial for the performance of varios algorithms:
some algorithms would use just the operants of Add, some
would need operants in the (term, coeff) format.

> Another question - how does the ordering of classes work? I didn't
> find it in the docs, nor I am completely sure I understand the sources
> - if you could give me a few hints, it'd be awesome.

It is not in docs as it is implementation detail.
First, one should realize that comparison is only
needed in producing a nice output and the
order is insignificant in commutative operations,
hence, internally data will not be ordered which can
be expensive. So, data is sorted in, say tostr, methods.

For equality checks of composite objects that represent
commutative operations, the data is transformed
to frozenset instances which will be checked for
equality.

Pearu

Pearu Peterson

unread,
Dec 11, 2007, 2:34:25 PM12/11/07
to Sympy Core


On Dec 11, 7:28 pm, Pearu Peterson <pearu.peter...@gmail.com> wrote:

...
> So, data is sorted in, say tostr, methods.

Btw, sorting a sequence of Basic objects is implemented
in sympy/core/sorting.py

Pearu

Ondrej Certik

unread,
Dec 11, 2007, 3:01:56 PM12/11/07
to symp...@googlegroups.com
On Dec 11, 2007 8:28 PM, Pearu Peterson <pearu.p...@gmail.com> wrote:
>
>
>
> On Dec 11, 6:48 pm, "Ondrej Certik" <ond...@certik.cz> wrote:
>
> > in order to port fast Add and Mul to SymPy, is the basic functionality
> > hidden in the TermCoeffDict class? If so, it should be pretty easy to
> > port it, right (nice job!)?
>
> Yes, TermCoeffDict is very similar to MutableAdd but it was
> changed because now Add is a subclass of Function holding
> the operants data + Add has internal attribute holding
> TermCoeffDict instance that holds data in (term, coeff)
> form. Now, using either iterAdd or iterTermCoeff methods,
> one can Add data in both forms efficiently.
>
> > The Add in sympycore has some other funcitonality, some of which is in
> > SymPy, some other things aren't. But I am just concerned with the
> > speed, and it seems to me the TermCoeffDict is enough.
>
> I'd suggest also porting iterAdd, iterTermCoeff methods
> as those are crucial for the performance of varios algorithms:
> some algorithms would use just the operants of Add, some
> would need operants in the (term, coeff) format.

I see. Like polynomials. The most important slowdown is imho spent in
things like
evaluating Add(x, x) and similar. So this should be all implemented in
TermCoeffDict.

> > Another question - how does the ordering of classes work? I didn't
> > find it in the docs, nor I am completely sure I understand the sources
> > - if you could give me a few hints, it'd be awesome.
>
> It is not in docs as it is implementation detail.
> First, one should realize that comparison is only
> needed in producing a nice output and the
> order is insignificant in commutative operations,
> hence, internally data will not be ordered which can
> be expensive. So, data is sorted in, say tostr, methods.

So printing is probably quite slow, isn't it?

> For equality checks of composite objects that represent
> commutative operations, the data is transformed
> to frozenset instances which will be checked for
> equality.

So let's say I want to do:

e = x+x

so TermCoeffDict.__new__(TermCoeffDict, x, x) get's called, which does

obj = dict.__new__(dict)
obj += x
obj += x

and dict.__iadd__() does the simplification to 2*x, is that correct?
That's pretty clever to subclass dict to provide a specialized very
fast dict for our purposes and then just call it from Add. Similarly
for Mul.

What is the difference between __iadd__ and inplaceadd()?

And canonical is for converting it to Add/Mul/Pow? I am not seeing if I write

e = x + y

where exactly it gets converted to the Add instance. In Add.__new__(),
the TermCoeffDict gets created, then as_Basic gets called, which
basically just returns canonical(), which returns just self (i.e.
TermCoeffDict), so I am not sure how it works exactly.

Ondrej

Fredrik Johansson

unread,
Dec 11, 2007, 3:15:13 PM12/11/07
to symp...@googlegroups.com
On Dec 11, 2007 9:01 PM, Ondrej Certik <ond...@certik.cz> wrote:
> > It is not in docs as it is implementation detail.
> > First, one should realize that comparison is only
> > needed in producing a nice output and the
> > order is insignificant in commutative operations,
> > hence, internally data will not be ordered which can
> > be expensive. So, data is sorted in, say tostr, methods.
>
> So printing is probably quite slow, isn't it?

Not really.

t = x**2 + 3*y*(x+2*(1+x**2)) + 4*x + 5*y*x + y**3
s = str(t)

sympycore time:
0.0019
0.0010

sympyhg time:
0.0093
0.0018

I'd say printing is fast enough that we could afford to slow it down further,
for example, properly sorting terms by polynomial degree.

Fredrik

Ondrej Certik

unread,
Dec 11, 2007, 3:22:52 PM12/11/07
to symp...@googlegroups.com
> Not really.
>
> t = x**2 + 3*y*(x+2*(1+x**2)) + 4*x + 5*y*x + y**3
> s = str(t)
>
> sympycore time:
> 0.0019
> 0.0010
>
> sympyhg time:
> 0.0093
> 0.0018
>
> I'd say printing is fast enough that we could afford to slow it down further,
> for example, properly sorting terms by polynomial degree.

Is the first number for the "t=" line and the second number for the "s=" line?

Ondrej

Fredrik Johansson

unread,
Dec 11, 2007, 3:27:14 PM12/11/07
to symp...@googlegroups.com
On Dec 11, 2007 9:22 PM, Ondrej Certik <ond...@certik.cz> wrote:
> Is the first number for the "t=" line and the second number for the "s=" line?

Yes. Hit me if I made an error.

Fredrik

Pearu Peterson

unread,
Dec 11, 2007, 3:35:59 PM12/11/07
to Sympy Core


On Dec 11, 8:01 pm, "Ondrej Certik" <ond...@certik.cz> wrote:

> I see. Like polynomials. The most important slowdown is imho spent in
> things like
> evaluating Add(x, x) and similar. So this should be all implemented in
> TermCoeffDict.

This is covered in TermCoeffDict.

> > > Another question - how does the ordering of classes work? I didn't
> > > find it in the docs, nor I am completely sure I understand the sources
> > > - if you could give me a few hints, it'd be awesome.
>
> > It is not in docs as it is implementation detail.
> > First, one should realize that comparison is only
> > needed in producing a nice output and the
> > order is insignificant in commutative operations,
> > hence, internally data will not be ordered which can
> > be expensive. So, data is sorted in, say tostr, methods.
>
> So printing is probably quite slow, isn't it?

Why should it be slower?

> > For equality checks of composite objects that represent
> > commutative operations, the data is transformed
> > to frozenset instances which will be checked for
> > equality.
>
> So let's say I want to do:
>
> e = x+x
>
> so TermCoeffDict.__new__(TermCoeffDict, x, x) get's called, which does
>
> obj = dict.__new__(dict)
> obj += x
> obj += x
>
> and dict.__iadd__() does the simplification to 2*x, is that correct?

This will be the result, right.

> That's pretty clever to subclass dict to provide a specialized very
> fast dict for our purposes and then just call it from Add. Similarly
> for Mul.
>
> What is the difference between __iadd__ and inplaceadd()?

inplaceadd takes additional coefficent argument, __iadd__ is
just an optimized version of inplaceadd.

> And canonical is for converting it to Add/Mul/Pow? I am not seeing if I write
>
> e = x + y
>
> where exactly it gets converted to the Add instance. In Add.__new__(),
> the TermCoeffDict gets created, then as_Basic gets called, which
> basically just returns canonical(), which returns just self (i.e.
> TermCoeffDict), so I am not sure how it works exactly.

Look at the TermCoeffDict.as_Basic method. The result
of canonical can either be TermCoeffDict or Basic
instance. If the former then the Add instance will be
created and the TermCoeffDict instance will be set
as _dict_content attribute.

Pearu

Ondrej Certik

unread,
Dec 11, 2007, 4:26:35 PM12/11/07
to symp...@googlegroups.com
> > I see. Like polynomials. The most important slowdown is imho spent in
> > things like
> > evaluating Add(x, x) and similar. So this should be all implemented in
> > TermCoeffDict.
>
> This is covered in TermCoeffDict.

Yes, that's what I mean.

>
> > > > Another question - how does the ordering of classes work? I didn't
> > > > find it in the docs, nor I am completely sure I understand the sources
> > > > - if you could give me a few hints, it'd be awesome.
> >
> > > It is not in docs as it is implementation detail.
> > > First, one should realize that comparison is only
> > > needed in producing a nice output and the
> > > order is insignificant in commutative operations,
> > > hence, internally data will not be ordered which can
> > > be expensive. So, data is sorted in, say tostr, methods.
> >
> > So printing is probably quite slow, isn't it?
>
> Why should it be slower?

It isn't as Fredrik pointed out. But I am rather asking why is it faster? :)

I mean if you just print, or you first sort and then print, it seems
to me the second option should be slower.

> inplaceadd takes additional coefficent argument, __iadd__ is
> just an optimized version of inplaceadd.

I see, so iadd could be made using inplaceadd, but it was done faster? Ok.

> > And canonical is for converting it to Add/Mul/Pow? I am not seeing if I write
> >
> > e = x + y
> >
> > where exactly it gets converted to the Add instance. In Add.__new__(),
> > the TermCoeffDict gets created, then as_Basic gets called, which
> > basically just returns canonical(), which returns just self (i.e.
> > TermCoeffDict), so I am not sure how it works exactly.
>
> Look at the TermCoeffDict.as_Basic method. The result
> of canonical can either be TermCoeffDict or Basic
> instance. If the former then the Add instance will be
> created and the TermCoeffDict instance will be set
> as _dict_content attribute.

I am stupid, I've looked into that, but I misinterpreted the code.
Yep, so the TermCoeffDict seems to be pretty selfcontained and easy to
port. Let's see how it goes.

Thanks for the explanations,
Ondrej

Fredrik Johansson

unread,
Dec 11, 2007, 4:37:55 PM12/11/07
to symp...@googlegroups.com
On Dec 11, 2007 10:26 PM, Ondrej Certik <ond...@certik.cz> wrote:
> I am stupid, I've looked into that, but I misinterpreted the code.
> Yep, so the TermCoeffDict seems to be pretty selfcontained and easy to
> port. Let's see how it goes.

If you find things that are not obvious, and figure them out, please add
comments to the code (and offer them back to sympycore).

Fredrik

Pearu Peterson

unread,
Dec 11, 2007, 4:40:20 PM12/11/07
to Sympy Core


On Dec 11, 9:26 pm, "Ondrej Certik" <ond...@certik.cz> wrote:

> > > So printing is probably quite slow, isn't it?
>
> > Why should it be slower?
>
> It isn't as Fredrik pointed out. But I am rather asking why is it faster? :)
>
> I mean if you just print, or you first sort and then print, it seems
> to me the second option should be slower.

In sympy sorting is carried out when constructing an Add instance.
In sympycore, no sorting is carried out in Add construction, the sort
is executed when needed, eg by the tostr method.

So, the work done from constructing an Add instance to
printing it, is basically the same.. with the difference that
sympycore is more efficient in many aspects.

What comes to sorting, then the sympycore compare algorithm
is also revised and made faster. But that is another story..

> > inplaceadd takes additional coefficent argument, __iadd__ is
> > just an optimized version of inplaceadd.
>
> I see, so iadd could be made using inplaceadd, but it was done faster? Ok.

Right.

Pearu
Reply all
Reply to author
Forward
0 new messages