Hi Mario,
I finally got to this. I checked out your branch:
https://github.com/sympy/sympy/pull/609
Well, actually the Mateusz' improved branch, and I updated it further
to run with the latest gmpy2:
https://github.com/certik/sympy/tree/lpoly2-improved
And tested an example on my computer with gmpy:
In [1]: %time s = series(sin(x)*cos(x), x, 0, 1000)
CPU times: user 4.52 s, sys: 14 ms, total: 4.54 s
Wall time: 4.5 s
In [2]: %time s = series(log(1+x)*exp(x), x, 0, 500)
CPU times: user 4.15 s, sys: 14 ms, total: 4.16 s
Wall time: 4.13 s
Then I ran the same benchmark using the latest sympy master, using
your code that got merged at
https://github.com/sympy/sympy/pull/2630,
I put the scripts here:
https://github.com/certik/sympy/compare/formal_power_series, it's just
the a.py and b.py. Your merged code doesn't seem to support sin/cos
benchmark [1] above, so I just ran the benchmark [2]:
https://github.com/certik/sympy/blob/59492520b443ea5f0ef31fc018e9bc700b93b818/b.py
$ python b.py
import
done
initialize series
done
start
stop
1.85430693626
Seems to be a little bit faster. Finally, I then wrote my own implementation:
https://github.com/certik/sympy/blob/59492520b443ea5f0ef31fc018e9bc700b93b818/a.py
$ python a.py
import
done
initialize series
done
start
1.17188405991
stop
1.17299985886
It's using the same algorithm, the speed improvement is just caused by
sorting the dictionary first before multiplying things out in rs_mul:
items1 = list(p1._data.items())
items1.sort(key=lambda e: e[0])
items2 = list(p2._data.items())
items2.sort(key=lambda e: e[0])
You only sort the items2 in sympy, while I noticed a speedup if I sort
items1 as well. But this is minor.
All this is using sparse representation, as a dictionary. I also
implemented dense representation, e.g. you can try the dense
representation on the example [2] by applying the following patch:
diff --git a/a.py b/a.py
index 5fabdd9..f7caf4a 100644
--- a/a.py
+++ b/a.py
@@ -179,20 +179,20 @@ def div(a, b):
for i in range(1, n):
t = t/i
data.append(t)
-exp = FormalPowerSeriesSparse.from_dense(data)
+exp = FormalPowerSeries(data)
# log(1+x)
data = [QQ.dtype(0)]
t = QQ.dtype(1)
for i in range(1, n):
data.append(t/i)
t = -t
-log = FormalPowerSeriesSparse.from_dense(data)
+log = FormalPowerSeries(data)
#x = FormalPowerSeries([0, 1, 0, 0, 0, 0, 0, 0, 0])
#onemx = FormalPowerSeries([1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
print "done"
print "start"
t1 = clock()
-s = mul_sparse(log, exp, n)
+s = mul(log, exp)
#s = mul(sin, cos)
t2 = clock()
print "stop"
Then you get:
$ python a.py
import
done
initialize series
done
start
1.11554312706
stop
1.11559510231
So it's a little bit faster, but the difference is small.
Conclusion --- a.py now contains simple implementation of the taylor
series that is as fast as Mario's fast code.
Anyway, I think this is the right approach, we should support more
functions (sin, cos, ...) in sympy.polys.ring_series and then create a
class like the FormalPowerSeriesSparse() in a.py that would hold the
series and would know how to print it, and also had methods to
manipulate it (or those can be external functions like in a.py).
Finally, this should then be hooked into our series expansion in
sympy.
Now, this works great for a Taylor expansion. We then need to extend
this to support Laurent series and eventually Puiseux series (that's
what SymPy's series is doing). An example of a Puiseux series is:
In [1]: %time print (1/cos(x/log(x))).series(x, 0, 10)
1 + x**2/(2*log(x)**2) + 5*x**4/(24*log(x)**4) +
61*x**6/(720*log(x)**6) + 277*x**8/(8064*log(x)**8) + O(x**10)
CPU times: user 3 s, sys: 14 ms, total: 3.01 s
Wall time: 3.03 s
As you can see, currently it takes 3s, and it should be immediate in
my opinion. You can read the references that I quoted in the docstring
of FormalPowerSeries() in a.py. Essentially, I think these are all
series of the form a*X^n, where n = 0, 1, 2, ... for Taylor series, n
= n0, n0+1, n0+2, ... for negative n0 for Laurent series, and X is a
symbol. For Puiseux series, the X is then something more complicated,
either a fractional power like x^(1/2), or something like x/log(x) in
the last example. But the contents of X is stored separately, and the
rest is just like polynomials. I am currently not sure if things like
multiplication of two Puiseux series needs to be changed somehow.
Sparse/Dense: I would just go with Sparse, it seems the speed is very
good both for really sparse series, as well as dense series.
Overall, I think this is the way to go. Let me know what you think.
Ondrej