[sympy] r3290 committed - Edited wiki page through web user interface.

3 views

Skip to first unread message

codesite...@google.com

unread,

Oct 9, 2009, 10:31:36 PM10/9/09

to sympy-...@googlegroups.com

Revision: 3290
Author: asmeurer
Date: Fri Oct 9 19:30:39 2009
Log: Edited wiki page through web user interface.
http://code.google.com/p/sympy/source/detail?r=3290

Modified:
/wiki/ODEsModuleReport.wiki

=======================================
--- /wiki/ODEsModuleReport.wiki Thu Oct 8 18:06:19 2009
+++ /wiki/ODEsModuleReport.wiki Fri Oct 9 19:30:39 2009
@@ -1,4 +1,91 @@
#summary Review of the 2009 Summer of Code project
#labels GSoC

-I have written up a review on my blog
[http://asmeurersympy.wordpress.com/2009/09/07/google-summer-of-code-2009-wrap-up/
here], but I still need to transfer it to here.
+*This page is still a work in progress. See my blog post for the full
report.*
+
+This review comes from my
[http://asmeurersympy.wordpress.com/2009/09/07/google-summer-of-code-2009-wrap-up/
blog]. See my blog for more details on my 2009 Google Summer of Code
experiences. Also, the blog post has some advice for anyone who may want
to apply for GSoC in the future.
+
+=About Me=
+My name is Aaron Meurer, and I am a second year math and computer science
major at New Mexico Tech in Socorro, New Mexico, USA.
+
+=Introduction=
+My project was to implement Ordinary Differential Equations solvers in
sympy. The final module that I completed is able to do the following:
+
+ * Solve ODEs using the following methods:
+ * 1st order separable differential equations
+ * 1st order differential equations whose coefficients or dx and dy are
functions homogeneous of the same order.
+ * 1st order exact differential equations.
+ * 1st order linear differential equations
+ * 1st order Bernoulli differential equations.
+ * 2nd order Liouville differential equations.
+ * nth order linear homogeneous differential equation with constant
coefficients.
+ * nth order linear inhomogeneous differential equation with constant
coefficients using the method of undetermined coefficients.
+ * nth order linear inhomogeneous differential equation with constant
coefficients using the method of variation of parameters.
+ * Classify an ODE that fits one of the above methods into that method
using the classify_ode() function.
+ * Check a solution to an ODE using the checkodesol() function.
+ * Separate the variables multiplicatively in an expression using the
separatevars() function.
+ * Determine the homogeneous order of an expression using the
homogeneous_order() function.
+
+In addition, I also did work to do the following.
+ * Remove automatic combination of exponents (exp(x)*exp(y) => exp(x +
y)).
+ * Expand hints for the expand() function.
+ * Refactor powsimp() to be able to simplify only the exponent or only
the base.
+
+=GSoC Review=
+== Beginning ==
+Let me start from the beginning. Around late February to early March of
this year, I discovered the existence of Google Summer of Code. I knew
that I wanted to do some kind of work this summer, preferably an
internship, so it piqued my interest. At that time, the mentoring
organizations were still applying for GSoC 2009, so I could only look at
the ones from 2008. Most of them were either Linux things or Web things,
neither of which I had any experience in or am I much interested in. I
took a free course in Python at my University the previous semester, and it
was the programming language that I knew best at the time. I had learned
some Java in my first semester CS class (did I mention that this was my
first year at college?), and I hated it, and I was still learning C for my
second semester CS class. So I looked at what the Python Foundation had to
offer. I am a double major in math and computer science, so I looked under
the math/science heading. That's when I saw SymPy.
+
+I should not that I have been ahead in Math. It was my second semester,
and I was taking Discrete Mathematics, Ordinary Differential Equations,
Basic Concepts of Math, and Vector Analysis. So I looked for project ideas
on the SymPy page that related to what I knew. The only one that I saw,
other than core improvements, was to improve the ODE solving capabilities.
I got into contact with the community and looked at the source, finding
that it was only capable of solving 1st order linear equations and some
special cases of 2nd order linear homogeneous equations with constant
coefficients. I already at that point knew several methods from my ODE
course, and I knew much of what I would learn.
+
+==Application Period==
+The most difficult part of the Google Summer of Code, in my opinion, is
the application period. For starters, you have to do it while you are
still in classes, so you pretty much have to do it in your free time.
Also, if you have never applied for Google Summer of Code before, you do
not really know what a good application should look like. I have long had
my application available on the <a
href="http://wiki.sympy.org/wiki/User:Asmeurer/GSoC2009_Application">SymPy
Wiki</a>, and I will reference it here a few times. First off, it was
recommended to me by some of the SymPy developers that I put as many
potential things that I could do in the summer in my application as I
though I could do. I was still only about half way through my ODEs course
when I wrote the application, but I had the syllabus so I knew the methods
I would be learning at least by name. So that is exactly what I did: I
packed my application with every possible thing that I knew we would be
learning about in ODEs.
+
+After I felt that I had a strong application, and Ondrej had proofread it
for me, I submitted it. There were actually two identical applications,
one for the Python Software Foundation, and one for Portland State
University. This is because SymPy was not accepted as a mentoring
organization directly, so they had to use those two foundations as
proxies. A requirement of acceptance is to submit a patch that passes
review. I decided to add a Bernoulli solver, because Bernoulli can be
solved in the general case much like the 1st order linear ODE, which was
already implemented.
+
+After I applied, there was an acceptance period. I used that period to
become aquatinted with the SymPy community and code base. A good way to do
this is to try to fix <a
href="http://code.google.com/p/sympy/issues/list?q=label:EasyToFix">EasyToFix
issues</a>. I found <a
href="http://code.google.com/p/sympy/issues/detail?id=694">issue 694</a>,
which is to implement a bunch of tests from a paper by Michael Wester for
testing computer algebra systems. The tests cover every possible thing
that a full featured CAS could do, so it was a great way to learn SymPy.
The issue is still unfinished, so working on it is still a good way to
learn how to use SymPy.
+
+Also, it was important to learn git, SymPy's version control system. The
learning curve it pretty steep if you have never used version control
system before, but once you can use it, it becomes a great tool at your
disposal.
+
+==Acceptance==
+After being accepted, I toned down my work with SymPy to work on finishing
up my classes. My classes finished a few weeks before the official start,
so I used that period to get a jump start on my project.
+
+==The GSoC Period==
+For the start of the period, I followed my timeline. I implemented 1st
order ODEs with homogeneous coefficients and 1st order exact ODEs. These
were both pretty simple to do, as I expected.
+
+The next thing I wanted to do was separable. My goal at this point was to
get every relevant exercise from my textbook to work with my solvers. One
of the exercises from my <a
href="http://books.google.com.np/books?id=29utVed7QMIC&lpg=PA24&ots=uxLSUKt_3P&hl=en&pg=PA56#v=onepage&q=&f=false">book</a>
(Pg. 56, No. 21) was $latex dy=e^{x + y}dx$. I soon discovered that it was
impossible for SymPy to separate $latex e^{x + y} \rightarrow e^{x}e^{y}$,
because the second would be automatically combined in the core. I also
discovered that <code>expand()</code>, which should have been the function
to split that, expanded using all possible methods indiscriminately. Part
of my <code>separatevars()</code> function that I was writing to separate
variables in expressions would be to split things like $latex x + yx
\rightarrow x(1 + y)$ and $latex 2 x y + x^{2} + y^{2} \rightarrow (x +
y)^{2}$, but <code>expand()</code>
+as it was currently written would expand those.
+
+So I spent a few weeks hacking on the core to make it not auto-combine
exponents. I came up with a rule that exponents would only auto-combine if
they had the same term minus the coefficient, the same rule that
<code>Add</code> uses to determine what terms should auto combined by
addition. So it would combine $latex e^{x}e^{x} \rightarrow e^{2x}$, but
$latex e^{x}e^{y}$ would be left alone. It turns out that some of our
algorithms, namely the Gruntz limit algorithm, relies on auto-combining.
We already had a function that could combine exponents,
<code>powsimp()</code>, but it also combined bases, as in $latex x^zy^z
\rightarrow (xy)^z$, so I had to split the behavior so that it could act
only as auto-combining once did (by the way, use <code>powsimp(expr,
combine='exp', deep=True)</code> to do this). Then, after some help from
Ondrej on pinpointing the exact location of the bugs, I just applied the
function there. The last thing I did here was to split the behavior of
expand, so that you could do <code>expand(x*(y + 1), mul=False)</code> and
it would leave it alone, but <code>expand(exp(x + y), mul=False)</code>
would return <code>exp(x)*exp(y)</code>. This split behavior turned out to
be useful in more than one place in my code.
+
+This was the first non bug fix patch of mine that was pushed in to SymPy,
and at the time of this writing, it is the last major one in the latest
stable version. It took some major rebasing to get my convoluted commit
history ready for submission, and it was during this phase that I git
finally clicked for me, especial the <code>git rebase</code> command. This
work took a few weeks from my ODEs time, and it became clear that I would
not be doing every possible thing from my application. The reason that I
included so much in my application was that my project was non-atomic. I
could implement a little or a lot and still have a working useful module.
+
+If you look at my timeline on my application, you can see that the first
half is symbolic methods, and the second half is other methods, things like
series. It turns out that we didn't really learn much about systems of
ODEs in my course and we learned very little about numerical methods (and
it would take much more to know how to implement them). We did learn
series methods, but they were so annoying to do that I came to hate them
with a passion. So I decided to just focus on symbolic methods, which were
my favorite anyway. My goal was to implement as many as I could.
+
+After I finished up separable equations, I came up with an idea that I did
not have during the application period. <code>dsolve()</code> was becoming
cluttered fast with all of my solution methods. The way that it worked was
that it took an ODE and it tried to match methods one by one until it found
one that worked, which it then used. This had some drawbacks. First, as
I mentioned, the code was very cluttered. Second, the ODEs methods would
have to be applied in a predetermined order. There are several ODEs that
match more than one method. For example, $latex 2xy + (x^2 +
y^2)\frac{dy}{dx}=0$ has coefficients that are both homogeneous of order 2,
and is also exact, so it can be solved by either method. The two solvers
return differently formatted solutions for each one. A simpler example is
that 1st order ODEs with homogeneous coefficients can be solved in two
different ways. My working solution was to try them both and then apply
some heuristics to return the simplest one. But sometimes, one way would
create an impossible integral that would hand the integration engine. And
it made debugging the two solvers more difficult because I had to override
my heuristic. This also touches on the third point. Sometimes the
solution to an ODE can only be represented in the form of an unevaluatable
integral. SymPy's <code>integrate()</code> function is supposed to return
an unevaluated <code>Integral</code> class if it cannot do it, but all too
often it will just hang forever.
+
+The solution I came up with was to rewrite dsolve using a hints method. I
would create a new function called <code>classify_ode()</code> that would
do all of the ODE classification, removing it from the solving code. By
default, dsolve would still use a predetermined order of matching methods.
But you could override it by passing a "hint" to <code>dsolve</code> for
any matching method, and it would apply that method. There would also be
options to only return unevaluated integrals when applicable.
+
+I ended up doing this and more (see the docstrings for
<code>classify_ode()</code> and <code>dsolve()</code> in the current git
master branch), but before I could I needed to clean up some things. I
needed to rewrite all of <code>dsolve()</code> and related functions.
Before I started the program, there were some special cases in dsolve for
second order linear homogeneous ODEs with constant coefficients and one
very special case ODE for the expanded form of $latex
\frac{d^2}{dx^2}(xe^{-y}) = 0$.
+
+So the first thing I did was implement a solver for the general
homogeneous linear with constant coefficients case. These are rather
simple to do: you just find the roots of the characteristic polynomial
built off of the coefficients, and then put the real parts of the roots in
front of the argument of an exponential and the imaginary parts in front of
the arguments of a sine and cosine (for example, $latex 3 \pm 2i$ would
give $latex C1e^{3x}\sin{2x} + C2e^{3x}\cos{2x}$. The thing was, that if
the imaginary part is 0, then you only have 1 arbitrary constant on the
exponential, but if it is non-zero, you get 2, one for each trig function.
The rest falls out nicely if you plug 0 in for $latex b$ into
$e^{ax}(C1\sin{bx} + C2\cos{box})$ because the sine goes to 0 and the
cosine becomes 1. But you would end up with $latex C1 + C2$ instead of
just $latex C1$ in that case. I had already planned on doing arbitrary
constant simplification as part of my project, so I figured I would put
this on hold and do that first. Then, once that was done, the homogeneous
case would be reduced to 1 case instead of the usual 2 or 3.
+
+My original plan was to make an arbitrary constant type that automatically
simplified itself. So, for example, if you entered <code>C1 + 2 + x</code>
with <code>C1</code> an arbitrary constant, it would reduce to just
<code>C1 + x</code>. I worked with Ondrej, including visiting him in Los
Alamos, and we build up a class that worked. The problem was that, in
order to have auto-simplification, I had to write the simplification
directly into the core. Neither of us liked this, so we worked a little
bit on a basic core that would allow auto-simplification to be written
directly in the classes instead of in the <code>Mul.flatten()</code> and
<code>Add.flatten()</code> methods. It turns out that my constant class
isn't the only thing that would benefit from this. Things like the order
class (O(x)) and the infinity class (oo) are auto-simplified in the core,
and things could be much cleaner if they happened in the classes
themselves. Unfortunately, modifying the core like this is not something
that can happen overnight or even in a few weeks. For one thing, it needed
to wait until we had the new assumptions system, which was another Google
Summer of Code project running parallel to my own. So we decided to shelf
the idea.
+
+I still wanted constant simplification, so I settled with writing a
function that could do it instead. There were some downsides to this.
Making the function as general as the classes might have been would have
been far too much work, so I settled on making it an internal-only
function that only worked on symbols named <code>C1</code>,
<code>C2</code>, etc. Also, unlike writing the simplification straight
into <code>Mul.flatten()</code> which was as simple as removing any terms
that were not dependent on x, writing a function that parsed an expression
and simplified it was considerably harder to write. I managed to churn out
something that worked, and so I was ready to finish up the solver I had
started a few paragraphs ago.
+
+After I finished that, I still needed to maintain the ability to solve
that special case ODE. Apparently, it is an ODE that you would get
somewhere in deriving something about relativity, because it was in the
relativity.py example file. I used Maple's excellent
<code>odeanalyser()</code> function (this is where I go the idea for my
<code>classify_ode()</code>)to find a simple general case ODE that it fit
(Liouville ODE). After I finished this, I was ready to start working on
the hints engine.
+
+It took me about a week to move all classification code into
<code>classify_ode()</code>, move all solvers into individual functions,
separate simplification code into yet other functions, and tie it all
together in <code>dsolve()</code>. In the end, the model worked very
well. The modularization allowed me to do some other things that I had not
considered, such as creating a special "best" hint that used some
heuristics that I originally developed for first order homogeneous which
always has two possible solutions to try to give the best formatted
solution for any ODE that has more than one possible solution method. It
also made debugging individual methods much easier, because I could just
use the built in hint calls in <code>dsolve()</code> instead of commenting
out lines of code in the source.
+
+This was good, because there was one more method that I wanted to
implement. I wanted to be able to solve the inhomogeneous case of a nth
order linear ode with constant coefficients. This can be done in the
general case using the method of variation of parameters. It was quite
simple to set up variation of parameters up in the code. You only have to
set up a system of integrals using the Wronskian of the general solutions.
It would usually be a very poor choice of a method if you were trying to
solve an ODE by hand because taking the Wronskian and computing n integrals
is a lot of work. But for a CAS, the work is already there. I just have
to set up the integrals.
+
+It soon became clear that even though, in theory, the method of variation
of parameters can solve any ODE of this type, in practice, it does not
always work so well in SymPy. This is because SymPy have very poor
simplification, especially trigonometric simplification, so sometimes there
would be a trigonometric Wronskian that would be identically equal to some
constant, but it could only simplify it to some very large rational
function of sines and cosines. When these were passed to the integral
engine, it would cause it to fail, because it could not find the integral
for such a seemingly complex expression.
+
+In addition, taking Wronskians, simplifying them, and then taking n
integrals is a lot of work as I said, and even when SymPy could do it, it
took a long time. There is another method for solving these types of
equations called undetermined coefficients that does not require
integration. It only works on a class of ODEs where the right hand side of
the ODE is a simple combination of sines, cosines, exponentials, and
polynomials in x. It turns out that these kinds of functions are common
anyway, so most ODEs of this type that you would encounter could be solved
with this method. Unlike variation of parameters, undetermined
coefficients requires considerable setup, including checking for different
cases. This would be the method that you would want to use if you had to
solve the ODE by hand because, even with all the setup, it only requires
solving a system of linear equations vs. solving n integrals with variation
of parameters, but for a CAS, it is the setup that matters, so this was a
difficult prospect.
+
+I spent the last couple of weeks writing up the necessary algorithms to
setup the required system of linear equations and handling the different
cases. After I finally worked out all of the bugs, I ran some profiling
against my variation of parameters solver. It turned out that for ODEs
that had trigonometric solutions (which take longer to simplify), my
undetermined coefficients solver was an order of magnitude faster than the
variation of parameters solver (and that is just for the ODEs that the
variation of parameters engine could even solve at all). For ODEs that
only had exponentials, it was still 2-4 times faster.
+
+I finished off the summer by writing extensive documentation for all of my
solvers and functions. Hopefully someone who uses SymPy to solve ODEs can
learn something about ODE solving methods as well as how to use the
function I wrote when they read my documentation.
+
+<strong> Post-GSoC</strong>
+I plan on continuing development with SymPy now that the Google Summer of
Code period is over. SymPy is an amazing project, mixing Python and
Computer Algebra, and I want to help it grow. I may even apply again in a
future year to implement some other thing in SymPy, or maybe apply as a
mentor for SymPy to help someone else improve it.

Reply all

Reply to author

Forward

0 new messages