Discussion regarding the GSOC project - Benchmarks and performance

84 views
Skip to first unread message

praneeth ratna

unread,
Mar 11, 2022, 9:53:10 AM3/11/22
to sympy
Hi all,

As mentioned in the project description here , https://github.com/sympy/sympy/wiki/GSoC-Ideas#benchmarks-and-performance , the sympy_benchmarks misses some benchmarks which are needed to be shifted fro the sympy main repo. Also the sympy_benchmarks misses benchmarks for combinatorics, series , stats and many other modules which can be added.(Please correct me here)
Also I did not have a clear understanding of what the new implementation of benchmarks do and how it should be done.

I'm not sure of the potential mentors for this project since it is not mentioned in Ideas page, so i would request the mentors to guide me on how to move forward in this project.

Thanks,
Praneeth

praneeth ratna

unread,
Apr 15, 2022, 7:58:34 AM4/15/22
to sympy
Hi all,

I have already mailed regarding my interest in the project -  Benchmarks and performance but have not recieved a reply , could the potential mentor please guide me on this project?

Thanks,
Praneeth

Oscar Benjamin

unread,
Apr 15, 2022, 8:54:31 AM4/15/22
to sympy
On Fri, 11 Mar 2022 at 14:53, praneeth ratna <praneet...@gmail.com> wrote:
>
> Hi all,
>
> As mentioned in the project description here , https://github.com/sympy/sympy/wiki/GSoC-Ideas#benchmarks-and-performance , the sympy_benchmarks misses some benchmarks which are needed to be shifted fro the sympy main repo. Also the sympy_benchmarks misses benchmarks for combinatorics, series , stats and many other modules which can be added.(Please correct me here)
> Also I did not have a clear understanding of what the new implementation of benchmarks do and how it should be done.

I think that with some of the project ideas the intention is not
necessarily that anyone has a clear idea of exactly what should be
done but that someone wanting to do a project would do some research
themselves to come up with an idea.

In relation to benchmarks the immediate issues I see are:

1. Some old benchmarking code is in the main repo and should be moved
over to the benchmarks repo.
2. The benchmarks repo is currently broken because some test fails so
no PR can be merged there until that is fixed.
3. The benchmarks are affected by caching which means that many of the
timings reported are completely unrepresentative.

Bigger issues:

1. The benchmarks are based on ASV which is awkward to use. It's
unnecessarily difficult even just to run a single benchmark. A better
framework should be found.
2. The benchmarks mostly only measure things that are already quite
fast because otherwise they would take too long to run. To guide
future improvements we really need benchmarks for things that are
currently slow.
3. The benchmarks are too tied up with sympy itself. I would rather
have an independent benchmark suite that collects good examples and
can be used by other projects as well. For this you need benchmarks
that are language agnostic and not simply written in sympy code. (I
also think it would be better if a lot of the test suite was like this
as well.)
4. The things that are tested in the benchmarks are a bit random. It
would be better to focus on core features like differentiation,
integration, linear algebra, polynomials, solvers, numerical
evaluation.

A big part of the difficulty in making good benchmarks is that it
really requires having some understanding of what are the core
operations that should or could be made faster.

--
Oscar

Aaron Meurer

unread,
Apr 15, 2022, 4:46:53 PM4/15/22
to sy...@googlegroups.com
On Fri, Apr 15, 2022 at 6:54 AM Oscar Benjamin
<oscar.j....@gmail.com> wrote:
>
> On Fri, 11 Mar 2022 at 14:53, praneeth ratna <praneet...@gmail.com> wrote:
> >
> > Hi all,
> >
> > As mentioned in the project description here , https://github.com/sympy/sympy/wiki/GSoC-Ideas#benchmarks-and-performance , the sympy_benchmarks misses some benchmarks which are needed to be shifted fro the sympy main repo. Also the sympy_benchmarks misses benchmarks for combinatorics, series , stats and many other modules which can be added.(Please correct me here)
> > Also I did not have a clear understanding of what the new implementation of benchmarks do and how it should be done.
>
> I think that with some of the project ideas the intention is not
> necessarily that anyone has a clear idea of exactly what should be
> done but that someone wanting to do a project would do some research
> themselves to come up with an idea.
>
> In relation to benchmarks the immediate issues I see are:
>
> 1. Some old benchmarking code is in the main repo and should be moved
> over to the benchmarks repo.
> 2. The benchmarks repo is currently broken because some test fails so
> no PR can be merged there until that is fixed.
> 3. The benchmarks are affected by caching which means that many of the
> timings reported are completely unrepresentative.
>
> Bigger issues:

For me the biggest issue with the benchmarks right now is that they
aren't very effective towards actually making SymPy faster, and
preventing it from getting slower (which are related but distinct
things). So any ideas suggested for a benchmarking project should, in
my opinion, be somehow linked to this end goal. This includes things
like

- Making the benchmarks easier to run (including running them automatically)
- Making it easier to add benchmarks
- Making it easier to interpret the results of benchmarks
- Making the benchmarks suite itself more likely to catch performance
regressions

>
> 1. The benchmarks are based on ASV which is awkward to use. It's
> unnecessarily difficult even just to run a single benchmark. A better
> framework should be found.

Just a small note on this point. I've done a small bit of research on
this myself and have come up with nothing. It's possible something
better does exist out there, but it's also likely that it doesn't and
if we want something better we will have to write it ourselves.

> 2. The benchmarks mostly only measure things that are already quite
> fast because otherwise they would take too long to run. To guide
> future improvements we really need benchmarks for things that are
> currently slow.
> 3. The benchmarks are too tied up with sympy itself. I would rather
> have an independent benchmark suite that collects good examples and
> can be used by other projects as well. For this you need benchmarks
> that are language agnostic and not simply written in sympy code. (I
> also think it would be better if a lot of the test suite was like this
> as well.)
> 4. The things that are tested in the benchmarks are a bit random. It
> would be better to focus on core features like differentiation,
> integration, linear algebra, polynomials, solvers, numerical
> evaluation.
>
> A big part of the difficulty in making good benchmarks is that it
> really requires having some understanding of what are the core
> operations that should or could be made faster.

This is true, but there's also enough work to be done on the
benchmarking tooling itself that most of the work could focus on that,
rather than necessarily adding too many actual benchmarks, which could
be done later.

Aaron Meurer

>
> --
> Oscar
>
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/CAHVvXxRPFjmZBrkJpusB-WER3V%3DMd%2B%2BP_X8HKAuHHJr3PPXe7Q%40mail.gmail.com.

Oscar Benjamin

unread,
Apr 15, 2022, 7:02:37 PM4/15/22
to sympy
On Fri, 15 Apr 2022 at 21:46, Aaron Meurer <asme...@gmail.com> wrote:
>
> On Fri, Apr 15, 2022 at 6:54 AM Oscar Benjamin
> <oscar.j....@gmail.com> wrote:
> >
> > 1. The benchmarks are based on ASV which is awkward to use. It's
> > unnecessarily difficult even just to run a single benchmark. A better
> > framework should be found.
>
> Just a small note on this point. I've done a small bit of research on
> this myself and have come up with nothing. It's possible something
> better does exist out there, but it's also likely that it doesn't and
> if we want something better we will have to write it ourselves.

Indeed. The best alternative I found was pytest-benchmark:
https://pypi.org/project/pytest-benchmark/

I didn't get very far with evaluating it but I can say that at least
it makes it easy to run selected subsets of the benchmarks.

--
Oscar
Reply all
Reply to author
Forward
0 new messages