Some may remember previous discussions about setting up compiler benchmarks that may be useful for comparing the various NumPy focused compilers and/or machines depending on your usecase. Some people graciously setup a nice initiative here:
https://github.com/numfocus/python-benchmarksand it has been useful and I think all contributors to that suite.
However, it was not clear whether the benchmarks are representative set of scientific computing applications. Researchers in other languages/domains have faced the same issues and came up with the idea of "Dwarfs". Dwarfs contain representative samples of various types of computations such as dense matrix algebra, sparse matrix operations, structured grid, FFT, unstructured grid, map-reduce etc.
Other researchers have made benchmark sets such as Rodinia and OpenDwarfs providing C, C++, OpenCL, OpenMP and CUDA implementations (depending on benchmark). Some students from my lab also ported some of the benchmarks to Javascript (see
https://github.com/Sable/Ostrich). I think these benchmark sets provide a very good baseline to compare against, and an established methodology for us to compare our work against work done by other researchers.
So I have started a benchmark repository here:
http://bitbucket.org/codedivine/pydwarfs which provides initial implementation of some of the dwarfs in Python. It is written for running in CPython 3, and provides Cython implementation of many of the benchmarks. In most cases, I have taken care to match the output of the Cython version against the versions provided by other researchers in C/C++ etc so performance is directly comparable for same inputs. You might also compare performance against JS implementations (which are actually very competent) using Ostrich linked above.
I am also working on ports for my own compiler framework (Velociraptor).
Feel free to fork and do ports for Numba, Parakeet, PyPy etc as you wish.
This is work-in-progress and so things *may* break. Not all dwarves are currently represented nor is there much documentation. I would suggest starting from LUD, SRAD, Pagerank and SPMV. Benchmark suite is generally licensed under Apache 2, but see specific files for more details.