gensim + PyPy = good recipe ?

Bruno CHAMPION

unread,

Mar 11, 2014, 10:50:57 AM3/11/14

to gen...@googlegroups.com

Hi all

Is somebody has tried to run Gensim with PyPy (http://www.pypy.org/, fast Python JIT compiler).

PyPy seems to be much more faster than the default Python engine. It could be helpfull for heavy computation tasks.

Bruno

Radim Řehůřek

unread,

Mar 11, 2014, 11:32:44 AM3/11/14

to gen...@googlegroups.com

Hello Bruno,

that's an exciting approach, using JIT.

Currently Gensim uses NumPy for low-level performance (NumPy itself wraps fast system BLAS libs etc).

AFAIK PyPy is not compatible with NumPy, though there are plans: http://pypy.org/numpydonate.html

If it WAS compatible, we could get some speedup from PyPy in non-numerical code in gensim. But most time is spent in number crunching anyway, where I suspect a good BLAS (=SSE, AVS instructions, cache awareness...) via NumPy will get you more than PyPy possibly could.

I could be wrong though, I'm curious to hear if anyone has more practical experience with this.

Best,

Radim

--

Radim Řehůřek, Ph.D.

consulting @ machine learning, natural language processing, big data

http://radimrehurek.com

Skipper Seabold

unread,

Mar 11, 2014, 11:56:58 AM3/11/14

to gensim

On Tue, Mar 11, 2014 at 11:32 AM, Radim Řehůřek <m...@radimrehurek.com> wrote:
> Hello Bruno,
>
> that's an exciting approach, using JIT.
>
> Currently Gensim uses NumPy for low-level performance (NumPy itself wraps
> fast system BLAS libs etc).
>
> AFAIK PyPy is not compatible with NumPy, though there are plans:
> http://pypy.org/numpydonate.html

My (possibly outdated) understanding is that this is a bit of a
misnomer. It provides a numpy-like array, but as you mention below, no
linear algebra, no scipy, no matplotlib, no numpy C-api, so you're a
bit stuck if you want to actually do (data-centric) things. Indeed,
linear algebra operations are glaringly absent from the above link.
This is what I want to hear about before I get excited about this. I
don't much care how long it takes to do "operations on stacked arrays"
or do indexing (usually).

>
> If it WAS compatible, we could get some speedup from PyPy in non-numerical
> code in gensim. But most time is spent in number crunching anyway, where I
> suspect a good BLAS (=SSE, AVS instructions, cache awareness...) via NumPy
> will get you more than PyPy possibly could.
>
> I could be wrong though, I'm curious to hear if anyone has more practical
> experience with this.
>

tl; dr Benchmarking might be a good first step to get an idea where
the time is spent then it'll be clearer what's the best approach.

FWIW, I've mainly stuck with Cython for performance.

A bit of the early back and forth (from the numpy side)

http://technicaldiscovery.blogspot.com/2011/10/thoughts-on-porting-numpy-to-pypy.html
http://blog.streamitive.com/2011/10/17/numpy-isnt-about-fast-arrays/

This is continuum's answer to the JIT-craze, which sounds more
realistic than pypy for science to me (depending on the hotspot).

http://numba.pydata.org/

IMO, actually finding the hotspots in the code and replacing them with
numba or better yet just plain Cython would be the best way to speed
things up. It depends though on what exactly is slow. I don't think
there's a one-stop "do this everything is faster."

Last I looked at LDA, for example, there wasn't too much optimizing to
be done besides moving the whole thing to C via Cython (ie., no Python
calls in the loop), but I didn't profile it.

I recently did something similar for our Kalman filter implementation
in statsmodels. The problem there was repeated calls to np.dot (in
Python) on small arrays from Cython code. I moved it all to pure C
(calling blas funcs from Cython) and found a good 5-100x speedup
(average 10-25x) depending on the problem.

https://github.com/jseabold/statsmodels/blob/arma-speedup/statsmodels/tsa/kalmanf/kalman_loglike.pyx

It becomes a bit more complicated in gensim because of the parallelism
and would require a bit more care, depending on the (possibly
threaded) BLAS backend IIUC.

Skipper

> On Tuesday, March 11, 2014 3:50:57 PM UTC+1, Bruno CHAMPION wrote:
>>
>> Hi all
>> Is somebody has tried to run Gensim with PyPy (http://www.pypy.org/, fast
>> Python JIT compiler).
>> PyPy seems to be much more faster than the default Python engine. It could
>> be helpfull for heavy computation tasks.
>> Bruno
>
>
> --
> Radim Řehůřek, Ph.D.
> consulting @ machine learning, natural language processing, big data
> http://radimrehurek.com
>
>

> --
> You received this message because you are subscribed to the Google Groups
> "gensim" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to gensim+un...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Radim Řehůřek

unread,

Mar 11, 2014, 12:42:33 PM3/11/14

to gen...@googlegroups.com

Great links, thanks Skipper!

Gensim tries to push as much work as possible into BLAS/LAPACK, so that's where the performance goes: large matrix multiplications, QR... Which is why I say I wouldn't expect "going PyPy" to bring much gain. The relevant parts are already "compiled" and well optimized, via NumPy.

I mean, there are certainly optimizations to be gained by going C (Cython), mostly when bypassing Python/NumPy memory management: avoiding memory copies when fancy indexing etc. LDA included, where I'd expect ~order of magnitude speedup for dropping Python and going fully optimized low level.

Not on the roadmap, although the "pure Python" Rubicon has already been crossed with gensim's Cython-optimized word2vec :-)

Best,
Radim

Reply all

Reply to author

Forward