Agreed. For me a fairly high priority is threads---I've got applications that
are dirt slow on one CPU (I'm talking days here...) and for which the DArray
approach is a nonstarter (due to the overhead of data transport, these are
"big data" as well as "big computation" problems). SMP is the obvious answer.
I don't think we're actually that far away from supporting low-level
multithreading, the main lack being a mutex around codegen. (Obviously all of
the threaded algorithms have to avoid touching gc, but that's getting easier
all the time.) I've spent a little time playing with this but nothing
sufficiently finished for general consumption (or even, my own consumption). But
my own need is getting sufficiently dire that I am contemplating picking these
efforts up again, although in the near term other more immediate commitments
will almost certainly win the competition for my time.
A "nice" threading interface (i.e., something fancier than a wrapper around
pthreads) would be more work, of course. And of course GPUs are very
attractive, too, but again more work.
Finally, don't forget Krys' very nice work on delayed execution, which
implements something along exactly the lines you propose. That work has never
gotten the attention it deserves (and here I view myself as the #1 guilty
party in not picking that up and running with it, since it would in principle
be quite useful to me).
--Tim
On Friday, September 06, 2013 04:59:13 AM Dahua wrote:
> In recent traveling, I met a lot of people working on machine learning and
> computer vision. I found that Python has become increasingly popular lately
> -- people are talking about new packages in Python for developing high
> performance learning algorithms, notably
> Theano<
http://deeplearning.net/software/theano/>and NumbaPro
> <
http://docs.continuum.io/numbapro/>.