-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 31.05.2016 07:17, Mikera wrote:
> I've been working with a number of collaborators on a deep
> learning library for Clojure.
>
> Some key features: - An abstract API for key machine learning
> functionality - Ability to declare graphs / stacks of operations
> (somewhat analogous to tensorflow) - Support for multiple
> underlying implementations (ClojureScript, JVM, CPU, GPU) -
> Integration with core.matrix for N-dimensional data processing
>
> We intend to release as open source. We haven't released yet
> because we want to get the API right first but it is looking very
> promising.
Almost all of the development in deep learning is done in Python, so
having to reproduce this work on a different runtime (and language)
seems non-Clojure-like for me (compared to being hosted on the JVM and
leveraging this ecosystem). deeplearning4j is already attempting it
from the Java side with the argument of distributed scaling, but
again, there is a ton of work done for the Python toolkits including
massive scaling (e.g. with tensorflow) and most research is there, so
my question would be what the actual goals for a Clojure stack would be?
Machine learning systems usually can be implemented very well as a
"microservice" because they have no state (assumed the model is
trained and constant) and just need to return an output given a sample
(act like a pure function). This has worked out fine for me in Clojure
so far.
There are many other important machine learning concepts and papers
beyond deep learning, which can be used that way. The network IO can
be problematic in some cases ofc., e.g. small models+low-latency
requirement, but I think the leverage is much higher and will increase
unless the ML community switches away from Python to the JVM (highly
unlikely atm.). I personally will rather focus on ML papers and
improving the algorithm on paper and in Python than reimplementing a
ton of work in Clojure just to get on parity in my favourite language.
Another point is performance. For many areas it doesn't matter too
much whether you are 5 or 10% faster as long as you can scale. In
machine learning it is often critical though as these 5-10% are
already hours and scaling can be hard model-wise. I have tried to get
performance on par with Theano out of Clatrix (both on CPU) a year ago
and Theano was 2-4x times faster on my machine. Even if a Clojure
stack would address this, there is a lot of work necessary to
constantly optimize the stack for different GPU architectures and
compute graphs. Theano has atm. 40 contributors or so, not to speak of
TensorFlow being backed by Google & DeepMind now.
I think a much better approach would be to bring Python native
bindings to the JVM. There is
http://www.jyni.org/ and they seem to be
getting closer to numpy support, but are only two guys atm. (I think).
While this is a more low-level effort and has little to do with
Clojure, it would allow to use all Python libraries natively from
Clojure (together with core.matrix backends for numpy, etc.). Jython
is considered as slow as CPython, so one could expect to have
Python-like performance with a small wrapper library, because most of
the performance critical code is already in the native libraries. From
there emancipating with Clojure through a core.matrix based-stack
would be a non-uphill battle similar to the development of the Clojure
ecosystem.
What points would speak against an approach like this?
Christian
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAEBAgAGBQJXTXq+AAoJEICbLivqiPOFR1sQAKZYWzGz3mEFVQuItOUgFz8p
/zRh3oj2jLYOT5rHxEehZkZfQEjuRVMn6NW5nPR8c6mEzUc2FRNUTJHbDAgqaWSp
LxIOy5qfqzuA1J1x/hlsn1JRGMrvjZv+NvW2PpG8WSgZYwblIxdzzcRCYiRQ4+tQ
IhGDg1CKc2awGOJHLuJmzTuHtI+fIhvhDxRBEjvlfTdIInKugS5K0rwyiXr50jcx
zoO5jnhibcZB9LxskmW0J/8kH/hT2RD8mwjeI5oQYZuHZ/LvZX0+U9ocihyxoL2B
YFd2TwDc7ebgx71gsAnSTPcrIOfIwItprP4ka2gWtmXGJR3PZxfm5JlBir/gJIYf
Aa3uqR19qOFCgxwUCqFCWTVgojVFcF4F+VU7dXtfrQE5hkmBycSwbXiDh4CC11jV
Lhlff+yjv4OL2IYrPMBbVVU/KeWH+o4ETR0GaePRfGuBOEc04048F4Xz84NBZ6ke
lJhGL63JpUKqBJAPjZlU57VMoNMIczHdlMGF1oRhqkWzo0gD4ygX5C8g90xXGXLq
NjF/GiFEUWR1xzPvqLTNIX2kTveW46ZBDTiYCYCD8j+8yxGw04ow5wEoflbhz3Gd
PdWk7wb9bXlSHm6+b0Ax8CGQqMeDbb/RXqHneTDCQBjDW4olmzWfpRokUGj+K0ne
/1dgMs5AjIB+QHMW+PHZ
=3hFS
-----END PGP SIGNATURE-----