In essence what am I telling you here? 2 things basically...
Firstly, think about performance before you actually design your
solution...it may be that some tasks are not well suited for
immutable data-structures and it can be a show stopper. Don't
think for a second that you can convert your gorgeous, purely
immutable approach to a purely mutable (better performing one)
without any cost. These are 2 fundamentally different choices that
lead to completely different algorithmic designs...
so where does this leave me? well, I am going to stick with the
elegant, all-clojure solution and perhaps find a 8/12-core machine
to do my training on... however, in a different setting I might
choose otherwise...at the moment I don't have time to spend
hundreds of hours to fix all the bugs of the mutable version so i
can get it to preform better... I hate the process as well...if
someone was paying me though, my ego would be far less!
Of course, there are going to be people that will say: "You're
using the wrong algorithm!", "You can prune the tree!" etc
etc...that is not the point though...even of I do pruning does
anybody think I can get to level 4 in less than 5 min? That would
require pruning half the tree! anyway, what I'm trying to say is
that in algorithms like these performance matters a lot... most
machine learning algorithms involve matrix multiplication...why do
you think all the machine-learners use matlab? you think they
enjoy writing matlab code? No - it just goes super fast ! If
performance matters to you , then perhaps your time is best spent
thinking a mutable approach from day 1... I cannot believe I said
that but there you go - I said it!
No hard feelings eh? I still love Clojure... :-)
Jim
I would love to have some time to look into the details of your specific problem more, but in the absence of time, might I suggest two quick points:
1. Gary Bernhardt has been playing with a "new" approach he calls "Functional Core, Imperative Shell". Essentially, it's another take on the question of how to limit the scope of mutation in order to get the most out of the correctness of mutation-free algorithms and the performance of mutating data instead of replicating it.
2. Along the same lines, have you made the most out of transients in your code? From your description, it seems like you have less work happening within methods than between methods, but perhaps if you "manually" inline some of the work, transients could provide improved performance
Hi Jim,
Reading your story I've got an impression that you make 'functional' and 'immutable' a synonym, not default.
Implementation should be more transparent.
In APL func&vect programming languages fammily there are tools which amends values in place.
It feels so natural, part of a language used in ordinary functional way even at higher abstraction level.
People use those languages for ML because solutions are much faster than Matlab, being very neat functional solutions.
Killing performance for religious paradigm of immutability may kill the language.
cheers
patryk
Hi Jim,
Reading your story I've got an impression that you make 'functional' and 'immutable' a synonym, not default.
Implementation should be more transparent.In APL func&vect programming languages fammily there are tools which amends values in place.
It feels so natural, part of a language used in ordinary functional way even at higher abstraction level.
People use those languages for ML because solutions are much faster than Matlab, being very neat functional solutions.Killing performance for religious paradigm of immutability may kill the language.
cheers
patryk