genuine help needed to speed up minimax algorithm!

Jim - FooBar();

unread,

Aug 17, 2012, 8:47:21 PM8/17/12

to clo...@googlegroups.com

Hi all,

I really need your advice/help with something I was not expecting! let
me explain...
after spending almost a week experimenting with different designs for my
chess minimax algorithm, I settled down on a design that I like. I like
it because it is rather functional (as opposed to my Java one 3 years
ago) and at least in theory, most of the job should be done in parallel.
To be hoenst it never occurred to me to use hash-maps for such a purpose
until I realised that a purely 'looping via map' -approach would not
allow me to keep track of the indices easily so even though i would have
the numeric result easily, I would not know which move this result
originated from. So, that is where it occurred to me to nest hash-maps
(instead of seqs via mapping) where I can always keep the originating
node as a key. I do think my solution is quite elegant and completely in
line with the functional lessons/obsessions/practices etc etc and for
that I'm really happy...Of course you can argue the oppposite - no
problem about that...:-)

As i was saying the design is pretty and elegant (and
lazy!)...Unfortunately it is terribly slow! I mean unbelievably
slow...4-5 min to reach level 4? How am I ever going to train by
evolutionary means if searching takes so much time is beyond me! Ok
let's see inside:

the core algorithm is here:
-------------------------------------------------------------------------------------------------------------------------------
(defn game-tree
"Generate the game tree up to a level lazily (via 'map')."
[root? dir board successors-fn depth]
{:node board
:direction dir
:children (if (zero? depth) '()
((if root? pmap map) ;start futures at the top via pmap
#(game-tree false (- dir) % successors-fn (dec depth))
(successors-fn board dir)))})

(defn search "The recursion of min-max algorithm."
[eval-fn tree]
(letfn [(minimize [tree] (if (seq (:children tree))
(apply min
(map #(maximize %) (:children tree)))
(eval-fn (:node tree) (:dir tree))))

(maximize [tree] (if (seq (:children tree))
(apply max
(map #(minimize %) (:children tree)))
(eval-fn (:node tree) (:dir tree))))]
(minimize tree)))

(defn evaluator
"Returns a fn that expects a tree to start searching using this eval-fn."
[eval-fn]
(fn [t]
(search eval-fn t)))

(defn best-next-state
"Get the best next state for the given game tree."
[tree eval-fn]
(->> (:children tree)
(apply max-key (evaluator eval-fn))
(:node)))

(defn next-level
"Generate the next level of the tree using b as the current state."
[b dir]
(let [team (filter #(= dir (:direction %)) b) ;all the team-mates (with
same direction)
team-moves (concat (for [piece team
coords (core/getMoves piece)]
(core/dest->Move @curr-game piece coords)))]
(vec (map core/try-move team-moves))))
--------------------------------------------------------------------------------------------------------------------------

As you can see all the usual suspects are there...letfn, pmap, max-key,
closures etc etc. I thought it was rather clever using pmap when
branching out from the root node but it turns out it only makes a tiny
positive difference overall (20-30 sec)...also looking at mu cpus, not
all of them are working even though there is so much work to be done on
each branch (where futures were created). Is pmap just too good to be
true? It is the first time i am using it seriously...

Or am I looking at it completely the wrong way here? I mean, I have done
this before in Java but it was only for checkers - not chess...we all
know that chess has a staggering branching factor especially as soon as
the game opens up a bit. Is it perhaps plain naive hoping to reach at
least level 6 in reasonable time (< 2min) functionally?

I really want to hear your thoughts especially if someone has done
anything similar...I honestly wish i had started coding checkers first
so I could compare with the Java one! at least i would knew whether to
pursuit it further or not...

thanks a lot for your patience and looking forward to comments :-)

Jim

Lars Nilsson

unread,

Aug 17, 2012, 9:46:51 PM8/17/12

to clo...@googlegroups.com

On Fri, Aug 17, 2012 at 8:47 PM, Jim - FooBar(); <jimpi...@gmail.com> wrote:
> As i was saying the design is pretty and elegant (and lazy!)...Unfortunately
> it is terribly slow! I mean unbelievably slow...4-5 min to reach level 4?
> How am I ever going to train by evolutionary means if searching takes so
> much time is beyond me! Ok let's see inside:

[...]

>
> As you can see all the usual suspects are there...letfn, pmap, max-key,
> closures etc etc. I thought it was rather clever using pmap when branching
> out from the root node but it turns out it only makes a tiny positive
> difference overall (20-30 sec)...also looking at mu cpus, not all of them
> are working even though there is so much work to be done on each branch
> (where futures were created). Is pmap just too good to be true? It is the
> first time i am using it seriously...
>
> Or am I looking at it completely the wrong way here? I mean, I have done
> this before in Java but it was only for checkers - not chess...we all know
> that chess has a staggering branching factor especially as soon as the game
> opens up a bit. Is it perhaps plain naive hoping to reach at least level 6
> in reasonable time (< 2min) functionally?

I have no particular insight regarding your code, and how to make
better. However, in general, I'd hazard a guess that high performance
game programs try to avoid memory allocations as much possible.
Preallocated arrays of sufficient sizes, etc, where values are filled
in and iterated over, stuff like that, rather than using functions
that yield temporary data structures that require memory allocation.

Perhaps check out http://www.tckerrigan.com/Chess/TSCP for a
non-optimized and written for clarity C Chess program that does not do
any memory allocation but uses fixed size arrays for the
search/principal variation, etc.

Lars Nilsson

Michael Gardner

unread,

Aug 18, 2012, 8:13:37 AM8/18/12

to clo...@googlegroups.com

If you haven't already, start by eliminating reflection warnings[1].

As for pmap, it's unfortunately useless. You could roll your own using e.g. Java's thread pools, or you could wait for the new reducers library[2]. Reducers should offer not only useful parallelism, but also better performance across-the-board compared to clojure's sequence functions (since they don't pay the performance price of laziness).

[1] http://clojuredocs.org/clojure_core/clojure.core/*warn-on-reflection*
[2] http://clojure.com/blog/2012/05/08/reducers-a-library-and-model-for-collection-processing.html

Jim - FooBar();

unread,

Aug 18, 2012, 8:27:44 AM8/18/12

to clo...@googlegroups.com

The only reflection warnings I get are from core.logic and seesaw but
these are used/required in other namespaces...I get not warnings from my
own code apart from a couple of cases where I'm calling Math/abs but
that is easy to fix.

As far as pmap goes, I originally thought of starting a new future for
each starting branch but that is what pmap does essentially, so it
looked very handy at first... I don't see how reducers can help here
because I'm not reducing anything really. actually comparing the scores
of the leaf nodes is not so expensive as it sounds (at least I can't
imagine why it would be). The trick would be to start branching in
parallel wouldn't it? So, let's assume you've got a brand new board.
White moves first and he has 20 potential moves to explore (8x2-pawns +
2x2 knights). There is a lot of work to be done in each branch so
assigning a future to each, sounds reasonable. However if that was
actually happening I should be seeing all 4 cores working non-stop!

I just wish pmap worked!

Jim

Michael Gardner

unread,

Aug 18, 2012, 8:57:47 AM8/18/12

to clo...@googlegroups.com

On Aug 18, 2012, at 7:27 AM, Jim - FooBar(); wrote:

> As far as pmap goes, I originally thought of starting a new future for each starting branch but that is what pmap does essentially, so it looked very handy at first...

Yes, pmap is essentially a trap in that it looks like the perfect tool for some light parallelism, but it works so poorly in the vast majority of cases that it's basically useless. There have been a few discussions on this list about it in the past.

> I don't see how reducers can help here because I'm not reducing anything really.

You should read the link I posted about reducers. It's good stuff.

> I just wish pmap worked!

So do I. Until the reducers library is ready, you could try something like this (no guarantees that this is an optimal implementation!):

(defn with-thread-pool* [num-threads f]
(let [pool (java.util.concurrent.Executors/newFixedThreadPool num-threads)]
(try (f pool)
(finally
(when pool (.shutdown pool))))))

(defmacro with-thread-pool [[name num-threads] & body]
`(with-thread-pool* ~num-threads (fn [~name] ~@body)))

(defn pmap-pool [f coll]
(with-thread-pool [pool (.. Runtime getRuntime availableProcessors)]
(doall
(map #(.get %)
(.invokeAll pool
(map (partial partial f)
coll))))))

Jim - FooBar();

unread,

Aug 18, 2012, 2:24:12 PM8/18/12

to clo...@googlegroups.com

On 18/08/12 13:57, Michael Gardner wrote:
> Until the reducers library is ready, you could try something like this (no guarantees that this is an optimal implementation!):
>
> (defn with-thread-pool* [num-threads f]
> (let [pool (java.util.concurrent.Executors/newFixedThreadPool num-threads)]
> (try (f pool)
> (finally
> (when pool (.shutdown pool))))))
>
> (defmacro with-thread-pool [[name num-threads] & body]
> `(with-thread-pool* ~num-threads (fn [~name] ~@body)))
>
> (defn pmap-pool [f coll]
> (with-thread-pool [pool (.. Runtime getRuntime availableProcessors)]
> (doall
> (map #(.get %)
> (.invokeAll pool
> (map (partial partial f)
> coll))))))

Hi Michael,

Unfortunately I get exactly the same performance with pmap when using
the above code... and again 2 of my cpus are sitting around waiting!
If use pmap or map-pool on all levels i get almost 100% utilisation of
cpu but nothing useful happens!!! anyway, I'll have a look at reducers!
thanks for the snippet anyway!

Jim

Jim - FooBar();

unread,

Aug 18, 2012, 2:52:53 PM8/18/12

to clo...@googlegroups.com

JESUS CHRIST!
What the hell just happened? I used clojure 1.5 alpha 3 that has the new
reducers and now i get back the same result in 3 seconds!!!!
HOW ON EARTH IS THAT POSSIBLE? I mean i've watched the videos but i
would never expect so much performance increase!!! what is happening? Is
my algorithm completely wrong? this is scary stuff...

the bad news is that i always get the same answer regardless of the
depth...also something suspicious is the fact the no matter the depth
the r/map returns in constant time! something smells here!

Jim

Bill Robertson

unread,

Sep 3, 2012, 11:42:34 PM9/3/12

to clo...@googlegroups.com

Did you figure out what was going on?

Jim - FooBar();

unread,

Sep 4, 2012, 6:37:46 AM9/4/12

to clo...@googlegroups.com

everything is working fine...see the post from yesterday called 'when
looking for performance consider cheating' for an up to date explanation
of how and where i cheated to speed it up...

Jim

On 04/09/12 04:42, Bill Robertson wrote:
> Did you figure out what was going on?

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient
> with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

Reply all

Reply to author

Forward