Le 25/02/2015 16:08, Ron Peacetree a écrit :
> Enough time has passed since the original design of edax in ~1998 that
> there have been some significant changes in HW architecture:
I do not think there is a single line of Edax 1.0 (the 1998 version) in
Edax 4.x. Edax has been rewritten from scratch between each major
version. It does take into account some recent changes in hardware
architecture: 64 bit instructions, multi-core CPU, etc. that were not
available (at a reasonable price) in 1998.
> =as of this writing, one can buy commodity systems (albeit servers at
> present) that support 1-2 TeraBytes of RAM
> =1+ TeraByte SSDs with ~RAM speed IO paths (SATA Express) are now reality.
They won't replace ram. RAM is still ~1000x faster than SSD for access
time.
> =GPU functionality is becoming much more tightly integrated to CPUs
> =GPUs are far better general purpose computing devices than they were
> even 2 years ago
> =HW support for transactional memory
> =HW support for automated fine grain parallelism
All this is very recent (in Haswell CPU or higher), and sometimes buggy
(transactional memory on the first Haswell or Broadwell CPU).
>
> All of the above suggests that it might be time to see if the
> performance of edax could be significantly improved:
> =support for transposition tables up to 1+ TB
That's easy to do, but, considering the transposition table size,
/bigger/ does not mean /better/.
> =seeing if the magic bitboard code can better leverage the GPU
> infrastructure
Probably not. As far as I know, it is still very slow to switch between
GPU & CPU. It means you cannot have the move generator on the GPU & the
search in the CPU.
> =seeing where else move generation and evaluation could be improved by
> the HW changes
move generation can benefit of some new instructions available in the
Haswell, for example with the following code :
https://code.google.com/r/okuharaandroid-edax-reversi/source/browse/src/flip_avx.c
>
> The laptop i'm writing this on is an i7-4860HQ with 32GB of RAM and a
> nVidia GeForce GTX 980M running Win 8.1
> the "stock" Edax 4.3.2 distro routinely searches 20-40M nps with peaks
> of 50-65M nps on this system.
> (hash-table-size 30, n-tasks 4, level 32 or 34)
A few tricks to run Edax faster:
- Try a smaller hash-table-size (25 should be near optimal for level
32-34) & n-tasks 8.
- Try to recompile Edax (stock Edax is optimized for generic CPU, if you
recompile it, it will be optimized for yourown cpu).
- Avoid windows (10% slower than Linux or Mac OS/X).
> With larger transposition tables and improvements I think could be
> made to the engine, edax might be able to routinely do 100+ M nps
> searches and search 40-42 plies reasonably efficiently. Especially on
> HW better than my laptop ;-)
When solving fforum 40-59, Edax reaches 140 MNPS on my computer
(i7-2600k at 4Ghz).
--
Richard