You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to openbla...@googlegroups.com
I compiled OpenBLAS on a recent Core i7-5930K which belongs to a new Haswell-E class of processors. I'm getting 46.5 GFlops in DGEMM (tested with N=9216) with 1 thread which is 79% of peak with Turbo frequency of 3.7 MHz.
The Haswell dgemm kernel file in OpenBLAS has 53.3 GFlops written in the comments which is 85% of peak in Turbo frequency of 3.9 GHz of your Core i7-4770. The 79% vs 85% is not that different but I'm curious what may be the reason of the poorer performance on the Haswell-E chip. Is it cache or main memory latency diferences (the system uses DDR4 vs DDR3 memory)? Maybe some of the parameters should be set differently? Which parameters should I try to change to fine tune the performance?