Dear OpenBLAS users
I'm relative new to linux.
Currently, I want to install quantum espresso on our server. I choose to use OpenBLAS as our lapack library since it is well maintained.
Q1. How can I optimize the OpenBLAS for our sever? Will "make" just does all the jobs for us? Or I need to manually set the numbers of threads when I compile?
Q2.I've successfully installed OpenBLAS on my own PC (intel i7, 8 core) with make command. I linked all the libraries(BLAS, LAPACK) to "libopenblas.a". I run the same example calculations for different installation, But it seems the Openblas doesn't have compelling advantages in calculation speed(comparing to default blas and lapack files comes with quantum espresso). Is it because I didn't install the openblas package properly?
Thanks for advance.
PS:
Server system info: (128 cpus)
processor : 0~127
vendor_id : GenuineIntel
cpu family : 6
model : 47
model name : Intel(R) Xeon(R) CPU E7- 8830 @ 2.13GHz
stepping : 2
cpu MHz : 1067.000
cache size : 24576 KB
physical id : 0
siblings : 16
core id : 0
cpu cores : 8
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ida arat epb dts tpr_shadow vnmi flexpriority ept vpid
bogomips : 4255.98
clflush size : 64
cache_alignment : 64
address sizes : 44 bits physical, 48 bits virtual
power management: