Thanks, in the course of producing the diagnostics I noticed that Scipy might not have been configured properly, just Numpy. I had just reinstalled Numpy from a local build configured with OpenBlas, but I didn't reinstall Scipy. The before and after configuration is shown below. I just rebuilt Scipy as well, and it now the takes 1/3rd of the time to perform LSA over Simple English wikipedia (400 topics).
The debugging trace is attached over Simple English wikipedia (redux-simplewiki-20140410-pages-articles.gensim-prep.lsa400.log). I'll post the full trace over Wikipedia in a day or two.
You might want to revise BLAS diagnostics notes in the distributed computing tutorial (
) to use scipy.show_config().
>>> scipy.show_numpy_config()
lapack_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/openblas/lib']
language = f77
blas_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/openblas/lib']
language = f77
openblas_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/openblas/lib']
language = f77
blas_mkl_info:
NOT AVAILABLE
>>> scipy.show_config()
lapack_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/openblas/lib']
language = f77
blas_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/openblas/lib']
language = f77
openblas_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/openblas/lib']
language = f77
blas_mkl_info:
NOT AVAILABLE
$ lscpu
Architecture: x86_64
CPU op-mode(s): 64-bit
CPU(s): 2
Thread(s) per core: 1
Core(s) per socket: 1
CPU socket(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 26
Stepping: 5
CPU MHz: 2266.746
Hypervisor vendor: Xen
Virtualization type: para
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 4096K
old Scipy config:
>>> scipy.show_config()
blas_info:
libraries = ['blas']
library_dirs = ['/usr/lib64']
language = f77
amd_info:
libraries = ['amd']
library_dirs = ['/usr/lib64']
define_macros = [('SCIPY_AMD_H', None)]
swig_opts = ['-I/usr/include/suitesparse']
include_dirs = ['/usr/include/suitesparse']
lapack_info:
libraries = ['lapack']
library_dirs = ['/usr/lib64']
language = f77
atlas_threads_info:
NOT AVAILABLE
blas_opt_info:
libraries = ['blas']
library_dirs = ['/usr/lib64']
language = f77
define_macros = [('NO_ATLAS_INFO', 1)]
atlas_blas_threads_info:
NOT AVAILABLE
umfpack_info:
libraries = ['umfpack', 'amd']
library_dirs = ['/usr/lib64']
define_macros = [('SCIPY_UMFPACK_H', None), ('SCIPY_AMD_H', None)]
swig_opts = ['-I/usr/include/suitesparse', '-I/usr/include/suitesparse']
include_dirs = ['/usr/include/suitesparse']
lapack_opt_info:
libraries = ['lapack', 'blas']
library_dirs = ['/usr/lib64']
language = f77
define_macros = [('NO_ATLAS_INFO', 1)]
atlas_info:
NOT AVAILABLE
lapack_mkl_info:
NOT AVAILABLE
blas_mkl_info:
NOT AVAILABLE
atlas_blas_info:
NOT AVAILABLE
mkl_info:
NOT AVAILABLE