Need DLL files to run the code

1,029 views
Skip to first unread message

ying wu

unread,
Oct 20, 2014, 5:32:41 AM10/20/14
to openbla...@googlegroups.com
Hi,
 
I am using OpenBLAS-v0.2.12-64bit on windows 7 64Bit. It is no problem to compile and link the code under Visual Studio. But when I run it, it asks for libgfortran-3.dll, then libquadmath_64-0.dll, then libgcc_s_seh-1.dll. I cannot find libgcc_s_seh-1.dll. I guess all the required DLL files are from MinGW, which is not installed in my PC.
 
So Could anyone kindly send me all the required DLL files to run the code with openBlas? Email address is wuying...@gmail.com. I guess the files will not be so big. Thanks very much.
 
Best wishes,
 
Ying

Zhang Xianyi

unread,
Oct 20, 2014, 12:58:51 PM10/20/14
to ying wu, openbla...@googlegroups.com
Hi Ying,

I upload your missing mingw64 dlls to sf.net.



Xianyi

--
You received this message because you are subscribed to the Google Groups "OpenBLAS-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openblas-user...@googlegroups.com.
To post to this group, send email to openbla...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Zhang Xianyi

unread,
Oct 21, 2014, 12:01:21 PM10/21/14
to ying wu, openbla...@googlegroups.com
Hi Ying,

How about setting OPENBLAS_NUM_THREADS=4? Because your CPU only has 4 physical cores, I suggest using 4 threads.

Xianyi

2014-10-21 3:48 GMT+08:00 ying wu <wuying...@gmail.com>:
Hi Xianyi,
 
Thanks very much for the DLLs. I get the code running without any problem.
 
However, there is one more thing I really need your help. I used openBlas under the Mac OS before and I used the LLVM. I remember the speed was super fast. At least it is much faster than Intel MKL. Furthermore, I can also turn on the parallel running, where I mean I can use all the cores to run the code. I was thus very happy with this library.
 
However, I have to change back to Windows recently. And I have to use Visual studio 2013 (C++). I get two problems:
 
One is that I cannot turn on the parallel!!!????  My environment is windows 7 64bit + visual studio 2013+Armadillo+openblas. My code is built on 64bit + release. I use the openBlas binary version for windows, downloaded from your website. I have turn on all the parallel setting in visual studio and I tried to add OPENBLAS_NUM_THREADS as the system environment. I tried to add OPENBLAS_NUM_THREADS in visual studio. I tried to add #include "cblas.h"  +  openblas_set_num_threads(8) as the runtime setting. But my CPU is always below 15%. Obviously the parallel does not work. Does this mean I have to compile the library by myself and the binary version on your website doesn't support parallel, or I did somewhere wrongly? Could you kindly help me?
 
The other thing is that I find the speed is slow under Windows. At least it is much slower than that under LLVM & Mac OS. Maybe it is because the parallel is not turned on. I attach the code I used below. Is this normal?
 
The CPU in my new workstation is Intel i7 4900MQ (2.8GHz). This is a very powerful CPU. The funny thing is that I run the same code in my another small laptop, which the CPU is only Intel i5-2520M (2.5GHz). In the small laptop, I use the Intel C++ and Intel MKL as the compiler. With the same code below, the running time is THE SAME!!!!  How is this possible? In my impression, openBlas should be faster than Intel MKL. I am confused. Maybe you have better idea.
 
Thank you very much for your help. Because my work is machine learning, so your library is so important for me.
 
Best wishes,
 
Ying
 
 
 
 
 
 
 
 

#include "stdafx.h"

#include <stdio.h>

#include <stdlib.h>

#include <conio.h>

#include <iostream>

#include <time.h>

#include "armadillo"

#include "cblas.h"

using namespace arma;

using namespace std;

int main()

{

openblas_set_num_threads(8);

//goto_set_num_threads(4);

 

cout << openblas_get_config() << openblas_get_parallel()<<endl;

//int m = 2000, p = 200, n = 1000;

int m = 100, p = 10, n = 100;

double alpha = 1.0, beta = 0.0;

mat A = zeros<mat>(m, p);

for (int i = 0; i < m; i++)

{

for (int j = 0; j < p; j++)

{

A(i, j) = (double)(i*p + j + 1);

}

}

mat B = zeros<mat>(p, n);

for (int i = 0; i < p; i++)

{

for (int j = 0; j < n; j++)

{

B(i, j) = -(i* n + j + 1);

}

}

mat C = zeros<mat>(m, n);

arma::wall_clock timer;

timer.tic();

clock_t start = clock();

for (int i = 0; i < 400000; i++)

{

C = alpha*(A*B) + beta*C;

}

double diff = (clock() - start) / (double)CLOCKS_PER_SEC;

std::cout << diff << std::endl;

printf("\n Computations completed.\n\n");

cout << "took " << timer.toc() << " seconds" << endl;

 

getch();

return 0;

}

 
 
 
 
 
 
 
 
 
 
 
 

Zhang Xianyi

unread,
Oct 21, 2014, 2:47:19 PM10/21/14
to ying wu, openbla...@googlegroups.com
What's OpenBLAS binary? Please try 

OpenBLAS-v0.2.12-Win64-int32.zip


2014-10-22 2:32 GMT+08:00 ying wu <wuying...@gmail.com>:
Hi Xianyi,

I already tried 1,2,4,8,16, but no difference actually. This is why I suspected the binary version directly downloaded or the VC++. This is quite annoying. I have to try the MinGW, but not preferable for me to change the compiler. Any thinking? Thanks very much.

Best wishes,

Ying




 

Zhang Xianyi

unread,
Oct 23, 2014, 3:54:34 AM10/23/14
to ying wu, openbla...@googlegroups.com
Could you provide the configure file for armadillo.

I want to make sure calling OpenBLAS correctly.

2014-10-22 16:30 GMT+08:00 ying wu <wuying...@gmail.com>:
That is the binary packages from http://www.openblas.net/ and  OpenBLAS-v0.2.12-Win64-int32.zip is what I am trying now. Sorry didn't make it clear. Thanks, Xianyi.
 
Ying

Zhang Xianyi

unread,
Oct 28, 2014, 1:57:30 AM10/28/14
to ying wu, openbla...@googlegroups.com
The config file looks fine.

Could you try some bigger matrices?

2014-10-23 17:58 GMT+08:00 ying wu <wuying...@gmail.com>:
Hi Xianyi,
 
I am attaching the armadillo configuration file.
 
There is some new findings: I finally set up MinGW and MSYS. I compiled the openblas from the source code in my workstation and compare it with VC++ 12.
 
Without openBlas and only Armadillo:
VC      37.62s
GCC   22.28s
 
** Thus the GCC can be faster than VC++. Without openBlas, it is the Armadillo do the matrix computation.
 
With openBlas and Armadillo:
VC      16.50s
GCC   7.5s
 
** I can see openBlas improve the speed greatly. At least, finally I have good speed now.
 
 
HOWEVER, why I still cannot run the code in parallel???? I try what I have tried and the CPU is always under 20%. If I want to use the multi-cores with openBlas, how can I do it under Windows? The openblas is compiled with NO_AFFINITY. Should I set NO_AFFINITY=0 and use openMP?
 
Thanks,
 
Ying 
 
 
 
 
 
 
 
 
 
 
Reply all
Reply to author
Forward
0 new messages