Did you link with the math library? (-lm)
> Demo_LAPACK.obj : error LNK2019: unresolved external symbol _DGESV
> referenced in function _main
Did you link with the (Goto)BLAS? Is the symbol for DGESV in the
GotoBLAS library the same as the extern symbol used in the code example?
(There may be a difference beween Fortran symbols and C symbols -- some
compilers bind Fortran symbols to their lowercase versions with an
underscore attached.)
You should use a pastebin website to show us the error messages, rather
than attaching them all in a very long e-mail sent to everyone on the
list ;-)
http://en.wikipedia.org/wiki/Pastebin
mfh
You have to use -mno-cygwin if you want to use libraries with VS. Have
you used it? Without this, it will not work.
You may need to rename DGESV in the code. It depend on how it is called
in GotoBLAS.
Using libraries compiled by different compilers is not a trivial task.
For C++, I guess, it will not work in general. For C and Fortran 77 it
may work if you are lucky. Actually if you try with VC to mix different
system libraries -MD, -MT and so on, you will have the same problem. For
the latter please look at
http://support.microsoft.com/kb/154753
I see in German, but I hope that for you it will be in your language.
If you use different versions of a compiler, you may have again the same
problem. Anyway, basically you have to be patient and understand what is
going on. Everything is possible and opportunities are endless. The
question is how much time it will take. It might be just faster to
recompile everything with one compiler.
Your friend is nm under cygwin. It tells you what names are defined in
the libraries.
Unfortunately I did not have time yet to look at GotoBLAS. But let us
compile the code with ATLAS. Do not forget that libraries from TAUCS are
compiled with -MT. So let us start with -MT
$ cl -MT -EHsc dgesv.cpp liblapack.lib libf77blas.lib libcblas.lib
libatlas.lib vcf2c.lib
dgesv.obj : error LNK2019: Verweis auf nicht aufgelöstes externes Symbol
"_DGESV" in Funktion "_main".
dgesv.exe : fatal error LNK1120: 1 nicht aufgelöste externe Verweise.
Well, _DGESV is missing and this is correct as ATLAS uses different BLAS
names. We can easily figure this out with nm
$ nm liblapack.lib | grep -i dgesv
00000010 T _dgesv_
Note that I knew that this must be in liblapack. In the general case you
have to search over all the libraries. So we see that we need to use
dgesv_. If now I edit dgesv.cpp and change DGESV to dgesv_, then the
command above compiles.
The next step let us switch to -MD
$ cl -MD -EHsc dgesv.cpp liblapack.lib libf77blas.lib libcblas.lib
libatlas.lib vcf2c.lib
There are already many missing symbols, as vcf2c.lib was compiled with
-MT. We already have discussed it and found a combination of libraries
that will work with -MD
http://matrixprogramming.com/TAUCS/taucs-MD.html
See Using ATLAS as optimized BLAS
Let us try it
$ cl -MD -EHsc dgesv.cpp liblapack.lib libf77blas.lib libcblas.lib
libatlas.lib libg2c.lib libgcc.lib
By me everything is okay.
Once more, patience, patience, and patience. This what is necessary if
you want to mix libraries from different compilers.
*nods* let me also reassure you that this is the case. Mr. Goto has
done an excellent job here, even if his spelling in the README isn't
so good.
> Here is the output of nm on the GotoBLAS lib http://pastebin.com/f14d74312
> (thanks Mark). GotoBLAS too uses _dgesv_ .
Sorry if I was a bit mean about using pastebin -- it's just that it's
much easier to read short e-mails ;-)
> Quick question: its just occurred to me most of the other linker
> errors I'm getting are about not finding the symbols _clock, _rand,
> _atoi etc. (used in dgesv.cpp). Why are the functions(?) prepended
> with an underscore?
Different compilers have different conventions for transforming the
name of a function or method name or a globally visible variable into
a symbol in the machine code. Those machine code symbols are the
"actual names" that represent the "interface" of a library or object
file. This transformation process is often called "name mangling"
(especially in the context of C++, which actually does _mangle_ the
names almost beyond recognition). Name mangling is why you need to
use extern "C" to declare the interface of a C function in a C++ file;
otherwise the C++ compiler assumes that the function is a C++ function
and performs C++ mangling on the name to get the linker symbol.
mfh
Why not just use a static library?
mfh
It seems that the question was about something else. The compiler
usually prepends underscore to functions to distinguish them from some
internal symbols, if I remember it correctly. Anyway, it seems that the
first underscore is quite common. The Fortran compilers, on the other
hand, used to prepend and append the underscore together.
Name mangling comes on top of this. Things with underscore happens
already with Fortran and C.
Cygwin is basically a tool to port Unix code to Windows. As such, it
emulates Unix calls. To achieve it, it uses cygwin1.dll and the code
compiled with gcc depends on it. Or one can say that Cygwin by default
uses Unix-like system libraries that in turn depends on cygwin1.dll.
Hence, the object file produced by default has basically Unix-like
system calls and it is hard to link it with VS. It is probably could be
possible but with a lot of headache.
An alternative to Cygwin was MinGW, that allows us to use gcc but
directly use Win32. MinGW is now a part of Cygwin and one turns it on
with the option -mno-cygwin. However, this way the code should not have
Unix system calls, only standard C library calls or Win32 calls. It
could be possible to insert -mno-cygwin in GotoBLAS but if it uses Unix
system calls, then it will not work.
The solution with DLL seems to be the best. This way the code will
depend on cygwin1.dll but this should not be a problem for you. A legal
problem with cygwin1.dll that you either must distribute your code under
GNU or must purchase a special license to distribute cygwin1.dll.
I see, the question was about something particular to VS. (Not all
compilers prepend underscores to functions.)
> Name mangling comes on top of this. Things with underscore happens
> already with Fortran and C.
I usually use the term "name mangling" to refer to any compiler's
convention for transforming programmer-visible symbols into
linker-visible symbols, regardless of the language. I suppose that's
a quirk of mine ;-)
mfh
In my experience it is quite common. It is certainly not VS specific.
Say, gcc does it and actually I do not remember a compiler that does not
do it.
Well, probably it was not the case on BESM6
but this was so long ago that I do not remember anymore.
g77 and gfortran both append an underscore to Fortran symbols. I've
encountered some compilers (IBM's xlf, perhaps?) that do not.
The Fortran 2003 standard introduced the ISO_C_BINDING module, which
lets you declare Fortran functions to have C binding (so you can specify
what symbol should be exported to the linker). It also lets you pass
scalars by value instead of by pointer. I'm using ISO_C_BINDING in a
project now and it's a big help, especially in mixed Fortran / C
projects. It's also handy if you want to roll your own C BLAS interface.
mfh
Actually you can control it with g77 with -fno-underscoring. But this is
only one part of the story. The second part is that the all these
compilers including C prepend underscore. And I am not sure that this
behavior could be changed.
I should confess that I have never compiled CLAPACK. I have just used
LAPACK directly. Clearly one needs a Fortran compiler to compile LAPACK
but then one can calls it from C and C++.
From docs it seems that they have used f2c to convert the Fortran code
to C and then compile it. I would expect that if one compiles the
Fortran code directly, it should be faster.
Yet, if there is no Fortran compiler at hand, then CLAPACK could be a
solution.
But why do you use these strange names in your code? You cannot expect
to find them in any normal BLAS library. Say, ATLAS does not have them.
It seems that this is very CLAPACK specific.
There are standard BLAS functions
http://www.netlib.org/lapack/lug/node145.html
Well, the word standard unfortunately does not imply what will these
names will look like in the library - uppercase or lowercase, with an
appended underscore or not. So, this is the first goal to learn how
these functions are named in a particular BLAS, say GotoBLAS and then
just use these names.
I understand. But my point was that these names come from your code. If
you change them in your code to those that are available in the GotoBLAS
then it may work. Also I see that GotoBLAS already has some LAPACK
routines, say it seems that dgesv should be already there.
The reason for that is that actually LAPACK routines should be tuned as
well. At least this concerns ILAENV
http://www.netlib.org/lapack/lug/node131.html
but usually people tune other routines as well. So, GotoBLAS and ATLAS
are actually more than just optimized BLAS. It might well be that it is
just enough for you.
If not, then for example in ATLAS there are instruction how to make a
full LAPACK
http://math-atlas.sourceforge.net/errata.html#completelp
Check what says the GotoBLAS docs.
I guess that this sounds pretty confusing but the life is usually a mess
anyway.
> 1. The sample code is not mine. Its a CLAPACK example.
>
> 2. GotoBLAS indeed implements several LAPACK routines. Kazushige Goto,
> the author of the library confirms this.
Look in the lapack directory in Goto's BLAS' source distribution to see
exactly which LAPACK routines Goto has implemented, to see if it affects
you. I see only various LU and Cholesky factorizations in there.
mfh
Well, you try to use it. I am really surprised why they have made these
strange names.
> 2. GotoBLAS indeed implements several LAPACK routines. Kazushige Goto,
> the author of the library confirms this. I imagine such LAPACK
> functions implemented in BLAS are "superior" to their native/untuned
> counterparts. If so, I wonder how the linker(?) can be made to use the
> BLAS versions instead of the LAPACK ones. Will placing the GotoBLAS
> lib before the LAPACK lib guarantee this?
Linking with two libraries that define the same symbol is tricky and
highly linker dependent.
A better solution is to leave only one, the right symbol. For example,
this is how it is done in ATALS
Whey you start working with any BLAS and LAPACK library, the first
question is how the functions are named there. You can check it with nm
nm lapack.lib | grep -i dgesv
By default, the Intel Fortran on Windows uses uppercase and does not
append underscore. If this is the case with the libraries, you need to
rename the functions in the sample: dgesv_ to DGESV and dgels_ to DGELS.
DGESV is correct. The compiler prepend always the underscore. The next
step would be to check how this function in the object file. Make nm on
the object file.
However the message
unresolved external symbol "void __cdecl _DGESV
says that it is okay.
Just double check that the linker search the library. It should be on
the path -L but additionally it should be put on the command line -l.
Well, for cl the flags are different.
Correct. This is the name mangling in C++. Use extern "C" before the
function name, that is
extern "C" void dgesv_(...);
My guess is that the missing functions belong to the system libraries of
the Intel Fortran compiler. It seems that you have to link with them as
well. No luck. Ask the author of the library.
> BTW, I'm compiling in Visual Studio, not the command-line.
This does not matter provided everything is done correctly.
Great work! You should know that this whole process of
reverse-engineering the linker and putting together a bunch of libraries
in order to get the right answer is something that occupies a lot of
time for folks like us. In particular, don't feel bad that it took this
long!
> Next: (Goto)BLAS integration and scaling the linear system.
>
> Evgenii, perhaps you could make a page about this to help the next
> n00b that comes around asking for help on how to use LAPACK in a VC++
> project.
I shouldn't speak for Evgenii, but it's always useful for me when I
encounter a similarly difficult problem, to write such a page myself and
then let experts critique it. That can be humbling though as I've found
myself ;-)
mfh
A very good paper in this respect is
David M. Beazley, Brian D. Ward, and Ian R. Cooke,
The Inside Story On Shared Libraries and Dynamic Loading,
Computing in Science & Engineering, September/October 2001, N 5, p. 90-97.
There is also very good book
John R. Levine, Linkers and Loaders
http://linker.iecc.com/
but it is more academic in nature.
>
>>> Next: (Goto)BLAS integration and scaling the linear system.
>
> I'm pleased to announce that I've successfully replaced the reference
> BLAS with GotoBLAS compiled for my system :-) . Also, I've managed to
> reduce the list of fortran libraries to libifcorertd.lib and
> libmmdd.lib.
Good news. Does GotoBLAS on Windows understand uppercase BLAS names
without underscore?
...
>>> Evgenii, perhaps you could make a page about this to help the next
>>> n00b that comes around asking for help on how to use LAPACK in a VC++
>>> project.
>> I shouldn't speak for Evgenii, but it's always useful for me when I
>> encounter a similarly difficult problem, to write such a page myself and
>> then let experts critique it.
>
> I don't mind writing such a doc with input from you guys, if Evgenii
> will put it up on matrixprogramming.com. I could put it on some blog,
> but I think its best if all the useful information remained under "one
> roof".
It would be my pleasure to put such a document to the site.
Well, this shows that actually GotoBLAS accepts only dtrsv_ as the only
allowable name. I do not know what the other names mean but U stays for
undefined anyway. So, this is an interesting question. How it was
possible that LAPACK compiled with the Intel Fortran on Windows is able
to call directly dtrsv_? In my understanding it should call DTRSV instead.
Well, this just shows that there are many things that are just above the
normal understanding. Yet, fortunately they are working!
Generally, yes. The idea is that if you "just want the answer," the
nonexpert routine should just return a reasonable answer. If you
really understand what the error means and you want to compute it, you
should call the expert routine.
> It would be interesting to see how expert routine DGESVX performs in
> comparison to the driver routine DGESV i.e. if the error computation
> slows it down significantly. Problem is I'm that some parts of the
> manual [ http://cc.in2p3.fr/doc/phpman.php/man/dgesvx/l ] are hard to
> decipher. Can someone please help?
You should read the LAPACK Users' Guide instead (it is available
online for free as in beer). The terms used in the manual pages will
make more sense once you do that.
In this case, iterative refinement (which both improves the solution's
accuracy and computes error bounds) costs a (usually) constant number
of dense matrix-vector multiplies. It should be much less
time-consuming than factoring the matrix, unless the matrix is very
very small.
mfh
Actually it should be possible to use BLAS routines to compute the error.
I was too lazy to call them.
By the way, have you turned the safe iterators off in VS? See two
discussions about it - follow the links at the bottom of the next page
Wouldn't that be ScaLAPACK, rather than LAPACK? OOC ScaLAPACK was
designed to exploit parallelism in the following ways:
1. on the panel factorizations and updates (which are performed using
ordinary ScaLAPACK factorization routines, if I recall correctly), and
2. on disk reads and writes (assuming your disk hardware and software
supports that).
As for #1, ScaLAPACK was designed with clusters and distributed memory
in mind. It can be hard (perhaps impossible?) to run with only one
processor. If you use a shared-memory MPI backend, you can productively
use ScaLAPACK on a single node with multiple processors, but it's known
that approaches specific to the shared-memory domain tend to perform
better. (I'm not talking about OOC yet.)
As for #2, the only way you'll get anything out of that on a single node
is if you have configured a RAID array of disks with striping to improve
aggregate disk bandwidth. On a cluster, you would need a
high-performance parallel filesystem (not NFS, something like Lustre or
GPFS).
> ] using Cygwin. I doesn't seem to perform any better than the basic
> LAPACK, and that's probably why its remained a prototype for more than
> 14 years (its dated 1995).
You should be proud you got it to work at all -- I'm impressed actually
;-) There has been some more recent work on parallel OOC solvers -- the
FLAME people (van de Geijn and ilk at UT Austin) worked on this.
mfh
...
> I've heard of the FLAME project, although I'm not entirely sure that I
> understand what they are trying to do. It is a frame work, a library
> or both? Anyway, page 6 of the FLAME complete reference document
> states that the OOC feature is currently is not available but that
> anyone interested in such functionality should contact their Spanish
> colleagues, which I did, but I got mixed messages from them; namely:
> the library is commercial, stability issues mean it cannot be
> integrated into (or distributed with) liblflame etc.
Once I have ordered and read the book written by the Flame authors (it
is actually free to download)
The Science of Programming Matrix Computations
http://www.lulu.com/content/1911788
The book is great but I have decided not to try FLAME. There are many
interesting things in this world but the available time unfortunately is
limited. In this respect you may want to look at small document that I
have recently written
Engineering Computing: Mixing Knowledge Transfer, Programming, and Numerics
http://evgenii.rudnyi.ru/doc/misc/EngineeringComputing.html
At some point it is important to choose the main goal and try to reach
it by just cutting the extra branches. Well, hopefully there will be
some more time after retirement.
OOC is for solving problems that don't fit in RAM. LAPACK is designed
to reduce memory bandwidth requirements (and therefore increase
computational intensity) between cache and RAM. The problem is, the
relationship between virtual memory (or rather, disk swap space) and
RAM is very different than the relationship between RAM and cache:
1. The cache reads lines of data (usually 32, 64, or 128 bytes) from
RAM, whereas virtual memory comes in pages at a time (often >= 4K).
Bringing in a page from disk swap space to RAM, or pushing it out
again, is a much more expensive operation than bringing in a cache
line from RAM.
2. The operating system has to change between processes every so often
and every time it does that, it has to bring in a page belonging to
that process (since different processes aren't allowed to share memory
pages generally). So not only does the disk have to grind to fetch
and put back pages from your app, it has to grind whenever the OS
decides to let some other process (like a system daemon or your GUI)
run. This is also true for cache and DRAM, but a cache line fetch is
handled by hardware and has a latency of about 100 cycles, whereas
swapping pages is a huge system call kind of thing and disk latency
could be around a million cycles.
3. RAM may be >1000x as big as (L2 or L3) cache, but virtual memory
isn't much bigger than RAM itself. Thus, it's still easy to run out
of virtual memory, which is generally a disaster (on Linux with the
default settings, it results in some random process on the machine
being killed).
This means that OOC codes generally take the effort to manage data
movement between RAM and disk explicitly. They bound the amount of
data in RAM, and carefully do as little data movement between RAM and
disk as possible. LAPACK doesn't do this; it relies on the properties
of a cache in order to reduce data movement.
> I've heard of the FLAME project, although I'm not entirely sure that I
> understand what they are trying to do. It is a frame work, a library
> or both?
Both, sort of. You can use it as a library, or you can use it to
experiment with different matrix partitionings in the three one-sided
factorization algorithms (Cholesky, LU, QR).
> Anyway, page 6 of the FLAME complete reference document
> states that the OOC feature is currently is not available but that
> anyone interested in such functionality should contact their Spanish
> colleagues, which I did, but I got mixed messages from them; namely:
> the library is commercial, stability issues mean it cannot be
> integrated into (or distributed with) liblflame etc.
That's unfortunate. Van de Geijn seemed quite proud of their work
when he contacted us to clarify a point in one of our papers. If you
have a strong interest in the OOC feature and would find it very
useful for your work, I would recommend explaining the situation to
him and asking him to intervene so you can get the code or at least a
library. If that doesn't work, let me know and I'll try talking to
him.
mfh
First, you should get the documentation for LAPACK functions from
www.netlib.org, as it is most likely to be up to date. (Nothing bad
about U of Utah per se.)
Second, any LAPACK function that has both a WORK and an LWORK parameter
lets you determine the best LWORK value by setting LWORK = -1 and
passing in a one-element array for WORK. If the call succeeds (INFO =
0), then WORK(1) will have been set to the optimal LWORK value. Then
you can set LWORK to that value, allocate the WORK array to that length
and call again.
The LWORK = -1 procedure is called an "LWORK query" and should be
discussed in the LAPACK Users' Guide.
mfh
If you look at the code - you have first to follow
http://www.netlib.org/lapack/double/dsysv.f
and then
http://www.netlib.org/lapack/double/dsytrf.f
you see that this query is very fast. So, not every call leads to a real
solve. It is useful to try to read code. Otherwise, it is simpler just
to run a test.
In fact, all you need for an LWORK query are the problem dimensions;
you don't even need to have allocated the arrays containing the
problem data, because the LWORK query won't touch them. The LWORK
query does some error checking which can save you the trouble of doing
it yourself.
mfh