I'm trying to get julia built on FreeBSD (FreeBSD 8.2-RELEASE amd64),
after a few tweaks I'm now stuck at:
llvm[4]: Installing Release Archive Library
/usr/home/peer/julia/external/root/lib/libprofile_rt.a
CC src/support/dirpath.o
LINK src/support/libsupport.a
CC src/flisp/flisp
assert-failed: (equal? +nan.0 +nan.0)
in file unittest.lsp
#0 (lambda)
gmake[2]: *** [flisp] Error 1
gmake[1]: *** [flisp/libflisp.a] Error 2
gmake: *** [julia-release] Error 2
Any ideas what would cause this?
How could I further debug it?
Cheers,
Peer Stritzinger
$ gcc -g -mtune=native -o diag diagnose_mxcsr.c
$ ./diag
mxcsr was 0x1f80
So it looks as if bit 7 is set ...
Cheers,
-- Peer
Took a while since I had trouble rebuilding (needed a "make cleanall"
otherwise it seems to skip the tests.
If I comment out only the offending NaN test, the other NaN tests seem
to pass and then I get:
CC src/flisp/flisp
assert-failed: (equal? 0.0 0.0)
in file unittest.lsp
#0 (lambda)
gmake[2]: *** [flisp] Error 1
gmake[1]: *** [flisp/libflisp.a] Error 2
gmake: *** [julia-release] Error 2
It seems to be a problem with equal? for same literal float values.
So I comment out the equal? 0.0 0.0 assert also:
Now it seems the remaining tests pass but now it can't read flisp.boot later on.
CC src/flisp/flmain.o
CC src/flisp/flisp
FLISP src/julia_flisp.boot
fatal error:
(io-error "file: could not open \"flisp.boot\"")
gmake[1]: *** [julia_flisp.boot] Error 1
Summary:
=======
On FreeBSD 8.2 on a amd64 system I have to disable these tests:
diff --git a/src/flisp/unittest.lsp b/src/flisp/unittest.lsp
index 9ebd491..3b0df0e 100644
--- a/src/flisp/unittest.lsp
+++ b/src/flisp/unittest.lsp
@@ -77,7 +77,7 @@
(assert (equal? (string 'sym #byte(65) #wchar(945) "blah") "symA\u03B1blah"))
; NaNs
-(assert (equal? +nan.0 +nan.0))
+;;;(assert (equal? +nan.0 +nan.0))
(assert (not (= +nan.0 +nan.0)))
(assert (not (= +nan.0 -nan.0)))
(assert (equal? (< +nan.0 3) (> 3 +nan.0)))
@@ -92,7 +92,7 @@
; -0.0 etc.
(assert (not (equal? 0.0 0)))
-(assert (equal? 0.0 0.0))
+;;;(assert (equal? 0.0 0.0))
(assert (not (equal? -0.0 0.0)))
(assert (not (equal? -0.0 0)))
(assert (not (eqv? 0.0 0)))
and then I'm stuck at the next step.
But flisp.boot is there, readable and unchanged:
$ ls -l src/flisp/flisp.boot
-rw-r--r-- 1 peer staff 36288 Feb 22 14:59 src/flisp/flisp.boot
You'd won the bet :-)
This pointed me in the right direction, in FreeBSD this is usually
done with sysctl.
Added a FreeBSD version of get_exename and a few hacks further on:
$ uname -s -r -m
FreeBSD 8.2-RELEASE amd64
./julia
_
_ _ _(_)_ |
(_) | (_) (_) | A fresh approach to technical computing
_ _ _| |_ __ _ |
| | | | | | |/ _` | | Version 0.0.0-prerelease
| | |_| | | | (_| | | Commit 3c3e0aecef (2012-02-21 06:58:08)*
_/ |\__'_|_|_|\__'_| |
|__/ |
julia> 1+2
There still is the issue with the broken unit test ...
I'll submit my changes to get to this point and write up a short
description how to get there.
3
Sent pull request https://github.com/JuliaLang/julia/pull/448
With this and following the description I added to the README.md it
should be easy to reproduce this.
If I can help out debugging the flisp tests please let me know.
-- Peer
GDB works for me for julia and flisp -- any suggestions how to debug this?
$ gdb ./julia
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
(gdb) b main
Breakpoint 1 at 0x41aaf4
(gdb) r
Starting program: /usr/home/peer/julia/julia
[New LWP 100331]
[New Thread 801a041c0 (LWP 100331)]
[Switching to Thread 801a041c0 (LWP 100331)]
Breakpoint 1, 0x000000000041aaf4 in main ()
$ gdb src/flisp/flisp
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...(no debugging
symbols found)...
(gdb) b main
Breakpoint 1 at 0x41dfd0
(gdb) r
Starting program: /usr/home/peer/julia/src/flisp/flisp
(no debugging symbols found)...(no debugging symbols found)...(no
debugging symbols found)...
Breakpoint 1, 0x000000000041dfd0 in main ()
(gdb) s
Single stepping until exit from function main,
which has no line number information.
; _
; |_ _ _ |_ _ | . _ _
; | (-||||_(_)|__|_)|_)
;-------------------|----------------------------------------------------------
> I am using memcmp() to compare the data of the 2 arguments; while this
> is a bit strange I'm not sure why it wouldn't work. The two NaNs
> originate from the same bit pattern in the reader.
> Not sure what to try next.
Let me know if I can help.
-- Peer
If I build flisp-debug by running "gmake debug" in the flisp directory
(equal? +nan.0 +nan.0) still returns #f as in the nodebug version.
The debug version also runs with gdb and the #f return value is also
returned with gdb.
Without understanding any of flisps I found out that a breakpoint in
equal_lispvalue is triggered by the equal? call. Actually its
triggered multiple times until the repl prints the #f. Can't make
anything of the values passed since they are only shown as numbers,
can only say: sometimes they are the same and sometimes they are
different.
If you can tell me where to look this would be no problem at all.
-- Peer
return *(uint64_t*)&da == *(uint64_t*)&db;
I've seen problems with this kind of type-punning before. Could you
try replacing the pointer cast with a union?
On Sun, Feb 26, 2012 at 9:47 PM, Jeff Bezanson <jeff.b...@gmail.com> wrote:
> OK I found something worth trying. On src/support/operators.c:184:
>
> return *(uint64_t*)&da == *(uint64_t*)&db;
>
> I've seen problems with this kind of type-punning before. Could you
> try replacing the pointer cast with a union?
Once you mentioned this it smelled like a strict aliasing violation to
me. So I build julia with the flag -fno-strict-aliasing in CFLAGS.
And voila julia builds without any test-case violation.
This also explains why you can't see the problem when building for
debugging (strict aliasing is only having a effect when using -O2 or
higher in gcc).
Basically strict aliasing says the compiler makes the assumption that
dereferencing pointers to objects of different types will never refer
to the same memory location.
Best explanation of this can be found here:
http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html
More info here:
http://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule
This is a new thing added to C99 and therefore in the meantime used by
default in gcc. This makes some optimizations possible like keeping
pointed to values in registers because the compiler can be sure that
the same memory location is not modified through another pointer.
The way around this is consistent use of unions for all these cases
all over the code. There is a warning in gcc that can show you some
points where this is violated. Unfortunately gcc is very inconsistent
with these warnings (IIRC: there are two steppings of the warning: one
that shows to few and one that shows to many problems, also gcc might
even miss some points in the second setting)
My suggestion is:
* switch on -fno-strict-aliasing and then either:
- work with the warnings and maybe other tools to root out all
aliasing problems then switch it off again.
- decide you want to be able to use a aliasing style and keep it switched on.
Personally I'm in favor of keeping -fno-strict-aliasing since the
aliasing problems cause hard to detect bugs (I'm burnt by having
debugged a real time os scheduler for 6 weeks fultime because of a
aliasing problem). Its basically how C is still believed to work (but
it doesn't anymore) when compiled with -fno-strict-aliasing
OTOH performance of C code might improve if strict aliasing is observed.
If performance is chosen over possible incorrectness here some way
should be in place to avoid aliasing creeping back in.