Porting Julia to PowerPC

1,335 views
Skip to first unread message

Geert Janssen

unread,
Mar 27, 2015, 2:56:47 AM3/27/15
to juli...@googlegroups.com
I really like to start a discussion on porting Julia to the PowerPC (powerpc64le) platform.
I have been working on this for several days now. My platform is a PowerNV system running
Ubuntu Utopic Unicorn 14.10. uname tells me this:
Linux XXXX 3.16.0-29-generic #39-Ubuntu SMP Mon Dec 15 22:29:07 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux

So far I have achieved the following:
- Created a Make.user for this platform
- Scrambled together all the dependencies and have them compiled and in .so form
- Can build libjulia.so and have a julia executable:
julia: ELF 64-bit LSB executable, 64-bit PowerPC or cisco 7500, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=bdbb7a443c6e540131d16e5c228a82a1840d5c6d, not stripped

Next thing is bootstrapping julia to build sys.ji and sys.so files.
That's where things run amok. I get a segmentation fault smack in the middle of
processing strings.jl. I have tried all kinds of debugging approaches:
- turned of garbage collection
- change the order of includes in the sysimg.jl file
- added jl_printfs/jl_static_show all over the C/C++ source code
Also there is a strange hick-up occurring right after including osutils.jl.
It seems like the program is stuck for several minutes and then continues.
I saw identical behavior on a 32-bit x86 machine.

I am still investigating.
Any suggestions are much appreciated.

Geert

Ryan Northrup

unread,
Mar 27, 2015, 4:58:15 AM3/27/15
to juli...@googlegroups.com

On Mar 26, 2015 11:56 PM, "Geert Janssen" <gee...@gmail.com> wrote:
>
> I really like to start a discussion on porting Julia to the PowerPC (powerpc64le) platform.
> I have been working on this for several days now. My platform is a PowerNV system running
> Ubuntu Utopic Unicorn 14.10. uname tells me this:
> Linux XXXX 3.16.0-29-generic #39-Ubuntu SMP Mon Dec 15 22:29:07 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux
>
> So far I have achieved the following:
> - Created a Make.user for this platform
> - Scrambled together all the dependencies and have them compiled and in .so form

Do you mind sharing your details here (which libraries are being used and which versions, as well as your Make.user)?  In addition to potential troubleshooting usefulness, I've been hoping to do something similar (though in my case it's OpenBSD on PowerPC rather than GNU/Linux) and would love to see how you've gone about it (since you've definitely gotten further than I have).

> - Can build libjulia.so and have a julia executable:
> julia: ELF 64-bit LSB executable, 64-bit PowerPC or cisco 7500, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=bdbb7a443c6e540131d16e5c228a82a1840d5c6d, not stripped
>
> Next thing is bootstrapping julia to build sys.ji and sys.so files.
> That's where things run amok. I get a segmentation fault smack in the middle of
> processing strings.jl. I have tried all kinds of debugging approaches:
> - turned of garbage collection
> - change the order of includes in the sysimg.jl file
> - added jl_printfs/jl_static_show all over the C/C++ source code
> Also there is a strange hick-up occurring right after including osutils.jl.
> It seems like the program is stuck for several minutes and then continues.
> I saw identical behavior on a 32-bit x86 machine.
>
> I am still investigating.
> Any suggestions are much appreciated.
>
> Geert

I'm not particularly familiar with PowerNV; do you happen to know which CPU model it is exactly?  /proc/cpuinfo might help with that identification.

I happen to have a G5 Power Mac and XServe (PowerPC 970); assuming I can get Ubuntu installed on one of them, and assuming there aren't particularly-significant differences CPU-wise, I can likely help test/troubleshoot once I know how to reproduce your environment.

-- Ryan S. Northrup

Isaiah Norton

unread,
Mar 27, 2015, 8:34:40 AM3/27/15
to juli...@googlegroups.com
In case you are not already, you should definitely be building against a new version of LLVM, either llvm-svn or 3.6.0 (LLVM_VER in Make.user). Some quick googling suggests there have been patches supporting PPC on MCJIT, but the old JIT hasn't gotten any attention for several release cycles.

The other suggestion is to run the build under a debugger so you can see what is actually happening. You need to look at the makefile for the command line where sysimg0.ji is built. Runnging the debugger will look something like:

(in base/)
gdb --args ../julia --build ../usr/lib/julia/sysimg0.ji sysimg.jl

and the second one:
gdb --args ../julia --build ../usr/lib/julia/sysimg.ji -I ../usr/lib/julia/sysimg0.ji sysimg.jl

(it's been a while since I've run this, so double-check in the Makefile)

Geert Janssen

unread,
Mar 27, 2015, 11:06:01 AM3/27/15
to juli...@googlegroups.com
Dear Ryan,

I am very much encouraged to learn that others are struggling with a PowerPC port as well.
I will gather all necessary info and post it here soon.
I am using LLVM 3.5.0 now since that comes standard with Ubuntu 14.10.
The machine uses Power 8 processors:
...
processor    : 191
cpu        : POWER8E (raw), altivec supported
clock        : 3325.000000MHz
revision    : 2.1 (pvr 004b 0201)
(Yes a big machine with 192 cores!)

Geert Janssen

unread,
Mar 27, 2015, 12:28:23 PM3/27/15
to juli...@googlegroups.com
I should have been more clear on the Julia version I am trying to port.
It is: julia-0.3.6_0c24dca65c.tar.gz.

I tried with llvm-3.6.0 which compiles fine by itself.
But the Julia build gives me all kinds of problems now because of missing llvm includes.
Should I try a newer version of Julia?

Isaiah Norton

unread,
Mar 27, 2015, 12:38:19 PM3/27/15
to juli...@googlegroups.com

You should build master from scratch. Don't use system llvm.

Isaiah Norton

unread,
Mar 27, 2015, 1:44:29 PM3/27/15
to juli...@googlegroups.com
Also there is a strange hick-up occurring right after including osutils.jl.
It seems like the program is stuck for several minutes and then continues.
I saw identical behavior on a 32-bit x86 machine. 

This one is usual: it is the compilation of inference. The pause is later than when you see `inference.jl` because the actual compilation isn't triggered until needed.

Ryan Northrup

unread,
Mar 27, 2015, 1:53:26 PM3/27/15
to juli...@googlegroups.com
On Fri, Mar 27, 2015 at 8:06 AM, Geert Janssen <gee...@gmail.com> wrote:
> Dear Ryan,
>
> I am very much encouraged to learn that others are struggling with a PowerPC
> port as well.
> I will gather all necessary info and post it here soon.
> I am using LLVM 3.5.0 now since that comes standard with Ubuntu 14.10.

Cool, thanks.

Also, like what Isaiah mentioned, LLVM is something that you should
probably let Julia pull in and compile, since Julia's pretty picky
when it comes to LLVM. I'd try again with USE_SYSTEM_LLVM=0 in your
Make.user (or just leave USE_SYSTEM_LLVM out of it and fall back to
the defaults in Make.inc).

I'm pretty sure Julia defaults to LLVM 3.3. Per the README.md, 3.5
and newer *mostly* work, but there are still some bugs.

> The machine uses Power 8 processors:
> ...
> processor : 191
> cpu : POWER8E (raw), altivec supported
> clock : 3325.000000MHz
> revision : 2.1 (pvr 004b 0201)
> (Yes a big machine with 192 cores!)

Certainly beefier than what I've got, that's for sure! Are those
actual cores or is it reporting hardware threads as their own cores?

Whatever the case, can't wait to be able to run Julia on these sorts
of systems :)

-- Ryan S. Northrup

Isaiah Norton

unread,
Mar 27, 2015, 1:58:16 PM3/27/15
to juli...@googlegroups.com
I'm pretty sure Julia defaults to LLVM 3.3.  Per the README.md, 3.5
and newer *mostly* work, but there are still some bugs.

You need >3.5 in order to get the newer JIT enginer (MCJIT) which will probably have the best chance to be supported. Put the following in Make.user:

LLVM_VER = 3.6.0

Geert Janssen

unread,
Mar 27, 2015, 2:54:14 PM3/27/15
to juli...@googlegroups.com
Dear Isaiah, Ryan,

Thanks for the continued encouragements.
I downloaded the latest julia-master.zip and I am compiling it now.
Indeed I'd already put LLVM_VER=3.6.0 in my Make.user.
Right of the bat something goes wrong with configuring libuv.
Will report more later.

Isaiah Norton

unread,
Mar 27, 2015, 3:01:28 PM3/27/15
to juli...@googlegroups.com
If possible it is best to work from a git clone so that you can track changes and submit them to upstream. The zip download does not appear to have a .git folder.

Simon Byrne

unread,
Mar 27, 2015, 4:32:11 PM3/27/15
to juli...@googlegroups.com
If anyone else is interested, apparently IBM offers free cloud access to Power machine to open source developers here:
https://www-304.ibm.com/partnerworld/wps/servlet/ContentHandler/stg_com_sys_power-development-platform

I haven't tried it out, but I'd be interested to hear if someone does.

s

Geert Janssen

unread,
Mar 27, 2015, 5:13:55 PM3/27/15
to juli...@googlegroups.com
Indeed Isaiah, just found that out too.
No .git in the zip. Am now cloning the julia git and start afresh (for the 3rd time...).
Will keep posting here on progress.

Geert Janssen

unread,
Mar 27, 2015, 5:46:12 PM3/27/15
to juli...@googlegroups.com

  • 100% of hardware agnostic Linux on x86 applications written in scripting (Java) or interpretive languages will run as is with no changes¹
  • 95% of Linux on x86 applications written in C/C++ port to Linux on Power with no source code change, just a simple recompile and test²
Unfortunately Julia does not fall in these categories...

Jameson Nash

unread,
Mar 27, 2015, 9:50:56 PM3/27/15
to juli...@googlegroups.com
Gotta love that fine print though:

Note 1 - Interpretive languages include PHP, Python, Perl, Ruby, Java, etc. Assumes 8 hours of dedicated time and prior experience with the application code and its dependencies (e.g. language, libraries, web application, database) and that dependencies already ported and installed. Assumes no platform or device specific dependencies
Note 2 - Includes C/C++ and other compiled languages. Assumes 16 hours of dedicated time and prior experience with the application code and its dependencies (e.g. language, libraries, web application, database) and that dependencies already ported and installed. Assumes no platform or device specific dependencies.

It's sure easy to get to 100% if you exclude anything that fails!

Julia itself has many platform and device specific dependencies, to optimize performance (llvm, blas, libunwind, gc frames), so it doesn't really fit in IBM's descriptions. But with LLVM3.6 (and 16 hours), I think it should be doable. I'm happy to help answer questions on the mailing list here (or Github issues). I may even be able to try IBM's virtual service thing sometime this weekend and see if I can support your work that way.

For debugging the sysimg build, you don't necessarily need to give build a long path, anywhere and any name will do:
```
gdb --args ../julia --build sys sysimg.jl
```

the `-J` option will run julia with the specified system image (and can be combined with the above option to run with the specified image, and generate a new one):
```
gdb --args ../julia -J ./sys.ji
```

(note: there's a section of the manual that is specifically geared towards helping answer these sorts of questions: http://docs.julialang.org/en/latest/devdocs/julia/. Please let us know where it can be improved. I know already that more information on the codegen process would probably be very useful, but it's harder for me to know what else might be missing since I can write it, I don't need to read it; which is a bit of a conundrum there on building a useful set of documentation :)

Geert Janssen

unread,
Mar 30, 2015, 10:15:04 AM3/30/15
to juli...@googlegroups.com
Dear Jameson,

I am aware of the devdocs directory and have read its contents.
So far that didn't really help me much.

I am getting a segmentation violation in the type inference code
right after (or during) the processing of osutils.jl in the sysimg.jl file.
I do have a stack trace but unfortunately the PPC64 machine I was using is in maintenance today.
I will repeat the run as soon as it is available.
I said before, I am now using a git clone of Julia with LLVM 3.6.0.
I can not use openlibm because of compilation errors.
Hence I resort to the system libm. Here is my Make.user file:


USE_SYSTEM_LIBM=1
USE_SYSTEM_BLAS=1
USE_SYSTEM_LAPACK=1
USE_SYSTEM_FFTW=1
USE_SYSTEM_GMP=1
USE_SYSTEM_MPFR=1

# __PPC64__ defined but __ppc64__ is not!
CFLAGS += -D__ppc64__
CXXFLAGS += -D__ppc64__
LDFLAGS = -llzma
VERBOSE = 1

LLVM_VER=3.6.0

Ryan Northrup

unread,
Mar 30, 2015, 5:17:11 PM3/30/15
to juli...@googlegroups.com


On Mar 30, 2015 7:15 AM, "Geert Janssen" <gee...@gmail.com> wrote:
> Hence I resort to the system libm. Here is my Make.user file:
>
>
> USE_SYSTEM_LIBM=1
> USE_SYSTEM_BLAS=1
> USE_SYSTEM_LAPACK=1
> USE_SYSTEM_FFTW=1
> USE_SYSTEM_GMP=1
> USE_SYSTEM_MPFR=1
>
> # __PPC64__ defined but __ppc64__ is not!
> CFLAGS += -D__ppc64__
> CXXFLAGS += -D__ppc64__
> LDFLAGS = -llzma
> VERBOSE = 1
>
> LLVM_VER=3.6.0
>

Thanks for that.  I'll see if I can reproduce (and if so, troubleshoot) on my own hardware).

-- Ryan

Jameson Nash

unread,
Mar 30, 2015, 9:09:20 PM3/30/15
to juli...@googlegroups.com
> I am aware of the devdocs directory and have read its contents.
So far that didn't really help me much.

As I said, I know there's definite room for improvement there. If you have some ideas for section headers that you think would help your use case, I would be happy to try to write the content.

I've committed a few build improvements over the weekend (through my own testing with the IBM service). See the Make.powerpc file for a starter template.

For that segfault, I found that turning on MEMDEBUG (in options.h) seemed to help. I'm still looking into why it fails in the first place. In my tests, it seemed like some of the jl_binding_t objects were getting incorrectly freed (and put back into the memory pool).

Geert Janssen

unread,
Mar 31, 2015, 9:51:58 AM3/31/15
to juli...@googlegroups.com
Thanks Jameson,

So it was you that checked in the Make.powerpc. I did a git pull and make.
Now I get the following error:

geert@tulgb001:~/src/julia/base$ gdb /gpfs/DDNgpfs1/geert/src/julia/usr/bin/julia-debug
GNU gdb
(Ubuntu 7.8-1ubuntu4) 7.8.0.20141001-cvs
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "powerpc64le-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /gpfs/DDNgpfs1/geert/src/julia/usr/bin/julia-debug...done.
(gdb) r -C native --build /gpfs/DDNgpfs1/geert/src/julia/usr/lib/julia/sys0 sysimg.jl
Starting program: /gpfs/DDNgpfs1/geert/src/julia/usr/bin/julia-debug -C native --build /gpfs/DDNgpfs1/geert/src/julia/usr/lib/julia/sys0 sysimg.jl
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/powerpc64le-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00003fffb7bf0080 in julia.new_0 ()
(gdb) bt
#0  0x00003fffb7bf0080 in julia.new_0 ()
#1  0x00003fffb7c7530c in jl_apply (f=0x3ffdb4870a38, args=0x3fffffffe098,
    nargs
=2) at julia.h:1261
#2  0x00003fffb7c7af40 in jl_trampoline (F=0x3ffdb4870a38,
    args
=0x3fffffffe098, nargs=2) at builtins.c:1010
#3  0x00003fffb7c65d88 in jl_apply (f=0x3ffdb4870a38, args=0x3fffffffe098,
    nargs
=2) at julia.h:1261
#4  0x00003fffb7c6d1ac in jl_apply_generic (F=0x3ffdb4870970,
    args
=0x3fffffffe098, nargs=2) at gf.c:1706
#5  0x00003fffb7d5beb4 in jl_apply (f=0x3ffdb4870970, args=0x3fffffffe098,
    nargs
=2) at julia.h:1261
#6  0x00003fffb7d5c55c in do_call (f=0x3ffdb4870970, args=0x3ffdb48690c8,
    nargs
=2, eval0=0x0, locals=0x0, nl=0, ngensym=0) at interpreter.c:64
#7  0x00003fffb7d5d454 in eval (e=0x3ffdb48709c0, locals=0x0, nl=0, ngensym=0)
    at interpreter
.c:214
#8  0x00003fffb7d5c144 in jl_interpret_toplevel_expr (e=0x3ffdb48709c0)
    at interpreter
.c:25
#9  0x00003fffb7d81904 in jl_toplevel_eval_flex (e=0x3ffdb4870998, fast=1)
    at toplevel
.c:502
#10 0x00003fffb7d81c90 in jl_parse_eval_all (fname=0x3fffb7df7918 "boot.jl")
    at toplevel
.c:550
#11 0x00003fffb7d81f2c in jl_load (fname=0x3fffb7df7918 "boot.jl")
    at toplevel
.c:589
#12 0x00003fffb7d6b468 in _julia_init (rel=JL_IMAGE_JULIA_HOME) at init.c:1009
#13 0x00003fffb7d6d5dc in julia_init (rel=JL_IMAGE_JULIA_HOME) at task.c:252
#14 0x0000000010003390 in main (argc=1, argv=0x3ffffffff8a0) at repl.c:482
(gdb)

This means that with the julia-0.3 version I got much farther: then it failed during sysimg.jl processing; now it already fails during loading of boot.jl.
I am curious to learn what your experiences are. Let me know of any experiments I should undertake.

Geert Janssen

unread,
Mar 31, 2015, 10:28:17 AM3/31/15
to juli...@googlegroups.com
Jameson,

Your MEMDEBUG suggestion is an excellent one.
(Apart from adding Make.powerpc you must also have edited src/sys.c for cpuid().)
I don't see the immediate seg fault anymore and sysimg.jl starts being processed.
It gets sort of stuck at osutils.jl as usual.
But then moves on and actually completes till gmp.jl.

...
version
.jl
gmp
.jl
error during bootstrap
:
LoadError(at "sysimg.jl" line 173: LoadError(at "gmp.jl" line 24: ErrorException("error compiling gmp_version: could not load module libgmp: libgmp: cannot open shared object file: No such file or directory")))
rec_backtrace at
/gpfs/DDNgpfs1/geert/src/julia/src/task.c:638
record_backtrace at
/gpfs/DDNgpfs1/geert/src/julia/src/task.c:683
jl_throw at
/gpfs/DDNgpfs1/geert/src/julia/src/task.c:798
jl_rethrow_with_add at
/gpfs/DDNgpfs1/geert/src/julia/src/codegen.cpp:602
to_function at
/gpfs/DDNgpfs1/geert/src/julia/src/codegen.cpp:625
jl_compile at
/gpfs/DDNgpfs1/geert/src/julia/src/codegen.cpp:775
jl_trampoline_compile_function at
/gpfs/DDNgpfs1/geert/src/julia/src/builtins.c:994
jl_trampoline at
/gpfs/DDNgpfs1/geert/src/julia/src/builtins.c:1010
jl_apply at
/gpfs/DDNgpfs1/geert/src/julia/src/julia.h:1261
jl_apply_generic at
/gpfs/DDNgpfs1/geert/src/julia/src/gf.c:1706
jl_apply at
/gpfs/DDNgpfs1/geert/src/julia/src/julia.h:1261
do_call at
/gpfs/DDNgpfs1/geert/src/julia/src/interpreter.c:64
eval at /gpfs/DDNgpfs1/geert/src/julia/src/interpreter.c:214
eval at /gpfs/DDNgpfs1/geert/src/julia/src/interpreter.c:220
eval_body at
/gpfs/DDNgpfs1/geert/src/julia/src/interpreter.c:575
jl_toplevel_eval_body at
/gpfs/DDNgpfs1/geert/src/julia/src/interpreter.c:515
jl_toplevel_eval_flex at
/gpfs/DDNgpfs1/geert/src/julia/src/toplevel.c:496
jl_eval_module_expr at
/gpfs/DDNgpfs1/geert/src/julia/src/toplevel.c:148
jl_toplevel_eval_flex at
/gpfs/DDNgpfs1/geert/src/julia/src/toplevel.c:394
jl_parse_eval_all at
/gpfs/DDNgpfs1/geert/src/julia/src/toplevel.c:550
jl_load at
/gpfs/DDNgpfs1/geert/src/julia/src/toplevel.c:589
jl_load_ at
/gpfs/DDNgpfs1/geert/src/julia/src/toplevel.c:597
unknown
function (ip: -1677459380)
jl_apply at
/gpfs/DDNgpfs1/geert/src/julia/src/julia.h:1261
jl_apply_generic at
/gpfs/DDNgpfs1/geert/src/julia/src/gf.c:1686
unknown
function (ip: -1907752664)
jl_apply at
/gpfs/DDNgpfs1/geert/src/julia/src/julia.h:1261
do_call at
/gpfs/DDNgpfs1/geert/src/julia/src/interpreter.c:64
eval at /gpfs/DDNgpfs1/geert/src/julia/src/interpreter.c:214
jl_interpret_toplevel_expr at
/gpfs/DDNgpfs1/geert/src/julia/src/interpreter.c:25
jl_toplevel_eval_flex at
/gpfs/DDNgpfs1/geert/src/julia/src/toplevel.c:502
jl_eval_module_expr at
/gpfs/DDNgpfs1/geert/src/julia/src/toplevel.c:148
jl_toplevel_eval_flex at
/gpfs/DDNgpfs1/geert/src/julia/src/toplevel.c:394
jl_parse_eval_all at
/gpfs/DDNgpfs1/geert/src/julia/src/toplevel.c:550
jl_load at
/gpfs/DDNgpfs1/geert/src/julia/src/toplevel.c:589
unknown
function (ip: 268447084)
unknown
function (ip: 268447856)
unknown
function (ip: 268448696)
unknown
function (ip: -1640739200)
__libc_start_main at
/lib/powerpc64le-linux-gnu/libc.so.6 (unknown line)
unknown
function (ip: 0)

Basic Block in function 'julia_gmp_version_5950' does not have terminator!
label
%top
LLVM ERROR
: Broken function found, compilation aborted!
Makefile:168: recipe for target '/gpfs/DDNgpfs1/geert/src/julia/usr/lib/julia/sys0.o' failed
make
[1]: *** [/gpfs/DDNgpfs1/geert/src/julia/usr/lib/julia/sys0.o] Error 1
make
[1]: Leaving directory '/gpfs/DDNgpfs1/geert/src/julia'
Makefile:82: recipe for target 'julia-sysimg-debug' failed
make
: *** [julia-sysimg-debug] Error 2
geert@tulgb001
:~/src/julia$

May we conclude that there is some kind of memory allocation problem?


Geert Janssen

unread,
Mar 31, 2015, 10:29:18 AM3/31/15
to juli...@googlegroups.com
OK, missing libgmp. That should be easy to fix. Working on it now.
Almost there...

Geert Janssen

unread,
Mar 31, 2015, 12:39:11 PM3/31/15
to juli...@googlegroups.com
Wow, after some 1.5 hours it finally processed sysimg.jl and went into its second phase:

deprecated.jl
basedocs
.jl
precompile
.jl
 g
++ -shared -fPIC -L/gpfs/DDNgpfs1/geert/src/julia/usr/lib/julia -L/gpfs/DDNgpfs1/geert/src/julia/usr/lib -L/gpfs/DDNgpfs1/geert/src/julia/usr/lib -o /gpfs/DDNgpfs1/geert/src/julia/usr/lib/julia/sys0.so /gpfs/DDNgpfs1/geert/src/julia/usr/lib/julia/sys0.o $([ Linux = Darwin ] && echo '' -Wl,-undefined,dynamic_lookup || echo '' -Wl,--unresolved-symbols,ignore-all ) $([ Linux = WINNT ] && echo '' -ljulia -lssp)
true -ignore /gpfs/DDNgpfs1/geert/src/julia/usr/lib/julia/sys0.so
 cd
base && /gpfs/DDNgpfs1/geert/src/julia/usr/bin/julia-debug -C native --build /gpfs/DDNgpfs1/geert/src/julia/usr/lib/julia/sys -J/gpfs/DDNgpfs1/geert/src/julia/usr/lib/julia/$([ -e /gpfs/DDNgpfs1/geert/src/julia/usr/lib/julia/sys.ji ] && echo sys.ji || echo sys0.ji) -f sysimg.jl || { echo '*** This error is usually fixed by running `make clean`. If the error persists, try `make cleanall`. ***' && false; }
exports
.jl

Waiting for it to finish...

Geert Janssen

unread,
Mar 31, 2015, 2:18:54 PM3/31/15
to juli...@googlegroups.com
SUCCESS!

After 2 more hours I finally have a Julia running on powerpc64le:

geert@tulgb001:~/src/julia$ ./julia
               _
   _       _ _
(_)_     |  A fresh approach to technical computing
 
(_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _
| |_  __ _   |  Type "help()" for help.
 
| | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.4.0-dev+4074 (2015-03-30 16:50 UTC)
 _/
|\__'_|_|_|\__'_|  |  Commit dc9d7b6* (1 day old master)
|__/                   |  powerpc64le-linux-gnu

julia
>

This is really a julia-debug so I will change that first.
Then we should fix the MEMDEBUG problem.
The start up time of Julia is pretty slow, not sure why.


Jameson Nash

unread,
Mar 31, 2015, 2:31:40 PM3/31/15
to juli...@googlegroups.com
MEMDEBUG is a definite performance impact. I'm not sure whether there is something wrong with the pool, or whether that just masks another issue.

In Sys.dllist(), can you see whether it was able to load the sys.so image? (You can also step through dump.c in gdb when starting Julia to see whether it got used)

How far do the tests get for you? (I only ran core)

Geert Janssen

unread,
Mar 31, 2015, 4:31:18 PM3/31/15
to juli...@googlegroups.com
I did a make clean and make to get a non-debug version.
This one gives me a seg fault right at the start of the second bootstrap phase.
Now I am doing a make cleanall and a make debug again to see if I can get a
working version again. Then I will run make testall.

On my system some libraries are under /usr/lib/powerpc64le-linux-gnu/.
If I tell Julia to use the system gmp and mpfr libs it says it cannot find them during bootstrap
(sysimg.jl, load of gmp.jl). When I copy these (or symlink) in julia/usr/lib everything is fine.

Jameson Nash

unread,
Mar 31, 2015, 4:52:27 PM3/31/15
to juli...@googlegroups.com
The debug and release versions will happily live next to each other, and don't require a make clean to switch. You may even want to keep backup copies of the usr/lib/julia/* files so you don't need to wait for the sys image to do testing.

Geert Janssen

unread,
Apr 1, 2015, 9:51:38 AM4/1/15
to juli...@googlegroups.com
I am back to running julia-debug.
I do not see a Sys.dllist() function. I have this:

julia> names(Sys)
13-element Array{Symbol,1}:
 
:uptime      
 
:cpu_info    
 
:Sys        
 
:OS_NAME    
 
:cpu_summary
 
:cpu_name    
 
:total_memory
 
:MACHINE    
 
:CPU_CORES  

 
:WORD_SIZE  
 
:loadavg    
 
:ARCH        
 
:free_memory

julia
> versioninfo()
Julia Version 0.4.0-dev+4074
Commit dc9d7b6* (2015-03-30 16:50 UTC)
DEBUG build
Platform Info:
 
System: Linux (powerpc64le-linux-gnu)
  CPU
: unknown
  WORD_SIZE
: 64
  BLAS
: libblas
  LAPACK
: liblapack
  LIBM
: libm
  LLVM
: libLLVM-3.6.0
 
So how do I know sys.so was loaded?
 I can only say that working interactively with julia-debug with MEMDEBUG defined is awfully slow.
I will make a non-debug julia and see how I fare.

Isaiah Norton

unread,
Apr 1, 2015, 10:02:40 AM4/1/15
to juli...@googlegroups.com
I do not see a Sys.dllist() function. I have this:

Libdl.dllist() now 

Geert Janssen

unread,
Apr 1, 2015, 10:44:57 AM4/1/15
to juli...@googlegroups.com
Well how about this:
geert@tulgb001:~/src/juliageert@tulgb001:~/src/julia$ ./julia
               _
   _       _ _
(_)_     |  A fresh approach to technical computing
 
(_)     | (_) (_)    |  Documentation: http://docs.julialang.org

   _ _   _
| |_  __ _   |  Type "help()" for help.
 
| | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.4.0-dev+4074 (2015-03-30 16:50 UTC)
 _/
|\__'_|_|_|\__'_|  |  Commit dc9d7b6* (1 day old master)
|__/                   |  powerpc64le-linux-
gnu

julia
> Libdl.dllist()

signal
(11): Segmentation fault
unknown
function (ip: 0)
Segmentation fault

I am afraid there are still some serious bugs lingering in the code...
Looking at the C/C++ source I am unfortunately not blown away by its clarity and beauty; on the contrary...

I am running make test-core now and so far so good.
make testall does not work for me:

 
/gpfs/DDNgpfs1/geert/src/julia/usr/bin/julia --check-bounds=yes --startup-file=no ./runtests.jl all
Master process (id 1) could not connect within 60.0 seconds.
exiting
.
Master process (id 1) could not connect within 60.0 seconds.
exiting
.
Master process (id 1) could not connect within 60.0 seconds.
exiting
.
Master process (id 1) could not connect within 60.0 seconds.
exiting
.
Master process (id 1) could not connect within 60.0 seconds.
exiting
.
Master process (id 1) could not connect within 60.0 seconds.
exiting
.
Master process (id 1) could not connect within 60.0 seconds.
exiting
.
Master process (id 1) could not connect within 60.0 seconds.
exiting
.
Worker 2 terminated.
ERROR
: LoadError: connect: connection refused (ECONNREFUSED)
while loading /gpfs/DDNgpfs1/geert/src/julia/test/runtests.jl, in expression starting on line 3
ERROR
(unhandled task failure): EOFError: read end of file

Makefile:9: recipe for target 'all' failed
make
[1]: *** [all] Error 1
make
[1]: Leaving directory '/gpfs/DDNgpfs1/geert/src/julia/test'
Makefile:496: recipe for target 'testall' failed
make
: *** [testall] Error 2


Geert Janssen

unread,
Apr 1, 2015, 11:41:19 AM4/1/15
to juli...@googlegroups.com
Indeed test-core completes successfully. At least something works.
But look at the time it takes!
 
/gpfs/DDNgpfs1/geert/src/julia/usr/bin/julia --check-bounds=yes --startup-file=no ./runtests.jl core
     
* core                 in 1717.86 seconds
    SUCCESS
make
[1]: Leaving directory '/gpfs/DDNgpfs1/geert/src/julia/test'



Geert Janssen

unread,
Apr 1, 2015, 5:14:16 PM4/1/15
to juli...@googlegroups.com
In src/gc.c in function jl_gc_setmark()  shouldn't there be a check on the size of the jl_value_t object?
If indeed MEMDEBUG is off, the small (<= 2048) values are pooled and bigger ones are not. In other spots in the same file
I see checks for 2048 but not here. Is this incorrect or am I missing something?
What I am saying is that I would expect a check on 2048 here and if bigger a call to gc_setmark_big().

void jl_gc_setmark(jl_value_t *v) // TODO rename this as it is misleading now
{
   
//    int64_t s = perm_scanned_bytes;
    jl_taggedvalue_t
*o = jl_astaggedvalue(v);
   
if (!gc_marked(o)) {
       
//        objprofile_count(jl_typeof(v), 1, 16);
#ifdef MEMDEBUG
        gc_setmark_big
(o, GC_MARKED_NOESC);
#else
        gc_setmark_pool
(o, GC_MARKED_NOESC);
#endif
   
}
   
//    perm_scanned_bytes = s;
}





Isaiah Norton

unread,
Apr 1, 2015, 11:25:26 PM4/1/15
to juli...@googlegroups.com
That function is only called for small boxed values (see alloc.c).

Geert Janssen

unread,
Apr 2, 2015, 9:52:14 AM4/2/15
to juli...@googlegroups.com
Ok Isaiah, then I guess the code as is is correct.
Here are my notes on MEMDEBUG.

src/julia.h:

On powerpc64le we have:
sizeof(jl_taggedvalue_t) = 32
sizeof(jl_value_t) = 16
sizeof(jl_sym_t) = 40

src/alloc.c: mk_symbol()

Symbols are normally created from a pool of allocated memory.
The pool size is about 0.5M bytes and is a static within the function.
The size of a symbol is about 40 + the length of its name (rounded up
to a multiple of 8 bytes). A symbol is stored as a tagged value, so
that adds another 32 bytes. If the average length is 8 characters, then
a pool can store 6400 symbols.

There is a nonsense test whether a single symbol is larger than the pool
size. However, when the pool is exhausted a new one is simply allocated
and used. The old one lives on but without any managed reference.
Presumably these symbols are never freed or garbage collected.

If MEMDEBUG is defined the behavior is slightly different: no pool is used at
all and every single symbol creation does its own small malloc.

src/gc.c

This is the only other file that checks MEMDEBUG.
Again MEMDEBUG controls whether some form of pooling is used or not.
Pools are for objects that are at most 2048 in size. Anything beyond
is allocated individually. With MEMDEBUG no pools are used and all allocation
is treated as being big.

With:
gc_setmark_big(o, mark_mode);
memset(pg->data, 0xbb, GC_PAGE_SZ);
memset(v, 0xee, allocsz);
Without:
gc_setmark_pool(o, mark_mode);

src/flisp/flisp.c: void gc(int mustgrow) and others
hopefully this is not the same MEMDEBUG? options.h not included.

I turned MEMDEBUG off in alloc.c and also removed all related memsets in gc.c.
That still gives me a good julia-debug build. So I am concentrating on gc.c.
Something goes wrong in the pooling code.

Isaiah Norton

unread,
Apr 2, 2015, 10:13:59 AM4/2/15
to juli...@googlegroups.com
The idea of MEMDEBUG is that everything goes through malloc, so you can -- with a good deal of patience -- use valgrind to find problems (whereas with the pool allocator valgrind is mostly useless because it can't track the small allocations).

Does test-core pass with MEMDEBUG off? If so, that is good progress.

Jameson Nash

unread,
Apr 2, 2015, 10:18:49 AM4/2/15
to juli...@googlegroups.com
symbols are allocated in a separate and unrelated pool, and never freed.

On x86_64 we have:
sizeof(jl_taggedvalue_t) = 16
sizeof(jl_value_t) = 0
these are the correct values; a non-zero value for jl_value_t doesn't make sense, and would probably result in a corrupted pool allocator.

flisp has it's own gc. generally you won't need to worry about it since it's much simpler than julia's gc, and the two don't interact.

Geert Janssen

unread,
Apr 2, 2015, 12:38:05 PM4/2/15
to juli...@googlegroups.com
Jameson, I was wrong about the sizes mentioned earlier.
I wrote a dedicated program and configured it for powerpc64le.
This is the output I get:

sizeof(int)=4
sizeof(void *)=8
sizeof(JL_DATA_TYPE)=0
sizeof(jl_value_t)=0
sizeof(jl_taggedvalue_t)=16
sizeof(jl_sym_t)=24
sizeof(jl_gensym_t)=8
sizeof(jl_tuple_t)=8
sizeof(jl_array_t)=40
sizeof(jl_lambda_info_t)=144
sizeof(jl_function_t)=24
sizeof(jl_typector_t)=16
sizeof(jl_typename_t)=32
sizeof(jl_uniontype_t)=8
sizeof(jl_fielddesc_t)=4
sizeof(jl_datatype_t)=88
sizeof(jl_tvar_t)=32
sizeof(jl_weakref_t)=8
sizeof(jl_binding_t)=40
sizeof(jl_module_t)=560
sizeof(jl_methlist_t)=48
sizeof(jl_methtable_t)=56
sizeof(jl_expr_t)=24

Surprisingly sizeof(jl_value_t) = 0. I first thought that was incorrect, but it seems intentional.
But I do suspect that that might cause some trouble.

@Isaiah: no, with MEMDEBUG off in both alloc.c and gc.c, I still get a segmentation violation!
That's what I am debugging now.

Jameson Nash

unread,
Apr 2, 2015, 8:08:14 PM4/2/15
to juli...@googlegroups.com
That looks like a great program to add to the repository. Perhaps contrib/ or examples/embedding.c?

I requisitioned another machine from the IBM program for the weekend to look into this further as well.

Geert Janssen

unread,
Apr 8, 2015, 11:23:39 AM4/8/15
to juli...@googlegroups.com
Here is a follow up on my investigations in "Julia on Power":

Julia uses a generational mark-sweep garbage collector (gc.c).
All objects managed in this way are divided among 2 generations,
young (age 0) and old (age 1).
Earlier I mentioned that completely switching off the
use of memory pools (#define MEMDEBUG in options.h) helped me to complete
a make debug build. When I turn this define back off and instead avoid
doing any "full" (or major) garbage collections (just quicksweeps) then
again I succeed in doing a make debug. A regular make still gives me
a segmentation fault right at the start of the second bootstrap phase:

signal (11): Segmentation fault
unknown
function (ip: -1395172488)
_ULppc64_is_signal_frame at
/usr/lib/powerpc64le-linux-gnu/libunwind.so.8 (unknown line)
_ULppc64_step at
/usr/lib/powerpc64le-linux-gnu/libunwind.so.8 (unknown line)
rec_backtrace_ctx at
/gpfs/DDNgpfs1/geert/src/julia/src/task.c:655
rec_backtrace at
/gpfs/DDNgpfs1/geert/src/julia/src/task.c:639
record_backtrace at
/gpfs/DDNgpfs1/geert/src/julia/src/task.c:683
jl_vexceptionf at
/gpfs/DDNgpfs1/geert/src/julia/src/builtins.c:61
jl_errorf at
/gpfs/DDNgpfs1/geert/src/julia/src/builtins.c:68
jl_load_and_lookup at
/gpfs/DDNgpfs1/geert/src/julia/src/ccall.cpp:119
blas_vendor at
./util.jl:115
unknown
function (ip: -535265232)
unknown
function (ip: 0)
/bin/sh: line 1: 72294 Segmentation fault      /gpfs/DDNgpfs1/geert/src/julia/usr/bin/julia -C native --build /gpfs/DDNgpfs1/geert/src/julia/usr/lib/julia/sys -J/gpfs/DDNgpfs1/geert/src/julia/usr/lib/julia/$([ -e /gpfs/DDNgpfs1/geert/src/julia/usr/lib/julia/sys.ji ] && echo sys.ji || echo sys0.ji) -f sysimg.jl



So now I tend to believe that something goes wrong during a full garbage
collection phase (with pooling enabled) where also the old generation is
sweeped.
If I am correct, using just quicksweep does cause promotions in the sense
that an object's age is incremented (from 0 to 1), but since no full sweep
ever follows up, the promoted age is not acted upon therefore no object will
ever get a GC_QUEUED status.

Notice that live objects typically have their 2
GC flags set to 0 (GC_CLEAN status); would we use a full sweep, old
generation objects would have the GC_QUEUED flag set to 1 on some pointer
field (See also diagram in gc.c).
I am not sure if all access to such pointer fields masks the GC bits.
It might explains getting a segmentation violation. But then again why
would that occur on a Power architecture and not on Intel?

Apart from this GC problem I am still not clear why the bootstrap phase
takes so much time: typically it takes several hours (on a POWER8E,
2GHz CPU) to get a sys.so. Then starting julia using this sys.so is
also incredibly slow. I wonder whether sys.so is really used? Does it
contain LLVM compiled native code?

Many questions and no real progress...

Jameson Nash

unread,
Apr 8, 2015, 11:38:33 AM4/8/15
to juli...@googlegroups.com
I fixed the gc bug and inability to load sys.so over the weekend. I also disabled libunwind since it segfaults on llvm-generated binaries.

Rather than tracking issues in this thread, please open issues on GitHub. That way it can also email you when there is any activity on an item. It also helps with keeping discussions from getting sidetracked by too many different bugs getting intermixed in the same conversation.

Geert Janssen

unread,
Apr 8, 2015, 12:06:52 PM4/8/15
to juli...@googlegroups.com
Thanks very much Jameson!
I agree with streamlining the bug tracking.
Although so far I could hardly define/describe my "bugs".
Did you check in your changes?

Geert Janssen

unread,
Apr 8, 2015, 4:32:24 PM4/8/15
to juli...@googlegroups.com
Wow, the latest git allows me to build to completion.
No more segmentation faults and a reasonable responsive system.
Still I get this:

geert@tulgbfen1:~/src/julia$ ./julia
               _
   _       _ _
(_)_     |  A fresh approach to technical computing
 
(_)     | (_) (_)    |  Documentation: http://docs.julialang.org

   _ _   _
| |_  __ _   |  Type "help()" for help.
 
| | | | | | |
/ _` |  |
  | | |_| | | | (_| |  |  Version 0.4.0-dev+4185 (2015-04-08 06:47 UTC)
 _/
|\__'_|_|_|\__'_|  |  Commit 9ad3aa0* (0 days old master)

|__/                   |  powerpc64le-linux-gnu

julia
> Libdl.dllist()

signal
(11): Segmentation fault
Segmentation fault
geert@tulgbfen1
:~/src/julia$





Jameson Nash

unread,
Apr 8, 2015, 4:44:34 PM4/8/15
to juli...@googlegroups.com
Yes (https://github.com/JuliaLang/julia/commit/b28015ef521d0cccfc41409dc54e8f6db76fa57c).

Your emails have been good bug reports already. Anything of the following form is usually good: I tried doing X (running make on a system with a large page size) which should have done Y (built Julia) but instead did Z (segfaulted while building the sys.ji file file the following backtrace)

Geert Janssen

unread,
Apr 9, 2015, 10:06:40 AM4/9/15
to juli...@googlegroups.com
Wonderful progress so far but still a few hitches.
This time I did a hard reset of my git and used Make.powerpc without change as Make.user.
It gives me a clean build except for the fact that during bootstrap gmp.jl and mpfr.jl cannot find
their shared libraries. They are to be found here:
/usr/lib/powerpc64le-linux-gnu/libgmp.so.10.2.0
/usr/lib/powerpc64le-linux-gnu/libmpfr.so.4.1.2
Copying these to $(JULIA_ROOT)/usr/lib and linking them to .so versions fixes things.

Here is a run of this julia executable:

geert@tulgbfen1:~/src/julia$ uname -a
Linux tulgbfen1 3.16.0-30-generic #40-Ubuntu SMP Mon Jan 12 22:07:11 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux
geert@tulgbfen1
:~/src/julia$ ./julia
               _
   _       _ _
(_)_     |  A fresh approach to technical computing
 
(_)     | (_) (_)    |  Documentation: http://docs.julialang.org

   _ _   _
| |_  __ _   |  Type "help()" for help.
 
| | | | | | |
/ _` |  |
  | | |_| | | | (_| |  |  Version 0.4.0-dev+4187 (2015-04-08 19:01 UTC)
 _/
|\__'_|_|_|\__'_|  |  Commit 961bc33* (0 days old master)
|__/                   |  powerpc64le-linux-gnu

julia
> versioninfo()
Julia Version 0.4.0-dev+4187
Commit 961bc33* (2015-04-08 19:01 UTC)

Platform Info:
 
System: Linux (powerpc64le-linux-gnu)
  CPU
: unknown
  WORD_SIZE
: 64
  BLAS
: libblas
  LAPACK
: liblapack
  LIBM
: libm
  LLVM
: libLLVM-3.6.0


julia
> f(x) = x*x
f
(generic function with 1 method)

julia
> code_native(f, (Int32,))
julia
: codegen.cpp:1016: const jl_value_t* jl_dump_function_asm(void*): Assertion `fptr != 0' failed.

signal (6): Aborted
Aborted

I suspect that there are still problems using LLVM on Power.
That might also explain the slugginess.

I would love to hear confirmation of others that run Julia on Power.
Any issues?

Jameson Nash

unread,
Apr 9, 2015, 10:16:57 AM4/9/15
to juli...@googlegroups.com
The linker often needs the -dev version of a package installed to be able to find a shared library.

Please open an issue for the code_native issue. It's probably something minor -- it was working for me last week.

Geert Janssen

unread,
Apr 13, 2015, 11:11:57 AM4/13/15
to juli...@googlegroups.com
Since I still have the feeling that Julia starts up very slowly on Power, even though now it compiles and builds
out of the box from GitHub master, I did some debugging using the LD_DEBUG feature of the dynamic linker.
I get tons of error messages for undefined symbols. Is this something to worry about? (I thinks yes).
Here are snapshots of what I see:

LD_DEBUG=bindings ./julia
...
     
95608:    binding file ./julia [0] to /gpfs/DDNgpfs1/geert/src/julia/usr/b
in/../lib/libjulia.so [0]: normal symbol `jl_compress_ast'
     95608:    ./julia: error: symbol lookup error: undefined symbol: jlcall_pr
int_44163 (fatal)
     95608:    ./julia: error: symbol lookup error: undefined symbol: julia_pri
nt_44163 (fatal)
     95608:    ./julia: error: symbol lookup error: undefined symbol: julia_sho
w_44940 (fatal)
     95608:    ./julia: error: symbol lookup error: undefined symbol: julia_dec
_44045 (fatal)
     95608:    ./julia: error: symbol lookup error: undefined symbol: julia_wri
te_sub_44050 (fatal)


Using LD_DEBUG=libs ./julia, I see these see errors:

...
     
95789:    calling init: /gpfs/DDNgpfs1/geert/src/julia/usr/lib/julia/sys.s
o
     
95789:    
     
95789:    ./julia: error: symbol lookup error: undefined symbol: jlcall_pr
int_44163
(fatal)
     
95789:    ./julia: error: symbol lookup error: undefined symbol: julia_pri
nt_44163
(fatal)
     
95789:    ./julia: error: symbol lookup error: undefined symbol: julia_sho
w_44940
(fatal)
     
95789:    ./julia: error: symbol lookup error: undefined symbol: julia_dec
_44045
(fatal)
     
95789:    ./julia: error: symbol lookup error: undefined symbol: julia_wri
te_sub_44050
(fatal)
     
95789:    ./julia: error: symbol lookup error: undefined symbol: julia_poi
nter_44052
(fatal)
     
95789:    ./julia: error: symbol lookup error: undefined symbol: julia_wri
te_44055
(fatal)

I suspect that although sys.so gets created, it somehow is corrupt and symbols look ups fail.
Is that possible?



Jameson Nash

unread,
Apr 14, 2015, 12:21:28 AM4/14/15
to juli...@googlegroups.com
I think those symbol lookups are occurring from symbols that should only exist in memory (in the JIT code). I'm not sure why PPC is trying to look them up on disk. It does seem that llvm is getting confused and is unable to find a function by name after emitting it (in at least some cases) and is thus doing some amount of extra work.

I'm still seeing a significant improvement in runtime however.

Geert Janssen

unread,
Apr 14, 2015, 11:03:28 AM4/14/15
to juli...@googlegroups.com
Julia is dead slow on Power. It is simply unusable.
Something needs to be done here. Just doing a Pkg.upfate() takes minutes.
Console I/O (just typing) feels like using a 2400 Baud modem.

Geert Janssen

unread,
Apr 14, 2015, 11:05:57 AM4/14/15
to juli...@googlegroups.com
Doing strace ./julia gives me tons of mprotect calls.
Are they really necessary? Do they slow things down so much?
Is this a particular Power thing?


mprotect(0x3ffda03f0000, 65536, PROT_READ|PROT_EXEC) = 0
mprotect
(0x3ffda03d0000, 65536, PROT_READ|PROT_EXEC) = 0
mprotect
(0x3ffda03b0000, 65536, PROT_READ|PROT_EXEC) = 0
mprotect
(0x3ffda0390000, 65536, PROT_READ|PROT_EXEC) = 0
mprotect
(0x3ffda0370000, 65536, PROT_READ|PROT_EXEC) = 0
mprotect
(0x3ffda0350000, 65536, PROT_READ|PROT_EXEC) = 0
mprotect
(0x3ffda0330000, 65536, PROT_READ|PROT_EXEC) = 0
mprotect
(0x3ffda02f0000, 65536, PROT_READ|PROT_EXEC) = 0
mprotect
(0x3ffda02e0000, 65536, PROT_READ|PROT_EXEC) = 0
mprotect
(0x3ffda02c0000, 65536, PROT_READ|PROT_EXEC) = 0
mprotect
(0x3ffda02a0000, 65536, PROT_READ|PROT_EXEC) = 0
mprotect
(0x3ffda0270000, 65536, PROT_READ|PROT_EXEC) = 0
mprotect
(0x3ffda0260000, 65536, PROT_READ|PROT_EXEC) = 0
mprotect
(0x3ffda0240000, 65536, PROT_READ|PROT_EXEC) = 0
mprotect
(0x3ffda0210000, 65536, PROT_READ|PROT_EXEC) = 0
mprotect
(0x3ffda0200000, 65536, PROT_READ|PROT_EXEC) = 0
mprotect
(0x3ffda01e0000, 65536, PROT_READ|PROT_EXEC) = 0



Jameson Nash

unread,
Apr 14, 2015, 10:26:12 PM4/14/15
to juli...@googlegroups.com

Julia is slower the first time it tries to execute something, while it tries to JIT compile optimized code. It gets much faster for me for subsequent characters.

Profiling julia (operf -g -- usr/bin/julia-debug -E "rand(10,10)*randn(10,10)") doesn’t seem to reveal any particular “smoking guns” (tested with llvm3.6):

CPU: ppc64 POWER8, speed 3425 MHz (estimated)
Counted CYCLES events (Cycles) with a unit mask of 0x00 (No unit mask) count 1500000
samples  cum. samples  %        cum. %     image name               symbol name
1223     1223           2.5583   2.5583    libjulia-debug.so        llvm::sys::Memory::InvalidateInstructionCache(void const*, unsigned lo
ng)
964      2187           2.0165   4.5748    libjulia-debug.so        llvm::Value::getValueID() const
730      2917           1.5270   6.1019    libc-2.19.so             malloc
712      3629           1.4894   7.5913    libc-2.19.so             free
599      4228           1.2530   8.8443    libjulia-debug.so        jl_egal
523      4751           1.0940   9.9383    libjulia-debug.so        llvm::Type::getTypeID() const
512      5263           1.0710  11.0093    libjulia-debug.so        type_eqv_
467      5730           0.9769  11.9862    libjulia-debug.so        llvm::simplify_type<llvm::Value const* const>::getSimplifiedValue(llvm
::Value const* const&)
442      6172           0.9246  12.9108    libc-2.19.so             _int_malloc
437      6609           0.9141  13.8249    libjulia-debug.so        llvm::simplify_type<llvm::Value const*>::getSimplifiedValue(llvm::Valu
e const*&)
401      7010           0.8388  14.6637    libjulia-debug.so        computeKnownBits(llvm::Value*, llvm::APInt&, llvm::APInt&, llvm::DataL
ayout const*, unsigned int, (anonymous namespace)::Query const&)
400      7410           0.8367  15.5005    libjulia-debug.so        jl_method_table_assoc_exact
395      7805           0.8263  16.3267    libjulia-debug.so        llvm::isa_impl_cl<llvm::Instruction, llvm::Value const*>::doit(llvm::V
alue const*)
369      8174           0.7719  17.0986    libjulia-debug.so        llvm::isa_impl_wrap<llvm::Instruction, llvm::Value const* const, llvm:
:Value const*>::doit(llvm::Value const* const&)
361      8535           0.7552  17.8538    libjulia-debug.so        jl_tupleref
352      8887           0.7363  18.5901    libjulia-debug.so        llvm::Use::get() const
347      9234           0.7259  19.3160    libjulia-debug.so        lookup_type
335      9569           0.7008  20.0167    libjulia-debug.so        llvm::SmallPtrSetImplBase::insert_imp(void const*)
325      9894           0.6798  20.6966    libstdc++.so.6.0.19      /usr/lib/powerpc64le-linux-gnu/libstdc++.so.6.0.19
322      10216          0.6736  21.3701    libjulia-debug.so        llvm::InstCombiner::DoOneIteration(llvm::Function&, unsigned int)
306      10522          0.6401  22.0102    libjulia-debug.so        llvm::isa_impl<llvm::Instruction, llvm::Value, void>::doit(llvm::Value
 const&)
294      10816          0.6150  22.6252    libjulia-debug.so        cache_match
294      11110          0.6150  23.2402    libjulia-debug.so        llvm::cast_convert_val<llvm::Instruction, llvm::Value const*, llvm::Va
lue const*>::doit(llvm::Value const* const&)
293      11403          0.6129  23.8532    libjulia-debug.so        jl_tupleref
292      11695          0.6108  24.4640    libc-2.19.so             __memset_power7
287      11982          0.6004  25.0643    libjulia-debug.so        jl_is_type
284      12266          0.5941  25.6584    libjulia-debug.so        jl_subtype_le
281      12547          0.5878  26.2462    libjulia-debug.so        int64hash
268      12815          0.5606  26.8068    libjulia-debug.so        llvm::PMDataManager::findAnalysisPass(void const*, bool)
267      13082          0.5585  27.3653    libjulia-debug.so        jl_apply_generic
266      13348          0.5564  27.9218    libjulia-debug.so        bool llvm::DenseMapBase<llvm::DenseMap<llvm::Instruction*, unsigned in
t, llvm::DenseMapInfo<llvm::Instruction*>, llvm::detail::DenseMapPair<llvm::Instruction*, unsigned int> >, llvm::Instruction*, unsigned in
t, llvm::DenseMapInfo<llvm::Instruction*>, llvm::detail::DenseMapPair<llvm::Instruction*, unsigned int> >::LookupBucketFor<llvm::Instructi
on*>(llvm::Instruction* const&, llvm::detail::DenseMapPair<llvm::Instruction*, unsigned int> const*&) const
254      13602          0.5313  28.4531    libjulia-debug.so        llvm::isa_impl_wrap<llvm::Instruction, llvm::Value const*, llvm::Value
 const*>::doit(llvm::Value const* const&)

The mprotect calls are needed to enable the eXecute bit on the pages. But there aren’t really that many calls, so they shouldn’t contribute much to the overall speed. (they barely show up in the profile, if at all).

Any idea why PPC64 ubuntu (as provided by https://www-304.ibm.com/partnerworld/wps/servlet/mem/ContentHandler/stg_com_sys_power-development-platform) is running with the kernel option CONFIG_PPC_64K_PAGES=y? I’m not really sure of the effect of this kernel option on performance.

Geert Janssen

unread,
Apr 15, 2015, 9:32:10 AM4/15/15
to juli...@googlegroups.com
Jameson that's a good tip. I did an operf run too, exactly the same command.
Although I am using the non-debug julia executable. 3/4 of the time seems to be spend in LLVM.
Will check with the sysadmin about the CONFIG_PPC_64K_PAGES option. Is that bad?

geert@tulgpu505:~/src/julia$ opreport
Using /home/geert/src/julia/oprofile_data/samples/ for samples directory.
CPU
: ppc64 POWER8, speed 3923 MHz (estimated)

Counted CYCLES events (Cycles) with a unit mask of 0x00 (No unit mask) count 1500000

   CYCLES
:1500000|
  samples
|      %|
------------------
   
17971 100.000 julia
       CYCLES
:1500000|
      samples
|      %|
   
------------------
       
13884 77.2578 libLLVM-3.6.so
         
1794  9.9827 libjulia.so
         
1600  8.9032 libc-2.19.so
         
246  1.3689 no-vmlinux
         
225  1.2520 sys.so
         
205  1.1407 libstdc++.so.6.0.20
           
15  0.0835 ld-2.19.so
           
1  0.0056 anon (tgid:84195 range:0x3ffd92ae0000-0x3ffd92ccffff)
           
1  0.0056 libdl-2.19.so


geert@tulgpu505:~/src/julia$ opreport -a -l -s sample | more
Using /home/geert/src/julia/oprofile_data/samples/ for samples directory.
CPU
: ppc64 POWER8, speed 3923 MHz (estimated)

Counted CYCLES events (Cycles) with a unit mask of 0x00 (No unit mask) count 150
0000
samples  cum
. samples  %        cum. %     image name               symbol name
1106     1106           6.1547   6.1547    libLLVM-3.6.so           llvm::sys::Memory::InvalidateInstructionCache(void const*, unsigned long)
353      1459           1.9644   8.1191    libc-2.19.so             malloc
342      1801           1.9032  10.0223    libc-2.19.so             _int_free
264      2065           1.4691  11.4914    libLLVM-3.6.so           computeKnownBits(llvm::Value*, llvm::APInt&, llvm::APInt&, llvm::DataLayout const*, unsigned int, (anonymous namespace)::Query const&)
246      2311           1.3689  12.8603    no-vmlinux               /no-vmlinux
228      2539           1.2688  14.1291    libc-2.19.so             _int_malloc
214      2753           1.1909  15.3200    libLLVM-3.6.so           llvm::PassRegistry::getPassInfo(void const*) const
211      2964           1.1742  16.4942    libLLVM-3.6.so           llvm::InstCombiner::DoOneIteration(llvm::Function&, unsigned int)
205      3169           1.1408  17.6349    libstdc++.so.6.0.20      /usr/lib/powerpc64le-linux-gnu/libstdc++.so.6.0.20
193      3362           1.0740  18.7090    libc-2.19.so             __memset_power7
192      3554           1.0684  19.7774    libLLVM-3.6.so           bool llvm::DenseMapBase<llvm::DenseMap<llvm::Instruction*, unsigned int, llvm::DenseMapInfo<llvm::Instruction*>, llvm::detail::DenseMapPair<llvm::Instruction*, unsigned int> >, llvm::Instruction*, unsigned int, llvm::DenseMapInfo<llvm::Instruction*>, l
lvm
::detail::DenseMapPair<llvm::Instruction*, unsigned int> >::LookupBucketFor<l
lvm
::Instruction*>(llvm::Instruction* const&, llvm::detail::DenseMapPair<llvm::I
nstruction
*, unsigned int> const*&) const
178      3732           0.9905  20.7679    libLLVM-3.6.so           llvm::PMDataManager::findAnalysisPass(void const*, bool)
175      3907           0.9738  21.7418    libLLVM-3.6.so           llvm::SmallPtrSetImplBase::insert_imp(void const*)
175      4082           0.9738  22.7156    libjulia.so              jl_method_table_assoc_exact
167      4249           0.9293  23.6450    libjulia.so              lookup_type
140      4389           0.7791  24.4240    libjulia.so              jl_egal
137      4526           0.7624  25.1864    libjulia.so              ios_getc
134      4660           0.7457  25.9321    libLLVM-3.6.so           llvm::PMTopLevelManager::findAnalysisPassInfo(void const*) const
132      4792           0.7346  26.6667    libc-2.19.so             __strncmp_power7
117      4909           0.6511  27.3178    libc-2.19.so             __memcmp_power7
116      5025           0.6455  27.9633    libLLVM-3.6.so           llvm::TargetLibraryInfo::getLibFunc(llvm::StringRef, llvm::LibFunc::Func&) const
107      5132           0.5954  28.5587    libjulia.so              jl_deserialize_value_
104      5236           0.5787  29.1375    libLLVM-3.6.so           llvm::PMTopLevelManager::findAnalysisPass(void const*)
103      5339           0.5732  29.7106    libLLVM-3.6.so           llvm::DataLayout::getTypeSizeInBits(llvm::Type*) const
100      5439           0.5565  30.2671    libLLVM-3.6.so           llvm::Value::stripPointerCasts()


Jameson Nash

unread,
Apr 19, 2015, 2:31:15 PM4/19/15
to juli...@googlegroups.com
> Will check with the sysadmin about the CONFIG_PPC_64K_PAGES option. Is that bad?

It's good for some workflows and bad for others. Since I was using the system without hitting swap space, it probably didn't have any impact. Is Power simply slower than x86_64? I'm primarily used to dealing with higher-end x64 chips, so I'm not in a good position to make an objective judgement. fwiw, the performance feels similar to a mid-range Atom processor I use for Windows development.
Reply all
Reply to author
Forward
0 new messages