It is important to note that GHC does not generate loops.
At least, not in the sense that a C compiler like LLVM recognises. Thus, questions surrounding optimisations such as polly or vectorisation are complete non-starters.
More concretely, consider this program:
test1 :: Array U DIM1 Double -> IO (Array U DIM1 Double) test1 xs = computeUnboxedP $ R.map (+1) xs
Compile with the usual swath of options and add
-keep-llvm-files
, or just look the output I saved here (as the generated names are likely to change): https://gist.github.com/tmcdonell/490e336274d8bff943b0259b13382907The work happens with the
fadd
instruction, for example in function@siOS_info$def
. That function then does a tail call to@siOR_info$def
, which itself then tail calls back to@siOS_info$def
. This kind of looping structure is, demonstrably, entirely opaque to LLVM’s optimisations.As Ben mentioned above, if you really care about this sort of thing you need to write your own code generator. Here is what happens if you give LLVM the above in a form it recognises: https://gist.github.com/tmcdonell/3efb7345b227e4b980b36fe4ca315ba5
-Trevor
--
You received this message because you are subscribed to the Google Groups "Haskell Repa" group.
To unsubscribe from this group and stop receiving emails from it, send an email to haskell-repa...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
I don’t know if anybody has explicitly studied the difference, but if you come across anything send it my way. My intuition tells me that LLVM is just generally better, but the NGC knows more about the peculiarities of the code GHC produces. Depending on your program, one of those is more important than the other.
LLVM has many different optimisation passes and you are free to specify as many or as few as you want, in any order and multiple times. Actually, if you just feed the output of my previous email through opt -O3
again it turns the tail calls into local branches. Still doesn’t vectorise though…
Repeating this experiment may be interesting:
https://donsbot.wordpress.com/2010/03/01/evolving-faster-haskell-programs-now-with-llvm/
-Trev
It is important to note that GHC does not generate loops.
At least, not in the sense that a C compiler like LLVM recognises. Thus, questions surrounding optimisations such as polly or vectorisation are complete non-starters.
More concretely, consider this program:
test1 :: Array U DIM1 Double -> IO (Array U DIM1 Double)
test1 xs = computeUnboxedP $ R.map (+1) xs
Compile with the usual swath of options and add -keep-llvm-files
, or just look the output I saved here (as the generated names are likely to change): https://gist.github.com/tmcdonell/490e336274d8bff943b0259b13382907
The work happens with the fadd
instruction, for example in function @siOS_info$def
. That function then does a tail call to @siOR_info$def
, which itself then tail calls back to @siOS_info$def
. This kind of looping structure is, demonstrably, entirely opaque to LLVM’s optimisations.
As Ben mentioned above, if you really care about this sort of thing you need to write your own code generator. Here is what happens if you give LLVM the above in a form it recognises: https://gist.github.com/tmcdonell/3efb7345b227e4b980b36fe4ca315ba5
-Trevor
On Sun, 12 Feb 2017 at 04:37 ‘Dominic Steinitz’ via Haskell Repa [haskel...@googlegroups.com](mailto:haskel...@googlegroups.com) wrote: