I usually use a manually written loop, but you can use Data.Vector for this and it should fuse.
John L.
It's not the lack of list fusion per se. If you have a
forM_ [a .. b] $ \n -> do stuff
where the type is Int, GHC is perfectly capable of eliminating the list and
rewriting it to a loop (usually on par with a hand-written loop, although the
core you get from a hand-written loop often is smaller and nicer to look at).
The problem is that you use the same list multiple times, and GHC thinks "hey,
let me re-use that", so it gets floated to the top-level, and the inner "loop"
really uses the list, which apart from the `case`s on the list forces it to
use boxed 'Int's instead of Int#. Then the boxed Ints need to be unboxed to be
used, oops.
If you manage to conceal the fact that the lists are the same from GHC, it
eliminates the lists and you get fast code also with forM_.
In your matmult example, using
forM_ [k-k .. _SIZE-1] $ \j -> do ...
does the trick for GHC-7.0 to 7.8.
That is of course a brittle workaround, future GHCs might rewrite the k-k to 0
and share the list again.
> a manually written loop is 10x faster.
>
> How do you people work around this?
Do the idiomatic thing and write a loop ;)
>
> [1]: https://ghc.haskell.org/trac/ghc/ticket/8763
Cheers,
Daniel
More or less. The `_SIZE - 1` is constant-folded to 511, so writing the list
as [0 .. 511] would probably not prevent the sharing. And using the same list
in a different function may or may not affect the produced code, I don't know
GHC well enough to predict that.
If you write it in a form that GHC doesn't recognise as the same value, GHC
can and does optimise it as if the value were used only once.
>
> If so, how far does that go, and would GHC even notice that it is the
> same as a potential `[0.._511]` I might write elsewhere?
Depends on "elsewhere". I've seen GHC produce multiple top-level values for
the very same thing used in different functions in the same module, so if
"elsewhere" means a different function and inlining doesn't cause GHC to see
the two at the same time, it will probably not notice. But ask somebody who
knows how GHC works if you want to be sure.
>
> > Do the idiomatic thing and write a loop ;)
>
> Unfortunately, `forM_ [1..n]` is pretty much the most idiomatic and
> beautiful way I can come up with when ask for an iteration over 1 up to
> n!
Well, my idiom is looping, because.
> So this better be fixed :)
Agreed.
On Apr 27, 2014 1:10 PM, "Niklas Hambüchen" <ma...@nh2.me> wrote:
>
> On 27/04/14 03:28, John Lato wrote:
> > No, I mean this:
> >
> >> V.forM_ (V.enumFromN 1 n)
>
> Hey John,
>
> unfortunately that's not fast, and it eats all the RAMs.
>
> Here comes the sad comparison of looping in Haskell:
>
> - loop (maxBound :: Word32)
> 3 seconds, constant space
>
> - forM_ [0..maxBound :: Word32]
> 36 seconds, constant space
>
> - V.forM_ (V.enumFromTo 0 (maxBound :: Word32))
> 37 seconds, constant space
>
> - V.forM_ (V.enumFromN 0 (fromIntegral (maxBound :: Word32)))
> linear space -> crashes
>
> All loops execute `return ()`.
Hmm. Are you using regular vectors or unboxed? Also what sort of crash? Is it a stack space overflow or are you on a 32-bit platform?
On 27/04/14 23:18, John Lato wrote:> Hmm. Are you using regular vectors
or unboxed? Also what sort of crash?I'm using regular Data.Vector as V.
> Is it a stack space overflow or are you on a 32-bit platform?
It's just eating my 8GB and then I kill it (or let my ulimit do it). It
can't be the stack space since I'm on 7.6 where stack space is limited
by default.
It's on 64 bit Linux.
On 28/04/14 00:22, John Lato wrote:I haven't checked yet, but should it matter?
> Are unboxed vectors faster? My rule of thumb is to use them over
> Data.Vector whenever possible.
Because my goal is that the vector never be created *at all*, and boxed
or not shouldn't make a difference on that!
Also haven't checked that yet, but I suspect that instead of something
> I would expect it's because you never force the argument. With
> `enumFromTo` the argument is forced because it needs to be checked for
> termination, but `enumFromN` is probably building up a big chain of
> thunks. I guess for this case `enumFromN` has no benefit over
> `enumFromTo` because the intention is to create a single loop instead of
> actually allocating the vector, so the warning in the documentation
> doesn't necessarily apply.
thunk-related, the thing plainly allocates the vector.
Just to clarify: `V.enumFromTo` works much better than `V.enumFromN`
because in contrast to the latter it doesn't actually try to create the
fully sized vector.
I've just uploaded my benchmarks to https://github.com/nh2/loop.
Please take a look. There are some interesting results.
The first thing I don't understand at all is:
http://htmlpreview.github.io/?https://github.com/nh2/loop/blob/master/results/bench.html
See how w32/loop and w32/unsafeLoop are equally fast. Then look at
http://htmlpreview.github.io/?https://github.com/nh2/loop/blob/master/results/bench-traverse-w32.html
Here I run the same thing over the whole of Word32.
See how `loop` is faster here than `unsafeLoop`? How does that make sense?
-- Note that some types (e.g. Word32) have bounds checks even for-- `toEnum`.
Next thing:
It seems that V.enumFromTo and V.fromList are actually the same:
http://hackage.haskell.org/package/vector-0.10.9.1/docs/src/Data-Vector-Fusion-Stream-Monadic.html#enumFromTo
However in my benchmark, at least for `Int`, the performance is
different - am I overlooking something?
I cannot see a difference between Vector.enumFromTo and
On 28/04/14 01:34, John Lato wrote:
> It can make a difference in that, with unboxed vectors, the compiler can
> statically determine that it is able to use unboxed values, and
> therefore is more likely to do so. Having finally broken down and run
> some tests, I can report that on my system using V.enumFromTo with
> unboxed vectors results in the same performance as the hand-written loop.
Vector.Unboxed.enumFromTo in my benchmark.
Vector.enumFromTo is as fast as the hand-written loop, but only for
`Int`. For `Word32`, it is 5 times slower. Any idea why?
Rule fired: enumFromTo<Int> [Stream]