Google Groups

Re: More generic parfoldr than this...


Kevin Hammond Jul 23, 2011 8:37 AM
Posted in group: parallel-haskell
My first reaction is that you should be defining parFold rather than parFoldr. This will naturally have a different type signature (the function f should be both commutative and associative). But I'll look on Monday when I don't just have my phone.

Best Wishes,
Kevin

---------

Kevin Hammond
Professor in Computer Science
University of St Andrews

E: k...@cs.st-andrews.ac.uk
T: +44-1334-473241
F: +44-1334-463278

On 23 Jul 2011, at 16:15, Prabhat Totoo <prabha...@gmail.com> wrote:

> My aim is to have a parallel foldr function. At first, it seemed
> rather straight forward to achieve and this is what I had in mind:
>
> First break up the input list into partitions based on the number of
> cores (numCapabilities). Then apply foldr to each partition, which
> will result in a list of folded values for each partition. Then do a
> foldr again on that list to obtain the final value.
>
> listChunkSize = numCapabilities
>
> chunk n [] = []
> chunk n xs = ys : chunk n zs
>  where (ys,zs) = splitAt n xs
>
> parfoldr f z [] = z
> parfoldr f z xs = res
>  where
>    parts = chunk listChunkSize xs
>    partsRs = map (foldr f z) parts `using` parList rdeepseq
>    res = foldr f z partsRs
>
> The above code does not work because obviously the definition of
> foldr, (a -> b -> b) -> b -> [a] -> b, implies that the input list
> type is (well, can be) different from the accumulator and result type.
>
> For example,
> 1) foldr (+) 0 [1..10] => list type = accumulator type (Integer)
> 2) foldr (\i acc -> (i>5) && acc) True [1..10] => list type (Integer) !
> = accumulator type (Bool)
>
> So, looking at my code above, the map will generate a list of type `b`
> which is then passed as argument to the second foldr. But the second
> foldr accepts list of type `a`. So, that won't work.
>
> An ugly solution would be to provide a different type signature for
> the parfoldr, e.g.
> parfoldr :: (NFData a) => (a -> a -> a) -> a -> [a] -> a
>
> This will work but then it is not exactly equivalent to foldr. Example
> 1 above will do just fine, but not example 2.
> So, question 1 is: how to define parfoldr to have same type signature
> as foldr?
>
> Comparing the 2 folds:
> input = [1..1000000]
> seqfold = foldr (+) 0
> parfold = parfoldr (+) 0
>
> I get the foll. times on a dual core machine:
> (no -threaded flag)
> $ ./test
> seqfold: 4.99s
> parfold: 25.16s
>
> (-threaded flag on)
> $ ./test
> seqfold: 5.32s
> parfold: 25.55s
> $ ./test +RTS -N1
> seqfold: 5.32s
> parfold: 25.53s
> $ ./test +RTS -N2
> seqfold: 3.48s
> parfold: 3.68s
> $ ./test +RTS -N3
> seqfold: 3.57s
> parfold: 2.36s
> $ ./test +RTS -N4
> seqfold: 3.03s
> parfold: 1.70s
>
> Observations from these measurements:
> - foldr seems to give lower runtime when num of cores is increased.
> why is that?
> - parfold gives better runtimes for N => 3.
>
> Any suggestions and ideas for improvement is appreciated :)
>
> Prabhat