My first reaction is that you should be defining parFold rather than parFoldr. This will naturally have a different type signature (the function f should be both commutative and associative). But I'll look on Monday when I don't just have my phone.

Best Wishes,

Kevin

---------

Kevin Hammond

Professor in Computer Science

University of St Andrews

E: k...@cs.st-andrews.ac.uk

T: +44-1334-473241

F: +44-1334-463278

On 23 Jul 2011, at 16:15, Prabhat Totoo <prabha...@gmail.com> wrote:

> My aim is to have a parallel foldr function. At first, it seemed

> rather straight forward to achieve and this is what I had in mind:

>

> First break up the input list into partitions based on the number of

> cores (numCapabilities). Then apply foldr to each partition, which

> will result in a list of folded values for each partition. Then do a

> foldr again on that list to obtain the final value.

>

> listChunkSize = numCapabilities

>

> chunk n [] = []

> chunk n xs = ys : chunk n zs

> where (ys,zs) = splitAt n xs

>

> parfoldr f z [] = z

> parfoldr f z xs = res

> where

> parts = chunk listChunkSize xs

> partsRs = map (foldr f z) parts `using` parList rdeepseq

> res = foldr f z partsRs

>

> The above code does not work because obviously the definition of

> foldr, (a -> b -> b) -> b -> [a] -> b, implies that the input list

> type is (well, can be) different from the accumulator and result type.

>

> For example,

> 1) foldr (+) 0 [1..10] => list type = accumulator type (Integer)

> 2) foldr (\i acc -> (i>5) && acc) True [1..10] => list type (Integer) !

> = accumulator type (Bool)

>

> So, looking at my code above, the map will generate a list of type `b`

> which is then passed as argument to the second foldr. But the second

> foldr accepts list of type `a`. So, that won't work.

>

> An ugly solution would be to provide a different type signature for

> the parfoldr, e.g.

> parfoldr :: (NFData a) => (a -> a -> a) -> a -> [a] -> a

>

> This will work but then it is not exactly equivalent to foldr. Example

> 1 above will do just fine, but not example 2.

> So, question 1 is: how to define parfoldr to have same type signature

> as foldr?

>

> Comparing the 2 folds:

> input = [1..1000000]

> seqfold = foldr (+) 0

> parfold = parfoldr (+) 0

>

> I get the foll. times on a dual core machine:

> (no -threaded flag)

> $ ./test

> seqfold: 4.99s

> parfold: 25.16s

>

> (-threaded flag on)

> $ ./test

> seqfold: 5.32s

> parfold: 25.55s

> $ ./test +RTS -N1

> seqfold: 5.32s

> parfold: 25.53s

> $ ./test +RTS -N2

> seqfold: 3.48s

> parfold: 3.68s

> $ ./test +RTS -N3

> seqfold: 3.57s

> parfold: 2.36s

> $ ./test +RTS -N4

> seqfold: 3.03s

> parfold: 1.70s

>

> Observations from these measurements:

> - foldr seems to give lower runtime when num of cores is increased.

> why is that?

> - parfold gives better runtimes for N => 3.

>

> Any suggestions and ideas for improvement is appreciated :)

>

> Prabhat