On Fri, Feb 13, 2015 at 05:39:42AM -0800, Dan wrote:
> It isn't strange that you just can't? Why FBP can not be used all the
> way through?
FBP can be used to implement a sort component.
Here is an FBP mergesort written in bash shell.
http://sam.nipl.net/code/mergesort
It uses recursion to create a FBP network right down to the lowest level
of the mergesort. This could be fast if implemented using electronic
circuitry, however in bash it is quite slow and impractical due to the
large number of processes that will be created, the overhead of IPC, the
slowness of bash, and the limited parallism offered by current
computers. (There are better designs for sorting quickly in circuitry.)
It would work better (on current computers) if we stop splitting into
more processes after we have enough processes to keeep all the cores
busy, and use a regular single-process sort after that. For an 8 core
system - split input into 8 parts, sort each part using a single process
(in parallel), merge by pairs (7x parallel).
Here's a program sort8 which does that:
text:
http://sam.nipl.net/code/net2sh/eg/sort8
graph:
http://sam.nipl.net/code/net2sh/g/eg/sort8.dot.png
perf. test, sorting 99171 words, a shuffled dictionary, on 8 cores:
regular GNU sort(1):
real 0m0.261s
user 0m0.256s
sys 0m0.004s
sort8:
real 0m0.123s
user 0m0.508s
sys 0m0.053s
It uses quite a lot more power but completes in less than half the time
compared to a regular single-process sort. Splitting the merge phase
like this makes it quicker compared to a single process merge.
In fact GNU sort(1) uses multiple threads for large files, but not in
this case.
Sam