FBP and number crunching

67 views
Skip to first unread message

Paul Morrison

unread,
Mar 2, 2017, 10:40:41 AM3/2/17
to Flow Based Programming
Recently I came across some articles suggesting that FORTRAN and C are the two front-runners for intensive number crunching apps, e.g. http://stackoverflow.com/questions/8997039/why-is-fortran-used-for-scientific-computing .

Given that FBP supports multiple cores very naturally, and is also good for converting legacy code, what would be a good language for doing the computation-intensive parts of an FBP application?  We do have a C[++] FBP implementation, CppFBP -  https://github.com/jpaulm/cppfbp - which uses BOOST to handle its multi-processing, but the authors of that article feel C (or C++?) is too complex, and probably does not support arrays of more than one dimension (?).  If you like scripting languages, CppFBP has a Lua interface, but I don't know how its table manipulation performance compares with FORTRAN's array handling...

Sam Watkins

unread,
Mar 2, 2017, 10:24:58 PM3/2/17
to flow-based-...@googlegroups.com
hi Paul,

Python can also be good for rapid number crunching using numpy, or so I've heard.

I would personally prefer to use C.

If you really need top performance for parallel number crunching, you need a
GPU backend (or an ASIC but that would be going too far for most applications).

Fortran can have better performance than C, but in my opinion it's not worth
it, and I'd rather use C.

As for C++.... I've studied it and used it, but I don't love it!

C++: an octopus made by nailing extra legs onto a dog. — Steve Taylor
*YOU* are full of bullshit. C++ is a horrible language. ... - Linus Torvalds

As you know, for many applications if your pipes / channels are long enough,
and the processes are substantial enough, and the network is well-designed, the
processes can act on a large batch of records at once and inter-process
communications and context switching should not slow it down very much.


Sam
> --
> You received this message because you are subscribed to the Google Groups "Flow Based Programming" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to flow-based-progra...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Paul Morrison

unread,
Mar 3, 2017, 10:11:13 AM3/3/17
to Flow Based Programming, s...@nipl.net
Hi Sam,

Yes, it seems to me that interpreted languages can never compete performance-wise with C - but IMO writing a compiler is a lot harder than writing an interpreter.  Plus, there seem to be quite a lot of different hardware instruction sets, so it makes sense to have an intermediate layer.   I assume C does support a wide range of instruction sets... 

OTOH I think C only supports one-dimensional arrays.  PL/I supported multiple dimensions, but not very well - APL was the most intuitive at array handling but is still interpreted, and Jim Weigang said in 1994 "well-written (i.e., no unnecessary loops) real-life APL code typically runs about 4 times slower than compiled code, give or take a factor of two. (Assuming the problem can be vectorized to some extent.) " - http://www.chilton.com/~jimw/bnchmrks.html .

For the difference between PL/I and APL array handling, see p. 180, 2nd ed. of my book:

In PL/I you can write the following statement:

A = A + A(2,3);

This is implemented in a very procedural manner. PL/I executes this statement one element at a time, and does it in such a way that the rightmost dimension is the one which cycles most rapidly. So the sequence will be A(1,1), A(1,2), ..., A(1,n), A(2,1), A(2,2), ..., and so on. When these additions hit A(2,3), all “later” elements (those following A(2,3)) will have the new (doubled) value added to them, rather than the original one.

In APL, on the other hand, matrices are treated as “aggregates”, which are conceptually handled all at the same moment of time. Thus, you can write

A ← A + A[2;3]

and it behaves almost like the PL/I example, but A[2;3] does not change halfway through execution [italics added].

Regards,

Paul


John Cowan

unread,
Mar 3, 2017, 12:00:17 PM3/3/17
to flow-based-...@googlegroups.com

On Fri, Mar 3, 2017 at 10:11 AM, Paul Morrison <paul.m...@rogers.com> wrote:

OTOH I think C only supports one-dimensional arrays.

No, it supports multidimensional arrays in the same style as Fortran, except that it uses row-major rather than column-major order.
 
PL/I supported multiple dimensions, but not very well - APL was the most intuitive at array handling but is still interpreted,

The first JIT compiler ever used in a commercial product was for APL, in the HP implementation of 1977, though it JIT-compiled to bytecode rather than HP machine language.  See also <https://en.wikipedia.org/wiki/APL_(programming_language)#Compilers>.

ern0

unread,
Mar 4, 2017, 6:29:09 AM3/4/17
to Flow Based Programming
>> OTOH I think C only supports one-dimensional arrays.

C supports anything and everything.

> No, it supports multidimensional arrays in the same style as Fortran, except
> that it uses row-major rather than column-major order.

Whatever you call it, it's really first index and second index, but
you can also name it street and house number.
--
ern0
dataflow evangelist

Paul Morrison

unread,
Mar 4, 2017, 11:32:38 AM3/4/17
to flow-based-...@googlegroups.com
Thanks, Ernő, I didn't know that!  So maybe C in combination with CppFBP would be a pretty good solution for this area...?

--
ern0
dataflow evangelist

--
You received this message because you are subscribed to the Google Groups "Flow Based Programming" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flow-based-programming+unsub...@googlegroups.com.

Paul Morrison

unread,
Mar 4, 2017, 11:38:13 AM3/4/17
to flow-based-...@googlegroups.com
Interesting, John!  I never understood why Iverson used all those special characters - why not just use names, with some kind of escape mechanism.  I just looked at J - https://en.wikipedia.org/wiki/J_(programming_language) - and actually it looks pretty powerful, and uses that technique IIUC.

--
You received this message because you are subscribed to the Google Groups "Flow Based Programming" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flow-based-programming+unsub...@googlegroups.com.

Paul Tarvydas

unread,
Mar 4, 2017, 2:25:28 PM3/4/17
to flow-based-...@googlegroups.com
On 2017-03-04 11:32 AM, Paul Morrison wrote:
> ... So maybe C in combination with CppFBP would be a pretty good
> solution for this area...?

It is my impression that "new" / "unusual" ideas are more likely to be
accepted in technologies at the bleeding edge, rather than technologies
with existing solutions.

The bleeding edge of number crunching appears to be TensorFlow, open
sourced by Google recently. TF is used for ML - machine learning, deep
learning, self-driving cars, etc.

The front-end for TF is Python. The back-end is C++. [I believe that
the core of numpy is C].

At present, IIUC, ML is hit and miss. They create graphs using Python,
then visualize them using TensorBoard, then compile and execute them in C++.

Andrew Ng says that ML is the "new electricity".

pt

Samuel Lampa

unread,
Mar 6, 2017, 6:25:35 AM3/6/17
to Flow Based Programming
On Saturday, March 4, 2017 at 8:25:28 PM UTC+1, Paul Tarvydas wrote:
The bleeding edge of number crunching appears to be TensorFlow, open
sourced by Google recently.

Indeed. Maybe a discussion of TensorFlow vs. FBP would be useful some time as well (in another thread perhaps). 
It seems pretty data flow-ish, albeit unfortunately not incorporating useful FBP stuff like ports :/

Also, it surprises me a bit they are not using Go more, as it is a Google language and would give them a lot of data flow / fbp-features out of the box.

// Samuel

Tom Young

unread,
Mar 6, 2017, 1:33:56 PM3/6/17
to Flow Based Programming
Yes, C supports multidimensional fixed size arrays: Ex .'  int hwd[100][100][100];' , but
dynamically allocated multi-dimensional arrays require special handling, which an interpreted language presumably can avoid, at some performance cost.

Tom Young
47 MITCHELL ST.
STAMFORD, CT  06902


When bad men combine, the good must associate; ...
  -Edmund Burke 'Thoughts on the cause of the present discontents' , 1770



On Fri, Mar 3, 2017 at 11:59 AM, John Cowan <co...@ccil.org> wrote:

--
You received this message because you are subscribed to the Google Groups "Flow Based Programming" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flow-based-programming+unsub...@googlegroups.com.

David Barbour

unread,
Mar 6, 2017, 5:31:39 PM3/6/17
to flow-based-...@googlegroups.com
On Thu, Mar 2, 2017 at 9:40 AM, Paul Morrison <paul.m...@rogers.com> wrote:
Given that FBP supports multiple cores very naturally, and is also good for converting legacy code, what would be a good language for doing the computation-intensive parts of an FBP application?

While FBP adequately supports pipeline parallelism, that specific form of parallelism is not suitable for all parts of most problems. Especially not the intense number crunching parts. If your goal is high performance computing, I'd recommend OpenCL [1] or something you can bind or compile to it (like WebCL [2]).

[2] https://www.khronos.org/webcl/

Paul Morrison

unread,
Mar 7, 2017, 11:11:33 AM3/7/17
to Flow Based Programming
Took a quick look at TensorFlow, then looked up Tensor on Wikipedia.  Almost immediately the article gets into stuff that I don't have a hope of understanding, but the very first sentence says:

"In mathematics, tensors are geometric objects that describe linear relations between geometric vectors, scalars, and other tensors."

This seems pretty clear, but the Wikipedia article seems to be saying that tensors are different from matrices, although it does say that tensors can be represented as multi-dimensional arrays...    APL to me was very clear - it just dealt with multi-dimensional arrays (0, 1, 2 or 'n' dimensions) as its primary data structures.  I guess I am confused by TensorFlow's use of the word "tensor".

TIA

Paul Tarvydas

unread,
Mar 7, 2017, 12:09:27 PM3/7/17
to flow-based-...@googlegroups.com
On 2017-03-07 11:11 AM, Paul Morrison wrote:
> Took a quick look at TensorFlow, then looked up Tensor on Wikipedia.
> Almost immediately the article gets into stuff that I don't have a
> hope of understanding, but the very first sentence says:

[I posted a response in the thread "TensorFlow vs. FBP"]

pt

Reply all
Reply to author
Forward
0 new messages