Flow-Based vs Data-Flow

733 views
Skip to first unread message

mark taylor

unread,
Sep 14, 2010, 1:04:38 PM9/14/10
to Flow Based Programming
Hi,

I've recently become interested in Data-Flow and/or Flow-Based
programming, have been reading as much as I can find about them.

Often, these terms seem to be used interchangeably, but in Flow-Based
circles, Data-Flow seems to be ignored almost as though it's a "dirty
word".

Can anyone enlighten me about:

1. Exactly what are the differences between the two?
2. Why is the term Data-Flow not mentioned in Flow-Based circles?

Many thanks,
Mark.

John Cowan

unread,
Sep 14, 2010, 2:26:33 PM9/14/10
to flow-based-...@googlegroups.com
mark taylor scripsit:

> Often, these terms seem to be used interchangeably, but in Flow-Based
> circles, Data-Flow seems to be ignored almost as though it's a "dirty
> word".

"Dataflow" often refers to fine-grained dataflow parallelism, at the
level of machine instructions. What we have here is coarse-grained
dataflow at the level of processes.

--
Note that nobody these days would clamor for fundamental laws John Cowan
of *the theory of kangaroos*, showing why pseudo-kangaroos are co...@ccil.org
physically, logically, metaphysically impossible. http://www.ccil.org/~cowan
Kangaroos are wonderful, but not *that* wonderful. --Dan Dennett on zombies

jpaulm

unread,
Sep 14, 2010, 8:11:42 PM9/14/10
to Flow Based Programming
> Often, these terms seem to be used interchangeably,  but in Flow-Based
> circles, Data-Flow seems to be ignored almost as though it's a "dirty
> word".
>
Personally, I wouldn't say "data flow" is a dirty word - I tend to
think of FBP as part of the data flow universe. I checked my book,
and found 40 references to the phrase "data flow". It's just that,
over the last several decades, so many different approaches all
described themselves as data flow, that my feeling was that the term
had become so broad as to become almost meaningless. You will find
that much of the early work was done using this title, or phrases that
included it. Actually, it's good to know (thanks, John) that the term
has now become more specialized - maybe this will reduce the
confusion!

Paul Tarvydas

unread,
Sep 15, 2010, 9:56:15 AM9/15/10
to Flow Based Programming
I formed my notion of what "data-flow" represents several decades ago.

To me, data-flow is automagic (or declarative) whereas FBP / reactive-
programming is explicit.

For example, take an adder component a = b + c.

In data-flow, the component would automagically fire only when both
inputs, b and c, have arrived. The underlying data-flow kernel
suspends the component and collects up its inputs, checking to see if
all parameters have arrived, waking the component up only when the
inputs are all present. This begs the question of what happens if two
b's arrive before any c's arrive.

In reactive / FBP programming, the component wakes up for every
input. If it receives a "b", and wants to wait for a "c", the
software architect explicitly writes the code to do so, e.g. put "b"
into a buffer, set a state variable that says that we're waiting for a
"c" and go back to sleep. It is, also, the architect's responsibilty
to write code to handle the case of two b's before any c's - or to
ignore this case, if desired.

In my mind, data-flow is a (weak) subset of FBP concepts (and, yes, a
dirty word in my mind :-).

I want to be able to engineer code, which means that I need to be able
to manipulate every aspect of the code, including the parallelism,
race conditions, etc.

Similarly, products such as Smalltalk Parts, Prograph and the short-
lived Sun Java "visual" programming language whose name I can't even
remember, all failed because they employed the implicitly automagic
concept of call-return connectors instead of something more atomic.

pt

jpaulm

unread,
Sep 15, 2010, 10:30:00 AM9/15/10
to Flow Based Programming

> In data-flow, the component would automagically fire only when both
> inputs, b and c, have arrived.  
>
> In reactive / FBP programming, the component wakes up for every
> input.

Interestingly, this debate came up within the FBP framework a few
years ago, because a chap who was a C# expert liked the former option
(which is also characteristic of what are called Petri nets). I tried
to summarize some of the pros and cons in http://www.jpaulmorrison.com/cgi-bin/wiki.pl?AnyVsAll

ern0

unread,
Sep 15, 2010, 7:35:52 PM9/15/10
to flow-based-...@googlegroups.com

Our Prototype home aut. server is a simple reactive / actor model
system, so it uses "any", but consumer ports can be marked by the
component programmer (so patch programmer can't change it), as "fire it
last". This works like a priority attribute: if a message is being fired
to a port marked "last", it gets to the end of the message queue. So it
will be processed after other messages are done.

The shortest example for this feature is the Add component. Interface:

/////////////////////////////////////
component Add {

interface {
consumer first(Integer)
consumer last(Integer)
producer result(Integer)
} // interface

} // component Add
/////////////////////////////////////

Implementation (part):

/////////////////////////////////////
void AddUnit::messageHandler(int numero,Message* message) {

if (numero == 1) { // first

firstValue = message->getValue();

} else {

if (Nan::isNan(firstValue)) return;
int value = message->getValue();
if (Nan::isNan(value)) return;
result->fire( firstValue + value );

} // if numero

} // messageHandler()
/////////////////////////////////////

So, some kind of "all" can be forced this way. Have to say, in our
system, "all" is an exception. It is a home automation system, and there
is usually no multiple input at a time (no, you are not so quick), but
the problem comes up when we measure something, and we want to do
operations on the results.

(I don't really understand the DF vs FBP thing, there're so many kind of
systems (where apps are net of ready-made components sending msgs each
other), that one good term should be choosen, then sub-types should be
identified, from make (??) and spreadsheets (??) to soft synths
(synchronous).)
--
ern0
dataflow programmer

Tom Young

unread,
Sep 16, 2010, 11:20:35 AM9/16/10
to flow-based-...@googlegroups.com
Hi Mark,

Once upon a time, we learned a better way than flowcharting to design
and diagram programs. Data Flow diagrams were a significant element
in this new, 'structured design' approach, which required that (once
the design was 'complete') that someone convert this nice, clear,
diagram into a modular, hierarchical calling structure(HCS), which was
a better paradigm than the typical monolithic non-structure of the
day.

Along came Mssrs. Morrison and Stevens to show how this conversion
could be avoided, that it was unnecessary, and that a number of
problems could be avoided in the process(so to speak). This new
paradigm was called 'Data Flow Design' (not to be confused with
hardware architecture data flow). FBP derived from this paradigm as
JPM has described elsewhere.

Alas, the development world committed to HCS and rapidly built more
and more extensive programming libraries for more and more languages,
enabling evermore complex layers of HCS's: see CPAN, C/C++ libs, JAVA,
Squid, etc.. Many developers were even able to avoid modular
approaches and often encouraged to do so. The programmer now needs to
be aware of and specify hundreds of top-level libraries to build many
programs, these days.

The mryiad difficulties, especially with multi-threading, multi-core,
and multi-hosting in this complex HCS world may now,finally, make the
case for data flow design a compelling one, but that has to be
balanced against the huge investment the industry has made in these
libraries.

Does this help answer your question?

-twy

--
Tom Young
47 MITCHELL ST.
STAMFORD, CT  06902

"The penalty good men pay for indifference to public affairs, is to be
ruled by evil men."
   -Plato

This e-mail message from Tom Young is intended
only for the individual or entity to which it is addressed. This e-mail
may contain information that is privileged, confidential and exempt from
disclosure under applicable law. If you are not the intended recipient,
you are hereby notified that any dissemination, distribution or copying
of this communication is strictly prohibited. If you received this
e-mail by accident, please notify the sender immediately and destroy
this e-mail and all copies of it.

jpaulm

unread,
Sep 17, 2010, 10:11:54 PM9/17/10
to Flow Based Programming


> Along came Mssrs. Morrison and Stevens to show how this conversion
> could be avoided, that it was unnecessary, and  that a number of
> problems could be avoided in the process(so to speak).    This new
> paradigm was called 'Data Flow Design' (not to be confused with
> hardware architecture data flow).   FBP derived from this paradigm as
> JPM has described elsewhere.
>
> Alas,  the development world committed to  HCS and rapidly built more
> and more extensive programming libraries for more and more languages,
> enabling evermore complex layers of HCS's: see CPAN, C/C++ libs, JAVA,
> Squid, etc..   Many developers were even able to avoid modular
> approaches and often encouraged to do so.  The programmer now needs to
> be aware of and specify hundreds of top-level libraries to build many
> programs, these days.
>
> The mryiad difficulties, especially with multi-threading, multi-core,
> and multi-hosting in this complex HCS world may now,finally, make the
> case for data flow design a compelling one, but that has to be
> balanced against the huge investment the industry has made in these
> libraries.
>

Takes me back! It's pretty strange, when you come to think of it,
that what you call HCS is still so solidly entrenched, when it's been
obvious for years that it's a terrible way to program computers! And,
as I show in my book, FBP-like approaches have been popping up since
at least the '70s (e.g. CHIEF), and yet they didn't catch on until the
difficulty of programming multiprocessor machines finally started HCS
collapsing under its own weight!
Reply all
Reply to author
Forward
0 new messages