The idea that software should be componentized - built from prefabricated components - first became prominent with Douglas McIlroy's address at the NATO conference onsoftware engineering in Garmisch, Germany, 1968, titled Mass Produced Software Components.[2] The conference set out to counter the so-called software crisis. McIlroy's subsequent inclusion of pipes and filters into the Unix operating system was the first implementation of an infrastructure for this idea.
Brad Cox of Stepstone largely defined the modern concept of a software component.[3]He called them Software ICs and set out to create an infrastructure and market for these components by inventing the Objective-C programming language. (He summarizes this view in his book Object-Oriented Programming - An Evolutionary Approach 1986.)
Brad Cox, who is the inventor of Objective-C and one of the acknowledged gurus of OO, has come to feel that OO alone is not adequate for building large systems. He came to the conclusion that FBP concepts should be implemented on top of Objective-C, and then could be used as building blocks for applications. Using a hardware analogy, he refers to Objective-C as "gate-level", and FBP as "chip-level". He had in fact already started experimenting with processes and data flows independently when he found out about our work and contacted me.
Operations consist of "black boxes" with inputs and outputs, all of which are always explicitly defined. They run as soon as all of their inputs become valid, as opposed to when the program encounters them. Whereas a traditional program essentially consists of a series of statements saying "do this, now do this", a dataflow program is more like a series of workers on an assembly line, who will do their assigned task as soon as the materials arrive. This is why dataflow languages are inherently parallel; the operations have no hidden state to keep track of, and the operations are all "ready" at the same time.
Dataflow programs are generally represented very differently inside the computer as well. A traditional program is just what it seems, a series of instructions that run one after the other. A dataflow program might be implemented as a hash table instead, with uniquely identified inputs as the keys, used to look up pointers to the instructions. When any operation completes, the program scans down the list of operations until it finds the first operation where all of the inputs are currently valid, and runs it. When that operation finishes it will typically put data into one or more outputs, thereby making some other operation become valid.
Processes are connected by means of FIFO (first-in, first-out) queues or connections which can hold up to some maximum number of IPs (called a queue's "capacity"). Processes are connected by means of FIFO (first-in, first-out) queues or connections which can hold up to some maximum number of IPs (called a queue's "capacity").
However, the nature of that assembly line, or connection between processes, is different. Dataflow uses a logical connection while FBP uses a physical connection.a dataflow program is more like a series of workers on anassembly line, who will do their assigned task as soon as the materials arrive.
I think you have really brought the differences to the fore - but I might also add that a part of the difference is between the idea of "valid input states" and IPs: it seems to me (I may be wrong) that "valid input states" would tend to be simple values, whereas IPs are structured chunks of data with a defined lifetime. Would you agree?
Hi Ged,I agree with you in general. There are a few notes though.The boundaries of Dataflow programming itself are not distinct. For example, VHDL is included in the list but as far as I know VHDL it is component-based. Is COMPONENT BASED orthogonal to DECLARATIVE? I'm not sure, but if it is, then VHDL obviously belongs to FBP instead of Dataflow.
[...]
I've always found it puzzling that this industry never bought in to the dichotomy, if not the terminology, used above. The discussion derails as soon as the word granularity comes up. To me (and real engineers) its just common sense.
Now suppose I have incoming IPs, and we have a process which modifies a
particular field in each IP (e.g. the Assign component), is it better to
destroy the incoming IP and create a new one with the same structure as
the incoming one - or can we just modify the incoming one and send it
on? Or doesn't it matter? From my point of view, I find the
[im]mutability issue confusing :-) Does this have a bearing on the
dataflow vs. (?) FBP issue?
If each process is a separate microprocessor, does this change the answer?
Ged Byrne: “I would say that the key difference is that FBP is COMPONENT BASED while Data Flow programming is DECLARATIVE”
This is not correct. We can not compare “component based” with “declarative”. These are two different aspects but you can compare “declarative” with “imperative” for instance. “Component based” means: the system is based on components. It highlights the aspect of composition and nothing more.
FBP is declarative and component based in the same time. This is the essence.
I don't see any difference between Data Flow Programming and Flow Based Programming. I have said this before: unfortunately the programmers associate different meanings to these common words: flow, data, components.
The eventual differences are related to implementations. By implementation I mean how the declarative program (FBP or DataFlow, whatever) is translated / compiled into the executable program for a specific platform / microprocessor.
I am really against the idea that the
parameters / data / IPs represent an aspect that makes the
difference between FBP and DataFlow. We have had this discussion
before. An IP is mutable. You can follow this simple rule: what moves
(IP in our case) can be changed but in a single place at one time. It
can change again in another place at another time. “Place” is a
component. The components that modify the IPs (other components) don't move.
The order is strictly given by
the connections: TIME is ORDER and ORDER is SPACE. Voilà: space-time continuum
:-). The sequence (time
order we are used with in the imperative programming) in FBP or
DataFlow is given by ordering the processing of the IPs in space, moving
the IPs from one place (component) to another.
I have said before that in my opinion an IP could be a component. The components are first-class citizens. In fact it is the only way a “polymorphic aspect” (as it is defined in OOP) could manifest.
Vladimir Sibirov: “FBP has 2 levels of abstraction: coordination/structure level (visual) which is a graph of components and connections and behavior level (textual) which is imperative code for component operations”.
This is interesting. At first glance this assertions makes one to think that the border between “declarative” and “imperative” in FBP is made by components' membrane: what is outside is “declarative” and what is inside should be imperative at some more basic level. This could be misleading.
It is true that a FBP program could be just that (i.e. more coordination and less processing/computation) but a FBP program could be declarative all the way down to the level with the finest granularity i.e. atomic components (basic operations).
In the end all the FBP (declarative) program should be translated / compiled into an executable. The result of this translation could be very different depending on the target microprocessor or the number of processors available. The FBP program should be stretched / map and partitioned in the translation process (compilation) on a single or a limited number of threads. A thread is an ordered set of microprocessor instructions that feed the microprocessor. The executable remains a program that respects the same data flow principles but this time the code is data (bytecode) that feeds a single (or a limited set of) real microprocessor(s). In other words you start with a network of interconnected virtual processors (components) and end up with a single or limited number of real processor(s). This translation / compilation is not trivial but it could be done all the way from the FBP declarative program (source code) down to the executable.
Vladimir Sibirov: “FBP uses bounded buffers for communication, Dataflow uses unbounded call chains or tuple spaces”
This is the implementation detail I was talking about. I don't see it as an important aspect that makes a paradigm difference. How do you implement the following diamond shaped diagram?
Result z = D(B(A(a).b), C((A(a).c));
All the components / processors (A,B,C and D) can be executed in parallel but x and y need synchronization or at least a mechanism that assures they are triggered by the same IP that originated them i.e. the input of A that is a. If all these components are on the same computer / process / thread everything seems to be simple but imagine that A, B, C and D are complex processes that executes on different computers. Streams can break, the order is an issue. The FBP program needs to be resilient to errors. a,b,c,x,y and z could be streams but a specific x element is linked with a specific y element. Is FBP and DataFlow different with this respect? How do you assure the consistency in this case?
By the way, is out there anybody that can explain to us what a MONAD is, explaining it in plain English and with a little bit of FBP flavor? Of course I have read some documentation about monads in functional languages but I am eager to know how a FBP programmer thinks about monads.
Regards,
Dan
of course it is.
everything ends up being implemented in C anyway, at the end of the
day, under the covers ;-)
All the components / processors (A,B,C and D) can be executed in parallel but x and y need synchronization or at least a mechanism that assures they are triggered by the same IP that originated them i.e. the input of A that is a. If all these components are on the same computer / process / thread everything seems to be simple...
it all depends on what level you are talking about. compilers all the
way down. except for the interpreters. oh but those are mostly jits
now anyway. oy veh, the humanity.
> JavaFBP is written in Java that compiles into Bytecode that is translated
> into machine code by the Hotspot VM. No C is generated at any point.
but hotspot is written in C, or at least used to be.
one can pretty much layer any turing complete language paradim on top
of or underneath any other turing complete language paradigm.
if i as the end user am using
http://en.wikipedia.org/wiki/DataRush_Technology or
https://github.com/trifork/erjang/wiki or http://clojure.org/ or
http://openquark.org/Welcome.html or http://shenlanguage.org/ running
on ABCL on a JVM, what am i doing?
sincerely.
it is quite right that Java itself doesn't change.
but it is not quite right to say that the layer on top cannot be
called anything other than imperative.
sincerely.
Could you point to the properties of JavaFBP that make it declarative. Each line is a command that instructs the JVM to create new object instances. It declares component objects and links them together with connections that are bounded queues. All of this is specifying exactly what is to be done, not the computation that is to be achieved. The code has side effects and does not clearly correspond to computational logic.JavaFBP is written in Java that compiles into Bytecode that is translated into machine code by the Hotspot VM. No C is generated at any point.
> Could you point to the properties of JavaFBP that make it declarative.
JavaFBP mini-language has both declarative ("These are the components,
this is how they are wired up") and imperative ("Create these components,
wire them up like this") readings. This is normal for code written
in a functional language, or as in this case a functional subset of a
non-functional language.
Since the mini-language is just Java, it's perfectly possible to create
components on the fly, or change the writing on the fly. But normally
FBP programmers don't, which is why the mini-language is effectively
functional/declarative.
(The comment below is chosen randomly, but it's apropos.)
--
Man has no body distinct from his soul, John Cowan
for that called body is a portion of the soul co...@ccil.org
discerned by the five senses, http://www.ccil.org/~cowan
the chief inlets of the soul in this age. --William Blake
> So all FBP programs are declarative, regardless of language?
That depends on the design of the FBP package. In JavaFBP and C#FBP,
the *normal* use of the mini-language features is functional no
mutations, no reassignment of variables), so both imperative and
declarative readings exist.
> The same is true of any program that constructs a user interface, such
> as a swing program?
It could be. I'm not familiar with the details of using Swing.
> The distinction between imperative and declarative programmer is
> therefore meaningless?
There are programs that have only imperative readings, not declarative
ones. Anything involving assignment (as opposed to initialization),
or that involves changing the state of an object (as opposed to just
setting it up), has only an imperative reading. If a text has only a
declarative reading, it's markup rather than code.
--
I marvel at the creature: so secret and John Cowan
so sly as he is, to come sporting in the pool co...@ccil.org
before our very window. Does he think that http://www.ccil.org/~cowan
Men sleep without watch all night?
I've not looked at JavaFBP lately and am not sure if Paul stuck to my programming-as-configuring closed boxes model. Suspect so knowing his proclivities. And mine. ;)
> The connecting step is separate, and originates from drag and drop actions
> to "draw" connections between components. I'm a bit puzzled as to why
> assembler macros were needed to do what I think of as (and implement as)
> workplace.connect(upstreamComponentID, downstreamComponentID).
The whole AMPS system was written in (System 360) assembler. The macros
made the mini-language more readable.
--
Normally I can handle panic attacks on my own; John Cowan <co...@ccil.org>
but panic is, at the moment, a way of life. http://www.ccil.org/~cowan
--Joseph Zitt
Maybe I wasn't clear - AMPS was written using mainframe Assembler
language and macros - no high level languages supported at all.
Remember we are talking about the late '60s, early '70s! The (what we
would now call FBP) networks were defined using a few different macros,
the most heavily used one being the one that specified a process and the
connections that attached to it. As you would expect, this macro
generated a process control block and zero or more connection control
blocks. The last macro in the program essentially said "run it". HTH !
Building construction has evolved since time began as shelter is one of our basic needs. We need to protect ourselves from the heat, cold, rain and other weather elements. What has happened over time is that we built our homes based on local building techniques and availability of materials. Maybe we started with stone or pieces of wood and that grew into buildings of stone and mortar or later timberframes. That was the method for a long period of time.
This gives us Flow Based Programming as a Construction Style that complements the Pipe and Filter Style of Dataflow Architecture. It describes the standard for the component blocks and the connections used to configure them.
We should be able to compile some clear, easy to understand statements that everybody can agree with.
> Maybe too far, perhaps mention that the data types can contain object
> references so streams contents can be lists or trees, unlike Unix byte
> streams.
This is not such a contrast as JPM traditionally makes it out to be.
It's true that at the OS level, pipes are byte streams, as are files.
However, most tools used in the Unix pipes and filters architecture treat
both pipes and files as string IPs delimited by newline characters (or
NUL characters in a few cases), or even as records with multiple fields
separated by a TAB character (usually changeable).
--
You are a child of the universe no less John Cowan
than the trees and all other acyclic http://www.ccil.org/~cowan
graphs; you have a right to be here. co...@ccil.org
--DeXiderata by Sean McGrath
Interesting post! Is your AOP Aspect-Oriented Programming or
Attribute-Oriented Programming (see https://en.wikipedia.org/wiki/Aop
)? If the former, could you maybe give us an example, and show how it
could be handled in FBP?
Thanks in advance,
Paul
On 11/04/2012 2:08 PM, Dan wrote:All the components / processors (A,B,C and D) can be executed in parallel but x and y need synchronization or at least a mechanism that assures they are triggered by the same IP that originated them i.e. the input of A that is a. If all these components are on the same computer / process / thread everything seems to be simple...
Hi Dan,
Just wondered why you picked this particular topology for your example - this is a perfect example of a deadlock-prone network. I would say it is anything but simple: if you can guarantee that A emits the same number of IPs to B and C, and that they in turn emit the same number of IPs on the connections to D, then you avoid a deadlock. You have to pay attention to this even when you are sharing memory. I confess I don't see what it means to say that you can run that network in one process or thread - unless you are using the arrows to mean something different from standard FBP usage. I certainly agree the situation becomes even more complex if the arrows are connections between machines, but it's pretty complex even on one machine...
> Unix bytestreams don't work for lists or trees; anything that contains
> references/pointers. JSON or XML DOM trees, for example. Not talking
> about ability to parse XML text to trees, but passing the trees
> between components directly, without conversions.
Well, it's true that when you are no longer sharing memory, you need
some kind of transducer. But if you want something very close to pure
data representations, consider ASN.1 BER or, more flexibly, Protocol
Buffers. Such things will always be needed by FBP programs that require
distribution beyond the boundaries of a single shared memory.
--
John Cowan co...@ccil.org http://www.ccil.org/~cowan
C'est la` pourtant que se livre le sens du dire, de ce que, s'y conjuguant
le nyania qui bruit des sexes en compagnie, il supplee a ce qu'entre eux,
de rapport nyait pas. --Jacques Lacan, "L'Etourdit"
FBP deadlocks (as opposed to transaction deadlocks) are always a design
problem.
> But what is different comparing to the classical imperative languages? We
> have compilers for many platforms. Well, *in FBP we know all the possible
> parallel paths*, we can map these one in so many ways on the real hardware.
Up to a point. Unless we have deep knowledge of all the components and which
ones are stateful, we don't know how much actual parallelism (as opposed to
concurrency) we can get.
--
John Cowan co...@ccil.org http://ccil.org/~cowan
The penguin geeks is happy / As under the waves they lark
The closed-source geeks ain't happy / They sad cause they in the dark
But geeks in the dark is lucky / They in for a worser treat
One day when the Borg go belly-up / Guess who wind up on the street.
> Also you say: "Unless we have deep knowledge of all the components and
> * which ones are stateful*, we don't know how much actual parallelism
> (as opposed to concurrency) we can get." We don't need to know which
> ones are stateful in order to translate them properly and execute them
> concurrently (some say in parallel).
FBP components are executed concurrently, but not in general in parallel. See
http://recycledknowledge.blogspot.com/2012/04/concurrency-vs-parallelism.html
for my explanation of the difference.
--
Your worships will perhaps be thinking John Cowan
that it is an easy thing to blow up a dog? http://www.ccil.org/~cowan
[Or] to write a book?
--Don Quixote, Introduction co...@ccil.org