Flow-based programming and Erlang style message passing -- A Biology-inspired idea of how they fit together

172 views
Skip to first unread message

Samuel Lampa

unread,
Jun 15, 2015, 9:56:08 AM6/15/15
to flow-based-...@googlegroups.com
JFYI, Inspired by some conversations at the Erlang User Conference last week, I wrote a blog post on some thoughts on how I think Flow-based programming and Erlang-style message passing might optimally fit together:

- http://bionics.it/posts/flowbased-vs-erlang-message-passing

I'm sure my thoughts are nothing near finished, but this is a topic that's been occupying my mind for some time, and I'm very interested to get the discussion going, to learn more about it!

The post is actually making the front page of HackerNews right now as well(!), so feel free to comment (and of course vote if you like it) at:

- https://news.ycombinator.com/item?id=9718868

Cheers
// Samuel

Samuel Lampa

unread,
Jun 15, 2015, 9:58:10 AM6/15/15
to flow-based-...@googlegroups.com
Especially I would like to see more interaction between the FBP and Erlang communities, as both seem to consist extremely knowledgeable people, used to solve hard problems in reality, and hope that this post might help in that direction.

// Samuel

Ged Byrne

unread,
Jun 15, 2015, 10:29:59 AM6/15/15
to flow-based-...@googlegroups.com
Hi Samuel,

Great post.

I was a bit confused by the heading "FBP at all suitable for distributed computing?" because the paragraphs that follow do not discuss FBP or distributed computing.  Instead they discuss Erlang VM's poor performance for certain parallelism.

I can't find anything that explains why you don't think FBP is well suited for distributed computing.  Have you dropped some paragraphs in an edit?

I was hoping to see a discussion around the nature of FBP's IPs as persistent entities compared to Erlang's transient messages.

Regards, 


Ged


--
You received this message because you are subscribed to the Google Groups "Flow Based Programming" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flow-based-progra...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Samuel Lampa

unread,
Jun 15, 2015, 11:01:30 AM6/15/15
to flow-based-...@googlegroups.com
On Monday, June 15, 2015 at 4:29:59 PM UTC+2, Ged Byrne wrote:
Hi Samuel,

Great post.

Thanks! 
I was a bit confused by the heading "FBP at all suitable for distributed computing?" because the paragraphs that follow do not discuss FBP or distributed computing.  Instead they discuss Erlang VM's poor performance for certain parallelism.

I can't find anything that explains why you don't think FBP is well suited for distributed computing.  Have you dropped some paragraphs in an edit?

I was hoping to see a discussion around the nature of FBP's IPs as persistent entities compared to Erlang's transient messages.

Thanks for noticing this, I'll look over the headings (I think I mostly just accidentally didn't adhere enough to the heading) 

The comparison of persistent vs transient IP's is a very interesting one, that I have actually not digged in on deeper, although I have at some point realized that it is an important distinctive feature.

Also, I think it is actually things like this that make me see FBP as much more similar to the DNA/RNA-string-processing machinery in the cell (where at least the DNA-data exists in only one copy), whereas cell-to-cell signalling is leveraging redundancy of messaging a lot more.

(Though, the RNA/DNA analogy does not hold completely here, if assuming persistent IPs, since often multiple mRNAs are produced from one DNA region (gene), and similarly multiple proteins are produced from each mRNA strand (actually often in parallel as well, to provide increased performance when the concentration of a certain protein needs to increase quickly).

Best Regards
// Samuel

Samuel Lampa

unread,
Jun 15, 2015, 11:17:52 AM6/15/15
to flow-based-...@googlegroups.com
On 2015-06-15 17:01, Samuel Lampa wrote:
> On Monday, June 15, 2015 at 4:29:59 PM UTC+2, Ged Byrne wrote:
>
> I was hoping to see a discussion around the nature of FBP's IPs as
> persistent entities compared to Erlang's transient messages.
>
>
> Thanks for noticing this, I'll look over the headings (I think I
> mostly just accidentally didn't adhere enough to the heading)
>
> The comparison of persistent vs transient IP's is a very interesting
> one, that I have actually not digged in on deeper, although I have at
> some point realized that it is an important distinctive feature.

I'm very interested in hearing your elaborated thoughts on this?

Best
// Samuel

Ged Byrne

unread,
Jun 15, 2015, 11:47:41 AM6/15/15
to flow-based-...@googlegroups.com
Hi Samuel,

In FBP the Information Packet designates a physical thing.  As Paul explains in the original edition:

  • Now we can reinterpret "a": rather than thinking of it as an area of storage, let us think of "a" as a "handle" which designates a particular "thing" - it is a way of locating a thing, rather than the storage area containing the thing. In fact, these data things should not be thought of as being in storage: they are "somewhere else" (after all, it does not matter where "read" puts them, so long as the information they contain becomes available to the program). These things really have more attributes than just their images in storage. The storage image can be thought of as rather like the projection of a solid object onto a plane - manipulating the image does not affect the real thing behind the image. Now a consequence of this is that, if we reuse a handle, we will lose access to the thing it is the handle of. This therefore means that we have a responsibility to properly dispose of things before we can reuse their handles.
  • http://www.jpaulmorrison.com/fbp/concepts_book.shtml
When an FBP processor reads or sends an IP it is a transfer of ownership, not the receiving of a message.  FBP prevents an IP from being in two places at the same time.  

If process A in on machine I and process B is on machine II then when A passes to B there must be a handover of responsibility.  This is a powerful concept but it can stand in the way of parallelism.

Compare this with the "Message Passing" approach.  A message is a piece of information that is not bound to anything.   We can make multiple copies of a message without worrying because it isn't bound to any solid object.

Many of the techniques we might use in distributed computing are incompatible with the idea of being bound to a solid object.  Consider, for example, this discussion around the use of redundant messages for fault tolerance:
  • > Let's start with how we do error recovery.
    >
    > Imagine two linked processes A and B on separate machines. A is the
    > master process.
    > B is a passive processes that will take over if A fails.
    >
    > A sends a stream of state update messages S1, S2, S3,... to B. The
    > state messages contain enough information
    > for B to do whatever A was doing should A fail. If A fails B will
    > receive an EXIT signal.
    >
    > If B does receive an exit signal it knows A has failed, so it carries
    > on doing whatever A was doing
    > using the information it last received in a state update message.
    >
    > That's it and all about it - nothing to do with supervisors etc,
    
    Ooooohh! THANK YOU, THANK YOU, THANK YOU.
    
    *Now* I get it. This is precisely what I was trying to understand. The  
    missing link was sending redundant messages. It all makes sense now --  
    it's so simple. All I have to do to get fault tolerance at the process  
    level is have a group of N redundant processes waiting for exit-signals  
    and forward state changes to N-1 of them. I could even send to N/2 and  
    have some decent tolerance at the expense of consistency of the nodes.
  • http://erlang.org/pipermail/erlang-questions/2011-January/055531.html
In FBP we have an IP that is a handle to a single, solid object.  In Erlang we have a stream of state update messages.  This approach is ideal for parallel paths, making it ideal for distributed computing.

I think having multiple instances of FBP microservices communicating via a message passing protocol is a great idea.

To go with the biological metaphors we have something like the Synapse and the way that an electrical impulse is converted into a chemical message so that  neurons may communicate.


Regards, 


Ged






Samuel Lampa

unread,
Jun 15, 2015, 12:39:52 PM6/15/15
to flow-based-...@googlegroups.com
Thank you very much for the elaborate answer, I appreciate it! 

It confirms my understanding of the differences, although I didn't remember super-clearly exactly how the IPs in FBP were designed.

I had understood what you describe, that IPs are more of a "data carrier" than a state change notification, but wasn't sure about the view that an IP is the same while travelling through multiple processes.

But I'm also a little confused about that, since this can not truly be the case e.g. for a merge-operation process? A merge would need to produce new (merged) IPs out of the two or more IPs received at a time as input?

Kind Regards
// Samuel
To unsubscribe from this group and stop receiving emails from it, send an email to flow-based-programming+unsub...@googlegroups.com.

Samuel Lampa

unread,
Jun 15, 2015, 12:47:10 PM6/15/15
to flow-based-...@googlegroups.com
On Monday, June 15, 2015 at 5:47:41 PM UTC+2, Ged Byrne wrote:
If process A in on machine I and process B is on machine II then when A passes to B there must be a handover of responsibility.  This is a powerful concept but it can stand in the way of parallelism.

Btw, I'm sure I'm not the only one to think this reminds a lot about the "borrowing-enabled" ownership model in the Rust programming language:
I wonder if anybody is working on an FBP implementation in Rust?

Best
// Samuel

Ged Byrne

unread,
Jun 15, 2015, 1:05:19 PM6/15/15
to flow-based-...@googlegroups.com
Hi Samuel,
In that case the original IP would be destroyed and two new ones created.

The rule is that once an IP has been received it must be either sent of deleted.

"
We said earlier that IPs are things which have to be explicitly destroyed - in our multi-process implementation, we require that all IPs be accounted for: any process which receives an IP has its "number of owned IPs" incremented, and must reduce this number back to zero before it exits. It can do this in essentially the same ways we dispose of a memo: destroy it, pass it on or file it. Of course, you can't dispose of an IP (or even access it) if you don't have its handle (or if its handle has been reused to designate another IP). Just like a memo, an IP cannot be reaccessed by a process once it has been disposed of (in most FBP implementations we zero out the handle after disposing of the IP to prevent exactly that from happening).
"
http://www.jpaulmorrison.com/fbp/concepts_book.shtml

If you look at the source for COlLATE you'll see that happening:
https://github.com/jpaulm/javafbp/blob/master/src/main/java/com/jpmorrsn/fbp/components/Collate.java

To unsubscribe from this group and stop receiving emails from it, send an email to flow-based-progra...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Flow Based Programming" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flow-based-progra...@googlegroups.com.

Samuel Lampa

unread,
Jun 15, 2015, 1:18:35 PM6/15/15
to flow-based-...@googlegroups.com
IC, thanks for explaining!

I should go back and study the FBP book in better detail! :)

// Samuel
To unsubscribe from this group and stop receiving emails from it, send an email to flow-based-programming+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Flow Based Programming" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flow-based-programming+unsub...@googlegroups.com.

co...@ccil.org

unread,
Jun 15, 2015, 3:02:53 PM6/15/15
to flow-based-...@googlegroups.com
Samuel Lampa scripsit:

> But I'm also a little confused about that, since this can not truly be the
> case e.g. for a merge-operation process? A merge would need to produce new
> (merged) IPs out of the two or more IPs received at a time as input?

Certainly. An FBP component can freely create new packets as needed, or
drop, pass on, or mutate packets it receives from its inputs. IMO it is
better to drop and create rather than to mutate packets, but mutation
is not forbidden by the programming style, because it is guaranteed
that a packet is either in flight between components or is owned by
exactly one component.

--
John Cowan http://www.ccil.org/~cowan co...@ccil.org
Being understandable rather than obscurantist poses certain
risks, in that one's opinions are clear and therefore falsifiable
in the light of new data, but it has the advantage of encouraging
feedback from others. --James A. Matisoff


Samuel Lampa

unread,
Jun 15, 2015, 3:05:05 PM6/15/15
to flow-based-...@googlegroups.com
On 2015-06-15 21:02, co...@ccil.org wrote:
> Samuel Lampa scripsit:
>
>> But I'm also a little confused about that, since this can not truly be the
>> case e.g. for a merge-operation process? A merge would need to produce new
>> (merged) IPs out of the two or more IPs received at a time as input?
> Certainly. An FBP component can freely create new packets as needed, or
> drop, pass on, or mutate packets it receives from its inputs. IMO it is
> better to drop and create rather than to mutate packets, but mutation
> is not forbidden by the programming style, because it is guaranteed
> that a packet is either in flight between components or is owned by
> exactly one component.

That's a great clarification, thanks!

Best
// Samuel

Paul Morrison

unread,
Jun 15, 2015, 3:13:13 PM6/15/15
to flow-based-...@googlegroups.com, joe...@gmail.com
For your interest - not directly connected with the ongoing discussion, Joe Armstrong (the inventor of Erlang) sent me the following in 2009 - some good stuff here!   I am assuming he won't mind my republishing it here, as it was public domain then...

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

I've been thinking about concurrent fault-tolerant programming for 25 odd years.

It seems to me that your FBP is bringing programming to what I might call a "process control" view of the world. Imagine the control panel in a power station - it shows flows and has coloured lamps etc showing what is working.

It would be very nice to visualise the "state" of an FBP system - green lights on components with low CPU load, red lights on high CPU components etc. little scale indicators on buffers to show how full they are etc.

As for out of band data I think we just need to have a general rule - all data is either {data, X} or {meta, X} all meta data is just passed from inputs to outputs (which ouputs? - all of them????) but can be (doesn't have to be) processed

I also feel the need for some special symbols.

(Now I feel hampered by ascii)

- a circle with out arrow = file source this opens the file, reads it entry at a time until eof and sends the data along the arrow

- circle with arrow in = file sink (the opposite)

So file copy is just a dumbbell with and arrow between the circles :-)


I have several long-term problems "storing things" and "finding things" - now Google has made inroads on finding things. Rob Pike (from plan9, inferno, bell-labs fame) http://labs.google.com/people/r/ has written a paper Interpreting the Data: Parallel Analysis with Sawzall.

Very interesting - this is flow based programming (with a different name) on a grand scale. All these Google machines are just pumping out data in a gigantic flow network.

One idea I had would be to try and create a web from unlinked data. Take, for example, the erlang mailing list - there are tens of thousands of interesting articles in it. If we could automatically link each post to (say) the ten most similar posts then we could turn the email archive into a web site.

This needs the idea of similarity between two texts - implying automatic extraction of keywords (difficult) - also "recommendations algebras" - ie if X likes A and B and Y likes A - then possibly Y might also like B.

Ultimately I'd like self-organising documents that allow feedback.

Another fun thing to do would be to implement a system for FlowBasedProgramming in ErlangLanguage with a nice graphics front-end - I think this is rather easy.

One "device" that would make life *very* easy is the "named persistent pipe" ie a persistent FIFO with named end-points that survives crashes.

That and a graphic front-end to draw the connections would be fun.

Actually a crude "no bells and whistles" variant of this would be very easy.

Then we need a set of filters - to get "legacy code" talk to the FIFO's.

(Incidentally the persistent FIFO model solves the problem of updating code on-the-fly, Just turn off the FIFO. When I give talks I think the "pluming" analogy is great - to change my boiler, the plumber turns of the water drains any extra water out of the system, changes the boiler, re-connects the pipes and turn on the water. The pipe contains the water - if you're careful nothing drips out. *This* is exactly how we should upgrade Software.

BTW I have a suggestion for a language-neutral data format for describing the water :-)

BTW - have you tried Erlang? - (www.erlang.org) http://www.erlang.org/download/getting_started-5.4.pdf is a good introduction.

If you'd like to learn Erlang I would be delighted to help you with any problems you might encounter.

I can set you your first homework "make a fault-tolerant persistent named FIFO - replicated the FIFO on two machines - so that it will still work if one machine fails" - hint use MnesiaDatabase - a simple version is probably less than 100 lines of code - we'll need this :-)


Erlang does not have a native interface to C or C++. All interfaceing is done by pipes or sockets - this is deliberate - interfacing erlang by linking code into the system would be very dangerous if the code were incorrect - so we require that it be isolated in a different process.

Erlang runs on all major platforms. The windows distribution is a binary executable - on unix/linux machines we distribute source but the build process is known to be trouble free on all major platforms.


For about ten years I have been thinking about whats missing in Erlang - how can we make systems that are easier to understand.

I had some ideas here

http://www.erlang.org/ml-archive/erlang-questions/200508/msg00340.html      (Sorry, link seems to be dead!)

The central idea is of "reactive components"

These are Black Boxes characterised by three things

		 IP, Port and protocol

saying

		 bind Ip,port,protocol

Means there is a thing at (IP, Port) that obeys protocol

		 Step 1 *specify the protocol* (usually in terms of RPCs)

		 (Example FTP)

		 protocol FTP
		 ...
		 ls() => [file()]
		 getFile() => ok() | error()

		 Step 2 - I refine the types

		 Type ls() = "ls"
		      getFile() = {"get", str()}
		      ok() = {"ok", str()}

		 Step 3 - I serialise the type instances

      And send them to the (IP, Port)

		 To fetch a file I send

		 Content-Type: xml
		 Content-Length: NNN
		 <tuple>
		  <str>get</str>
		  <str>The File</str>
		 </tuple>

		 and wait for

		 Content-Type: xml
		 Content-Length: NNN
		 <tuple>
		  <str>ok</str>
		  <str>  the file content ....</str>
		 </tuple>

		 Or

		 Content-Type: xml
		 Content-Length: NNN
		 <tuple>
		  <str>error</str>
		  <str> ... string ....</str>
		 </tuple>

because that's what the protocol spec says.

What more do I need?

  • a graphic front-end to draw the interconnections
  • a universal pipe mechanism *between* components


--
You received this message because you are subscribed to the Google Groups "Flow Based Programming" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flow-based-progra...@googlegroups.com.

Samuel Lampa

unread,
Jun 15, 2015, 3:19:06 PM6/15/15
to flow-based-...@googlegroups.com, joe...@gmail.com
On 2015-06-15 21:13, Paul Morrison wrote:
> For your interest - not directly connected with the ongoing
> discussion, Joe Armstrong (the inventor of Erlang) sent me the
> following in 2009 - some good stuff here! I am assuming he won't
> mind my republishing it here, as it was public domain then...

Thanks Paul, these were indeed super-interesting notes! :)

For the record, I guess the closest to an FBP implementation in Erlang
is the one implemented in Elixir (which in its turn compiles to Erlang
in the backend), ElixirFBP: http://www.elixirfbp.org/ (Mentioned
elsewhere in the mailing list).

Best Regards
// Samuel

Sam Watkins

unread,
Jun 16, 2015, 5:36:34 AM6/16/15
to flow-based-...@googlegroups.com
On Mon, Jun 15, 2015 at 03:47:30PM +0000, Ged Byrne wrote:
> When an FBP processor reads or sends an IP it is a transfer of ownership,
> not the receiving of a message. FBP prevents an IP from being in two
> places at the same time.

If necessary, we can be more flexible without breaking the FBP paradigm.

We can have multiple references to immutable IPs, with reference
counting. This works exactly the same as making copies of the IPs, but
could be more efficient if those IPs are very large (e.g. videos,
images, documents, read-only databases).

It would also be okay to have multiple references to a thread-safe
capable object. I guess this only makes sense if the IPs have OO
methods and behaviour (e.g. Java objects). For a physical metaphor,
we could think of those IPs as telephones that can communicate with the
underlying object at a distance.

This is straying a little from normal FBP, but doesn't really contradict
the principle that an IP cannot be in two places at the same time.

Both of these methods would require extra attention when passing IPs
over the network to another computer.

That discussion on state update messages for redundant processes is
interesting, thanks for sharing that.

Paul Morrison

unread,
Jun 16, 2015, 2:13:50 PM6/16/15
to flow-based-...@googlegroups.com, Sam Watkins
We did something like what you describe in large apps written using AMPS, the first production FBP environment, running on IBM mainframes (green threads).

The requirement was to give multiple processes simultaneous access to read-only tables.

A table was built using one process (A), which then "sent" it to another process (B) (transferring ownership) which would store its address in a "global" table.  The variables in the global table were accessed associatively, via character strings, and could only point at immutable objects held in IPs.  In fact this was the only use we made of globals in any AMPS application.

To accommodate FBP ownership rules, process B had to stay alive until all processes that referenced the table had terminated.  This could be handled by means of a second input port on process B.  I don't remember this as causing any hardship...  :-)    This design also implied that building the table was done once and the table never changed after that - if this restriction could not be satisfied, we would normally have used a database.. 

Reply all
Reply to author
Forward
0 new messages