> Ralf, I have to say that I have absolutely no recollection of saying
> anything like that quote - I don't even use the phrase "shared
> state"... I have to confess I don't know exactly what it means -
> perhaps someone else said it...?
Shared state consists of variables or objects which are writable by
more than one process in such a way that the changes are visible in all
processes that examine them.
In order to manage shared state, it's necessary to regulate access to
it. The power of FBP is that it confines shared state to the buffers
inside connections, whch are accessed only in a highly disciplined
fashion. Of course, JavaFBP and C#FBP don't actually prevent the use of
shared state, but they make it unnecessary.
--
Unless it was by accident that I had John Cowan
offended someone, I never apologized. co...@ccil.org
--Quentin Crisp http://www.ccil.org/~cowan
That said, however, instead of, as Ralf suggested, restricting the
problem to using small in-storage tables of questions and answers, let
us assume a database comprising the questionnaire and possible answers
sitting on the cloud, with the desktop app looking after formatting.
Now you could have the server generate HTML, or you can just store raw
data in the cloud, and use some of the processing power on the users'
desktops to do the formatting. The server side of this app might then
look something like the diagram shown on the cover of the 2nd edition of
my book - although it shows multiple back-ends, which might not be
necessary in Ralf's example. In this case, the server is servicing
multiple users, and the two sections of the attached diagram are running
on separate machines. An app which only services a single user,
running on a single machine, could of course be simpler, but you might
still want to allow for the possibility of migrating to a server
arrangement at some later date - most of the components would not need
to change...
One last thought: in light of the thread about FBP on hardware, perhaps
we should visualize FBP components running on different microprocessors,
and see what that does to our designs!
Tom,
Yes we could.
But should we?
Regards,
Ged
On Feb 9, 10:48 pm, Tom Young <f...@twyoung.com> wrote:
> It should be simple enough to separate the UI concerns from the domain
> logic by splitting 'Input Answers' into two components.
>
> -Tom
Tom,Then the only reason we are creating shared state artificially.
I'm saying that we should first define requirements that actually introduces the problem of shared state.
Paul's suggestion of making it a multi-user system does this.
Cheers,
On 10 February 2012 17:56, Tom Young <twy...@twyoung.com> wrote:
Yes, in this case we should, because the point is to demonstrate a 'shared state' situation. If all functionality is confined to one component, or the UI and the logic issues are separated, then the 'shared state' issue is hidden from view.
I think the shared state issue will not be a problem in a well-designed FBP -- just keep state in the appropriate component (or IP). The object here is to show what that design might look like.
Regards,
The argument was that separating out the UI would expose the issue or at least demonstrate how to address the issue. How else can we address the argument than by doing the separation?
I'm saying that we should first define requirements that actually introduces the problem of shared state.Perhaps we should agree on whether we are discussing the Distributed Shared State Concurrency Problem or some other shared state problem. If the former, then I would argue that the problem does not arise unless we imagine that a FBP is intended to survive (i.e. continue to produce correct results) despite one or more component nodes failing.
Paul's suggestion of making it a multi-user system does this.Paul's diagram does not tell us how the problem is addressed. It is possible to imagine that all state is confined to the IPs and all components are stateless; or, some component(perhaps 'Display Questions' is keeping track what is going on; or something in between.
seems to me the trade off is: shared state vs. inconsistency. that is,
if the state is not shared but is duplicated then there is always some
period of time after a change when the duplicated state isn't the
safe.
Actually, we had one fairly common shared state situation that I
described earlier in this thread: shared tables - these are still IPs,
however.
I also think there is a well known shared state (?) situation at a
"coarser-grained" level: namely shared databases. Here, IMO, the whole
sharing issue has been very thoroughly explored in the area of
traditional transaction-based systems. However, you might not even need
that complexity, as Ralf's questionnaire and its possible answers are
pretty much read-only...
What am I missing?!
> [T]here are at least three different ways to get the effect. The first
> is to include the state in the IP. The second is to have a separate
> process send the state to the two processes; this works nicely for
> read-only. The third is to have the two sharing processes ship the
> state back and forth as it is modified.
There is a fourth approach, provided each change can be made atomically
under the implementation language's memory model. All components that
need read-only access to the state have direct pointers to it, but there
is a dedicated component that updates the state as it receives command
IPs from other processes.
Note that this design provides eventual consistency; that is, it's
possible for a process to see an old version of the state because the
server component has not yet updated it. If that's good enough, as it
often is, then it works cleanly.
--
A rabbi whose congregation doesn't want John Cowan
to drive him out of town isn't a rabbi, http://www.ccil.org/~cowan
and a rabbi who lets them do it co...@ccil.org
isn't a man. --Jewish saying
[...] The general answer is that you put the shared
state data in its own process. Thus, in your example, we create
another process called Q that holds XYZ. The flows then could be
A -> B -> C -> D
|
T -> Q
|
E -> F -> G
[...]
I am sure you will ask what (2) means. All the components being pure functional, pure data or mix have input and output ports. What it is important is the dynamic connection between them after they have travelled through pipes (connections).
A pure data component (e.g. C) has also ports like functional components: at least one for write and another distinct one for read. You manipulate the "shared" state variable C only through this distinct ports. Because you can connect to "write port" only one pipe at a time then you do not share the component C for writing but you can do for reading (i.e. it's like copying / duplicating the information). Or you can send in order two values through a pipe to C input (write) port.
[...]
> It helps to use ports to get a copy of the state. The idea is to send
> a request to the process holding the state, which sends a copy back
> the requesting process.
i don't yet see how that is a silver bullet. what if some other
process also wants the data to manipulate? how does it get merged? or
are there locks? if there are locks, how do you coordinate locks to
avoid deadlock? or zombie locks? etc.
sincerely.
> There are issues with that fourth approach. First issue. Significant
> updates take time. I don't think that in place atomic updates are all
> that easy to come by,
What counts as "significant" depends on the application. In the Java
memory model (which is the only one I'm really familiar with) a change to
a single storage location such as an instance variable, array component,
or global variable is guaranteed to happen atomically (unless it is of
type 'long' or 'double', for historical reasons).
So, for example, the update component can take command IPs from other
components, construct a new updated object on the heap, and atomically
substitute it for the old object in an exposed global variable. This is
the general case, and often not necessary: for example, if the global
state is a large matrix of floats, all the update component has to do
is set the specified (x,y) element to the new value: there is no need
to copy the whole matrix.
> Second issue. How do we garbage collect?
We garbage-collect with a garbage collector.
--
And through this revolting graveyard of the universe the muffled, maddening
beating of drums, and thin, monotonous whine of blasphemous flutes from
inconceivable, unlighted chambers beyond Time; the detestable pounding
and piping whereunto dance slowly, awkwardly, and absurdly the gigantic
tenebrous ultimate gods --the blind, voiceless, mindless gargoyles whose soul
is Nyarlathotep. (Lovecraft) John Cowan co...@ccil.org
For instance, the programmers have discussed about OOP (Object Oriented Programming) for decades by now and many still think an object in OOP paradigm resembles the real object in nature. At least us, the guys studying FBP know that it is not quite true. OOP became religious like many other paradigms. When I have discovered for the first time OOP I thought my problems were over and objects are like electronic chips that I can connect together. Nothing could be further from the truth. There was a big discrepancy between reality and concepts.
I have observed a strong tendency of people, especially those ones coming from academics, to create their own terms eluding the real meaning of words (i.e. usually defined in dictionaries). The commons sense is broken and programmers are speaking their own language using the same words but with different meanings.
Well, an object should remain an object, a component is a component, a message or parameter remain what the dictionary says they are. Also something could be a component, node in a graph, an object and a message in the same time. Nothing is wrong with this either because that particular entity has all the aspects / qualities described by these words. Sometime an object is not a message but it is a component just because it is part of something else and not passed along.
Let's take for instance the phrase "Information Packet" (IP). Well it's good enough to tell me that it contains information but all the program (data , functions) contains information. Being a packet suggest to me that it is a container that will transport information from A to B. This is good. But why not using the common term: parameter or message that are so common and even a non programmer understand what it is? You see?
By the way, quote from Tom Young :"Among other issues: how IPs can be functional escapes me".
IPs can be functional. What is so strange about that? You can not send an object that includes behaviour to another component? You do this in OOP all the time. You can receive as parameter predicates, objects, whatever else. Maybe I have not understood your concern.
It is quite strange for me to understand why "my terms" are so special or hard to understand. I have used words that are quite common for programmers and non-programmers alike (component, functional, parameter, message, object, data etc). We don't have to invent special terms for FBP. Even FBP is object of confusion when talking about it in the context of Data Flow Programming, for instance. But the term "Data Flow Programming" was already taken :) and associated with different meanings than our desire. Besides that the flow can contain something else but just data (i.e. data without behaviour to be more clear). It can contain functional components (the wonder of Tom Young).
It seems we starve for uniqueness. I understand that we have to come up with phrases "unique" to our domain in order to differentiate from others. We think that "our domain" is special and deserves special words. I don't think so.
I don't have any need to add other special words for FBP but I do agree that each term, even if it is quite common and well understood should be put in the context of FBP and explained as such.
In the end I am curious to understand why "my terms" (that are so common after all) were so special and hard to understand.
> - Read only shared state can always be eliminated by making it an IP and
> creating copies when necessary.
If something is read-only, it's a value, not state. State by definition is
mutable.
--
You know, you haven't stopped talking John Cowan
since I came here. You must have been http://www.ccil.org/~cowan
vaccinated with a phonograph needle. co...@ccil.org
--Rufus T. Firefly
(i guess i don't yet see what is novel about any of the approaches
described vs. how people have wrestled with state in e.g. distributed
or even just concurrent systems?)
sincerely.
I think many emails on this list, including some of Dan's, Paul's,
and mine, can be difficult to understand!
Many of us have different ideas or variations on FBP.
Dan's has some different ideas which might be hard to communicate,
e.g. 'sending components as messages'. Some people will want this,
some people won't. I think it can be useful, and also confusing!
We send queries over the network to visit the database. We can't
send the database over the network to visit the query. I think the
database is like a big collection of data IPs, and the query can be
implemented as a FBP component (or encapsulated network).
So it might be a good idea for our FBP components to be portable from
machine to machine, e.g. as Java bytecode, portable source code, or a
graph definition.
I've been reading a bit about Haskell, like "these functions take other
functions as arguments, and return different functions!" I'm not sure
if this would be a good feature for FBP! It can make it hard to know
what is going on; and it can help to make general and concise programs.
Sam
"So where are we in the discussion on "multiple flows working on same data" (i.e. shared state)?"
It can always be avoided by lugging it around in IPs? That does not
feel right (at least as a general solution). Firstly it makes for
bloated IPs. Secondly it would require loops, since once a flow comes
to an end the IP with the state would get stuck.
Thanks Ron, and well put!
> Dan's thoughts force us to think outside the box.
Yes, I was trying to reply something sensible about processes
and/or components as messages... but I'm still thinking!
I've been reading a book on Haskell, so my head already feels funny.
Functions, types, classes, kinds, functors, monoids, monads...
aargh, they're all blurring together!
Sam
--
"Mr. Osbourne, may I be excused? My brain is full."
http://sam.nipl.net/pix/my-brain-is-full.jpg
Another thing that I predicted is that, as FBP catches on, it will be
applied in a sort of generic sense, just like (in N. America) "bandaid"
came to mean any kind of sticking plaster - or even just a stop-gap
solution. This is where I believe we could start using Vladimir's
http://flowbased.org/ - or Tom's matrix -
https://docs.google.com/spreadsheet/ccc?key=0AqcHR01C1GJ7dDU5T0tkWHgzVURhWGw3cy1kOGJqQVE&hl=en#gid=0
, as good places to say what FBP is not, and what it is. Do we need a
new term for the "book" FBP?
I venture to say that the FBP I describe has been in continuous use at a
bank for around 40 years, evolving over time through many hardware,
software and business changes. Whatever we are going to call it, this
is the context in which these ideas were developed. So I think, at the
very least, it can serve as a solid base for future development. The
so-called "digressions" may not be very close to the book FBP as it
currently exists, but may well point the way to future extensions and
evolution - I would hope incremental, rather than us all having to start
again from scratch :-)
I also wanted to endorse Ron's comment about FBP allowing different
paradigms to coexist - this has been a key motif in my work for years,
but IMO it has become more difficult in recent years due to the
proliferation of byte codes such as JVM and CLR - hence an earlier
thread in this Group about a "universal" infrastructure - which seems to
have died...?
One other comment: has it struck people that the participants in this
conversation are located in - or hail from - at least the following that
I am sure about: Romania, Russia, Canada, UK, US,...? The Internet is
really an amazing thing! :-)
http://en.wikipedia.org/wiki/Actor_model#Later_Actor_programming_languages
now, i know there are differences, and differences do matter! it is
just that it seems like this FBP thing -- the way i read the messages
mostly so far, not absolutely, just mostly -- is that it is a new
unique different never seen before never duplicated idea, which seems
odd, and also that everybody has to come up with their own vocabulary
for some reason.
perhaps one pie in the sky wunderbar solution would be a new wikipedia
page with a big table matrix on it. y axis is "your own personal term
{actor,auton,...}" and the x axis would be the features we can use to
categorize. to see the subtle distinctions.
sincerely.
again, see the experience of the Actor model, in particular via Erlang
since that has been used extensively in real world commercial
nine-nines type software. i think you will discover that while the
rule of DAMMIT THERE'S NO SHARED STATE is a wonderful beginning to get
people out of their really bad habits, the shared state will
eventually crawl back in!
heck, look at actual hardware -- there's a fair bit of logic in cpus
to deal with cache coherency, no?
now i'm not actually an Erlang developer, i'm just a gadfly on the
wall, but my understanding is that most everybody breaks down and uses
mnesia, the Erlang database, which is really just using shared state,
ha ha. oh well. a guy can dream.
i think the main non-shared-state alternative contender these days is
the new-fangled no-sql-eventual-consistency thing.
(stm is about doing shared state better. it is not something that
would support distribution, it would fly in the face of the N
Fallacies of Distributed Computing, i believe.)
sincerely.
personally i wouldn't say it is mistaken. i'm more a visual thinker
myself. and here's somebody cooler and more famous and more productive
than myself who says the same thing.
now, of course, there are a zillion ways to do something wrong, and
often visual programming has been seen as failing, but maybe like
edison some day (if fbp didn't already do it) we'll get through all
the wrong ways to use graphcial stuff for programming, and end up with
the good one(s).
sincerely.
> heck, look at actual hardware -- there's a fair bit of logic in cpus
> to deal with cache coherency, no?
Primarily because the original Von Neumann model didn't make distinctions
that later turned out to be useful, like between code and data.
(Admittedly, there has to be a point at which data *becomes* code.)
> now i'm not actually an Erlang developer, i'm just a gadfly on the
> wall, but my understanding is that most everybody breaks down and uses
> mnesia, the Erlang database, which is really just using shared state,
> ha ha. oh well. a guy can dream.
Yes, but it's *encapsulated* shared state. The trouble with "normal"
threaded programs in conventional languages is that everything's unsafely
shared by default, and you have to take special pains to share safely
or to avoid sharing.
--
Using RELAX NG compact syntax to John Cowan <co...@ccil.org>
develop schemas is one of the simple http://www.ccil.org/~cowan
pleasures in life....
--Jeni Tennison <co...@ccil.org>
i don't follow. i thought RDBMS had lots of tweaky code to deal with
all the concurrency and locking and optimization and performance
issues, so i do not see how encapsulation is a silver bullet. i mean,
OO is supposedly encapsulation, and Java sucks wrt concurrency i
suspect. so there's something about the term *encapsulation* that is
worth expanding upon?
sincerely.
> i don't follow. i thought RDBMS had lots of tweaky code to deal with
> all the concurrency and locking and optimization and performance
> issues, so i do not see how encapsulation is a silver bullet.
Yes, it does, but all that code is encapsulated into the database engine:
it doesn't have to show up in the rest of your code.
--
John Cowan co...@ccil.org http://www.ccil.org/~cowan
O beautiful for patriot's dream that sees beyond the years
Thine alabaster cities gleam undimmed by human tears!
America! America! God mend thine every flaw,
Confirm thy soul in self-control, thy liberty in law!
--one of the verses not usually taught in U.S. schools
i guess it seems less than ideal to me when trying to figure out how
sausage is made to say "don't look at how we make sausage, just eat
it."
sincerely.
> Well, yes, but it's not that simple. I've never been fond of the
> actor model even though I am using something like it myself. Actor
> based (quasi actor based?) generally embed the spawning of processes/
> actors/autons within the code. FBP does not. Instead the network is
> created graphically, i.e., it is visible in advance versus being
> constructed dynamically.
>
> In my probably mistaken view the difference is critical. If you don't
> know what you've got (which you don't in a sufficiently large dynamic
> network) then you need a way to find what you need.
That's merely a special case. The configuration mini-language in
JavaFBP and C#FBP is just a stereotyped use of ordinary Java or C#
methods, and there's nothing stopping you from creating new Components
and Connections at run time, or even dynamically loading new Component
subclasses to provide novel functionality at runtime.
Like any sharp tool, dynamic network construction can cut you if misused.
Static networks can be pretty hard to understand too, if they are big
and complex enough. (I'm maintaining a static network -- a concept
rather than a processing network -- right now with about 350 nodes, and
it's hard to hold it all in my head.) However, it's common for a Unix
process to create a predecessor or successor to itself and insert them
into the pipeline, so a disciplined use of dynamic behavior can be useful.
--
If you have ever wondered if you are in hell, John Cowan
it has been said, then you are on a well-traveled http://www.ccil.org/~cowan
road of spiritual inquiry. If you are absolutely co...@ccil.org
sure you are in hell, however, then you must be
on the Cross Bronx Expressway. --Alan Feuer, NYTimes, 2002-09-20
> i guess it seems less than ideal to me when trying to figure out how
> sausage is made to say "don't look at how we make sausage, just eat
> it."
The point of that story is that if you looked at how it's made, you
probably wouldn't want to eat it. Similarly, it's fine to have complex
multithreaded code *somewhere* in the system -- after all, the OS kernel
is just that -- but not in *my* code!
--
"But I am the real Strider, fortunately," John Cowan
he said, looking down at them with his face co...@ccil.org
softened by a sudden smile. "I am Aragorn son http://www.ccil.org/~cowan
of Arathorn, and if by life or death I can
save you, I will." --LotR Book I Chapter 10
see, i would be day dreaming smoking stuff hoping somebody would
actually consider how to do something like an OS kernel w/out having
to do pure evil Linux-kernel-have-you-seen-the-humanity-in-there style
coding. i want the future to have some theory and language and
Sufficiently Smart Compilers (i know that is a bad thing to wish for,
at least until they get over a hump that makes them not suck more than
they help) so that we aren't ever writing horrible, horrible code. :-)
> One of the things about information packets is that they are not
> messages as such. Each IP is a unique object with a well defined
> lifetime. It comes into existence either as an input from outside or
> as a de novo creation. It passes from place to place until it comes
> to the end of its existence. Messages, on the other hand, are not
> unique objects as such.
Again, it depends. In early Smalltalk systems, messages *were* unique
objects, instances of class Message. They were created whenever code
sent a message to an object. In Smalltalk-80 (the current version),
message-sending is implemented as function call with a search for the
correct function, as in Simula-67 and all later OO languages.
However, it is still possible for the Smalltalk programmer to create
and send reified messages (at a cost in efficiency), and if there is
something wrong with a message (for example, the receiving object does
not understand it), it is reified after the fact and delivered using
the message doesNotUnderstand (which by default throws an exception,
but can be redefined)
--
John Cowan co...@ccil.org
"You need a change: try Canada" "You need a change: try China"
--fortune cookies opened by a couple that I know
> On shared state. From my experience of writing flow-based programs so far
> "shared state" problem occurs if there is a data resource that should be
> read/written by several processes concurrently, especially if all events in
> the system are asynchronous. But if you enclose such a resource into a
> component that manages the resource and access to it just like a DBMS does,
> accessing such a shared resource becomes as clumsy as querying a database.
Actually, querying a database is no more clumsy than reading a file from a
file-reader component. You send an IP to the database component's input
port that represents the query, and you get back a stream of IPs representing
the rows followed by an EOF IP.
--
John Cowan co...@ccil.org
"Not to know The Smiths is not to know K.X.U." --K.X.U.
I am really impressed with the insights you folks have into FBP! I
think Dan has made us all dig a bit more deeply!
Component has exactly the same meaning in Component-oriented programming and Integrated circuits engineering (e.g. VHDL) as in FBP. I can't see what's wrong with it if it has been used in the computer industry for say last 40 years.
18.02.2012 2:26 пользователь "Aristophanes" <c...@tiac.net> написал:
> For component based programming all of the focus on the individual
> components. The interfaces that connects each of them are entirely
> isolated from one another. For example, in Unix there are STDIN,
> STDOUT and STDERR. They are just byte streams.
This is true in principle, but in practice almost all the components
(programs) that participate in the pipe-and-filter architecture are
processing text line-by-line, so that each line (terminated by a newline
character) functions as an IP. Like FBP IPs, lines can be created,
passed through, modified, or dropped by any component.
(Note: Pipes are fixed-size buffers. The size varies from implementation
to implementation: it cannot be less than 512 bytes, but sizes from 8K
to 64K are more usual nowadays.)
> In Flow Based Programming the focus is on the flow, within which the
> components are just actors. The other actor is the Information Packet
> which has structure and follows rules.
Unix-style IPs usually have structure too, though some components ignore
the structure and treat the IP as a string. More typically, components
divide each IP into fields using a delimiter character (often tab
or whitespace, but settable by a parameter). In essence, lines are
dynamically rather than statically typed, but that does not mean they
are typeless.
> The combination of the two allows those reading a flow network
> to reason about the flow. It is possible to guarantee that an
> information packet that enters the network will always have a final
> destination where it will exit the network.
This is true only for the subset of networks whose components neither
create nor drop IPs. But many useful components do so. Consider a "top
ten" component, which accepts IPs containing a key such as a number,
and passes through the 10 IPs with the largest keys. (In practice,
10 would be a parameter.) Or even simpler, a component which accepts
IPs until there are no more, and then produces a single IP containing
the number of IPs read.
> Could I go as far to say that FBP makes the end-to-end flows of the
> network deterministic rather than emergent?
I don't understand what that means. If you mean that there are no race
conditions, then yes. If you mean that the fate of every packet is
statically predictable, then no.
--
How they ever reached any conclusion at all <co...@ccil.org>
is starkly unknowable to the human mind. http://www.ccil.org/~cowan
--"Backstage Lensman", Randall Garrett
> In FBP I can answer this by answering the question will [2] or [3]
> drop any packets? If the answer is no, then I know that every packet
> received by [1] will be received by [4]. As long as [2] and [3] do
> not contain any drop statement I know it to be true.
Provided you can inspect the source code. In any case, the converse is
not true: the mere presence of calls on drop does not guarantee what, if
anything, is dropped. This is an ineluctable consequence of programming
your components in a Turing-complete language with an arbitrary amount
of local storage.
:z
> This is what I mean by the flow being deterministic. "For everything
> that happens there are conditions such that, given them, nothing else
> could happen.": http://en.wikipedia.org/wiki/Determinism
In that sense any single-threaded program is deterministic. The advantage
of all FBP-style programming is that each component is deterministic;
the necessary control of non-determinism happens inside the engine.
> In unix [2] could receive the text of shakespear on STDIN, ignore it
> and output a list of swear words to [3]. There no rules that govern
> the relationship between input and output. The output is simply a
> product of the code's behaviour.
An FBP component that does exactly that is quite possible; it drops all
its input packets and creates new output packets from scratch. Neither
one of them is particularly sensible, but similar components might be
used for mocking out network connections or databases during testing.
> If this component receives a 100 customer records and drops them all,
> replacing them with a single count value then this looks wrong to me.
:z
> Where has all the customer data gone?
>
> It tells me that if I am sending my customer records to this component
> then they should be copies, not the original.
If your customer records only exist as transient IPs within a network,
you have far worse problems than programming style. Of course they
would be copies of the persistent versions in some database.
It is quite common for me to create a network whose first few steps are
massive data reductions; they throw away almost all the input in order
to focus on what I care about at the time.
> I think more work is needed to fully realise the potential. For
> example, in the COUNT example 100 customer records could be read, 99
> dropped and one of them replaced by a single integer. That shouldn't
> be allowed. It should be necessary to drop all 100 customer records
> and then create a single count integer.
Arguably IPs should be treated as immutable. That gains most of the
performance of a move-pointer architecture while keeping the conceptual
simplicity of a copy-all-data architecture like Unix. Unfortunately this
cannot be enforced in Java.
--
As you read this, I don't want you to feel John Cowan
sorry for me, because, I believe everyone co...@ccil.org
will die someday. http://www.ccil.org/~cowan
--From a Nigerian-type scam spam
Sent from Samsung Galaxy Tab (tm) on Rogers
Ged Byrne <ged....@gmail.com> wrote:
>In FBP I can answer this by answering the question will [2] or [3] drop any
>packets? If the answer is no, then I know that every packet received by
>[1] will be received by [4]. As long as [2] and [3] do not contain any
>drop statement I know it to be true.
>
>This is what I mean by the flow being deterministic. "For everything that
>happens there are conditions such that, given them, nothing else could
>happen.": http://en.wikipedia.org/wiki/Determinism
>
>Every packet received by [4] was created by [1], [2] or [3]. They cannot
>have come from anywhere else. Every packet sent by [1] will either be
>received by [4] or dropped by [2] or [3]. These relationships across the
>flow are guaranteed by the engine, I know them to be true without having to
>look at any of the code.
>
>In unix [2] could receive the text of shakespear on STDIN, ignore it and
>output a list of swear words to [3]. There no rules that govern the
>relationship between input and output. The output is simply a product of
>the code's behaviour.
>
>This is what I mean by emergent:
>"c<http://en.wikipedia.org/wiki/Complex_system>omplex
>systems and patterns arise out of a
>m<http://en.wikipedia.org/wiki/Multiplicity_(disambiguation)>
>ultiplicity of relatively simple interactions." Each line of code is a
>simple interaction, and I have to read all of them to know what going to
>happen. That 's what software is so difficult.
>
>Considering the example of the COUNT component. It receives all IPs and
>sends emits a single count value.
>
>If this component receives a 100 customer records and drops them all,
>replacing them with a single count value then this looks wrong to me.
> Where has all the customer data gone?
>
>It tells me that if I am sending my customer records to this component then
>they should be copies, not the original. That seems untidy. Instead I
>would have two output ports from my component. One port would pass the
>records on, the other would send a single count. The principle of FBP make
>the need for this obvious.
>
>I think more work is needed to fully realise the potential. For example,
>in the COUNT example 100 customer records could be read, 99 dropped and one
>of them replaced by a single integer. That shouldn't be allowed. It
>should be necessary to drop all 100 customer records and then create a
>single count integer.
>
The problem is to create a file OUT which is a subset of another one IN, where the records to be output are those which satisfy a given criterion "c". Records which do not satisfy "c" are to be omitted from the output file. This is a pretty common requirement and is usually coded using some form of the following logic:read into a from INdo while read has not reached end of fileif c is truewrite from a to OUTendifread into a from INenddoFigure 3.2What action is applied to those records which do not satisfy our criterion? Well, they disappear rather mysteriously due to the fact that they are not written to OUT before being destroyed by the next "read". Most programmers reading this probably won't see anything strange in this code, but, if you think about it, doesn't it seem rather odd that it should be possible to drop important things like records from an output file by means of what is really a quirk of timing?
Suppose we decide instead that a record should be treated as a real thing, like a memo or a letter, which, once created, exists for a definite period of time, and must be explicitly destroyed before it can leave the system. We could expand our pseudo-language very easily to include this concept by adding a "discard" statement (of course the record has to be identified somehow). Our program might now read as follows:read record a from INdo while read has not reached end of fileif c is truewrite a to OUTelsediscard aendifread record a from INenddo
> In FBP I can answer this by answering the question will [2] or [3]
> drop any packets? If the answer is no, then I know that every packet
> received by [1] will be received by [4]. As long as [2] and [3] do
> not contain any drop statement I know it to be true.
Provided you can inspect the source code. In any case, the converse is
not true: the mere presence of calls on drop does not guarantee what, if
anything, is dropped. This is an ineluctable consequence of programming
your components in a Turing-complete language with an arbitrary amount
of local storage.
:z
[...]
> It also makes the component code easier to understand. Take for example
> your components that discard most of the packets to focus on what you
> are interested in. It is made explicit that packets are being dropped
> and why.
That is as much to say that manual storage reclamation is easier
to understand and less buggy than automatic storage reclamation.
All experience with garbage collectors is against that claim. Here you
are adding two lines to a trivial component; not all components will
be so trivial.
Paul Morrison scripsit:
> Most programmers reading this probably won't see anything strange in
> this code, but, if you think about it, doesn't it seem rather odd that
> it should be possible to drop important things like records from an
> output file by means of what is really a quirk of timing?
I don't understand where timing comes in. This simply assumes that
all packets belonging to a component and not otherwise disposed of are
dropped when the component terminates, or sooner if it can be proved that
they will not be needed again (which is what a garbage collector does).
Both versions of the code, with and without manual drop, have entirely
deterministic behavior.
--
John Cowan co...@ccil.org http://ccil.org/~cowan
Consider the matter of Analytic Philosophy. Dennett and Bennett are well-known.
Dennett rarely or never cites Bennett, so Bennett rarely or never cites Dennett.
There is also one Dummett. By their works shall ye know them. However, just as
no trinities have fourth persons (Zeppo Marx notwithstanding), Bummett is hardly
known by his works. Indeed, Bummett does not exist. It is part of the function
of this and other e-mail messages, therefore, to do what they can to create him.
I believe that it is this principle, that data should just be allowed to disappear into the ether, that distinguishes Flow based from Component based.
> My quote above may be pre-OO, but it is (or was) a pretty standard
> coding technique: read a record into an area of storage, and then write
> it out from the same area. If you assume that the record is always read
> into a newly allocated object, then I think your objection holds... but
> you can't force programmers to always do that - even in OO.
Oh, if you are reading directly out of the OS buffer, then yes, it's
timing-dependent. I understand that pre-Unix operating systems often
did that -- just returned a pointer into the buffer, and you had to
grab your stuff before it was overwritten. In Java, what you get out
of a Reader is either a primitive value or else a String, and Strings
are immutable. Even in the C model, you can rely on the contents of
the userland buffer to remain the same (unless you change them yourself)
until you do another read, and for writes not to return until the data
has been copied into the kernel.
--
Let's face it: software is crap. Feature-laden and bloated, written under
tremendous time-pressure, often by incapable coders, using dangerous
languages and inadequate tools, trying to connect to heaps of broken or
obsolete protocols, implemented equally insufficiently, running on
unpredictable hardware -- we are all more than used to brokenness.
--Felix Winkelmann
Oops.
That is as much to say that manual storage reclamation is easierto understand and less buggy than automatic storage reclamation.All experience with garbage collectors is against that claim. Here youare adding two lines to a trivial component; not all components willbe so trivial.
Ged Byrne scripsit:
> It also makes the component code easier to understand. Take for example
> your components that discard most of the packets to focus on what you
> are interested in. It is made explicit that packets are being dropped
> and why.
> Many C++ developers have adopted a similar approach using RAII (Resource
> Acquisition Is Initialisation) as an effective alternative to garbage
> collection:
It isn't, really. RAII only works if the lifetime of the resource is
stack-oriented (that is, it comes into existence when a particular function
is invoked, and is disposed of when the function is about to return).
That works well *when it works*, but often produces convoluted results
or fails altogether, which is why C++ has a variety of (non-built-in)
pointer types that do reference-counting garbage collection.
> The implementation is different but the principle is the same. For any
> resource (memory, file, connection, etc) there is a handle. Responsibility
> for that handle is always held in just one place, so that it is possible to
> ensure that it is released.
Quite so. So now instead of "Who frees memory?" you have to ask "Who drops
the packet?" Not much of a gain.
> Bartosz describes resource transfer, the transfer of a handle between
> scopes, as part of his resource management:
> http://www.relisoft.com/resource/transfer.html.
I'll read up on this.
--
John Cowan <co...@ccil.org> http://www.ccil.org/~cowan
One time I called in to the central system and started working on a big
thick 'sed' and 'awk' heavy duty data bashing script. One of the geologists
came by, looked over my shoulder and said 'Oh, that happens to me too.
Try hanging up and phoning in again.' --Beverly Erlebacher
> The implementation is different but the principle is the same. For anyQuite so. So now instead of "Who frees memory?" you have to ask "Who drops
> resource (memory, file, connection, etc) there is a handle. Responsibility
> for that handle is always held in just one place, so that it is possible to
> ensure that it is released.
the packet?" Not much of a gain.
In FBP Information Packets (IP) are managed by establishing a chain of responsibility. At any one time there is a single point of responsibility for an IP, either within a processor or a queue.A queue can receive and send IPs. Every IP received (R) must be sent (S) so that (R = S)A processor can create and destroy IPs in addition to sending and receiving them. Every IP received (R) or created (C) must be either sent (S) or destroyed (D) so that (C + R = S + D).
> Would you agree that this is a fair summary of the rules presented by Paul
> in his book?
I would.
> Is the above a fundamental principle of FBP or just a feature of some
> implementations?
It's fundamental to the *analysis* of FBP programs.
> Would I be fair to summarise your opinion as being that the above is simply
> an implementation approach taken by some FBP that is not significant and of
> limited usefulness?
No. What I think is that explicit dropping is not a requirement on all
implementations, and implicit dropping backed by garbage collection (whether
done by reference counting, mark-sweep, stop-and-copy, or "drop everything
when the process terminates") is satisfactory and indeed preferable, as it
minimizes the amount of bookkeeping code that needs to be written.
--
Is not a patron, my Lord [Chesterfield], John Cowan
one who looks with unconcern on a man http://www.ccil.org/~cowan
struggling for life in the water, and when co...@ccil.org
he has reached ground encumbers him with help?
--Samuel Johnson
Every IP received (R) or created (C) must be either sent (S) or destroyed (D) so that (C + R = S + D).
> I think I understand where John is coming
> from, but one of the problems that we had in the early days was of
> storage not being freed up, and gradually accumulating until storage was
> full :-)
Quite so. That's why John McCarthy invented garbage collection somewhere
around 1959 for the first implementation of Lisp on the IBM 704. It's not
clear exactly when it was actually added to the implementation, somewhere
between then and 1962. McCarthy credits Daniel J. Edwards with writing
the code, which implements a classical mark-and-sweep algorithm.
> At base, this seems to me to be a philosophical problem - OO GC seems to
> me to be like the old saw: "if a tree falls in the forest,...".
Just so. The collector gains a conspectus of the entire program's state,
constructs a proof that certain parts of that state "cannot possibly
matter to any future computation" (in the words of the Scheme standard),
and frees the space they occupy for reuse.
As you can see above, GC is not tied to OO and is in fact older; the
first recognizable OO system was Simula 67.
> In these FBP implementations, we also provided an IP /tree/ mechanism,
> to allow fairly complex structures of IPs to travel through the network
> using single sends and receives.
I remember implementing that in JavaFBP. Has it been removed from the
current implementations?
> Last point: in today's world, we could have all owned IPs chained into a
> linked list - or, in fact, any collection that supports removal of an IP
> from an arbitrary point in the list. This would then allow the
> programmer to locate undisposed-of IPs much more quickly than having to
> pore through the code... Comments?
A WeakHashMap would work better, I think. (It should be a WeakSet,
but Java doesn't have those.)
--
John Cowan co...@ccil.org http://ccil.org/~cowan
If I have not seen as far as others, it is because giants were standing
on my shoulders.
--Hal Abelson