What is the right granularity level for an FBP component?
What is the right granularity level for an FBP component?
The question bothers me so much because a wrong understanding of the correct granularity creates so much complexity that after a while you would be asking: why even bother with FBP?
I believe most who have tried "fine-grained" FBP in practice with a non-trivial project have encountered at least once, if not many times, that the running graph is so complicated that maintaining a mental model of it becomes a problem in and of itself.
My mental exploration begins with this question: Why did FBP even appeal to me in the first place? On my first attempt several years back it was mostly that I could visualize my code with a clean separation of concerns between components.
After my “sabbatical" and having re-read JPM’s book, I realized that the real pot of gold at the end of the rainbow was actually to escape the von Neumann model. Yes, visualization is important; modularity is important; reusability is important; but there are well-researched (and implemented) solutions to these problems in the von Neumann world.
I believe that FBP (or at least frameworks that are FBP-like) is gaining momentum because we’re finally catching up with reality that the physical world cannot be captured in a shared-memory model, nor can our *mental* world handle one gargantuan bloc of code that shares memory. The need is there, and there’s been a plethora of approaches on how to manage this "post-von Neumann" world, such as microservices and software agents.
For the sake of a comprehensive analysis, here is my own mental model of different software layers in relative abstraction, from the abstract to the concrete:
- Software system/application, which serves a collection of one or more business objective(s)
- Software agents[1], a logically independent entity that handles a single aspect of a business problem
- Microservice, an independently running service that tackles a single technical problem
- Computer program, which solves an engineering problem with clearly defined domain, parameters, and results (e.g. take two streams of documents and merge them into one, but not present two lists of documents and allow the user to interactively select which to keep as documents arrive)
- Function, the basic unit of logical computation
So I asked myself, where does (or should) FBP break the spectrum?
From what I’ve understood by reading JPM’s book and my limited experiences in a few FBP-like implementations, I call this layer the “von Neumann limit”, above which shared memory does not exist, but below which there is an illusion that the entire world is single-threaded with a single memory space, which is where the von Neumann model excels.
In other words, we have two rules (here onward called the “Existential Rules”) to determine whether a system implements FBP:
- There may not be asynchronous operations. e.g. putting a process to sleep and sending an HTTP request are blocking operations.
- There must not be state shared between components. This may seem obvious but what this constraint implies is that there cannot be any free variable in each component. An example violation would be a call to write to a file by name on the filesystem if the two processes happen to be running on the same machine.
The two constraints are by no means new as they are mentioned in JPM’s book, though I would like to highlight them since these two constraints are together the single most important concept in FBP, as they define what FBP is by placing it along the software abstraction spectrum, at least under this model.
To demonstrate why this limit is significant, imagine that we set the breaking point on a more abstract level (i.e. making a component “coarser”): each component, like a microservice or a full-fledged service, may now take and send asynchronous operations at will. This results in unpredictable behaviors for obvious reason. Of course, there are strategies dealing with this problem in “conventional” programming paradigm, but a more robust alternative to this problem is exactly what FBP is offering. Whereas conventional solutions requires a number of runtime checks, validations, and logs, FBP offers the guarantee at the framework level!
On the contrary, let’s push the breaking point down to a more concrete level (i.e. making a component “finer"): each component, like a computer program or a function, may now access some global state. This is an obvious problem to a seasoned programmer so no further elaboration is needed on this topic.
What is interesting is the corollary to that. One may ask what if there is no shared state whatsoever? Well, in order for that to happen, a lot of copying in memory is needed and it becomes more expensive as one moves down towards finer components.
One may also suggest that a sufficiently sophisticated compiler can effectively remove a lot of run-time copying (a la Haskell). This approach requires purity at the function level and the ability to infer types across all the components, which is impractical given FBP concerns itself not with the implementation details of individual components, opting for flexibility and modularity over performance.In short, we have three rules (here onward called the “Granularity Rules”) when it comes determining the (approximately) right granularity for an FBP component:
- Start with the von Neumann model in a component. i.e. just start programming the way one always has, well, for most at least, that the programming language must be synchronous.
- As soon as one reaches the point where she needs asynchronous operations within a component, break it apart into multiple components.
- Combine components into a graph, but when performance becomes unacceptable (i.e. too much copying in memory), combine components without violating the first Existential Rule. Note that the second Existential Rule must be violated (rightfully) in this step because the point is exactly to introduce shared state to remove the copying.
If you have read this far, you must at least understand the frustration that I have. Please let me know what you think, or better yet, let me know if something is not sound! Thank you!
nice, good to see this kind of explanation.
sounds a lot like actors.
My experience with noflo was that the way graph execution works makes some things much more complicated compared to classical fbp. It is not apparent at first but it makes complexity grow.
From what I understand, common dataflow systems are very hard to use because they limit computations to a hierarchy of functions that get evaluated upon some change in inputs, programming this way is very hard compared to conventional structured/oop programming.
FBP was inspired by this kind of machine, and this is where your answer may reside. We really like to make small things that do a single thing well, this is quite fashionable today in the world of programming, but each time new information is created by a function, all the meaning it has attached is lost. It needs to be carried over to the next function in some way to complete the program correctly, it might just be the memory of the programmer, the types involved or some other technique. In that slice of time where the new information is being built, ready to be returned or sent, we may want to produce more information related to the processed data, and FBP acknowledges this.
In the case of the mechanical card sorter, it produced not only a sorted stack, but additional information like card count, faults, etc. The machine "understands" the domain of what is being done, it is not just a responsibility or a task, it encapsulates a job.
Now, I might be writing too much but try to follow this idea: The granularity is not dictated by the need for asynchronicity, but the need for information to perform a job. Is a job something like "function(x){ return x * 2 }" ? To me it seems like this is just evaluating a formula, so the job could be "evaluate formula". Or if we go with the machine analogy, "Formula Evaluator". Then we could think the kind of information it can produce while doing its job, like the greatest result produced, smallest, divisions by zero, etc. All of this will not be useful for most programs, but it could be available as an output port, or configurable via IIPs.
Another important idea is Information Packets. Originally I understood them as just a number or a string, or any basic type, but operating on that scale everywhere is not very useful, as you might have seen with other dataflow systems. The next step is records, structs, POJOs, dictionaries, hashmaps, or any name you like, the idea is similar. Many languages have immutable versions of them or you can fake them, so even if you are sharing a reference to a certain record, it will be read only, or make a copy to create a new version, so the performance cost is paid only if writes are made. Again we are working on traditional PC hardware, if you used some many-cpu system with dedicated ram for each cpu the story would be different.
My experience with noflo was that the way graph execution works makes some things much more complicated compared to classical fbp. It is not apparent at first but it makes complexity grow.
Agreed. My attempt to devise a model is to understand what that something is that makes an FBP-like implementation the wrong approach, so a new-comer to FBP can say, "oh, it doesn't do X and Y, and this is why it wouldn't work."
From what I understand, common dataflow systems are very hard to use because they limit computations to a hierarchy of functions that get evaluated upon some change in inputs, programming this way is very hard compared to conventional structured/oop programming.Do you have some examples of what these common dataflow systems are?
What do you mean by "a hierarchy of functions"? "Get[ting] evaluated upon some change in inputs" sounds like event-based programming. Am I getting the right idea?
FBP was inspired by this kind of machine, and this is where your answer may reside. We really like to make small things that do a single thing well, this is quite fashionable today in the world of programming, but each time new information is created by a function, all the meaning it has attached is lost. It needs to be carried over to the next function in some way to complete the program correctly, it might just be the memory of the programmer, the types involved or some other technique. In that slice of time where the new information is being built, ready to be returned or sent, we may want to produce more information related to the processed data, and FBP acknowledges this.In the case of the mechanical card sorter, it produced not only a sorted stack, but additional information like card count, faults, etc. The machine "understands" the domain of what is being done, it is not just a responsibility or a task, it encapsulates a job.
Thanks for the reference! I have certainly not been fortunate enough to see one in action... On this specific problem though, I believe why monads are fairly popular in FP circles, because it allows encapsulation of some contextual information why keeping a certain degree of purity. What does FBP bring to the table on this front vs frameworks/languages that provide context-sensitive programming that maintain purity?
Now, I might be writing too much but try to follow this idea: The granularity is not dictated by the need for asynchronicity, but the need for information to perform a job. Is a job something like "function(x){ return x * 2 }" ? To me it seems like this is just evaluating a formula, so the job could be "evaluate formula". Or if we go with the machine analogy, "Formula Evaluator". Then we could think the kind of information it can produce while doing its job, like the greatest result produced, smallest, divisions by zero, etc. All of this will not be useful for most programs, but it could be available as an output port, or configurable via IIPs.I see where you are coming from. My only objection would be: that applies to any computation at all granularity level. The example you raised is obviously at the finest level, which still satisfies the constraint "something that needs some information in the relevant domain to perform a job". In the end, that's what computation is: it takes some information that it needs (i.e. defining the domain) and produces some output (i.e. performing the job). Maybe I have misunderstood your argument?
Another important idea is Information Packets. Originally I understood them as just a number or a string, or any basic type, but operating on that scale everywhere is not very useful, as you might have seen with other dataflow systems. The next step is records, structs, POJOs, dictionaries, hashmaps, or any name you like, the idea is similar. Many languages have immutable versions of them or you can fake them, so even if you are sharing a reference to a certain record, it will be read only, or make a copy to create a new version, so the performance cost is paid only if writes are made. Again we are working on traditional PC hardware, if you used some many-cpu system with dedicated ram for each cpu the story would be different.Yea, I agree. IP is a very powerful concept. The only reason why I put a constraint of performance in granularity evaluation is because there is really no theoretical basis for why a component cannot be just a simple addition or if-else function; I would be delighted to be convinced otherwise though. In fact, your example above is a valid instance of this.My intuition is that those of us who have tried "fine-grained" FBP know from experience that "fine-grained" FBP doesn't work, and my exploration is to understand why. Of course, when starting out in solving a new problem, one doesn't know where to begin when it comes to granularity. What I want is a model that can inform me in which direction I should move towards as I'm exploring and developing software for a particular domain, without accidentally "breaking" FBP just because it is an easy way out.
On Friday, 28 August 2015 17:40:37 UTC-4, Kenneth Kan wrote:
What is the right granularity level for an FBP component?Whatever level works best for you. Seriously.FBP is not an appliance. Some assembly required.
The question bothers me so much because a wrong understanding of the correct granularity creates so much complexity that after a while you would be asking: why even bother with FBP?Because it lets you change things around quickly, without fear of breaking something.
I believe most who have tried "fine-grained" FBP in practice with a non-trivial project have encountered at least once, if not many times, that the running graph is so complicated that maintaining a mental model of it becomes a problem in and of itself.If the model is getting too complex, you should try breaking it up into subnets.
My mental exploration begins with this question: Why did FBP even appeal to me in the first place? On my first attempt several years back it was mostly that I could visualize my code with a clean separation of concerns between components.FBP is only one kind of programming technique that does this.After my “sabbatical" and having re-read JPM’s book, I realized that the real pot of gold at the end of the rainbow was actually to escape the von Neumann model. Yes, visualization is important; modularity is important; reusability is important; but there are well-researched (and implemented) solutions to these problems in the von Neumann world.FBP also gives you some defence against Amdahl's world.
I believe that FBP (or at least frameworks that are FBP-like) is gaining momentum because we’re finally catching up with reality that the physical world cannot be captured in a shared-memory model, nor can our *mental* world handle one gargantuan bloc of code that shares memory. The need is there, and there’s been a plethora of approaches on how to manage this "post-von Neumann" world, such as microservices and software agents.You can believe what you want, but I've seen nothing in the market to reflect this.AFAIK, most programmers just get by with what they have, and they will accept any lumps they get as long as they don't have to learn anything new.
For the sake of a comprehensive analysis, here is my own mental model of different software layers in relative abstraction, from the abstract to the concrete:
- Software system/application, which serves a collection of one or more business objective(s)
- Software agents[1], a logically independent entity that handles a single aspect of a business problem
- Microservice, an independently running service that tackles a single technical problem
- Computer program, which solves an engineering problem with clearly defined domain, parameters, and results (e.g. take two streams of documents and merge them into one, but not present two lists of documents and allow the user to interactively select which to keep as documents arrive)
- Function, the basic unit of logical computation
Unfortunately, these aren't really a hierarchy.
So I asked myself, where does (or should) FBP break the spectrum?FBP is somewhere below Microservices, and above functions. Computer programs are some other thing that doesn't belong in the hierarchy.Agents don't quite fit in the hierarchy either.
From what I’ve understood by reading JPM’s book and my limited experiences in a few FBP-like implementations, I call this layer the “von Neumann limit”, above which shared memory does not exist, but below which there is an illusion that the entire world is single-threaded with a single memory space, which is where the von Neumann model excels.I think of it this way: inside a component, we use stack-oriented programming (von Neumann) outside, we let the framework do its thing.
In other words, we have two rules (here onward called the “Existential Rules”) to determine whether a system implements FBP:
- There may not be asynchronous operations. e.g. putting a process to sleep and sending an HTTP request are blocking operations.
- There must not be state shared between components. This may seem obvious but what this constraint implies is that there cannot be any free variable in each component. An example violation would be a call to write to a file by name on the filesystem if the two processes happen to be running on the same machine.
The two constraints are by no means new as they are mentioned in JPM’s book, though I would like to highlight them since these two constraints are together the single most important concept in FBP, as they define what FBP is by placing it along the software abstraction spectrum, at least under this model.I'm not sure I follow - I think you mean rules for an FBP component, not a system - I will defer to Paul on this.
To demonstrate why this limit is significant, imagine that we set the breaking point on a more abstract level (i.e. making a component “coarser”): each component, like a microservice or a full-fledged service, may now take and send asynchronous operations at will. This results in unpredictable behaviors for obvious reason. Of course, there are strategies dealing with this problem in “conventional” programming paradigm, but a more robust alternative to this problem is exactly what FBP is offering. Whereas conventional solutions requires a number of runtime checks, validations, and logs, FBP offers the guarantee at the framework level!You are redefining what it means to be a component here. Meaning you are not really talking about FBP at this point.
On the contrary, let’s push the breaking point down to a more concrete level (i.e. making a component “finer"): each component, like a computer program or a function, may now access some global state. This is an obvious problem to a seasoned programmer so no further elaboration is needed on this topic.I would suggest you elaborate - I missed your point here. Components have access to shared state - file system, database, etc.
What is interesting is the corollary to that. One may ask what if there is no shared state whatsoever? Well, in order for that to happen, a lot of copying in memory is needed and it becomes more expensive as one moves down towards finer components.I'm not sure I follow this either. How do you lock a file for writing if you don't have shared state?
One may also suggest that a sufficiently sophisticated compiler can effectively remove a lot of run-time copying (a la Haskell). This approach requires purity at the function level and the ability to infer types across all the components, which is impractical given FBP concerns itself not with the implementation details of individual components, opting for flexibility and modularity over performance.In short, we have three rules (here onward called the “Granularity Rules”) when it comes determining the (approximately) right granularity for an FBP component:
- Start with the von Neumann model in a component. i.e. just start programming the way one always has, well, for most at least, that the programming language must be synchronous.
- As soon as one reaches the point where she needs asynchronous operations within a component, break it apart into multiple components.
- Combine components into a graph, but when performance becomes unacceptable (i.e. too much copying in memory), combine components without violating the first Existential Rule. Note that the second Existential Rule must be violated (rightfully) in this step because the point is exactly to introduce shared state to remove the copying.
I'm not sure I would agree with 2 or 3. AFAIK nothing in FBP requires the problem it is solving to have preexisting asynchrony. And neither does FBP require copying.
If you have read this far, you must at least understand the frustration that I have. Please let me know what you think, or better yet, let me know if something is not sound! Thank you!I think you might have some misunderstanding about FBP and how it fits in to the big picture.
Although some people here have extended or are extending FBP into the higher orders,in general, from an architectural point of view, FBP is similar to an orchestration script, such as BPEL or even Ant.FBP does't have a coordination language per se (or at least, not yet), bu it does have fbp files that describe a network (very similar to a markup language).This is what allows easy recombination, and even graphical programming.
FBP is also a software engine of sorts, and has to run in a software process, so an FBP network is below a microservice(actually a microservice could be implemented as a set of FBP networks).
An FBP job normally runs like a batch file, from start to completion.But I'm fairly certain you can also leave it running, so that it will continue to accept data on it's input ports. (Someone please confirm)Otherwise, the work moves between components, as in assembly line, in a manner very similar to streaming.(It is my understanding that FBP is very compatible with data streaming approaches such as audio streaming).If you want to understand FBP, I recommend you try a craftsman's approach - try to build something small but real, then go for the bigger things.Maybe try one of the other orchestration frameworks, so you can compare. Maybe take a look at a real assembly line.Hopefully, you should be able to get a feel for how FBP fits in somewhere along the way.
> FBP is somewhere below Microservices, and above functions.
I agree. Perhaps FBP and its components should be renamed "Nanoservice
Architecture" and "nanoservices" for marketing purposes. This would also
suggest sexy connections with nanomachines and nanoscale culture generally.
> I think of it this way: inside a component, we use stack-oriented
> programming (von Neumann) outside, we let the framework do its thing.
Pedantic historical note: Except for special cases like program loading,
modern architectures are Harvard rather than von Neumann. I haven't
written any self-modifying programs for many decades.
> FBP does't have a coordination language per se (or at least, not yet), bu
> it does have fbp files that describe a network (very similar to a markup
> language).
I sketched such a coordination language some time back. It's essentially
just a functional programming language, though with the ability to have
multiple values returned from a function (as distinct from returning a tuple).
It would be easy to make it a subset of Python or Lisp.
A key thing that wasn't mentioned is Groups or Brackets. Dealing with substreams has larger implications.
From what I understand, common dataflow systems are very hard to use because they limit computations to a hierarchy of functions that get evaluated upon some change in inputs, programming this way is very hard compared to conventional structured/oop programming.Do you have some examples of what these common dataflow systems are?What do you mean by "a hierarchy of functions"? "Get[ting] evaluated upon some change in inputs" sounds like event-based programming. Am I getting the right idea?Yes, systems like shader editors, excel, procedural geometry generators, they are built around an algorithm that re evaluates values upon changes in dependencies. So if we had a node that adds two numbers, and a node that squares the result of the first, if we change one of the addends, the sum is re evaluated, and then the square is re evaluated. I bet that you used excel and it works for math formulas, but the paradigm breaks very fast if you need anything more complicated. Hence macros and plugins were created. This is what people think about when they hear "dataflow".
FBP components have a similar role, but they are much easier to program, you can explain how to build a component for a FBP framework in minutes, try to do that with monad transformers and arrows.
But I digress, there's a tendency to make everything pure and minimalistic, but as we know from experience, the world isn't pure and requirements are not minimalistic, things change and it is much easier to alter a FBP graph or some component code than it is to keep the whole system pure. I really like the idea of writing components in the language that is best for the "job" the component has, so if the best is haskell, then use that, if it is APL, then use that. In other words, the advantage is that the "job" can be approached alone without the need to comply with the type requirements of the system as a whole. ( I would really like to see a FBP framework for F#, and if I have time for it might try to make one.)
It's hard to discuss this kind of topic, as the terminology we use has such an overloaded meaning, but I'll try again. What I mean by this is that there is a human factor in making components, that can predict that some events during the job might be useful , some might not, and they might be available for sending before the whole job is done, or maybe the job never stops like a server and data is produced at intervals. Computing things is easy once we have the algorithm, but that's not the problem we are trying to solve here. We don't want to do much algorithmic work with the graph, what we want to do is coordination.
Say that you have a component that reads files, it accepts a stream of file names. As file names come, they are read from disk, the usual way of dealing with this kind of task in a conventional system is to read files one by one, check exceptions, allocate memory if needed, etc. In FBP the component programmer can provide several outputs for this kind of task, and configuration. For example, send file contents as they are read, send the whole thing when done, send an ok or error packet after each file read with associated file data, output file sizes, only check if files exist, etc. It works as a machine that deals with the job, with some configuration , it isn't "just a function the returns more than one value". The output of "ok/error" could be used to feed a progressbar or a timeout process watching the file reader, or the next stage of processing. The reality of it is that the component might be a graph made out of simpler components but that's an implementation detail.
I think that Paul would agree with me on this that solving a problem is more important than theoretical purity, but I get where you are trying to go with this.
On Friday, 28 August 2015 17:40:37 UTC-4, Kenneth Kan wrote:
What is the right granularity level for an FBP component?Whatever level works best for you. Seriously.FBP is not an appliance. Some assembly required.Yes, I understand that FBP is not an appliance. What I try to avoid though, is intuition-based development. The more I learn from experience with software, the more I mistrust my intuition. The reason why frameworks in general and FBP specifically exist in the first place is so that we don't program whatever we feel like. Sufficient empirical evidences in our industry have already proven that that is not the ideal approach.
More importantly, any non-trivial software project is not the work of one single person. What works best for me may not work best for a lot of people.
The purpose of a structured mental framework is to provide that common understanding. Wouldn't you say?
The question bothers me so much because a wrong understanding of the correct granularity creates so much complexity that after a while you would be asking: why even bother with FBP?Because it lets you change things around quickly, without fear of breaking something.So do OOP, FP, AOP, and basically what all paradigm/framework/stack promise. I want to understand why; otherwise, the fuzzy definition of what "flow-based" means is exactly why there are many FBP-like frameworks that end up not delivering that promise of productivity.
I believe most who have tried "fine-grained" FBP in practice with a non-trivial project have encountered at least once, if not many times, that the running graph is so complicated that maintaining a mental model of it becomes a problem in and of itself.If the model is getting too complex, you should try breaking it up into subnets.That is a common feature of FBP-like implementations. Apparently that by itself is not sufficient, if the implementation does not follow FBP principles (and I'm trying to figure out what those are and why they are important).
My mental exploration begins with this question: Why did FBP even appeal to me in the first place? On my first attempt several years back it was mostly that I could visualize my code with a clean separation of concerns between components.FBP is only one kind of programming technique that does this.After my “sabbatical" and having re-read JPM’s book, I realized that the real pot of gold at the end of the rainbow was actually to escape the von Neumann model. Yes, visualization is important; modularity is important; reusability is important; but there are well-researched (and implemented) solutions to these problems in the von Neumann world.FBP also gives you some defence against Amdahl's world.Agreed on both points. We're talking about the same thing, that those qualities are not what makes FBP shine. My quest is to find what FBP really brings to the table that other techniques don't.
From what I’ve understood by reading JPM’s book and my limited experiences in a few FBP-like implementations, I call this layer the “von Neumann limit”, above which shared memory does not exist, but below which there is an illusion that the entire world is single-threaded with a single memory space, which is where the von Neumann model excels.I think of it this way: inside a component, we use stack-oriented programming (von Neumann) outside, we let the framework do its thing.What do you mean by "let the framework do its thing"?
Ultimately, I want to have some rules of thumbs (note that I'm not calling these "Laws") on determining what makes a component, since components are the building blocks of FBP. In the same spirit, a function, as in functional programming, is defined as a relation between a domain and a codomain, without any side effect. The definition of the foundational entity of any paradigm should be clear to a new-comer; if the axioms are wrong, any discussion on top of that would be meaningless.
On the contrary, let’s push the breaking point down to a more concrete level (i.e. making a component “finer"): each component, like a computer program or a function, may now access some global state. This is an obvious problem to a seasoned programmer so no further elaboration is needed on this topic.I would suggest you elaborate - I missed your point here. Components have access to shared state - file system, database, etc.What is interesting is the corollary to that. One may ask what if there is no shared state whatsoever? Well, in order for that to happen, a lot of copying in memory is needed and it becomes more expensive as one moves down towards finer components.I'm not sure I follow this either. How do you lock a file for writing if you don't have shared state?
Sorry for being fuzzy here. Before I elaborate on this point, let me define what I mean by "shared state".It's a given that any useful system needs to change the state ultimately, otherwise it's just a useless box that heats up and does nothing. By shared state I mean effect that can be accidentally introduced to another part of a software system in a different layer of abstraction.To illustrate this point, imagine this FBP hierarchy (now this is definitely a hierarchy because each layer depends on the one below):- User of the software system/application/program- Graph- Component- InstructionWe can at least agree that all state is "shared" at the highest level. To the user, it's just one giant "thing" that performs some desired task(s).Let's take the database example. A component can of course connect to the database, but to a component it is the *only* component connecting to the database. Of course, what good is it if a database is only "shared" with one? This is the illusion that I mentioned.A component's behavior does not depend on other parts of system changing the state of what it perceives is its "dedicated" resource. That concern lies one level up. To a graph, the intent is clear: you have these components accessing the same database so that they could work together to produce some useful result. To a component, however, FBP should be able to give it this guarantee/illusion: "You are the only one in the world. Don't worry about anything else." It's this separation of concerns that drove me to believe that this "world-splitting" constraint is so fundamental to defining FBP, or any good abstraction.
Paul can confirm on this one: I also believe that this is why externalizing relationships between components (in contrast to the actor model) is so important in FBP. It gives all components the guarantee that they don't need to know anything but their own domains.
One may also suggest that a sufficiently sophisticated compiler can effectively remove a lot of run-time copying (a la Haskell). This approach requires purity at the function level and the ability to infer types across all the components, which is impractical given FBP concerns itself not with the implementation details of individual components, opting for flexibility and modularity over performance.In short, we have three rules (here onward called the “Granularity Rules”) when it comes determining the (approximately) right granularity for an FBP component:
- Start with the von Neumann model in a component. i.e. just start programming the way one always has, well, for most at least, that the programming language must be synchronous.
- As soon as one reaches the point where she needs asynchronous operations within a component, break it apart into multiple components.
- Combine components into a graph, but when performance becomes unacceptable (i.e. too much copying in memory), combine components without violating the first Existential Rule. Note that the second Existential Rule must be violated (rightfully) in this step because the point is exactly to introduce shared state to remove the copying.
I'm not sure I would agree with 2 or 3. AFAIK nothing in FBP requires the problem it is solving to have preexisting asynchrony. And neither does FBP require copying.No, FBP doesn't require copying. The point was that copying is inevitable given how a proper FBP implementation should work. Because FBP requires isolation of shared state (as defined above), copying is inevitable as the system wouldn't produce anything useful otherwise.
The granularity rules do not define FBP. They merely serve as a guideline to reaching the "right" granularity for a particular component. One may certainly ignore #2 and #3 if performance isn't critical to the problem that she's trying to solve. In fact, most problems are well-served by the von Neumann model. That's why the model is so successful. What FBP brings to the table is a model that works for problems that lie *outside* of the von Neumann model.On the comment on pre-existing asynchrony, that's the area that I want to explore. What do you think is the value of FBP if there isn't an asynchrony issue in a particular problem? My gut reaction to such a problem would be to just use a pure FP language as it is well-suited for that domain.
Although some people here have extended or are extending FBP into the higher orders,in general, from an architectural point of view, FBP is similar to an orchestration script, such as BPEL or even Ant.FBP does't have a coordination language per se (or at least, not yet), bu it does have fbp files that describe a network (very similar to a markup language).This is what allows easy recombination, and even graphical programming.That's what I want to know myself: is FBP only just BPEL/Ant/etc with a graphical interface?
FBP is also a software engine of sorts, and has to run in a software process, so an FBP network is below a microservice(actually a microservice could be implemented as a set of FBP networks).Completely agreed.An FBP job normally runs like a batch file, from start to completion.But I'm fairly certain you can also leave it running, so that it will continue to accept data on it's input ports. (Someone please confirm)Otherwise, the work moves between components, as in assembly line, in a manner very similar to streaming.(It is my understanding that FBP is very compatible with data streaming approaches such as audio streaming).If you want to understand FBP, I recommend you try a craftsman's approach - try to build something small but real, then go for the bigger things.Maybe try one of the other orchestration frameworks, so you can compare. Maybe take a look at a real assembly line.Hopefully, you should be able to get a feel for how FBP fits in somewhere along the way.Perhaps I have not been fortunately enough to use a real FBP system, because my previous experiences have not been too satisfactory, as a user at least. Do you have some recommendations on this front?
In short, we have three rules (here onward called the “Granularity Rules”) when it comes determining the (approximately) right granularity for an FBP component:
- Start with the von Neumann model in a component. i.e. just start programming the way one always has, well, for most at least, that the programming language must be synchronous.
- As soon as one reaches the point where she needs asynchronous operations within a component, break it apart into multiple components.
- Combine components into a graph, but when performance becomes unacceptable (i.e. too much copying in memory), combine components without violating the first Existential Rule. Note that the second Existential Rule must be violated (rightfully) in this step because the point is exactly to introduce shared state to remove the copying.
--
You received this message because you are subscribed to the Google Groups "Flow Based Programming" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flow-based-progra...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Can I just quote from the end of the reuse chapter.
"I believe that, unless companies start to bring engineering-type disciplines to application development, not only will they fail to take full advantage of the potential of computers, but they will become more and more swamped with the burden of maintaining old systems. You can't do a lot about old systems - I know, I've seen lots of them - but new systems can be built in such a way that they are maintainable. It should also be possible to gradually convert old programs over to the FBP technology piece-meal, rather than "big bang". "
How prophetic. There's so much hype around new software but the majority of developers are stuck nursing legacy systems.
Could this be where FBP's "killer app" lies - in a systematic approach to converting legacy software?
Any ideas what that approach would look like?
--
--
Hi Bert,
Is developing new code from scratch required for a strangler application?
Does it allow for mining existing assets and rewriting them within a new framework?
So I could approach it like this?
1. Create a simple pass through proxy FBP network. All requests and responses are simply forwarded on.
2. Add routing logic that handles request that meet certain criteria but continues to forward the rest.
3. It may be desirable to both handle and forward the request and then compare the results. It the new code produces a different result log the error and return the legacy response. Once the new code has proven itself reliable this can be removed.
4. Keeping adding additional routing criteria and implementation components so that the coverage of the new code steadily increases.
5. Keep going until the new code coverage is 100% and the old system can be turned off.
FBP is a new/old paradigm - I don't believe you can get to it by starting with the von Neumann (Harvard?) paradigm - you have to start completely afresh - "become as a little child"! Actually that's a good idea!
For me, FBP has always been an engineering type of discipline. What we are trying to do when we build an FBP application is design a data processing factory - using standard types of machines.
An engineering discipline requires training and experience. But such disciplines don't build all their components - there is usually a hierarchy, from small to large. In FBP, these might be components, subnets, networks, applications, company- or world-spanning networks... As I've said elsewhere, you don't give an engineer a pile of girders and a manual, and tell her to build a bridge. There is school, on-the-job training, and progress up through the ranks, all of which takes a while. A bridge or a cathedral is also a team effort, so you have to be able to carve up the design early and hand off sections of it to different team members - for that you need clean interfaces between the chunks, and well-defined functions.
I think the insight is that pure lazy functional programming and dataflow
are two sides of the same coin. A Haskell program, for example, has a
stateless inside and a stateful outside, because the main program lives
in the IO monad. FBP networks have a stateless outside (the graph)
and potentially stateful insides (the components).
Of course a component can be written in Haskell and have a pure core
wrapped in a stateful mantle running in a pure network!
I would like to help , since the main problem with FBP is the lack of available material with modern solutions. We started a "documenting" effort with the flowbased wiki but It got stale, I'd rather spend effort in making example programs made with FBP, with the thought process behind them. Nothing beats that kind of material.
I think that you are reflecting the thoughts of many Kenneth. I think that noflo requires too much setup ( software and mental ), because the whole experience with noflo-ui is not finished. But that's normal with any big project like that.Since ES6 is a reality now, I started a small project to bring classical FBP anywhere with an ES6 based engine, it still requires a lot of work but the goals are :
- Almost zero setup ( should be as simple as using jquery )
- A new DSL and editor ( already implemented )
- Easy to integrate in existing applications
- As classical as I can get it to be ( considering cooperative coroutines )
- Run with io.js for server and batch applications
Since I'm using BabelJs , pretty much any system that can run ES5 will be able to run it.
--
To unsubscribe from this group and stop receiving emails from it, send an email to flow-based-programming+unsub...@googlegroups.com.
To add to what Alfredo has mentioned, I thought that native support of threads wasn't a requirement as the first versions of FBP implementations were done in green threads.
And I thought the problem was that JS was by design "asynchronous" in that it is guaranteed to run to completion without blocking, which is mitigated by ES6 as pointed out by Alfredo.
The issue seems to be that the syntax (i.e. `yield`) is unconventional to those outside of the JS world. Is that right?
--
You received this message because you are subscribed to the Google Groups "Flow Based Programming" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flow-based-progra...@googlegroups.com.
--
I know that it looks unnecessary but it can work as a nice trick to avoid "pass" yields, to allow cooperative coroutines without breaking the illusion. Of course sometimes a "pass" yield will be needed. But don't think of a drop as blocking.
If "run to completion without blocking" refers to deadlocks - we have always considered deadlocks a design issue. We never had a deadlock show up in a properly designed network, with the emphasis on "properly designed". Are you saying that deadlock avoidance is one of the reasons for the limitations in JS - if so, I kind of doubt that!
I don't mind 'yield' so much, but I seem to remember that you can't bury 'yield' inside a service - why does 'yield' have to be 'visible'? Also, I believe you can only use the "top" frame - I think we can live with that, but IMHO it's just a stupid restriction, arising not from philosophical reasons, but because the JS gurus are too stubborn to use multiple fibers!
To unsubscribe from this group and stop receiving emails from it, send an email to flow-based-programming+unsub...@googlegroups.com.
I know that it looks unnecessary but it can work as a nice trick to avoid "pass" yields, to allow cooperative coroutines without breaking the illusion. Of course sometimes a "pass" yield will be needed. But don't think of a drop as blocking.
What Kenneth might be referring to with the nature of js is that there's an event loop, and blocking execution is very frowned upon, so hidden blocking functions can cause much trouble. It might be one of the motivations for the yield limitation, or it might be that generators are translated into a state machine and going more than one level deep could have implications that escape my understanding. It would be such if generators were implemented differently I guess, like in lua or when you use fibers with node.
A concept I'm experimenting with is "internal inports" to make this even easier, so that a process can wait on callbacks the same way it can read from inports.
Ged, the main graph is ran within an interval in the main window, to allow for event processing. inside the main graph processes are generators, which may or may not fire up workers or child processses in the case of node or iojs. Spawned workers or child processes would write to a parent owned queue, and when the turn of the parent to execute comes, the queued data can be used. Communication with the main graph ( or other graphs running in the same page) is though pushing data into queues and passing a callback for output data if needed.
Running the graph inside a worker might not be feasible due to how restricted they are , but it is something worth exploring.
I hope that this description is clear.
I think you have to design the overall flow first - you're right that you can't afford to have one user's function block others, so you try to get each user's functions through as fast as possible, or "out of the way" if it's going to take a little time. This is also why we tend to keep screen management in a different subsystem. In our JavaFBP-based Brokerage application we multiplexed processes which were I/O- or CPU-intensive (fed by a Load Balancer process), and also used caching to reduce the amount of I/O. See the cover of my book (2nd edition) for a schematic - it doesn't show multiplexing or caching, but they were key to improving performance. I believe our screen management was managed by IBM's WebSphere, now InfoSphere.Interestingly the new Facebook architecture, Flux, uses the same basic architecture, as they found that their previous MVC architecture didn't scale up adequately. See Jing Chen's presentation in https://facebook.github.io/flux/ .
Of course, I don't know if you can do this easily with JavaScript, with its single stack - maybe someone could enlighten me why people are even trying to use JavaScript for this kind of work...? IIUC web workers do support multiple cores, but communication between them doesn't seem to me to readily support FBP-style design. Since JS is now #7 on the TIOBE popularity stats, maybe we should turn our attention to Python or C[++], both of which easily support this kind of thing... See http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html .
Did you guys ever look at JSFBP - https://github.com/jpaulm/jsfbp ? I am becoming more and more convinced that this is the only way to support FBP using JS - and no complicated Promises, Futures, etc. When you allow multiple stacks, everything becomes dead easy!
Hi Ged,
I think you have to design the overall flow first - you're right that you can't afford to have one user's function block others, so you try to get each user's functions through as fast as possible, or "out of the way" if it's going to take a little time. This is also why we tend to keep screen management in a different subsystem. In our JavaFBP-based Brokerage application we multiplexed processes which were I/O- or CPU-intensive (fed by a Load Balancer process), and also used caching to reduce the amount of I/O. See the cover of my book (2nd edition) for a schematic - it doesn't show multiplexing or caching, but they were key to improving performance. I believe our screen management was managed by IBM's WebSphere, now InfoSphere.
Of course, I don't know if you can do this easily with JavaScript, with its single stack - maybe someone could enlighten me why people are even trying to use JavaScript for this kind of work...?
IIUC web workers do support multiple cores, but communication between them doesn't seem to me to readily support FBP-style design.
Since JS is now #7 on the TIOBE popularity stats, maybe we should turn our attention to Python or C[++], both of which easily support this kind of thing... See http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html .
PS I didn't know Swing was dead! DrawFBP seems to work pretty well, and it's based on Swing!
Maybe someone could correct me but I think the JS community is simply trying to shoehorn things into JS even though they don't make sense.
Paul Morrison scripsit:
> PS I didn't know Swing was dead! DrawFBP seems to work pretty well, and
> it's based on Swing!
Swing is only dead in the sense that the kool kids don't like it any more
(or desktop apps of any sort, only web apps and native mobile apps).
That doesn't mean support for it in Java is going away.
Humberto Madeira scripsit:
> When I switched from Swing to Web - even with all the browser quirkiness of
> IE 5.5 at the time, the improvement was so dramatic that no one was
> inclined to switch back.
You're talking about desktop apps running as local web servers? Running
on centralized web servers is a whole different thing, with completelyl
different security and performance profiles.
In any case, there are
still a lot of limitations in HTML5, particularly in the area of direct
manipulation UI, though things like drag and drop are now decently supported.
--
When this happens, the process is suspended and some other ready process gets control. Eventually, the suspended process can proceed (because the blocking condition has been relieved) and will get control when time becomes available. Dijkstra called such processes "sequential processes", and explains that the trick is that they do not know that they have been suspended. Their logic is unchanged - they are just "stretched" in time.