ern0 scripsit:
This is a very interesting project indeed. It subsumes FBP, with the
following extensions:
1) A network can be distributed across multiple machines transparently
when it is deployed, rather than running in a single process.
2) A specific instance of a component can be run in multiple copies
("tasks") either on the same or different machines. You can provide a
"stream grouping" to specify how packets arriving at the input ports
are distributed among the tasks. See below.
3) There is a standard high-performance serialization format for
transferring IPs between processes or machines.
4) Both networks and components can be written in non-JVM languages
easily. A non-JVM component uses JSON as the format for incoming and
outgoing IPs.
5) Because network transmissions are inherently unreliable, there
is a mechanism for end-to-end acknowledgement and retransmission.
Components are divided into ones with input ports ("bolts") and ones
without ("spouts"). Whenever a bolt drops a packet, it sends a hash
of the packet back to the originating spout. If it creates new packets
based on the old one, the spout is told about it by sending a hash of the
new packet, and identity of the originating spout is passed along to the
next component.
Each spout keeps track of the incoming notifications, and if all
packets in the network generated from a given outgoing packet have
not been dropped within a specified timeout interval, the spout should
retransmit the packet. Some spouts can't do that, and other packets
can't be safely retransmitted because they are not idempotent.
Packet hashes are xored together, and so are drop-packet hashes, making it
extremely unlikely that the spout will decide a packet needs retransmission
when it really has been processed successfully.
Here's a quick list of available stream groupings (you can write your
own, too):
1) Randomly distribute input packets over the tasks. This is the default.
2) Specify certain fields of the packet as the key, and send all packets
with the same key to the same task.
3) Send all packets to all tasks.
4) Send all packets to a single task.
5) The sender specifies which task to send to.
6) Randomly distribute packets over the tasks running in this same
process, if any; otherwise, across all tasks.
7) If the next component is a non-looper and it is available in this same
process, execute it in the same thread. Otherwise, randomly distribute.
This is not implemented yet, but will be the default in future.
--
Where the wombat has walked, John Cowan <
co...@ccil.org>
it will inevitably walk again.
http://www.ccil.org/~cowan
(even through brick walls!)