Storm project

160 views
Skip to first unread message

ern0

unread,
Apr 24, 2013, 3:23:41 AM4/24/13
to flow-based-...@googlegroups.com
Idunno was it mentioned before, but I've found another system
http://storm-project.net/
--
ern0
Haben Sie Fragen?

John Cowan

unread,
Apr 26, 2013, 11:48:49 AM4/26/13
to flow-based-...@googlegroups.com
ern0 scripsit:

> Idunno was it mentioned before, but I've found another system
> http://storm-project.net/

This is a very interesting project indeed. It subsumes FBP, with the
following extensions:

1) A network can be distributed across multiple machines transparently
when it is deployed, rather than running in a single process.

2) A specific instance of a component can be run in multiple copies
("tasks") either on the same or different machines. You can provide a
"stream grouping" to specify how packets arriving at the input ports
are distributed among the tasks. See below.

3) There is a standard high-performance serialization format for
transferring IPs between processes or machines.

4) Both networks and components can be written in non-JVM languages
easily. A non-JVM component uses JSON as the format for incoming and
outgoing IPs.

5) Because network transmissions are inherently unreliable, there
is a mechanism for end-to-end acknowledgement and retransmission.
Components are divided into ones with input ports ("bolts") and ones
without ("spouts"). Whenever a bolt drops a packet, it sends a hash
of the packet back to the originating spout. If it creates new packets
based on the old one, the spout is told about it by sending a hash of the
new packet, and identity of the originating spout is passed along to the
next component.

Each spout keeps track of the incoming notifications, and if all
packets in the network generated from a given outgoing packet have
not been dropped within a specified timeout interval, the spout should
retransmit the packet. Some spouts can't do that, and other packets
can't be safely retransmitted because they are not idempotent.
Packet hashes are xored together, and so are drop-packet hashes, making it
extremely unlikely that the spout will decide a packet needs retransmission
when it really has been processed successfully.

Here's a quick list of available stream groupings (you can write your
own, too):

1) Randomly distribute input packets over the tasks. This is the default.

2) Specify certain fields of the packet as the key, and send all packets
with the same key to the same task.

3) Send all packets to all tasks.

4) Send all packets to a single task.

5) The sender specifies which task to send to.

6) Randomly distribute packets over the tasks running in this same
process, if any; otherwise, across all tasks.

7) If the next component is a non-looper and it is available in this same
process, execute it in the same thread. Otherwise, randomly distribute.
This is not implemented yet, but will be the default in future.

--
Where the wombat has walked, John Cowan <co...@ccil.org>
it will inevitably walk again. http://www.ccil.org/~cowan
(even through brick walls!)

Kenneth Kan

unread,
Apr 27, 2013, 5:44:09 PM4/27/13
to flow-based-...@googlegroups.com, co...@mercury.ccil.org
When Twitter open sourced Storm, I dismissed it almost entirely. FBP itself is already a solid programming paradigm. Storm takes some of the disadvantages away and have a strong community due to its affiliation with Twitter. Now that I compare it to FBP, I can't help but ask: why would a new programmer want to learn to FBP rather than Storm? Would you say it's because FBP is a simpler system?

ern0

unread,
Apr 27, 2013, 5:49:28 PM4/27/13
to flow-based-...@googlegroups.com
> 1) A network can be distributed across multiple machines transparently
> when it is deployed, rather than running in a single process.

It's easier than you think, you should just write a pair of components.

I've written a prototype dataflow system, which has only a "fake"
scheduler (components are passing packets by calling each other thru a
pointer, C++), but I've also written UdpSend and UdpReceive component,
so I can set up multi-host application. Even with a half-baked system.

The best part: you don't need multiple machines to set up multi-host
app, nobody cares if your two segments of the network run on the same
machine during developement.
--
ern0
dataflow programmer

John Cowan

unread,
Apr 27, 2013, 9:40:18 PM4/27/13
to flow-based-...@googlegroups.com
ern0 scripsit:

> >1) A network can be distributed across multiple machines transparently
> >when it is deployed, rather than running in a single process.
>
> It's easier than you think, you should just write a pair of components.

Sure, but that means as you change the details of deployment, you have
to change the network. Storm separates the concerns of the programmer
from those of ops.

--
John Cowan co...@ccil.org
I amar prestar aen, han mathon ne nen, http://www.ccil.org/~cowan
han mathon ne chae, a han noston ne 'wilith. --Galadriel, LOTR:FOTR

John Cowan

unread,
Apr 27, 2013, 9:41:25 PM4/27/13
to Kenneth Kan, flow-based-...@googlegroups.com
Kenneth Kan scripsit:

> Now that I compare it to FBP, I can't help but ask: why would a
> new programmer want to learn to FBP rather than Storm? Would you say
> it's because FBP is a simpler system?

FBP is an architecture, not a specific implementation, and Storm is one
of its implementations. As is the shell, as is JavaFBP, as was AMPS
(in assembler on the '360).

--
You escaped them by the will-death John Cowan
and the Way of the Black Wheel. co...@ccil.org
I could not. --Great-Souled Sam http://www.ccil.org/~cowan

Paul Morrison

unread,
Apr 27, 2013, 10:10:02 PM4/27/13
to flow-based-...@googlegroups.com
On 27/04/2013 5:44 PM, Kenneth Kan wrote:
> When Twitter open sourced Storm, I dismissed it almost entirely. FBP
> itself is already a solid programming paradigm. Storm takes some of
> the disadvantages away and have a strong community due to its
> affiliation with Twitter. Now that I compare it to FBP, I can't help
> but ask: why would a new programmer want to learn to FBP rather than
> Storm? Would you say it's because FBP is a simpler system?
My friend and mentor, Wayne Stevens, used to say that, when it was time
for someone to invent the hula hoop, hula hoops would start appearing
all over the place, produced by different manufacturers! Maybe we should
view FBP as a paradigm shift, triggering all sorts of new approaches,
both in software and hardware, while Storm (and JavaFBP) are specific
implementations. Just my 2 cents!

John Cowan

unread,
Apr 27, 2013, 11:38:38 PM4/27/13
to flow-based-...@googlegroups.com
Paul Morrison scripsit:

> My friend and mentor, Wayne Stevens, used to say that, when it was
> time for someone to invent the hula hoop, hula hoops would start
> appearing all over the place, produced by different manufacturers!

A tree can not find out, as it were, how to blossom, until comes
blossom-time. A social growth cannot find out the use of steam
engines, until comes steam-engine-time.

--Charles Fort, "Lo!" (1931), often misquoted as "It steam engines when
it comes steam-engine time."

> Maybe we should view FBP as a paradigm shift, triggering all sorts
> of new approaches, both in software and hardware, while Storm (and
> JavaFBP) are specific implementations. Just my 2 cents!

Indeed.

--
John Cowan co...@ccil.org http://ccil.org/~cowan
Female celebrity stalker, on a hot morning in Cairo:
"Imagine, Colonel Lawrence, ninety-two already!"
El Auruns's reply: "Many happy returns of the day!"

Kenneth Kan

unread,
Apr 28, 2013, 2:03:22 AM4/28/13
to John Cowan, flow-based-...@googlegroups.com

On Sat, Apr 27, 2013 at 9:41 PM, John Cowan <co...@mercury.ccil.org> wrote:
FBP is an architecture, not a specific implementation, and Storm is one
of its implementations.  As is the shell, as is JavaFBP, as was AMPS
(in assembler on the '360).

Very succinctly said! Thank you. Now it's much more clear to me.

Like Paul has said, this is a paradigm shift, and this new paradigm is so much more superior for most business logic. If reinventing the hula hoops is what it takes, so be it!
Reply all
Reply to author
Forward
0 new messages