S4 Introduction

35 views
Skip to first unread message

Krishna

unread,
Nov 3, 2010, 6:45:40 PM11/3/10
to s4-project
Interesting project. I haven't yet looked deep enough, but has some
similarities with Flume.
Anyway, Thanks Yahoo.
Cheers
<k/>

Anish Nair

unread,
Nov 4, 2010, 8:00:38 PM11/4/10
to s4-project
Flume is great for scooping in logs into HDFS in a streaming manner.
S4 is more of a compute platform. We view it as a way of writing
operations on data streams (in a distributed fashion, of course). In
fact, we could use Flume to plug input streams into S4. They're
complementary, in that sense.

OG

unread,
Nov 5, 2010, 4:12:15 PM11/5/10
to s4-project
I skimmed the docs and had the similar reaction. Sounds a bit like
Flume and a bit like JMS. Couldn't one add custom operations in Java
classes acting as sources or syncs in Flume and their equivalents in
JMS?

Thanks,
Otis

Ted Dunning

unread,
Nov 5, 2010, 9:24:06 PM11/5/10
to s4-pr...@googlegroups.com

I think you could do this with a substrate like JMS, but the purely functional contract of map-reduce gives 
you more freedom in terms of semantics than most other frameworks.  This contract allows, for instance,
substantial rewrites of the map-reduce dataflow graph that are not allowable with a general JMS framework,
better framework initiated failure tolerance and lighter weight messaging.  It isn't clear that S4 delivers on
all of those potentials, but I could imagine that S4 with the equivalent of speculative execution and an
optimization framework like Plume (clone of FlumeJava which is not Flume) might give you many of them.

Whether Flume gives you these capabilities, I couldn't say.

OG

unread,
Nov 6, 2010, 2:33:02 AM11/6/10
to s4-project
Yeah, it seems S4 is richer than JMS and really made for creating
flows of components that process streams of events and even perform
real-time computations on them. I think Flume is less about
performing computation on stream of data, and more about getting data
from point A to point Z.

Otis

On Nov 5, 9:24 pm, Ted Dunning <ted.dunn...@gmail.com> wrote:
> I think you could do this with a substrate like JMS, but the purely
> functional contract of map-reduce gives
> you more freedom in terms of semantics than most other frameworks.  This
> contract allows, for instance,
> substantial rewrites of the map-reduce dataflow graph that are not allowable
> with a general JMS framework,
> better framework initiated failure tolerance and lighter weight messaging.
>  It isn't clear that S4 delivers on
> all of those potentials, but I could imagine that S4 with the equivalent of
> speculative execution and an
> optimization framework like Plume (clone of FlumeJava which is not Flume)
> might give you many of them.
>
> Whether Flume gives you these capabilities, I couldn't say.
>
Reply all
Reply to author
Forward
0 new messages