[Faunus] Hadoop-Based Graph Computing Framework (0.1-alpha)

Marko Rodriguez

unread,

Sep 21, 2012, 4:40:30 AM9/21/12

to gremli...@googlegroups.com

Hello,

For those who like Gremlin and have Big Graph Data, please direct your attention to Faunus.

http://thinkaurelius.github.com/faunus/

Faunus is a Hadoop-based graph computing framework that provides a breadth-first implementation of Gremlin. With Faunus, you simply use the Gremlin REPL and Gremlin expressions to rip out a series of MapReduce jobs on graphs represented across a machine cluster.

[Titan]

gremlin> g.V.out.out.count()

==>28

[Faunus]

gremlin> g.V.out.out.count()

12/09/21 01:52:42 INFO mapreduce.FaunusCompiler: Compiled to 3 MapReduce job(s)

12/09/21 01:52:42 INFO mapreduce.FaunusCompiler: Executing job 1 out of 3: MapSequence[com.thinkaurelius.faunus...

...

==>28

.......what about a degree distribution?

gremlin> g.V.sideEffect('{it.degree = it.outE.count()}').degree.groupCount

12/09/21 01:56:05 INFO mapreduce.FaunusCompiler: Compiled to 1 MapReduce job(s)

...

==>0 7 ==>1 1 ==>3 1 ==>4 2 ==>5 1

......but that is just over the toy Graph of the Gods graph :|.

The funnest number to date with Faunus was yesterday -- we did a 4 step traversal off g.V in a social network dataset (orkut) that yielded 308 trillion paths in 30 minutes over a 8 machine m1.xlarge cluster in EC2.

Finally, note that Faunus currently works over any Rexster-fronted Blueprints-enabled graph database, natively with Titan (via HBase and Cassandra Hadoop connectivity) and with binary/text representations of graphs in HDFS.

Please enjoy... and if you are good with Hadoop, please feel free to contribute to the effort,

Marko.

http://thinkaurelius.com

Luca Garulli

unread,

Sep 21, 2012, 5:57:18 AM9/21/12

to gremli...@googlegroups.com

Hi,

great job!

Lvc@

--

Marko Rodriguez

unread,

Sep 21, 2012, 6:02:38 AM9/21/12

to gremli...@googlegroups.com

Thanks.

BTW: Stephen wrote the OrientDB adaptor. That was the reason he was asking about how to convert the RID in OrientDB to a long.

Take care Luca,

Marko.

http://markorodriguez.com

--

Stephen Mallette

unread,

Sep 21, 2012, 6:03:44 AM9/21/12

to gremli...@googlegroups.com

Luca, here's the wiki page that talks about the OrientDB specific
configuration for Faunus:

https://github.com/thinkaurelius/faunus/wiki/Rexster-Format

Stephen

> --
>
>

Luca Garulli

unread,

Sep 21, 2012, 6:44:21 AM9/21/12

to gremli...@googlegroups.com

Hi Stephen,

this project makes me the willing to get a bunch of servers from Amazon to test it! :-)

Lvc@

--

Peter Neubauer

unread,

Sep 21, 2012, 9:09:04 AM9/21/12

to gremli...@googlegroups.com

Cool!

/peter

Send from mobile device.

--

Chris Diehl

unread,

Sep 21, 2012, 6:52:42 PM9/21/12

to gremli...@googlegroups.com

Awesome work guys! This is really exciting!

Jonathan Haddad

unread,

Sep 26, 2012, 12:35:26 PM9/26/12

to gremli...@googlegroups.com

Amazing. Awesome work!

On Friday, September 21, 2012 1:40:34 AM UTC-7, Marko A. Rodriguez wrote:

Reply all

Reply to author

Forward