Gremlin as a modern query language.

2,905 views
Skip to first unread message

Marko Rodriguez

unread,
Mar 2, 2016, 8:23:22 AM3/2/16
to gremli...@googlegroups.com, d...@tinkerpop.incubator.apache.org
Hello,


Note that the various DSLs these systems use are analogous to Gremlin --- they all use the "functional-fluent"-style. We need to stress to people that if they use Spark/Storm/Flink/Samza/Scala/Java8/Clojure, then Gremlin fits their already existing mental model of data flows and aggregations. When people say a query language needs to be "like SQL," point them to the fact that most modern data processing frameworks don't use that style. When people say that SQL is declarative and thus can be optimized, tell them that these functional-fluent languages build a query plan that is optimized for the underlying execution engine. By making an "SQL language," all you are doing is making another layer of indirection -- now you have to compile a String down to the underlying execution language (e.g. Java). Modern data processing languages don't waste the effort as the constructs in modern programming languages provide enough expressivity. Moreover, these languages lead (I believe) to execution engine designs that naturally support both single machine and compute cluster executions (they have a map/reduce-foundation inherent in their representation).

GREMLIN

text.map(line -> line.split(" "))
    .unfold()
    .groupCount()

STORM
topology.newStream("spout1", spout)
.each(new Fields("sentence"),new Split(), new Fields("word"))
.groupBy(new Fields("word"))
.persistentAggregate(new MemoryMapState.Factory(),
new Count(), new Fields("count"));
SPARK

text.flatMap(line => line.split(" "))
.map(word => (word, 1)) .reduceByKey(_ + _)


SAMZA

text.split(" ").foldLeft(Map.empty[String, Int]) {
(count, word) => count + (word -> (count.getOrElse(word, 0) + 1)) }


FLINK

text.flatMap ( _.split(" ") )
.map ( (_, 1) )
.groupBy(0)
.sum(1)

Take care,
Marko.
Reply all
Reply to author
Forward
0 new messages