. I followed the loading section of the tutorial
. Just rename the filename to my soc-LiveJournal.txt.
Unfortunately, the gremlin java process keeps on full GC after the loading process running for a while.
I'm looking for help to efficiently load the big dataset (1GB~100GB). I also try to use HadoopGraph to load it, but found HadoopGraph does not support addVertex. Is there any suggestion?
graph = TinkerGraph.open()
graph.createIndex('userId', Vertex.class)
g = traversal().withEmbedded(graph)
getOrCreate = {
id ->
g.V().has('user','userId', id).fold().coalesce(unfold(),
addV('user').property('userId', id)).next()
}
new File('wiki-Vote.txt').eachLine {
if (!it.startsWith("#")){
(fromVertex, toVertex) = it.split('\t').collect(getOrCreate) //
g.addE('votesFor').from(fromVertex).to(toVertex).iterate()
}
}