How to use BLVP bulk loading big data into Titan1.0.0 with Hbase?

96 views
Skip to first unread message

HaiHong Zhou

unread,
Apr 14, 2016, 5:39:14 AM4/14/16
to Aurelius
I have 1 billion verties and billion  of edges in CSV files. There has 300G datas. I used BLVP to loading these data into Titan1.0.0 and used hbase as the storage backend .

I had already defined the data schema.

Then when I loading data,there throw an exception:

 

gremlin> :load test/bulk-load-BLVP.groovy


 

This is bulk-load-BLVP.groovy contents:

 

graph = GraphFactory.open("conf/hadoop-graph/hadoop-script-my.properties")
blvp 
= BulkLoaderVertexProgram.build().keepOriginalIds(false).writeGraph("conf/titan-hbase-bulk-load-test.properties").
        intermediateBatchSize
(10000).create(graph)
graph
.compute(SparkGraphComputer).program(blvp).submit().get()

 

 

 This is hadoop-script-my.properties contents:


gremlin
.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin
.hadoop.graphInputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.script.ScriptInputFormat
gremlin
.hadoop.graphOutputFormat=org.apache.hadoop.mapreduce.lib.output.NullOutputFormat
gremlin
.hadoop.memoryOutputFormat=org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat
gremlin
.hadoop.jarsInDistributedCache=true
gremlin
.hadoop.deriveMemory=false
gremlin
.hadoop.inputLocation=test/vote.txt
gremlin
.hadoop.scriptInputFormat.script=data/script-input-test.groovy
gremlin
.hadoop.outputLocation=output


####################################
# SparkGraphComputer Configuration #
####################################
spark
.master=local[4]
spark
.executor.memory=1g
spark
.serializer=org.apache.spark.serializer.KryoSerializer

 

This is script-input-test.groovy contents:

def parse(line, factory) {
    
def (sourceId,targetId) = line.split(/\t/).toList()
    
def v1 = factory.vertex(sourceId, 'user')
    v1
.property('userId', sourceId)

    
def v2 = factory.vertex(targetId,'user')
   
// v2.property('userId', targetId)
    
def edge = factory.edge(v1, v2,'friend')
    edge
.property('voteFor', targetId)
    
return v1
}

 

This is titan-hbase-bulk-load-test.properties contents:

#my individuted configuration
gremlin
.graph=com.thinkaurelius.titan.core.TitanFactory
#Blow is config batch loading.
storage
.batch-loading=true
storage
.read-only=false
storage
.hbase.table=test
storage
.backend=hbase
storage
.hostname=127.0.0.1

cache
.db-cache = true
cache
.db-cache-clean-wait = 20
cache
.db-cache-time = 180000
cache
.db-cache-size = 0.5

 

My vote.txt contents as follows:

30  1412

30  3352

30  5254

30  5543

30  7478

3   28

3   30

3   39

3   54

3   214

3   271

3   286

3   590

3   604

3   611

3   8283

25  3

25  6


Can you help me to solve this problem?

Thank you!

Zhou

 

Daniel Kuppitz

unread,
Apr 14, 2016, 10:14:44 AM4/14/16
to aureliu...@googlegroups.com
I guess the exception is a FastNoSuchElementException? That has been discussed several times already. Last time here: https://groups.google.com/d/msg/aureliusgraphs/PHfd0c3FRFA/8CNc_-irHAAJ

Cheers,
Daniel


--
You received this message because you are subscribed to the Google Groups "Aurelius" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraph...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/aureliusgraphs/35ec8efc-b859-42d5-b5c6-50e9e802f531%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

HaiHong Zhou

unread,
Apr 15, 2016, 3:20:57 AM4/15/16
to Aurelius
Hi Daniel,

Thanks for the feedback. I get it. I know how to do it.

Thanks,

Zhou


在 2016年4月14日星期四 UTC+8下午10:14:44,Daniel Kuppitz写道:
...

Stephen Mallette

unread,
Apr 15, 2016, 5:03:04 AM4/15/16
to Aurelius
Daniel, since this has come up a few times, perhaps you could add a WARNING: to the BLVP docs about this problem.

--
You received this message because you are subscribed to the Google Groups "Aurelius" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraph...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages