Titan Graph Load Failing

242 views
Skip to first unread message

namrata jain

unread,
Sep 18, 2015, 10:32:56 AM9/18/15
to aureliu...@googlegroups.com

We are trying to load data in Titan using Titan-Hadoop/Faunus. We are using GraphsonInputFormat and TitanHbaseOutputFormat, Titan 0.5.4

We have 17 data nodes. On which 51 mappers/reducers are running concurrently.

With 128M vertices and 60M edges, load was successful.

When we tried same thing for 250M vertices (twice) , it failed with following exception during reduce phase (addEdge)

20:07:23 INFO  org.apache.hadoop.mapreduce.Job  - Task Id : attempt_1441096735236_0046_r_000020_2, Status : FAILED

Error: java.io.IOException: Cannot modify unmodifiable vertex: v[27246028821166388]

        at com.thinkaurelius.titan.hadoop.formats.util.TitanGraphOutputMapReduce$EdgeMap.map(TitanGraphOutputMapReduce.java:398)

        at com.thinkaurelius.titan.hadoop.formats.util.TitanGraphOutputMapReduce$EdgeMap.map(TitanGraphOutputMapReduce.java:368)

        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)

        at org.apache.hadoop.mapreduce.lib.chain.Chain$MapRunner.run(Chain.java:321)

Caused by: com.thinkaurelius.titan.core.SchemaViolationException: Cannot modify unmodifiable vertex: v[27246028821166388]

        at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.verifyWriteAccess(StandardTitanTx.java:265)

        at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.addEdge(StandardTitanTx.java:652)

        at com.thinkaurelius.titan.graphdb.blueprints.TitanBlueprintsTransaction.addEdge(TitanBlueprintsTransaction.java:132)

        at com.thinkaurelius.titan.graphdb.vertices.AbstractVertex.addEdge(AbstractVertex.java:255)

        at com.thinkaurelius.titan.hadoop.formats.util.TitanGraphOutputMapReduce.getCreateOrDeleteRelation(TitanGraphOutputMapReduce.java:252)

        at com.thinkaurelius.titan.hadoop.formats.util.TitanGraphOutputMapReduce.access$000(TitanGraphOutputMapReduce.java:42)

        at com.thinkaurelius.titan.hadoop.formats.util.TitanGraphOutputMapReduce$EdgeMap.getCreateOrDeleteEdge(TitanGraphOutputMapReduce.java:426)

        at com.thinkaurelius.titan.hadoop.formats.util.TitanGraphOutputMapReduce$EdgeMap.map(TitanGraphOutputMapReduce.java:393)

Looking at Titan code, we found that this error occurs in below function:

private void verifyWriteAccess(TitanVertex... vertices) {

if (config.isReadOnly())

      throw new UnsupportedOperationException("Cannot create new entities in read-only transaction");

   for (TitanVertex v : vertices) {

     if (v.hasId() && idInspector.isUnmodifiableVertex(v.getLongId()) && !v.isNew())

     throw new SchemaViolationException("Cannot modify unmodifiable vertex: "+v);

  }

  verifyAccess(vertices);

}

We are not able to understand what is wrong here. Is it failing because, vertex with that long-id already exist in DB. But if so, when we try to access the vertex using gremlin, it is returning null.

gremlin> g.v(27246028821166388)

==>null


Regards,

Namrata

namrata jain

unread,
Sep 23, 2015, 9:58:48 AM9/23/15
to aureliu...@googlegroups.com
Please reply back, if anyone has faced such issue before or know the root cause of this error.
We are stuck here

TIA,
Namrata

David

unread,
Sep 25, 2015, 3:42:55 PM9/25/15
to Aurelius
Its been a while since I was developing around Faunus, and the details have faded.
But my first suspicion is with your data.

One suggestion is to hand create a small data set, or use the graph of the gods
example, and run that through your environment and see if that succeeds.

If it does, then edit that same file and make it invalid by having an adjacency
reference point to a vertex that is not in the file.  Run that and see if it
matches the errors you are seeing with your large data set.

namrata jain

unread,
Sep 28, 2015, 5:05:25 AM9/28/15
to Aurelius
Thanks David for the reply. We tried same on small dataset.

Below is the graphson, for which it is failing:


{"vertex_name":"account","_outE":[{"_id":"1775853609619300262","_label":"uses_device","_inV":"5321925535205852"}],"_id":"29189114749474227","value":"ACC1","_inE":[],"vertex_id":"account_ACC1"}
{"vertex_name":"device_id","_outE":[],"_id":"5321925535205852","value":"DID1","_inE":[{"_outV":"29189114749474227","_id":"1775853609619300262","_label":"uses_device"}],"vertex_id":"device_DID1"}



with Exception:

Error: java.io.IOException: Cannot modify unmodifiable vertex: v[29189114749474228] outVertex:num_properties:0:key:Id:value:29189114749474228 inVertex:num_properties:3:key:vertex_name:value:device_id:key:value:value:DID1:key:vertex_id:value:device_DID1:key:Id:value:163856464 label:name:uses_device
at com
.thinkaurelius.titan.hadoop.formats.util.TitanGraphOutputMapReduce$EdgeMap.map(TitanGraphOutputMapReduce.java:398)

at com
.thinkaurelius.titan.hadoop.formats.util.TitanGraphOutputMapReduce$EdgeMap.map(TitanGraphOutputMapReduce.java:368)
at org
.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org
.apache.hadoop.mapreduce.lib.chain.Chain$MapRunner.run(Chain.java:321)
Caused by: com.thinkaurelius.titan.core.SchemaViolationException: Cannot modify unmodifiable vertex: v[29189114749474228] outVertex:num_properties:0:key:Id:value:29189114749474228 inVertex:num_properties:3:key:vertex_name:value:device_id:key:value:value:DID1:key:vertex_id:value:device_DID1:key:Id:value:163856464 label:name:uses_device
at com
.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.addEdge(StandardTitanTx.java:679)

at com
.thinkaurelius.titan.graphdb.blueprints.TitanBlueprintsTransaction.addEdge(TitanBlueprintsTransaction.java:132)
at com
.thinkaurelius.titan.graphdb.vertices.AbstractVertex.addEdge(AbstractVertex.java:255)
at com
.thinkaurelius.titan.hadoop.formats.util.TitanGraphOutputMapReduce.getCreateOrDeleteRelation(TitanGraphOutputMapReduce.java:252)
at com
.thinkaurelius.titan.hadoop.formats.util.TitanGraphOutputMapReduce.access$000(TitanGraphOutputMapReduce.java:42)
at com
.thinkaurelius.titan.hadoop.formats.util.TitanGraphOutputMapReduce$EdgeMap.getCreateOrDeleteEdge(TitanGraphOutputMapReduce.java:426)
at com
.thinkaurelius.titan.hadoop.formats.util.TitanGraphOutputMapReduce$EdgeMap.map(TitanGraphOutputMapReduce.java:393)
... 3 more


Please note here that id that exception is mentioning for vertex id (29189114749474228) does not exist in our graphson. We are not sure where this id is being generated.


If we change just the the ids in the same graphson, it works.

Modified graphson looks like below:


{"vertex_name":"account","_outE":[{"_id":"1","_label":"uses_device","_inV":"2"}],"_id":"3","value":"ACC1","_inE":[],"vertex_id":"account_ACC1"}
{"vertex_name":"device_id","_outE":[],"_id":"2","value":"DID1","_inE":[{"_outV":"3","_id":"1","_label":"uses_device"}],"vertex_id":"device_DID1"}


Please suggest. We are getting the same exception for multiple edges in our entire graphson.


TIA,

Namrata

Anish Bansal

unread,
Sep 28, 2015, 10:33:37 AM9/28/15
to Aurelius
Hi David,

Please note that we changed titan/titan-core/src/main/java/com/thinkaurelius/titan/graphdb/transaction/StandardTitanTx.java to add more logs. PFA modified file.

Method changed: addEdge at line 651

This file was checkout out from Titan github (commit - 141cd4c) corresponding to 0.5.4 release.

cheers,
Anish
StandardTitanTx.java

Anish Bansal

unread,
Dec 31, 2015, 6:02:39 AM12/31/15
to Aurelius
Hi guys,

We had sidelined this issue for sometime. Now that we have revisited it again, we have found the real cause of issue. This is due to a bug in dependent org.codehaus.jettison.jettison (1.3.3) library. This library was incorrectly giving wrong long value for a json (example below). We fixed this by including a later version - 1.3.7 of the same library.

Example JSON: {"_outV":"29189114749474227","_id":"1775853609619300262","_label":"uses_device"}
When we try to get get long for _outV, it returns 29189114749474228 instead of 29189114749474227.

Attaching test file for the same.

cheers,
Anish
Test.java
Reply all
Reply to author
Forward
0 new messages