Best way to load exported medium-sized graphs

36 views
Skip to first unread message

Carlos Bobed

unread,
Apr 13, 2021, 3:42:44 AM4/13/21
to Gremlin-users
Hi all,

As suggested, meanwhile I get an answer to whether I can share the graph, I open another thread on this particular subject. 

I'm trying to load a graphml export into janusgraph 0.5.3. not quite big (1.68M nodes, 8.8M edges). However, I reach a point where Tinkerpop layer tells me that the Batch is too large and it crashes (I suspect that there might be an edge with far too much information ... but finding it is difficult).

I've tried to split the graphml file into non-overlapping partitions, but janusgraph does not seem to honor the ID (I'm not actually sure whether this might be omitted at Tinkerpop or Janusgraph level), and when I reach the splitted edges part, it inserts new nodes for both source and target.

Has anyone faced this graphml loading problem before? Should I try to get a GraphSON translated version of my exported graph to run everything more smoothly?

What are the recommendations to deal with this kind of loadings? I'm trying to avoid implementing my own graph loader at this point.

Thank you very much in advance,

Best,

Carlos Bobed

Stephen Mallette

unread,
Apr 15, 2021, 7:34:52 AM4/15/21
to gremli...@googlegroups.com
>  However, I reach a point where Tinkerpop layer tells me that the Batch is too large and it crashes 

What is the actual error?

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/e99af74b-40be-4e4d-a8ca-b391e6cbc929n%40googlegroups.com.

Carlos Bobed

unread,
Apr 15, 2021, 7:53:55 AM4/15/21
to gremli...@googlegroups.com
Hi Stephen, 

El jue, 15 abr 2021 a las 13:34, Stephen Mallette (<spmal...@gmail.com>) escribió:
>  However, I reach a point where Tinkerpop layer tells me that the Batch is too large and it crashes 

What is the actual error?

digging in the logs I've seen that Tinkerpop raises an exception stating that a Batch was too large, but it seems that it's propagated from Cassandra backend in Janusgraph; 
loading a DBpedia2016 export in graphml has not raised any problem, so I'm trying to find the culprit among the edges ... it seems that there's some edges that are bigger than what's accepted in a Batch, but I cannot find the particular parameter to touch. 

Thank you, 

Carlos 
 
You received this message because you are subscribed to a topic in the Google Groups "Gremlin-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gremlin-users/L_6Zi9Nc4tM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/CAA-H43_SVzaDKuMT7h7xkxicLEz-6K8aT%2Bk2nQuSYE92HtzNgw%40mail.gmail.com.

Stephen Mallette

unread,
Apr 15, 2021, 9:42:46 AM4/15/21
to gremli...@googlegroups.com
yeah - i would expect that sort of error to be JanusGraph/Cassandra specific as batching in the TinkerPop context related to the GraphMLReader refers to the number of mutations before a commit of the transaction. While you can control that setting, I dont think it will specifically raise the sort of error you are describing. It might help JanusGraph experts if you supplied the exact error message and stacktrace you're seeing. You may also want to take this question to the JanusGraph mailing list.

Carlos Bobed

unread,
Apr 15, 2021, 9:52:10 AM4/15/21
to gremli...@googlegroups.com
Hi again, 

yes I did post it as I learned that it was due to the janusgraph part. From the logs, while loading this graph, Cassandra driver is almost always warning that all batches are over the limit of 5120 (which I haven't found yet where to modify ...). 

Thank you very much for your answer! 

Best, 

Carlos 

Reply all
Reply to author
Forward
0 new messages