Vertex with id already exists: on multiple graphml reads

27 views
Skip to first unread message

Fred Eisele

unread,
Jan 8, 2021, 2:56:07 PM1/8/21
to Gremlin-users
I wish to combine several graphml graphs into a single graph.
These graphml files are created using TinkerGraph.
This causes the node ID's to be reused which, while unique within their respective files, are not unique across multiple graphml files; the graphml files reuse vertex ids.
When I read these files into JanusGraph there is no problem as it generates new node ids.
When I read these files into a TinkerGraph object I get the error...
```
Vertex with id already exists: 0
```
When I change the <node id="0"> in a graphml file to some other value it simply moves the collision to another node.
Is there a way to change this default behavior to generate new unique ids on a read?

Here is a sample of the type of commands being used.
```
conf = new BaseConfiguration()
conf.setProperty("gremlin.tinkergraph.defaultVertexPropertyCardinality","list")
conf.setProperty("gremlin.tinkergraph.vertexIdManager","LONG")
graph1 = TinkerGraph.open(conf)
g1 = traversal().withEmbedded(graph1)

g.io('/tmp/base.graphml').with(IO.reader, IO.graphml).read().iterate()
g.io('/tmp/foo.graphml').with(IO.reader, IO.graphml).read().iterate()
g.io('/tmp/bar.graphml').with(IO.reader, IO.graphml).read().iterate()
```
I have, here, indicated the use of a configuration to try to find a vertexIdManager which generates new unique ids. My original attempt made no use of a `config`.

Fred Eisele

unread,
Jan 8, 2021, 3:42:52 PM1/8/21
to Gremlin-users
As I have control over the creation of the graphml files I am able to configure the writer so that it will produce UUIDs.
conf.setProperty("gremlin.tinkergraph.vertexIdManager","UUID")
conf.setProperty("gremlin.tinkergraph.edgeIdManager","UUID")
Then when they are read either "UUID" or "ANY" may be used.
I would still be interested in knowing whether there is a way to force the reader to generate unique "LONG" ids.

Stephen Mallette

unread,
Jan 11, 2021, 12:01:32 PM1/11/21
to gremli...@googlegroups.com
GraphMLReader checks calls Graph.Features.willAllowId() to determine if the graph can consume the identifier given to it. If it does it will pass the id as the pairing argumen to T.id with addVertex(). If it does not match then those arguments are excluded. This is the path that JanusGraph takes as it doesn't allow id assignment. To get TinkerGraph to do that you just need to have willAllowId return false. TinkerGraph consults the IdManager to determine what that method returns so you simply need to create your own IdManager implementation that does so. I'd basically copy the LongIdManager and have allow(id) return false by default:




--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/2df0974c-32dc-4381-a8a0-1bd9d89b908dn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages