[gremlin-python] Simple addV + addE pattern not working (async execution?)…

1,213 views
Skip to first unread message

Dave vU

unread,
Oct 16, 2016, 8:06:46 AM10/16/16
to Gremlin-users
Hi,

Sorry if I'm missing something obvious, but it seems that, when executing queries through the Python-Gremlin interface, there is some race condition in the graph state, causing vertices just created to not show in the next query:

new_block_id = self.g.addV().property('foo', bar).id().next()
edges
= self.g.V(new_block_id).as_('new_block').V().has(T.id, P.within(children_vertices)).addE().from_('new_block').toList()

… will fail (randomly): Even though the vertex gets created and an id returned, querying with that ID will not return anything… 

Debugging further, I found that running this:

new_block_id = self.g.addV().property('foo', bar).id().next()
print("ID created:" + str(new_block_id))
print("ID queried:" + str(self.g.V(new_block_id).toList()))

gives: 

ID created: 1234
ID queried
: []

So, clearly the vertex exists in the graph, but for some reason, is not yet visible to the traversal in the next query…

Is there some asynchronous execution issue here that I am missing? (it would appear so, but then how can an ID value be returned?)

What is the recommended way to deal with this in gremlin-python?

Robert Dale

unread,
Oct 16, 2016, 8:46:00 AM10/16/16
to gremli...@googlegroups.com
Which graph provider are you using?
> --
> You received this message because you are subscribed to the Google Groups
> "Gremlin-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to gremlin-user...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/gremlin-users/3da50e3c-3334-46f2-902e-3c1af85a4822%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
Robert Dale

Dave vU

unread,
Oct 16, 2016, 8:54:48 AM10/16/16
to Gremlin-users
All this is using neo4j as a provider… But should such a basic issue be provider-dependent?

BTW, digging further, I realised that the following query in gremlin console

g.addV().property('foo', 'bar').as('new_block').V().has('id', within([123, 456])).addE().from('new_block').select('new_block').id().next()

also gives me an exception (basically, next() fails because nothing is selected). Both vertex and the edges are created, and running the same query with toList() works fine, but returns an empty list.

This clearly seems linked with the fact that the graph hasn't been updated by the time the select() step is run… The closest to an answer to that problem I can find in the doc, is the barrier() step, but it does not seem to apply here.

Anything I am missing?

-- 
Dave

Robert Dale

unread,
Oct 16, 2016, 9:40:15 AM10/16/16
to gremli...@googlegroups.com

Not sure how you're calling addE() without a label. Maybe the python api lets you do this but the groovy/java api does not.

I'll assume that you're running neo4j embedded in a remote gremlin server.  BTW, which version of gremlin and neo4j are you running?

I haven't experienced your problem where a newly created vertex can not be immediately queried and I've loaded lots of data like this.

One thing I noticed is that in your first traversal, you use T.id.  In later traversals, you use 'id'. These are two very distinctly different properties.  Make sure you're using the one you expect.

This traversal works as expected for me.  First the traversal must meet the has() requirement so let's add the expected vertex with 'id' of 123.

```
gremlin> graph = EmptyGraph.instance()
==>emptygraph[empty]
gremlin> g = graph.traversal().withRemote('conf/remote-graph.properties')
==>graphtraversalsource[emptygraph[empty], standard]
gremlin> g.addV().property('id',123)
==>v[0]
gremlin> g.addV().property('foo', 'bar').as('new_block').V().has('id', within([123, 456])).addE('my-label').from('new_block').select('new_block').id().next()
==>1
```


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/12d5794e-c6b5-40af-b958-d220ae0c3cb6%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Robert Dale

Dave vU

unread,
Oct 16, 2016, 10:21:26 AM10/16/16
to Gremlin-users
Not sure how you're calling addE() without a label. Maybe the python api lets you do this but the groovy/java api does not.

All apologies: I was overzealous when cleaning my query of non-essential stuff… The original query definitely had an edge label (and failed).

One thing I noticed is that in your first traversal, you use T.id.  In later traversals, you use 'id'. These are two very distinctly different properties.  Make sure you're using the one you expect.

Indeed! Changing 'id' to T.id solves the gremlin console issue (I am not very clear on what the difference is, based on the doc)…

Unfortunately, this was only an attempt at reducing the issue to an easily-reproducible console version. The python version still fails and is not related to an 'id'/T.id nomenclature (as far as I can tell).

I'll assume that you're running neo4j embedded in a remote gremlin server.  BTW, which version of gremlin and neo4j are you running?

The very latest install of Tinkerpop and neo4j plugin, as of a week ago (not sure how to get the version string, but as far as I can tell, there hasn't been any update to release since then).

I haven't experienced your problem where a newly created vertex can not be immediately queried and I've loaded lots of data like this.

The issue is very much random and looks like a race condition (when creation and traversal are in two separate queries). Given enough iterations (~10), the following python code will *always* produce an exception for me:

new_block = self.g.addV('block').property('hash', hash_str).property('txt', txt_str).id().next()
test
= self.g.V(new_block).id().next()
# test = self.g.V().has('block', 'hash', hash_str).id().next()

Whether using the ID, or using an indexed property look-up (commented line).

On the other hand, if I yield a little between creation and traversal:

new_block = self.g.addV('block').property('hash', hash_str).property('txt', txt_str).id().next()
time
.sleep(0.1)
test
= self.g.V(new_block).id().next()
# test = self.g.V().has('block', 'hash', hash_str).id().next()

The code works fine.

Robert Dale

unread,
Oct 16, 2016, 10:42:29 AM10/16/16
to gremli...@googlegroups.com
Yes, I can reproduce in 3.2.2.  Can not reproduce in 3.2.1. Created issue https://issues.apache.org/jira/browse/TINKERPOP-1511


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Robert Dale

Dave vU

unread,
Oct 16, 2016, 10:58:35 AM10/16/16
to Gremlin-users
Great!

(well, not so great: but I'm happy to see I wasn't completely crazy or clueless on that one)

Thanks for helping me reproduce and for filing the report!

Surprising that such a bug in such a fundamental use-pattern made it through unnoticed… 
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.



--
Robert Dale
Reply all
Reply to author
Forward
0 new messages