Gremlin+Python Unable to add vertex from inside a loop

770 views
Skip to first unread message

Debasish Kanhar

unread,
Sep 11, 2017, 6:34:03 AM9/11/17
to Gremlin-users
Hi all,

So I have a pandas dataframe which I want to load to JanusGraph using gremlin+python. My head of Dataframe is as:

node_id    name
1    Indiana Jones
2    Destiney Roob
3    May Hoppe
4    Justine Schroeder
5 Kimberly Sipes









So, then I have dataframe containing 100 unique users names. I'm using Gremlin + Python to push data to graph. The data have label of 'user'.  Logically after insertion is done, if I query g.V().hasLabel('user').count() on gremlin shell, I should get 100 (Same as rows in dataframe). But I get some other number. Why is that so?

I tried setting cache=false in setting file but same issue. Set of codes:

keyname = 'name'
label = 'user'
for _, row in df.iterrows():
keyvalue = row[keyname]
if label == 'user':
try:
print("Element Exists: {}. ID {}".format(self.g.V().hasLabel(label).has(keyname+"_{}".format(k), keyvalue).next(), _))
except:
print("No element fount in graph. ID: ", _)
ret = self.g.addV(label).property(keyname+"_{}".format(k), keyvalue).next()
time.sleep(0.5)
if
label == 'user':
try:
print("Element Added: {}. ID {}".format(self.g.V().hasLabel(label).has(keyname+"_{}".format(k), keyvalue).next(), _))
except:
print("Exception! Element couldn't be added. No element fount in graph. ID: ", _)

What should happen logicall in each iteration is that,
First it should print "No element found for ID: somenumber".
Then it should add to graph.
Then it should print the vertex added.

But, If a vertex was found, it prints Element Exists, and then in 2nd print statement, it prints Element Added.
But for those elements which doesnt exist, i.e. 1st Print statement: No element found, then in those cases 2nd print statement should be vertex added just now. But it prints Exception, no element coult be added.

It looks like some vertex are not getting added at all. But at same time arent also throwing an exception. This is strange behaviour. Anyone has any ideas why this is happening?

Thanks


HadoopMarc

unread,
Sep 11, 2017, 2:16:17 PM9/11/17
to Gremlin-users
What is the value of k? Is was never assigned in the code visible.

Cheers,    Marc

Op maandag 11 september 2017 12:34:03 UTC+2 schreef Debasish Kanhar:

Debasish Kanhar

unread,
Sep 11, 2017, 2:37:49 PM9/11/17
to gremli...@googlegroups.com
The k is just a subscript. I had initialized to 0. Sorry forgot to mention. 

Cheers. :-)

--
You received this message because you are subscribed to a topic in the Google Groups "Gremlin-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gremlin-users/JUDdpJmyEsE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/cbaa4fa9-161c-4d2e-ba7d-d0be77139dde%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

HadoopMarc

unread,
Sep 12, 2017, 4:10:01 PM9/12/17
to Gremlin-users
Oops, that was not very helpful as an investigative question, my bad.

I could reproduce your issue on a pure Tinkerpop 3.2.3 gremlin-server + pytthon console, configured as in theTP ref docs. I trimmed down your code until it worked as expected. I did not try to find where the issue actually came from. If you find it, we would love to hear it, but it does not seem to be a Tinkerpop issue.

Cheers,    Marc

Almost original code, addV gets lost somehow:
Python 2.7.12 (default, Nov 19 2016, 06:48:10)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from gremlin_python.structure.graph import Graph
>>> from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
>>> graph = Graph()
>>> g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','g'))
>>> g.V().count().next()
6L

>>> keyname = 'name'
>>> label = 'user'
>>> k = 0
>>> for _, keyvalue in [(0,'hiep'), (0,'hiep'), (1,'hoi'), (2,'hoera')]:
...     if label == 'user':
...         try:
...             print("Element Exists: {}. ID {}".format(g.V().hasLabel(label).has(keyname+"_{}".format(k), keyvalue).next(), _))
...         except:
...             print("No element fount in graph. ID: ", _)
...     ret = g.addV(label).property(keyname+"_{}".format(k), keyvalue).next()
...     if label == 'user':
...         try:
...             print("Element Added: {}. ID {}".format(g.V().hasLabel(label).has(keyname+"_{}".format(k), keyvalue).next(), _))
...         except:
...             print("Exception! Element couldn't be added. No element fount in graph. ID: ", _)
...
('No element fount in graph. ID: ', 0)
Element Added: v[13]. ID 0
Element Exists: v[13]. ID 0
Element Added: v[13]. ID 0
('No element fount in graph. ID: ', 1)
Element Added: v[17]. ID 1
('No element fount in graph. ID: ', 2)
Element Added: v[19]. ID 2
>>>



Trimmed down code, works as expected:
Python 2.7.12 (default, Nov 19 2016, 06:48:10)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from gremlin_python.structure.graph import Graph
>>> from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
>>> graph = Graph()
>>> g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','g'))
>>>
>>> g.V().count().next()
6L

>>> keyname = 'name'
>>> label = 'user'
>>> k = 0
>>> for _, keyvalue in [(0,'hiep'), (0,'hiep'), (1,'hoi'), (2,'hoera')]:
...     if label == 'user':
...         g.V().hasLabel(label).has(keyname+"_{}".format(k), keyvalue).toList()
...     ret = g.addV(label).property(keyname+"_{}".format(k), keyvalue).next()
...     if label == 'user':
...         g.V().hasLabel(label).has(keyname+"_{}".format(k), keyvalue).toList()
...
[]
[v[13]]
[v[13]]
[v[13], v[15]]
[]
[v[17]]
[]
[v[19]]
>>>





Op maandag 11 september 2017 20:37:49 UTC+2 schreef Debasish Kanhar:
To unsubscribe from this group and all its topics, send an email to gremlin-user...@googlegroups.com.

HadoopMarc

unread,
Sep 13, 2017, 1:45:35 PM9/13/17
to Gremlin-users
Contrary to what I said above, my first listing did not reproduce Debasish' issue.  It only showed that add vertex for gremlin-python does work from a loop, when used together with a gremlin-server with the tinkerpop-modern tinkergraph. It does not preclude problems with JanusGraph. Maybe it is still of some help to you.

My first listing should have been:

Python 2.7.12 (default, Nov 19 2016, 06:48:10)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from gremlin_python.structure.graph import Graph
>>> from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
>>> graph = Graph()
>>> g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','g'))
>>>
>>>
>>> keyname = 'name'
>>> label = 'user'
>>> k = 0
>>> for _, keyvalue in [(0,'hiep'), (0,'hiep'), (1,'hoi'), (2,'hoera')]:
...     if label == 'user':
...         try:
...             print("Element Exists: {}. ID {}".format(g.V().hasLabel(label).has(keyname+"_{}".format(k), keyvalue).toList(), _))

...         except:
...             print("No element fount in graph. ID: ", _)
...     ret = g.addV(label).property(keyname+"_{}".format(k), keyvalue).next()
...     if label == 'user':
...         try:
...             print("Element Added: {}. ID {}".format(g.V().hasLabel(label).has(keyname+"_{}".format(k), keyvalue).toList(), _))

...         except:
...             print("Exception! Element couldn't be added. No element fount in graph. ID: ", _)
...
Element Exists: []. ID 0

Element Added: [v[13]]. ID 0
Element Exists: [v[13]]. ID 0
Element Added: [v[13], v[15]]. ID 0
Element Exists: []. ID 1

Element Added: [v[17]]. ID 1
Element Exists: []. ID 2

Element Added: [v[19]]. ID 2
>>>



Cheers,   Marc


Op dinsdag 12 september 2017 22:10:01 UTC+2 schreef HadoopMarc:
Reply all
Reply to author
Forward
0 new messages