Inconsistent query results in gremlin-python

100 views
Skip to first unread message

Lesley Deng

unread,
Aug 6, 2024, 6:37:58 AM8/6/24
to Gremlin-users

Hi,

I hope this message finds you well. I am writing to seek assistance with an issue I've encountered while using the gremlin-python library to interact with graph database.

Issue Description: I've noticed that running the same query consecutively produces inconsistent results. The first execution of the query returns an empty result set, while the subsequent execution returns a non-empty result set. I have not manually updated the graph database between these two queries.

Example: Here's an example of the issue as demonstrated in an IPython interface:

# first time: query returns empty results
In [43]: db_query = transfer(content, g)

In [44]: db_query
Out[44]: [['V'], ['hasLabel', 'AssignStatement'], ['as', 'a1', 'a1__'], ['out', 'lhs'], ['out', '_type'], ['values', '_ipython_canary_method_should_not_exist_']]

In [45]: db_query.limit(10).to_list()
Out[45]: []

# second time: same query gets results
In [46]: db_query =  db_query = transfer(content, g)

# a comparison operator
In [47]: db_query == g.V().has_label('AssignStatement').as_('a1', 'a1__').out('lhs').out('_type')

Out[47]: True

In [48]: db_query.limit(10).to_list()
Out[48]: [v[898], v[30], v[926], v[30], v[30], v[30], v[58], v[30], v[58], v[33]]


Environment Details:

  • gremlin-python version: 3.7.1
  • IPython and gremlin server run on OS: Ubuntu 20.04.6


Could you please help me understand what might be causing this behavior?

Kind regards,

Lesley

Stephen Mallette

unread,
Aug 8, 2024, 7:26:11 AM8/8/24
to gremli...@googlegroups.com
Are you user those are the really the same exact queries? this bytecode:

> In [44]: db_query
Out[44]: [['V'], ['hasLabel', 'AssignStatement'], ['as', 'a1', 'a1__'], ['out', 'lhs'], ['out', '_type'], ['values', '_ipython_canary_method_should_not_exist_']]

seems to indicate that there is a values("_ipython_canary_method_should_not_exist_") on the end, while your second query doesn't seem to have that.


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/037c4880-d472-4952-8ace-5e5d8c41a95fn%40googlegroups.com.

Lesley Deng

unread,
Aug 14, 2024, 6:05:44 AM8/14/24
to Gremlin-users

Thank you for your response! I have also observed the difference ['values', '_ipython_canary_method_should_not_exist_'], and I am also puzzled as to why In [47] is returning True in this context.

But I am certain that the db_query variable is identical in both instances after the assignment db_query = transfer(content, g). Yet, after undergoing comparison operations, executing the two queries yielded different outcomes.

Stephen Mallette

unread,
Aug 15, 2024, 9:48:49 AM8/15/24
to gremli...@googlegroups.com
looks like the  ['values', '_ipython_canary_method_should_not_exist_'] isn't relevant:


i don't have a explanation for why you are seeing that behavior.if the code in transfer() creates no variation in output and the data in the database is static i'm not sure how you can end up this way. can you share the code in transfer()?

Lesley Deng

unread,
Aug 16, 2024, 9:49:11 AM8/16/24
to Gremlin-users

I am willing to review the details with you, but the code is a little bit complex, perhaps we could first identify which specific details would be most helpful to examine?

The transfer() function operates similarly to a interpreter, translating the text-based content into corresponding GraphTraversal steps and returning a GraphTraversal. In this example, the translation process that returns the traversal involves continuously adding steps:

  • traversal = g.V()
  • traversal = g.V().hasLabel('AssignStatement')
  • traversal = g.V().hasLabel('AssignStatement').as('a1', 'a1__')
  • traversal = g.V().hasLabel('AssignStatement').as('a1', 'a1__').out('lhs')
  • traversal = g.V().hasLabel('AssignStatement').as('a1', 'a1__').out('lhs').out('_type')
  • finally we have db_query = traversal

I look forward to your suggestions on how to proceed.

Stephen Mallette

unread,
Aug 19, 2024, 8:12:36 AM8/19/24
to gremli...@googlegroups.com
Well, if the transfer() function is complicated, perhaps there is something happening there in traversal construction that is not producing the query you think? Have you tried to omit the transfer() function and just send the query literally and directly to see if you can reproduce the problem more simply? Since I can't think of any explanation for this behavior, we'd need a set of reproducible steps to try to debug it. Also, you don't say what graph database you are using beyond Gremlin Server, so I assume that means you have it configured to use TinkerGraph. If you are not using TinkerGraph, I would suggest trying to recreate the problem with TinkerGraph so that the issue can be isolated to just TinkerPop code.



Lesley Deng

unread,
Dec 4, 2024, 4:20:59 AM12/4/24
to Gremlin-users
Hello, I now understand where the addition of .values('_ipython_canary_method_should_not_exist_') is triggered and how to handle it. When executing  In [44]: db_query  in IPython to view the content of the query command, it triggers get_real_method() from /IPython/utils/dir2.py(65):
-> canary = getattr(obj, '_ipython_canary_method_should_not_exist_', None)

This in turn triggers the __getattr__(self, key) method of the class GraphTraversal(Traversal). In  __getattr__  method, before executing return self.values(key), if we have:
if key in ['_ipython_canary_method_should_not_exist_', '_repr_mimebundle_']:
    return
it can prevent the issue I encountered. 

As for why executing db_query in IPython (note that this command has not yet added .toList() to trigger the actual Gremlin database query operation) leads to the execution of the dir2.py file, I have not yet figured it out.

Reply all
Reply to author
Forward
0 new messages