Mixed Indexing doesn't work, or Searching doesn't work on Mixed Indexed vertices

101 views
Skip to first unread message

Debasish Kanhar

unread,
Nov 18, 2017, 3:02:06 AM11/18/17
to JanusGraph users
Hi All,

So, I created Mixed Index on a property node_id for each Vertex. Before doing that, I deleted all nodes and vertices in Graph as follows:

g.V().drop().iterate();
graph
.tx().commit();

I add mixed key as follows, even before adding any vertex.
gremlin> mgmt = graph.openManagement();
gremlin> nodeid = mgmt.makePropertyKey('node_id').dataType(String.class).make(); 
gremlin> mgmt.buildIndex('nodeByID', Vertex.class).addKey(nodeid, Mapping.TEXT.asParameter()).buildMixedIndex("search"); 
gremlin> mgmt.commit();

It worked fine, and my Indexing warning got suppressed, so I thought that the Index creation was successful.

To get Status of Index, I did following :
gremlin> mgmt = graph.openManagement(); 
gremlinindex = mgmt.getGraphIndex("nodeByID");
gremlin> index.getIndexStatus(mgmt.getPropertyKey('node_id'))
==>ENABLED 

Now, I've a node as follows, which was added after Index was created:
gremlin> g.V().has("NodeProperties", textContains('5401')).has('isSOI', textContains('True')).valueMap()
08:35:33 WARN  org.janusgraph.graphdb.transaction.StandardJanusGraphTx  - Query requires iterating over all vertices [(NodeProperties CONTAINS 5401 AND isSOI CONTAINS True)]. For better performance, use indexes
==>[NodeProperties:[b'{'SOI': [5401], 'annotations': {}}'],type:[b'unstructured'],isSOI:[b'True'],node_id:[b'f601bfa5-c8de-49a7-b51b-7b15376099be']]



As it can be seen, node_id has value "[b'f601bfa5-c8de-49a7-b51b-7b15376099be']". So searching on that should work, and shouldn't this query work?
g.V().has("NodeProperties", textContains('5401')).has('node_id', textContains('f601bfa5-c8de-49a7-b51b-7b15376099be')); 


On running above query, I expect the same Node to be returned, but that isn't happening. Why is that so? Should I reindex my nodes each time after I push or is there something I'm missing?

Thanks


tpr...@gmail.com

unread,
Nov 18, 2017, 11:03:23 AM11/18/17
to JanusGraph users
Which version do you use ?
Which mixed index do you use?
Do you try with String mapping ?

Robert Dale

unread,
Nov 18, 2017, 7:07:18 PM11/18/17
to Debasish Kanhar, JanusGraph users
Your results look a bit strange.  Not sure what the "b'...'" is.  And what is NodeProperties, a JSON string?  In any case, I was unable to reproduce using the information provided. It would help if you could post complete steps to reproduce. 

Although, it's not clear why you want full-text searching.  I think you may want exact-match string searching in which case you can use a composite index.

Robert Dale

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/f4ff57ff-84eb-4128-b251-267235c0bc03%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Debasish Kanhar

unread,
Nov 19, 2017, 2:09:17 AM11/19/17
to JanusGraph users
Hi Robert,

Yes, all my properties Inc NodeProperties are stringified object. To be specific, its byte encoded strings. I do this to take care of special characters while pushing. I don't think that should be an issue, as I want to do a partial search on the property, and that is reason why I used textContains() on property node_id. 

Please note some extra specifics for problem:

I use Python as client, and push data using GLV.

My workflow to push data and where searching fails on node_id is as follows:

search_case_query = "g.V().has('NodeProperties', Text.textContains('5401')).count()"
num_nodes = dbCON.query(search_case_query)

if len(num_nodes) > 0:
drop_query = "g.V().has('NodeProperties', Text.textContains('5401')).drop().iterate(); graph.tx().commit()"
dbCON.query(drop_query)
else:
print("Not present. Going to add!")

query = self.build_node_addition_query(node)
addedNode = dbCON.query(query)

sourceGraphNode = dbCON.query("g.V().has('node_id', "
"textContains('{}')).id()".format(source_id))
targetGraphNode = dbCON.query(
"g.V().has('node_id', textContains('{}')).id()".format(
target_id))

edge = source_node.addE(category).to(target_node).id().next()

The above step is where it fails. Logically, I added node in previous step, and searching for that node based on node_id. and adding a edge based on its instance. 

Note that my dbCon is custom class which used Goblin to query into Janus directly. The query method for dbCon is as follows:

if loop is None:
try:
loop = asyncio.get_event_loop()
except RuntimeError:
logger.debug("Couldn't get event loop for current thread. Creating a new event loop to be used!")
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)

response = self.__run(loop, query)

if loop is None:
loop.close()

return response

def __run(self, loop, query):
    messages = []
    async def run(loop, query):
        cluster = await Cluster.open(loop)
client = await cluster.connect()
resp = await client.submit(query)
async for msg in resp:
messages.append(msg)
await cluster.close()
    
    loop.run_until_complete(run(loop, query))

    return messages




Please let me know if you can still reproduce this error or not. 
If not, also let me know is this proper way to create Index? Steps i sequential format:
1: Use gremlin shell to drop nodes.
2: Create Index on Graph without any nodes in it.
3: Start pushing data.
4: Expect Index to by created for newly pushed vertecis once its pushed.

Thanks

Robert Dale

To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages