Hi ,
I am working with cassandra 2.0.8 / Titan 0.5. I see performance issue on my gremlin query. Please see below my query. I have index on both city and type. City is my graphs super node.
I have the following query,
g.V('city', 'Boston').in('cityKey').has('type', 'SOFTWARE').count() (This is slow)
The performance of the query is very slow. When I run the query
g.V('city', 'Boston').in('cityKey').count() (The result is instantaneous).
I have read that .has sometimes does not use indexes and have also changed the query to use filter , but that doesn't help.
For the amount of data I have , g.V('city', 'Boston').in('cityKey').count() results 44605 and the type software will have count greater than 44500. Also see below the time it takes which is very slow in both cases and see the usage of filter in the query.
gremlin> s=System.currentTimeMillis();g.V('city', 'Boston').in('cityKey').has('type', 'SOFTWARE').count();System.currentTimeMillis()-s
==>91545
gremlin> s=System.currentTimeMillis();g.V('city', 'Boston').in('cityKey').type.filter{it == 'SOFTWARE'}.count();System.currentTimeMillis()-s
==>92028
gremlin>
Is there any way I can rewrite this query for optimal results. And also if the usage of filter doesimprove the timing I would also likely to have multiple filters where type as software and language as java. Any help is greatly appreciated.
Thanks,
Sabari