Small initial size of place holder for query results in MultiKeySliceQuery.java

34 views
Skip to first unread message

Calvin Lei

unread,
Oct 16, 2017, 7:57:00 PM10/16/17
to JanusGraph developers
HI, 
   I am curious why the result place holder is set to size of 4 in MultiKeySliceQuery.java:49? If the incoming query is of size 1000 with corresponding 1000 results. There will be unnecessary resizing taking place. Wouldn't it be fair to set the initial capacity of the array list to be queries.size()?

thanks
 

Robert Dale

unread,
Oct 17, 2017, 7:50:25 AM10/17/17
to Calvin Lei, JanusGraph developers
Can you share the queries and/or profile() (only useful from v0.2) that show multiKSQ[x] where x is the size of the queries?

What portion of overall queries/frequency do these queries represent?

Do you have any other quantitative analysis supporting your theory?

Looking at the loop, the result size is not necessarily the same as the query size. It is limited by the aggregate of the sub keySliceQuery sizes and a hard limit. I'm not familiar enough with the internals to understand how these are affected or controlled. However, in my own queries, only ones with within() predicates seem to affect the query size. And these are rarely used.



Robert Dale

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-dev+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/4cac10c4-e001-4789-bb1e-c5b3f995be85%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Calvin Lei

unread,
Oct 17, 2017, 12:50:54 PM10/17/17
to JanusGraph developers
Hi Robert,
   I am using 0.2 pre-release for the testing. My queries are as followed:
          tx.query().has("customId", Contain.IN, customIds1000)
   where I have different queries with lookup of 500, 1000, 5000, 100000, 50000 IDs of type String
  
   As JanusGraph do not support non-numeric IDs, this is our main use case for the ID indirection to map the string ID to JanusGraph's vertices. 

   I ran the queries with and without the change, I see the performance boost of roughly 30% in query time. I agree that the result size is not necessarily the same as the query size, but in best (worst?) case, the results would be the same as the query size. Regarding the hard limit, it is a very good point and we can address it by simply using Math.max(getLimit(), queries.size())

thanks
C


On Tuesday, October 17, 2017 at 4:50:25 AM UTC-7, Robert Dale wrote:
Can you share the queries and/or profile() (only useful from v0.2) that show multiKSQ[x] where x is the size of the queries?

What portion of overall queries/frequency do these queries represent?

Do you have any other quantitative analysis supporting your theory?

Looking at the loop, the result size is not necessarily the same as the query size. It is limited by the aggregate of the sub keySliceQuery sizes and a hard limit. I'm not familiar enough with the internals to understand how these are affected or controlled. However, in my own queries, only ones with within() predicates seem to affect the query size. And these are rarely used.



Robert Dale

On Mon, Oct 16, 2017 at 7:57 PM, Calvin Lei <ckp...@gmail.com> wrote:
HI, 
   I am curious why the result place holder is set to size of 4 in MultiKeySliceQuery.java:49? If the incoming query is of size 1000 with corresponding 1000 results. There will be unnecessary resizing taking place. Wouldn't it be fair to set the initial capacity of the array list to be queries.size()?

thanks
 

--
You received this message because you are subscribed to the Google Groups "JanusGraph developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-de...@googlegroups.com.

Robert Dale

unread,
Oct 18, 2017, 6:32:30 AM10/18/17
to Calvin Lei, JanusGraph developers
Did you mean Math.min() ?

Robert Dale

To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-dev+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-dev/5700a0d4-222a-4440-8f37-de121ac9e78a%40googlegroups.com.

Calvin Lei

unread,
Oct 18, 2017, 12:41:46 PM10/18/17
to JanusGraph developers
correct :). Or we can choose to use LinkedList. I see similar improvement in my test. 

thanks,
C

Robert Dale

Robert Dale

unread,
Oct 18, 2017, 1:07:37 PM10/18/17
to JanusGraph developers
You should create an issue - https://github.com/JanusGraph/janusgraph/issues - and make a pull request.

Calvin Lei

unread,
Oct 18, 2017, 6:37:42 PM10/18/17
to JanusGraph developers
Thanks Robert. I am in the process of getting the CLA signed. Once done I will create a PR. Let me create an issue in the mean time.
Reply all
Reply to author
Forward
0 new messages