The query execution is unstable, sometimes fetch more data

120 views
Skip to first unread message

Hung Tran

unread,
May 11, 2016, 1:41:21 PM5/11/16
to OrientDB
Hi,

I am using OrientDB 2.1.8 on Windows 2012 server platform.

My SQL is


SELECT
@rid, EmailAddress FROM Contact WHERE (OwnedBy.EntityInfo.State <> 200) AND (Account = #20:0) AND (EntityInfo.State = 0) ORDER BY EmailAddress ASC skip 100 limit 100


When I do a filter with Account = #20:0 which contains about 120000 contacts, the EXPLAIN is

{
   
"result": [
       
{
           
"@type": "d",
           
"@version": 0,
           
"fullySortedByIndex": true,
           
"documentReads": 200,
           
"current": "#41:11015",
           
"documentAnalyzedCompatibleClass": 200,
           
"recordReads": 200,
           
"involvedIndexes": [
               
"IX_Contact_EmailAddress"
           
],
           
"fetchingFromTargetElapsed": 250,
           
"indexIsUsedInOrderBy": true,
           
"evaluated": 200,
           
"elapsed": 251.53171,
           
"resultType": "collection",
           
"resultSize": 100,
           
"@fieldTypes": "documentReads=l,current=x,documentAnalyzedCompatibleClass=l,recordReads=l,involvedIndexes=e,fetchingFromTargetElapsed=l,evaluated=l,elapsed=f"
       
}
   
],
   
"notification": "Query executed in 0.575 sec. Returned 1 record(s)"
}

there are only 200 records evaluated and took 0.575 sec.

When I change my filter to Account = #20:1, the EXPLAIN is

{
   
"result": [
       
{
           
"@type": "d",
           
"@version": 0,
           
"fullySortedByIndex": true,
           
"documentReads": 110866,
           
"current": "#41:111016",
           
"documentAnalyzedCompatibleClass": 110866,
           
"recordReads": 110866,
           
"involvedIndexes": [
               
"IX_Contact_EmailAddress"
           
],
           
"fetchingFromTargetElapsed": 5517,
           
"indexIsUsedInOrderBy": true,
           
"evaluated": 110866,
           
"elapsed": 5514.2456,
           
"resultType": "collection",
           
"resultSize": 100,
           
"@fieldTypes": "documentReads=l,current=x,documentAnalyzedCompatibleClass=l,recordReads=l,involvedIndexes=e,fetchingFromTargetElapsed=l,evaluated=l,elapsed=f"
       
}
   
],
   
"notification": "Query executed in 5.835 sec. Returned 1 record(s)"
}

There are 110866 records evaluated and took 5.835 sec.

I could not understand what caused a big different here, is it a bug? Any explanation will be very appreciated!

My Best,
Hung Tran


Luca Garulli

unread,
May 12, 2016, 4:24:21 AM5/12/16
to OrientDB
Hi Hung,
Your title is misleading: "The query execution is unstable, sometimes fetch more data". Why is fetching more data? Did you receive more than 100 records?

About the reason of the difference in timing, it looks like the record #20:1 has much more Contacts, right? This query is slow because it uses the index only for the order by, not for the lookup:

  "documentAnalyzedCompatibleClass": 110866,

If you have 10M of contacts, it scans all the clusters to evaluate all the following conditions:


  (OwnedBy.EntityInfo.State <> 200) AND (Account = #20:0) AND (EntityInfo.State = 0)

If you had a graph structure, you could start from the Account and traverse the connected contacts. So the following query:

SELECT @rid, EmailAddress
FROM Contact
WHERE (OwnedBy.EntityInfo.State <> 200) AND (Account = #20:0) AND (EntityInfo.State = 0)
ORDER BY EmailAddress ASC skip 100 limit 100

Would be something like:

SELECT @rid, EmailAddress FROM (
  SELECT expand( out('HasContacts') ) FROM #20:0
)
WHERE OwnedBy.EntityInfo.State <> 200 AND EntityInfo.State = 0
ORDER BY EmailAddress ASC SKIP 100 LIMIT 100



Best Regards,

Luca Garulli
Founder & CEO


--

---
You received this message because you are subscribed to the Google Groups "OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to orient-databa...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hung Tran

unread,
May 12, 2016, 4:50:45 AM5/12/16
to orient-...@googlegroups.com
Hi Lugi,

I am sorry about the title, I means the SQL engine fetches more data internally. 

No, the account #20:0 has 120000 contacts while #20:1 only has 100000 contacts.



About Indexing, I see when WHERE uses index, ORDER BY cannot use index, so I dropped index on Account. 

I will try your suggestion and give a feedback soon.

Last Update: About your suggestion, I have tried the query is fast as twice when traverse from the Account "2,56 seconds to take 100 rows of 100000 rows", however it does not explain why the explain is big different, and the performance in this case is slow.

My Best,
Hung Tran

Luca Garulli

unread,
May 12, 2016, 7:56:06 AM5/12/16
to OrientDB
Another question, are OwnedBy and EntityInfo embedded properties in Contact?


Best Regards,

Luca Garulli
Founder & CEO


On 12 May 2016 at 10:50, Hung Tran <tdhu...@gmail.com> wrote:
Hi Lugi,

I am sorry about the title, I means the SQL engine fetches more data internally. 

No, the account #20:0 has 120000 contacts while #20:1 only has 100000 contacts.



I will try your suggestion and give a feedback soon.

Hung Tran

unread,
May 12, 2016, 11:51:52 AM5/12/16
to orient-...@googlegroups.com
Hi Luca,

The OwnedBy is a Link property, and the EntityInfo is an object-embedded property. I have an index on OwnedBy, but I could not create index on property of an object-embedded property (I guess that's a limitation, maybe I need to break it down to properties).

If you found anything, please let me know, thank you very much!

My Best,
Hung Tran

Luca Garulli

unread,
May 16, 2016, 7:46:04 AM5/16/16
to OrientDB
Could you please try the time of the following queries and post results?

(1) 

SELECT @rid, EmailAddress FROM (
  SELECT expand( out('HasContacts') ) FROM #20:0
)
WHERE OwnedBy.EntityInfo.State <> 200 AND EntityInfo.State = 0
ORDER BY EmailAddress ASC SKIP 100 LIMIT 100

(2)

SELECT @rid, EmailAddress FROM (
  SELECT expand( out('HasContacts') ) FROM #20:0
ORDER BY EmailAddress ASC SKIP 100 LIMIT 100

(3)

SELECT @rid, EmailAddress FROM (
  SELECT expand( out('HasContacts') ) FROM #20:0
LIMIT 100


Best Regards,

Luca Garulli
Founder & CEO


On 12 May 2016 at 17:51, Hung Tran <tdhu...@gmail.com> wrote:
Hi Luca,

The OwnedBy is a Link property, and the EntityInfo is an embedded property. I have a index on OwnedBy, but I could not create index on property of an object-embedded property (I guess that's a limitation, maybe I need to break it down to properties).

David López Pérez

unread,
May 17, 2016, 5:32:33 AM5/17/16
to OrientDB
Hi Luca, I am Hung co-worker. 
These are the results:

(1)
{
   
"result": [
       
{
           
"@type": "d",
           
"@version": 0,

           
"fetchingFromTargetElapsed": 0,
           
"expandElapsed": 0,
           
"orderByElapsed": 0,
           
"current": "#20:0",
           
"user": "#5:0",
           
"elapsed": 1.983832,
           
"resultType": "collection",
           
"resultSize": 0,
           
"@fieldTypes": "fetchingFromTargetElapsed=l,expandElapsed=l,orderByElapsed=l,current=x,user=x,elapsed=f"
       
}
   
],
   
"notification": "Query executed in 0.036 sec. Returned 1 record(s)"
}

(2)
{
   
"result": [
       
{
           
"@type": "d",
           
"@version": 0,

           
"fetchingFromTargetElapsed": 0,
           
"expandElapsed": 0,
           
"orderByElapsed": 0,
           
"current": "#20:0",
           
"user": "#5:0",
           
"elapsed": 1.491263,
           
"resultType": "collection",
           
"resultSize": 0,
           
"@fieldTypes": "fetchingFromTargetElapsed=l,expandElapsed=l,orderByElapsed=l,current=x,user=x,elapsed=f"
       
}
   
],
   
"notification": "Query executed in 0.043 sec. Returned 1 record(s)"
}

(3)
{
   
"result": [
       
{
           
"@type": "d",
           
"@version": 0,

           
"fetchingFromTargetElapsed": 0,
           
"expandElapsed": 0,
           
"current": "#20:0",
           
"user": "#5:0",
           
"elapsed": 0.981974,
           
"resultType": "collection",
           
"resultSize": 0,
           
"@fieldTypes": "fetchingFromTargetElapsed=l,expandElapsed=l,current=x,user=x,elapsed=f"
       
}
   
],
   
"notification": "Query executed in 0.044 sec. Returned 1 record(s)"
}

Thank you very much, I hope this could help you to help us :)

Best Regards,
David

Hung Tran

unread,
May 19, 2016, 1:45:40 PM5/19/16
to orient-...@googlegroups.com
Hi Luca,

I have deployed a new instance on Ubuntu 14.04 and redo tests, the performance is stable and good enough, so I guess that's an OrientDB issue on Windows Server 2012 R2 platform, or it is just a kind of machine issue.

Here is the test result on 1.3M contacts, the query is done for an account with 300k contacts



Especially, the query only uses an single index on a field in ORDER BY. The same query, if I have an index on Account property, the query engine will take it instead of EmailAddress index, and that's really a nightmare for my situation.



I have also tried the same tests on OrientDB v2.1.17, the results are slower than v2.1.8 and v2.1.9 about 100 ms per query, not sure why !?

If possible, could you explain a bit about the rules of query engine to pick up index for a query?

My Best,
Hung Tran
Reply all
Reply to author
Forward
0 new messages