Slow performance

184 views
Skip to first unread message

Ivan Šturlić

unread,
Aug 4, 2015, 1:51:10 AM8/4/15
to OrientDB
Hi,

I was very excited about OrientDB beacuse of features but I get more and more frustrated because of slow performance for simple queries.
For example, I have about 130.000 vertices in MeasurementValue class. I have NOTUNIQUE index on date property.
This simple query:

select mea from MeasurementValue where date = '2014-07-19' unwind mea limit 1000

takes 12.4 seconds which is unacceptable beacuse it has index on date property and returns only 1000 rows. There are total of only 365 vertices with date '2014-07-19' in database and each vertex has 96 values in mea embedded list.
Also it is very easy to run out of memory (default Xmx512m) if I want to return all rows (365x96 = 35.040 rows).

You can download database from:



Ivan.
SelectWithLimit.PNG

SavioL

unread,
Aug 4, 2015, 3:22:34 AM8/4/15
to OrientDB
Hi Ivan,
which version of Orientdb are you using?

regards,
Savio L.

Ivan Šturlić

unread,
Aug 4, 2015, 6:30:41 AM8/4/15
to OrientDB
I'm using the 2.1 rc 6

Ivan.

SavioL

unread,
Aug 4, 2015, 9:08:46 AM8/4/15
to OrientDB
hi,
i tried your db, leaving everything as it is and making the query I got this result:


















                                       SBTREE


To speed it up you could delete the index of date and recreate it by putting it as Type: NOT UNIQUE HASH INDEX (should be the fastest of all and you can use it since there aren't ranges of queries). Putting this index was 15% faster than the previous time, putting 2.775 seconds. This is the result obtained:


HASH INDEX

have you got some improvement?

regards,
Savio L.




































Ivan Šturlić

unread,
Aug 4, 2015, 10:27:04 AM8/4/15
to OrientDB
Thanks Savio for help.
Difference between yours and my result is probably beacuse of hardware. I'm trying on my laptop which is slower than your machine.
But I can see that this is extremly bad performance because I tried similar in MS SQL 2008 database on the same laptop.
I loaded those 130.000 x 96 records (I could only insert them all as rows so I got  more than 12 million rows).

Equivalent query with TOP 1000 limit executed in 50 ms and query without limit which returned 30.000 rows executed in 500 ms.
And all that without date index!
Also, there were no problems with memory overflow like in OrientDB. Actually, SQL Server memory usage stays below 600 MB with total of 12 databases some of which are larger than this one.

Also, HASH index is not good because I also need range queries.

Regards, Ivan.

James Wang

unread,
Aug 4, 2015, 10:39:57 AM8/4/15
to OrientDB
I think OrientDB shines where JOINs are needed in RDBMS.

Especially, JOINs are very expensive for bigs tables.

Not surprised if it may slow in some simple queries.

Luca Garulli

unread,
Aug 4, 2015, 8:05:59 PM8/4/15
to OrientDB
Hi Ivan,
What's the MS SQL equivalent query to extract the information you need?

Best Regards,

Luca Garulli
Founder & CEO


--

---
You received this message because you are subscribed to the Google Groups "OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to orient-databa...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ivan Šturlić

unread,
Aug 5, 2015, 4:48:58 AM8/5/15
to OrientDB
Hi Luca,

In MS SQL I tried
select hour,minute,val,stat from MeasurementValue where date = '2014-07-19'

In OrientDB those values are stored inside JSON, but in MS SQL I can only store them in rows.
I will try similar in PostgreSQL since it supports JSON data type.

Of course, for real application I need much complicated queries with grouping, aggregation and finding MeasurementValue vertices traversing the graph and by date.

But this was the test for beginning which is not encouraging since more complicated queries can only be slower.

Regards, Ivan.

Ivan Šturlić

unread,
Aug 5, 2015, 5:04:53 PM8/5/15
to OrientDB
Here is the test with equivalent dataset and PostgreSQL 9.4 which supports embedded JSON documents.

OrientDB query:
select mea from MeasurementValue where date = '2014-07-19' unwind mea limit 1000
which takes 12.4 seconds

is equivalent to PostgreSQL query:
select jsonb_array_elements(mea) as mj from measurementvalue where date = '2014-07-19' limit 1000
which takes 44 ms as you can see in screenshot below.
This is 280 times faster!


If I remove limit and also make projection of JSON elements it takes only 920 ms to return over 130.000 rows (second screenshot).
Like in MS SQL I haven't created index for date column.


Regards, Ivan.
Reply all
Reply to author
Forward
0 new messages