New benchmarks from Arango

801 views
Skip to first unread message

scott molinari

unread,
Jun 11, 2015, 11:36:55 AM6/11/15
to orient-...@googlegroups.com
Hi,

I just ran into this and I would imagine, you gents would want to know about it and also, more than likely, do something about it. 


Scott

Chaitanya

unread,
Jun 11, 2015, 2:38:13 PM6/11/15
to orient-...@googlegroups.com
+1

--

---
You received this message because you are subscribed to the Google Groups "OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to orient-databa...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Chaitanya

unread,
Jun 11, 2015, 3:05:15 PM6/11/15
to orient-...@googlegroups.com
I think these benchmarks are from nodejs app. But OrientDb is a jvm based database. Why can't we make a benchmark with ArangoDb?

Ziink A

unread,
Jun 12, 2015, 1:11:05 AM6/12/15
to orient-...@googlegroups.com
I tried running the benchmarks on OrientDB. It causes a memory dump.

Riccardo Tasso

unread,
Jun 12, 2015, 1:31:40 AM6/12/15
to orient-...@googlegroups.com

Thank you, very interesting.

Riccardo

--

scott molinari

unread,
Jun 12, 2015, 2:33:09 AM6/12/15
to orient-...@googlegroups.com
For me personally, I can't believe the results. However, I'd like to see some proper results for OrientDB too.

I am sold on Orient over Arango currently, mainly due to the binary interface and because it has multi-master replication and easy sharding capabilities, which Arango still doesn't have and probably won't (the multi-master replication that is). But, if Orient is actually that much slower, that would be a no-go for our project too. 

Scott  

Riccardo Tasso

unread,
Jun 12, 2015, 2:39:30 AM6/12/15
to orient-...@googlegroups.com
Unfortunately performarces are very difficult to test, and benchmarks are very difficult to trust, especially if implemented by vendors.

Riccardo

--

Luca Garulli

unread,
Jun 12, 2015, 3:02:01 AM6/12/15
to orient-...@googlegroups.com
Hey guys,

I wish somebody from ArangoDB asked for help on using OrientDB properly for such benchmark before posting results.

But I can also understand a minor vendor tries to get attention in that way ;-)

We had the chance to look at the code and it doesn't use OrientDB properly. We'll send a Pull Request to that project with our changes soon. I hope they will update their numbers with the new ones on their web site.


Best Regards,

Luca Garulli
CEO at Orient Technologies LTD
the Company behind OrientDB

scott molinari

unread,
Jun 12, 2015, 8:06:17 AM6/12/15
to orient-...@googlegroups.com
Cool! I hope the results speak much better for OrientDB. I don't expect blowing the socks off of Arango, but I'd be happy with results comparable to Arango's. I'd also suggest you run the tests on your own too or maybe someone from the community could do it, just to validate the results as factual.

Scott

Ziink A

unread,
Jun 12, 2015, 10:40:37 AM6/12/15
to orient-...@googlegroups.com
I have a machine with 16GB RAM. Here are some numbers. 

I moved the shortest path test to the end for Orient because it won't run on 2.1rc3 due to a parsing bug and because it crashes 2.0.10. No connection pooling for Orient. 

Since the database doesn't fit in RAM, ArangoDB's numbers are not that impressive (and actually worse for aggregation).  

ArangoDB 2.6 alpha 3
INFO using server address  127.0.0.1
INFO start
INFO step 1/2 done
INFO step 2/2 done
INFO warmup done
INFO executing shortest path for 19 paths
INFO total paths length: 85
INFO -----------------------------------------------------------------------------
INFO ArangoDB: shortest path, 19 items
INFO Total Time for 19 requests: 133919 ms
INFO Average: 7048.368421052632 ms
INFO -----------------------------------------------------------------------------
INFO executing neighbors for 500 elements
INFO total number of neighbors found: 9102
INFO -----------------------------------------------------------------------------
INFO ArangoDB: neighbors, 500 items
INFO Total Time for 500 requests: 1618 ms
INFO Average: 3.236 ms
INFO -----------------------------------------------------------------------------
INFO executing distinct neighbors of 1st and 2nd degree for 500 elements
INFO total number of neighbors2 found: 418236
INFO -----------------------------------------------------------------------------
INFO ArangoDB: neighbors2, 500 items
INFO Total Time for 500 requests: 4807 ms
INFO Average: 9.614 ms
INFO -----------------------------------------------------------------------------
INFO executing single read with 100000 documents
INFO -----------------------------------------------------------------------------
INFO ArangoDB: single reads, 100000 items
INFO Total Time for 100000 requests: 32723 ms
INFO Average: 0.32723 ms
INFO -----------------------------------------------------------------------------
INFO executing single write with 100000 documents
INFO -----------------------------------------------------------------------------
INFO ArangoDB: single writes, 100000 items
INFO Total Time for 100000 requests: 45902 ms
INFO Average: 0.45902 ms
INFO -----------------------------------------------------------------------------
INFO executing aggregation
INFO -----------------------------------------------------------------------------
INFO ArangoDB: aggregate, 1 items
INFO Total Time for 1 requests: 199743 ms
INFO Average: 199743 ms
INFO -----------------------------------------------------------------------------
DONE



orientdb 2.1rc3
INFO using server address  127.0.0.1
INFO start
INFO warmup done
INFO executing neighbors for 500 elements
INFO total number of neighbors found: 9102
INFO -----------------------------------------------------------------------------
INFO OrientDB: neighbors, 500 items
INFO Total Time for 500 requests: 5206 ms
INFO Average: 10.412 ms
INFO -----------------------------------------------------------------------------
INFO executing distinct neighbors of 1st and 2nd degree for 500 elements
INFO total number of neighbors2 found: 418236
INFO -----------------------------------------------------------------------------
INFO OrientDB: neighbors2, 500 items
INFO Total Time for 500 requests: 80235 ms
INFO Average: 160.47 ms
INFO -----------------------------------------------------------------------------
INFO executing single read with 100000 documents
INFO -----------------------------------------------------------------------------
INFO OrientDB: single reads, 100000 items
INFO Total Time for 100000 requests: 75391 ms
INFO Average: 0.75391 ms
INFO -----------------------------------------------------------------------------
INFO executing single write with 100000 documents
INFO -----------------------------------------------------------------------------
INFO OrientDB: single writes, 100000 items
INFO Total Time for 100000 requests: 92721 ms
INFO Average: 0.92721 ms
INFO -----------------------------------------------------------------------------
INFO executing aggregation
INFO -----------------------------------------------------------------------------
INFO OrientDB: aggregate, 1 items
INFO Total Time for 1 requests: 73120 ms
INFO Average: 73120 ms
INFO -----------------------------------------------------------------------------
INFO executing shortest path for 19 paths
ERROR OrientDB.RequestError: Error on parsing command at position #34: Invalid keyword '0].RID'
Command: select shortestPath($a[0].rid, $b[0].rid, "OUT") LET $a = (select @rid from PROFILE where _key = "P714635"), $b = (select @rid from PROFILE where _key = "P117865")
------------------------------------------^


orientdb 2.0.10
INFO using server address  127.0.0.1
INFO start
INFO warmup done
INFO executing neighbors for 500 elements
INFO total number of neighbors found: 9102
INFO -----------------------------------------------------------------------------
INFO OrientDB: neighbors, 500 items
INFO Total Time for 500 requests: 1891 ms
INFO Average: 3.782 ms
INFO -----------------------------------------------------------------------------
INFO executing distinct neighbors of 1st and 2nd degree for 500 elements
INFO total number of neighbors2 found: 418236
INFO -----------------------------------------------------------------------------
INFO OrientDB: neighbors2, 500 items
INFO Total Time for 500 requests: 60942 ms
INFO Average: 121.884 ms
INFO -----------------------------------------------------------------------------
INFO executing single read with 100000 documents
INFO -----------------------------------------------------------------------------
INFO OrientDB: single reads, 100000 items
INFO Total Time for 100000 requests: 55399 ms
INFO Average: 0.55399 ms
INFO -----------------------------------------------------------------------------
INFO executing single write with 100000 documents
INFO -----------------------------------------------------------------------------
INFO OrientDB: single writes, 100000 items
INFO Total Time for 100000 requests: 96845 ms
INFO Average: 0.96845 ms
INFO -----------------------------------------------------------------------------
INFO executing aggregation
INFO -----------------------------------------------------------------------------
INFO OrientDB: aggregate, 1 items
INFO Total Time for 1 requests: 61498 ms
INFO Average: 61498 ms
INFO -----------------------------------------------------------------------------
INFO executing shortest path for 19 paths
ERROR OrientDB.RequestError: Error on retrieving record #13:754877 (cluster: relation)

scott molinari

unread,
Jun 17, 2015, 2:39:02 AM6/17/15
to orient-...@googlegroups.com
Those are interesting numbers Ziink A. 

We'll send a Pull Request to that project with our changes soon. I hope they will update their numbers with the new ones on their web site.

Luca, if you do make the improvements, I promise you, I will do my best to get them to change their blog post. I am also pretty sure they will. They are looking for the best comparison, not some lies, from what I gather.

Scott 

scott molinari

unread,
Jun 17, 2015, 2:57:32 AM6/17/15
to orient-...@googlegroups.com
Reading this announcement in its entirety, I would also gather the improvements in the JS driver will also speed up this benchmark considerably. 

Scott

Marvin Froeder

unread,
Jun 21, 2015, 5:43:18 PM6/21/15
to orient-...@googlegroups.com
I would love too see this PR... Just to make sure I'm not doing the same mistakes :)

Luca Garulli

unread,
Jun 22, 2015, 5:56:05 AM6/22/15
to orient-...@googlegroups.com
We uploaded the database online, in case anybody wanna run tests:




Best Regards,

Luca Garulli
CEO at Orient Technologies LTD
the Company behind OrientDB

On 22 June 2015 at 00:06, Luca Garulli <l.ga...@orientdb.com> wrote:
Hey guys,
We sent the PR, but seems it takes a lot of time to merge it and re-run tests...

However, Enrico (maggiolo00) took a look a this benchmark and optimized the OrientDB implementation in the following ways:
  • Using OrientDB 2.1-rc4 instead of 2.0.x. This seemed more fair since the release from Arango was the last alpha
  • The database was created without lightweight edges. Once we re-imported it (by the way we used OrientDB ETL and a couple of JSON files) everything was much faster
  • The singleWrite test used SQL statement, but this is not the fastest way, so Enrico used direct document create. On this test we are the fastest DBMS on insertion
Everybody can clone this repository and run the benchmark by himself ;-)



However since on these days we adopted the OrientJS driver from Oriento project, we found some issues and bottlenecks, that on this tests made the difference. For example on "singleRead", OrientDB was very slow. On my PC it took about 60 secs to execute 100k of reads, but once we profiled the time spent we discovered that 70% of the time was on Node.js driver and only 30% to execute the query! While with marshalling, the Node.js driver is good enough, with unmarshalling it's very slow. For this reason we're going to improve the unmarshalling in the next weeks. Stay tuned to get the updates on this.

NOTE: it's not that the OrientJS (forked from Oriento project) implementation was bad, we think Node.js is not so good to manipulate chars/bytes, so we're considering new options to do that ;-)

We run the benchmark multiple times and OrientDB was the fastest DBMS of all the others tested, but "singleRead" (see above) and "neighbors2" (actually on "neighbors" OrientDB is the fastest). By looking at the kind of benchmark we understood why: it's not what you can expect by a classic 2-nd level neighbors, but it returns only the IDs. On Arango, like any other Relational DBMS, you have primary keys that are on indexes.

So that particular query uses the index without even fetching the real documents. That's why seems faster, but retrieving the ids is an edge case, without any particular meaning in a real use case. If you're looking for neighbors you usually are interested on any information about the neighbors, like name, city, etc. Not just the IDs.

However, even if this benchmark has been created by a vendor to demonstrate that is the fastest, the complexity of the benchmark is very simple. So I'd call it micro-benchmark. Furthermore other vendors weren't called to do any tuning, so I see this as a mere marketing move to make a lot of noise.


Best Regards,

Luca Garulli
CEO at Orient Technologies LTD
the Company behind OrientDB

On 21 June 2015 at 23:33, Marvin Froeder <vel...@gmail.com> wrote:
I would love too see this PR... Just to make sure I'm not doing the same mistakes :)

Luca Garulli

unread,
Jun 22, 2015, 5:56:06 AM6/22/15
to orient-...@googlegroups.com
Hey guys,
We sent the PR, but seems it takes a lot of time to merge it and re-run tests...

However, Enrico (maggiolo00) took a look a this benchmark and optimized the OrientDB implementation in the following ways:
  • Using OrientDB 2.1-rc4 instead of 2.0.x. This seemed more fair since the release from Arango was the last alpha
  • The database was created without lightweight edges. Once we re-imported it (by the way we used OrientDB ETL and a couple of JSON files) everything was much faster
  • The singleWrite test used SQL statement, but this is not the fastest way, so Enrico used direct document create. On this test we are the fastest DBMS on insertion
Everybody can clone this repository and run the benchmark by himself ;-)



However since on these days we adopted the OrientJS driver from Oriento project, we found some issues and bottlenecks, that on this tests made the difference. For example on "singleRead", OrientDB was very slow. On my PC it took about 60 secs to execute 100k of reads, but once we profiled the time spent we discovered that 70% of the time was on Node.js driver and only 30% to execute the query! While with marshalling, the Node.js driver is good enough, with unmarshalling it's very slow. For this reason we're going to improve the unmarshalling in the next weeks. Stay tuned to get the updates on this.

NOTE: it's not that the OrientJS (forked from Oriento project) implementation was bad, we think Node.js is not so good to manipulate chars/bytes, so we're considering new options to do that ;-)

We run the benchmark multiple times and OrientDB was the fastest DBMS of all the others tested, but "singleRead" (see above) and "neighbors2" (actually on "neighbors" OrientDB is the fastest). By looking at the kind of benchmark we understood why: it's not what you can expect by a classic 2-nd level neighbors, but it returns only the IDs. On Arango, like any other Relational DBMS, you have primary keys that are on indexes.

So that particular query uses the index without even fetching the real documents. That's why seems faster, but retrieving the ids is an edge case, without any particular meaning in a real use case. If you're looking for neighbors you usually are interested on any information about the neighbors, like name, city, etc. Not just the IDs.

However, even if this benchmark has been created by a vendor to demonstrate that is the fastest, the complexity of the benchmark is very simple. So I'd call it micro-benchmark. Furthermore other vendors weren't called to do any tuning, so I see this as a mere marketing move to make a lot of noise.


Best Regards,

Luca Garulli
CEO at Orient Technologies LTD
the Company behind OrientDB

On 21 June 2015 at 23:33, Marvin Froeder <vel...@gmail.com> wrote:
I would love too see this PR... Just to make sure I'm not doing the same mistakes :)

Ziink A

unread,
Jun 22, 2015, 12:58:30 PM6/22/15
to orient-...@googlegroups.com
Thank you for the update. It would be great if you could post the benchmark numbers that you got. 

scott molinari

unread,
Jul 8, 2015, 7:54:32 AM7/8/15
to orient-...@googlegroups.com
For anyone interested, ArangoDB did rerun the benchmark tests with the ODB corrections and ODB did come out a lot better, but other than straight writes, it looks like Arango is still performing considerably better.


Scott

Luca Garulli

unread,
Jul 8, 2015, 10:37:53 AM7/8/15
to orient-...@googlegroups.com
Hi guys,
I'd have a lot to say about this benchmark and the way they managed it... I wrote a comment on that post but has been canceled. They used a dirty database with tons of duplicated links. Download they database they used (https://s3.amazonaws.com/nosql-sample-data/orientdb-community-2.1.rc4.tar.bz2) it and check it by yourself ;-)

Actually OrientDB numbers are better than what has been reported in that charts and OrientDB is the faster DBMS than any others on most of them. About singleread and neighbors2 tests, I already explained why OrientDB is not the fastest one.

By the way, those numbers are relative to a machine with 16 vCores and 60GB RAM. On a more common machine (8 cores and 16GB RAM), numbers are completely different.

Since it's not allowed for us to comment the original blog, we'll re-run that benchmark and we'll publish results somewhere else.

Best Regards,

Founder & CEO


--

scott molinari

unread,
Jul 8, 2015, 11:10:28 AM7/8/15
to orient-...@googlegroups.com
Thanks Luca,

Looking forward to the results.

Scott

MrFT

unread,
Jul 8, 2015, 11:34:15 AM7/8/15
to orient-...@googlegroups.com
I put some of luca's remarks in a comment and F Celler answered:
"We had started with a version fetching the whole documents, but some
of the databases did not survive this tests. However, now there are
newer versions available. So, I can rerun the tests using neighbors
with data."

Doesn't seem like they are unwilling to do a fair comparison, at a first glance.



Op woensdag 8 juli 2015 17:10:28 UTC+2 schreef scott molinari:

MrFT

unread,
Jul 9, 2015, 9:44:46 AM7/9/15
to orient-...@googlegroups.com
@luca

F Celler is asking which query he should use to fetch the whole document.

Could you join the discussion there and give them the right input?

Here is a link to that comment:




Op woensdag 8 juli 2015 17:34:15 UTC+2 schreef MrFT:

Luca Garulli

unread,
Jul 9, 2015, 2:38:44 PM7/9/15
to orient-...@googlegroups.com
Hi guys,
We just published our take on this benchmark:

http://orientdb.com/orientdb-performance-challenge/



Best Regards,

Founder & CEO


--

Riccardo Tasso

unread,
Jul 9, 2015, 3:32:21 PM7/9/15
to orient-...@googlegroups.com
Very good answer Luca!

I'll have to run benchmarks on my machine too!

Cheers,
   Riccardo

Dário Marcelino

unread,
Jul 9, 2015, 7:01:22 PM7/9/15
to orient-...@googlegroups.com
Hey Luca,

Is good to see OrientDB participating in the benchmark!

One thing though, it seems the results you published marked as "OrientDB test by competitor" are outdated as they've been corrected in How an open-source competitive benchmark helped to improve databases (25th June). Your neighbours results don't seem to be comparable with theirs as, I believe, they are adding neighbours1 + neighbours2, while in your article they are separated. Anyway, if you use the newer article as source their OrientDB results seem very close to your OrientDB 2.1 RC4 results.

Cheers,
Dário

Luca Garulli

unread,
Jul 9, 2015, 7:12:15 PM7/9/15
to orient-...@googlegroups.com
Hey Dario,
The blog post shows how performance changed dramatically from the very first results when we weren't contacted to tune the OrientDB implementation.

I hope next time a vendor wants to create a new benchmark to showcase his product, he will contact the other vendors before publishing completely inaccurate results. While I understand that this is just marketing on their side, it's important to be fair and correct.


Best Regards,

Founder & CEO


scott molinari

unread,
Jul 10, 2015, 1:17:07 AM7/10/15
to orient-...@googlegroups.com
It is interesting that the OrientDB results came up with worse results on reads than Arango did. 

I also agree with Dario, that the second report results from Arango should have been used in the result set. It seems they are trying to be fair. The same should be returned. You refer to them reworking the benchmarks, yet by including the incredibly bad results of their first test and not the second set of results, you are sort of committing the same marketing whitewash you accuse them of doing. Not really a noble gesture. 

I'm also interested in what was done, in laymen's terms, to improve the neighbors performance. The performance gain in 2.2 is remarkable.

Also, when do you think you will be returning with the improved NodeJS driver for hopefully improved read performance results? Those are the only ones where I am now raising an eyebrow.

And how hard would it be to get this benchmark done with some other languages? Like Java, PHP and Python? Not saying, you gents should do them, just wondering what you think the amount of work would be. A couple hours, couple days, couple weeks?

And at any rate, thanks for the efforts made on this. In the end, everyone wins, if the knife throwing can be avoided.

Scott

normanLinux

unread,
Jul 10, 2015, 12:16:05 PM7/10/15
to orient-...@googlegroups.com
I can well believe that the nodeJS driver is the bottleneck.  node is very useful, but also does have a tendency to be wasteful of resources.  I've been using node/npm with Angular and it tends to generate several hundred support files.  Still, it's not nearly as bad as yeoman.  

Using yeoman to generate a skeleton app (granted with coffeescript support) generated nearly 30,000 files!

So far, my experience with orient is limited, but it suits my use case better than anything else

Luca Garulli

unread,
Jul 10, 2015, 12:20:17 PM7/10/15
to orient-...@googlegroups.com
We analyzed the OrientJS unmarshalling (forked by Oriento driver) and I think Charles did a great job: it cannot be done better with Node.js platform. That's why our plan is to rewrite the unmarshalling in C and provide it with the driver (seems Node.js auto-compile C code). 


Best Regards,

Founder & CEO


--
Reply all
Reply to author
Forward
0 new messages