Slowness on big datasets

214 views
Skip to first unread message

isart....@gmail.com

unread,
Sep 1, 2015, 8:58:24 AM9/1/15
to OrientDB
Hello,

We are evaluating OrientDB to represent our Users database. Our schema is quite simple: we only have the "Users" vertex and the "Follows" edge. It works all right with small datasets (<100mb), but as soon as I've imported a few million users it really slows down. Our test dataset is only 7G, but we were hoping to get a 1TB database imported. Are we doing something wrong? any tips to speed up the queries?

For example, getting the followers for a given user takes more than 100 seconds (the user has more than 1M followers)

> select in(Follows) from #12:9389243;       
----+------+---------
#   |@CLASS|in      
----+------+---------
0   |null  |[1190488]
----+------+---------
1 item(s) found. Query executed in 102.376 sec(s).

Trying to get the intersection of the followers of 2 users takes >2hours (the users have +1M followers each)
select intersect(in(Follows).id) from User where id in [1,2]   

We are running Orientdb community edition 2.1. The server has 16G of RAM, and the database on disk is 7G, the server runs with the following arguments

-Dstorage.diskCache.bufferSize=12474 -Xmx4g

Below is the output of the INFO command

orientdb {test}> info                               

Current database: test (url=remote:127.0.0.1/test/)

DISTRIBUTED CONFIGURATION: none (OrientDB is running in standalone mode)

DATABASE PROPERTIES
--------------------------------+----------------------------------------------------+
 NAME                           | VALUE                                              |
--------------------------------+----------------------------------------------------+
 Name                           | null                                               |
 Version                        | 14                                                 |
 Conflict Strategy              | version                                            |
 Date format                    | yyyy-MM-dd                                         |
 Datetime format                | yyyy-MM-dd HH:mm:ss                                |
 Timezone                       | Etc/UTC                                            |
 Locale Country                 | US                                                 |
 Locale Language                | en                                                 |
 Charset                        | UTF-8                                              |
 Schema RID                     | #0:1                                               |
 Index Manager RID              | #0:2                                               |
 Dictionary RID                 | null                                               |
--------------------------------+----------------------------------------------------+

DATABASE CUSTOM PROPERTIES:
 +-------------------------------+--------------------------------------------------+
 | NAME                          | VALUE                                            |
 +-------------------------------+--------------------------------------------------+
 | strictSql                     | true                                             |
 | useLightweightEdges           | false                                            |
 +-------------------------------+--------------------------------------------------+

CLUSTERS
----------------------------------------------+-------+-------------------+----------------+
 NAME                                         | ID    | CONFLICT STRATEGY | RECORDS        |
----------------------------------------------+-------+-------------------+----------------+
 _studio                                      |    11 |                   |             16 |
 default                                      |     3 |                   |              0 |
 e                                            |    10 |                   |              0 |
 follows                                      |    13 |                   |        6890284 |
 index                                        |     1 |                   |              7 |
 internal                                     |     0 |                   |              3 |
 manindex                                     |     2 |                   |              1 |
 ofunction                                    |     6 |                   |              0 |
 orids                                        |     8 |                   |              0 |
 orole                                        |     4 |                   |              3 |
 oschedule                                    |     7 |                   |              0 |
 ouser                                        |     5 |                   |              3 |
 user                                         |    12 |                   |        6086744 |
 v                                            |     9 |                   |              0 |
----------------------------------------------+-------+-------------------+----------------+
 TOTAL = 14                                                               |       12977061 |
------------------------------------------------------+-------------------+----------------+

CLASSES
----------------------------------------------+------------------------------------+------------+----------------+
 NAME                                         | SUPERCLASS                         | CLUSTERS   | RECORDS        |
----------------------------------------------+------------------------------------+------------+----------------+
 _studio                                      |                                    | 11         |             16 |
 E                                            |                                    | 10         |              0 |
 Follows                                      | [E]                                | 13         |        6890284 |
 OFunction                                    |                                    | 6          |              0 |
 OIdentity                                    |                                    | -          |              0 |
 ORestricted                                  |                                    | -          |              0 |
 ORIDs                                        |                                    | 8          |              0 |
 ORole                                        | [OIdentity]                        | 4          |              3 |
 OSchedule                                    |                                    | 7          |              0 |
 OTriggered                                   |                                    | -          |              0 |
 OUser                                        | [OIdentity]                        | 5          |              3 |
 User                                         | [V]                                | 12         |        6086744 |
 V                                            |                                    | 9          |              0 |
----------------------------------------------+------------------------------------+------------+----------------+
 TOTAL = 13                                                                                             12977050 |
----------------------------------------------+------------------------------------+------------+----------------+

INDEXES
----------------------------------------------+------------+-----------------------+----------------+------------+
 NAME                                         | TYPE       |         CLASS         |     FIELDS     | RECORDS    |
----------------------------------------------+------------+-----------------------+----------------+------------+
 dictionary                                   | DICTIONARY |                       |                |          0 |
 ORole.name                                   | UNIQUE     | ORole                 | name           |          3 |
 OUser.name                                   | UNIQUE     | OUser                 | name           |          3 |
 User.id                                      | UNIQUE     | User                  | id             |    6086743 |
----------------------------------------------+------------+-----------------------+----------------+------------+
 TOTAL = 4                                                                                               6086749 |
-----------------------------------------------------------------------------------------------------------------+

Andrey Lomakin

unread,
Sep 1, 2015, 2:34:32 PM9/1/15
to OrientDB

Hi
What version of database do you use .
Distributed, remote or embedded ?


--

---
You received this message because you are subscribed to the Google Groups "OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to orient-databa...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

isart....@gmail.com

unread,
Sep 1, 2015, 5:12:15 PM9/1/15
to OrientDB
I've started the server with the "server.sh" script. The queries are run locally using the "./console.sh" script

isart....@gmail.com

unread,
Sep 14, 2015, 7:12:23 AM9/14/15
to OrientDB
Hi, as Colin Lester suggested on a private message I've tested release 2.1.2 and I can get the query 6 times faster!


Thanks guys for the quick fix!

scott molinari

unread,
Sep 14, 2015, 9:26:43 AM9/14/15
to OrientDB
6 x faster from 102 seconds is still 17 seconds. Is that fast enough?

Scott

Isart Montane

unread,
Sep 14, 2015, 9:31:47 AM9/14/15
to orient-...@googlegroups.com
Can you make it faster? :)

On Mon, Sep 14, 2015 at 3:26 PM, scott molinari <scottam...@googlemail.com> wrote:
6 x faster from 102 seconds is still 17 seconds. Is that fast enough?

Scott

--

---
You received this message because you are subscribed to a topic in the Google Groups "OrientDB" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/orient-database/sUQ4jFDXPRo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to orient-databa...@googlegroups.com.

scott molinari

unread,
Sep 15, 2015, 8:22:47 AM9/15/15
to OrientDB
LOL! Nope.

FYI, there is a plan now to introduce vertex specific indices, which could help with such supernodes, like you have. Plus 1 here, if you like the idea. 

Isart Montane

unread,
Sep 15, 2015, 8:33:37 AM9/15/15
to orient-...@googlegroups.com
Thanks Scott. That looks good. Looking forward to it.

--

Özgür Sucu

unread,
May 25, 2017, 8:12:55 AM5/25/17
to OrientDB
Hi Isart,
Can you find a solution for the performance issue if not have you tried to user another system to overcome this problem?
Thanks

15 Eylül 2015 Salı 15:33:37 UTC+3 tarihinde Isart Montane yazdı:

Isart Montane

unread,
May 30, 2017, 9:08:51 AM5/30/17
to orient-...@googlegroups.com
Hi Orgur,

sorry but we ended up using Spark instead. Not exactly the same but worked in our case.

According to the issue there seems to be a workaround, so maybe you can give it a go.

Isart

To unsubscribe from this group and all its topics, send an email to orient-database+unsubscribe@googlegroups.com.

Özgür Sucu

unread,
May 31, 2017, 6:55:35 AM5/31/17
to OrientDB
Hi Isart,
What was the workaround?

30 Mayıs 2017 Salı 16:08:51 UTC+3 tarihinde Isart Montane yazdı:
Reply all
Reply to author
Forward
0 new messages