Hello Community,
hello OrientDB developers!
I work in public services, and currently we're evaluating different technologies (OrientDB, Hibernate/postgres, Fedora4 and neo4j) in order to find the best possible backend-solution for our data.
Over the course of the last month we developed a fairly straight-forward Java-Class-Model that we like to use regardless of the underlying technology.
In future applications we're going to have more than 1,5 Mill. objects to persist, manage and retrieve.
Handling Java-Objects directly
seems so much more intuitive and flexible instead of mapping them with
JPA, so we were eager to try something new, like for example, OrientDBs Object Database functionalities.
But somehow we can't figure out how to get a decent performance out of our experimental setup.
So far we've persisted about 95,601 of our Objects (books and other media), resulting in
46,995,663
ORecords (see screen-shot) and about 18,8 GB of data on our NAS. |
Our test-system:
- Virtual Machine on VMware
- SUSE 10 OS
- 1 TB of NAS.
- Java 1.8
- OrientDD PE Version 2.0.12
- QuadCore CPU
Select a book with a specific relation (like a triple):
select from RecordImpl
where
type.uniqueKey = 'TOME'
and
relationNode.relations
contains (
(predicate.uniqueKey = 'IS_PUB_PLACE_OF')
and
subject.relationNodeContainers contains (uniqueKey = 'MILANO')
)
Query executed in 9.26 sec. Returned 20 record(s)
9.26 sec. , how can we accelerate this query? We have indexes on all the uniqueKeys.
We managed to accelerate this query a little by rewriting the statement like this:
select from RecordImpl
where
type contains [#46:0]
and (
relationNode.relations
contains (
predicate contains [#34:0] and subject contains [#30:18]
)
)
Query executed in 7.122 sec. Returned 20 record(s)
7.122 sec. , sadly not acceptable. And that is one of the more simple questions we'd like to get answered in a decent time.
Now this one with a simple order by:
select from RecordImpl
where
type contains [#46:0]
and (
relationNode.relations
contains (
predicate contains [#34:0] and subject contains [#30:18]
)
)
order by sortIndex desc
Query executed in 133.423 sec. Returned 20 record(s)
So, any ideas how we could accelerate our queries? What do we wrong?
Best regards & thanks,
Sebastian
Edit: Added number of cores (4) at sys specs