Optimizing Queries for a Prefix Tree

47 views
Skip to first unread message

John J. Szucs

unread,
Dec 22, 2016, 2:57:32 PM12/22/16
to OrientDB
In my project, I have a fairly large prefix tree, potentially containing millions of nodes (about 250K nodes in my development instance), managed in OrientDB (pointing to other vertices in my graph).

The nodes of the prefix tree are represented by a Token vertex type. Each Token has a 'key' property and is connected to its child vertices by a 'child' edge type. So, a sequence like "hello world" would be represented as:

root -child-> "hello" -child-> "world"

Currently, I have a NOTUNIQUE_HASH_INDEX on Token.key and I am querying the data structure like this:

SELECT EXPAND(OUT('child')[key=:k]) FROM :p

where k is the child key I am looking for and p is the RID of the parent node.

Generally, performance is pretty good, but I am looking for ideas on improving the query, the indexing, or both for this use case. In particular, queries starting at the root node, which has many children, take noticeably longer than the other, less-connected nodes.

Any suggestions? Thanks in advance!

-- John

Luigi Dell'Aquila

unread,
Dec 23, 2016, 2:33:49 AM12/23/16
to orient-...@googlegroups.com
Hi John,

You can try with the following:

SELECT FROM YourClass where key = :k where in('Child') contains :p

in a situation where you have supernodes it will be faster, but of course it will be a bit slower when you have small fan out

Thanks

Luigi


--

---
You received this message because you are subscribed to the Google Groups "OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to orient-database+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

John J. Szucs

unread,
Dec 23, 2016, 10:06:35 AM12/23/16
to OrientDB
Luigi,

This reduced the overall run-time for the method that uses this query by about 75%!

As an FYI for future readers, I believe the second "WHERE" in suggested SQL statement should be an AND.

Thank you very much for the quick and helpful response!

-- John
To unsubscribe from this group and stop receiving emails from it, send an email to orient-databa...@googlegroups.com.

Luigi Dell'Aquila

unread,
Dec 23, 2016, 10:07:51 AM12/23/16
to orient-...@googlegroups.com
Ah yes, sure, the second WHERE should be an AND ;-)

Happy to see it helped 

Luigi

To unsubscribe from this group and stop receiving emails from it, send an email to orient-database+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages