OrientDB Lucene Index slow

110 views
Skip to first unread message

Erik Pragt

unread,
Mar 11, 2016, 7:04:55 AM3/11/16
to OrientDB

I have a database with 100.000+ records. I want to index them on two fields with Lucene, so I added the following index:

create index Book.search on Book (title,isbn) FULLTEXT ENGINE LUCENE

However, when I search on one of the fields using the following query:

select from Book where [title,isbn] LUCENE "android"

The query is taking a very long time, like it's doing a full table scan. If I use the explain plan, it also suggests that's it's doing that:

explain select from Book where [title,isbn] LUCENE "android"

Result:

{
    "result": [
        {
            "@type": "d",
            "@version": 0,
            "documentReads": 80551,
            "current": "#16:217944",
            "documentAnalyzedCompatibleClass": 80551,
            "recordReads": 80551,
            "_memoryIndex": "isbn:\n\t'[61 6c 6c 61]':1: [(1)]\n\t'[63 6f 6d 70 6c 65 74 6f]':1: [(6)]\n\t'[63 6f 6e]':1: [(3)]\n\t'[63 6f 72 73 6f]':1: [(5)]\n\t'[65 64 69 74 69 6f 6e]':1: [(15)]\n\t'[67 75 69 64 61]':1: [(0)]\n\t'[69 6d 70 61 72 61 72 65]':1: [(8)]\n\t'[69 74 61 6c 69 61 6e]':1: [(14)]\n\t'[70 65 72]':1: [(7)]\n\t'[70 6f 63 6f]':1: [(12)]\n\t'[70 72 6f 67 72 61 6d 6d 61 72 65]':1: [(10)]\n\t'[70 72 6f 67 72 61 6d 6d 61 7a 69 6f 6e 65]':1: [(2)]\n\t'[72]':1: [(4)]\n\t'[74 65 6d 70 6f]':1: [(13)]\n\tterms=14, positions=14, memory=32.9 KB\ntitle:\n\t'[31 35 33 30 30 35 38 32 33 36]':1: [(0)]\n\tterms=1, positions=1, memory=32.9 KB\n\nfields=2, terms=15, positions=15, memory=66.6 KB",
            "fetchingFromTargetElapsed": 17037,
            "evaluated": 80551,
            "user": "#5:0",
            "tips": [
                "Query 'SELECT FROM Book WHERE [title, isbn] LUCENE \"android\"' fetched more than 50000 records: to speed up the execution, create an index or change the query to use an existent index"
            ],
            "elapsed": 17040.559,
            "resultType": "collection",
            "resultSize": 848,
            "@fieldTypes": "documentReads=l,current=x,documentAnalyzedCompatibleClass=l,recordReads=l,fetchingFromTargetElapsed=l,evaluated=l,user=x,elapsed=f"
        }
    ],
    "warnings": [
        "Query 'SELECT FROM Book WHERE [title, isbn] LUCENE \"android\"' fetched more than 50000 records: to speed up the execution, create an index or change the query to use an existent index"
    ],
    "notification": "Query executed in 17.686 sec. Returned 1 record(s)"
}

What am I missing here?

scott molinari

unread,
Mar 11, 2016, 10:31:27 AM3/11/16
to OrientDB
I am no expert, and this is just a big guess, but try just the "title" property in the where filter.

Scott

Erik Pragt

unread,
Mar 11, 2016, 6:15:00 PM3/11/16
to orient-...@googlegroups.com


On 12 Mar 2016, at 02:31, 'scott molinari' via OrientDB <orient-...@googlegroups.com> wrote:

I am no expert, and this is just a big guess, but try just the "title" property in the where filter.


No, that's not how it works.

Erik


Scott

--

---
You received this message because you are subscribed to a topic in the Google Groups "OrientDB" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/orient-database/ZOjm55H375Y/unsubscribe.
To unsubscribe from this group and all its topics, send an email to orient-databa...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

scott molinari

unread,
Mar 13, 2016, 8:53:09 AM3/13/16
to OrientDB
I just tested a similar setup and it certainly does work and the query is considerably faster. My thinking is, why search ISBN's looking for "android", when you know it isn't going to be in that field?

Scott

Erik Pragt

unread,
Mar 13, 2016, 5:35:41 PM3/13/16
to orient-...@googlegroups.com
The question is: how do you know? I have one searchbox, it can be anything, so I need to search through multiple fields. In my example, it can be an isbn, or a title, but what if it's an author or a title or a description?

On Sun, Mar 13, 2016 at 11:53 PM, 'scott molinari' via OrientDB <orient-...@googlegroups.com> wrote:
I just tested a similar setup and it certainly does work and the query is considerably faster. My thinking is, why search ISBN's looking for "android", when you know it isn't going to be in that field?

Scott

--

scott molinari

unread,
Mar 14, 2016, 5:15:50 AM3/14/16
to OrientDB, erik....@jworks.nl
Hmm... I understand your situation now.

I guess one of the ODB Team will have to answer. Sorry for wasting your time.

Scott

Erik Pragt

unread,
Mar 14, 2016, 6:28:40 AM3/14/16
to orient-...@googlegroups.com
No, I appreciate the help!

I also fixed it, thanks to stackoverflow .

I was caused by me creating the index like [isbn, title] and querying like [title, isbn] . Orient db doesn't like that apparently.

Thanks for the help!

Erik
--
Reply all
Reply to author
Forward
0 new messages