Re: [RavenDB] Sort by lucene score

362 views
Skip to first unread message

Itamar Syn-Hershko

unread,
Oct 24, 2012, 11:45:48 AM10/24/12
to rav...@googlegroups.com
This is how it works by default - sorted by relevance not lexical, and you should be able to verify that by looking at Temp-Index-Score.

Also, don't do things like .Where("FullTextSearch: *" + searchTerm + "*"). They are quite costly and often times produce bad results. Just rely on the correct analyzer, or look at using NGrams.

On Wed, Oct 24, 2012 at 5:29 PM, Wyatt Webb <wyat...@gmail.com> wrote:
We are using full text search with an advanced query.
Our map:

Map = searchDocuments => searchDocuments.Select(searchDocument =>
                new
                {
                    FullTextSearch = new object[]
                    {
                        searchDocument.FirstName, 
                        searchDocument.LastName, 
                        searchDocument.CompanyName
                    }
                });

            Index(x => x.FullTextSearch, FieldIndexing.Analyzed);

Our Query (with a search document class as the result)

            using (var session = DocumentStore.OpenSession())
            {
                var advancedQuery = session.Advanced.LuceneQuery<SearchDocument, SearchDocumentIndex>();

                advancedQuery = advancedQuery.Where("FullTextSearch: *" + searchTerm + "*");

                RavenQueryStatistics stats;

                var results = advancedQuery
                    .Statistics(out stats)
                    .Skip(pageSize * pageNumber)
                    .Take(pageSize)
                    .ToArray();
                
                return new PagedQueryResult<SearchEntity>(pageSize, stats.TotalResults)
                {
                    PageNumber = pageNumber + 1,
                    PageValues = results.Cast<SearchDocument>()
                };
            }

Is there a way to have the results return from an advanced query sorted by the lucene score or best matched? I've read in the documentation that if no sort is specified then the results are returned Lexical. I've also seen some Google group posts that the default is to returned by the meta data Temp-Index-Score. In my tests the results are sorted by last updated descending (last updated, last). I need the results to be returned in order of Lucene score, highest first. Any help would be appreciated.

Wyatt Webb

unread,
Oct 24, 2012, 12:45:42 PM10/24/12
to rav...@googlegroups.com
We are not seeing that result at all. When returning the results the temp-index-score is always 1. Any suggestion as to why that is?

As for the where clause, we are using the advanced query, so how do you perform a "contains" search and not an exact match search?

Oren Eini (Ayende Rahien)

unread,
Oct 24, 2012, 5:33:48 PM10/24/12
to rav...@googlegroups.com
What is your query?

Wyatt Webb

unread,
Oct 25, 2012, 8:19:54 AM10/25/12
to rav...@googlegroups.com
The query is listed in the first post. The line with 

                advancedQuery = advancedQuery.Where("FullTextSearch: *" + searchTerm + "*");

I've discovered some other things. If we use the Advanced Query's Search method it performs an exact match AND the Lucene score is not just a 1. That's a step forward. If we use a tilde ~ after each search term then it performs a Fuzzy search. This is another step forward. So our query looks like this now.

   var searchTerms = "Hank Christian Microsoft";
var fuzzyTerms = String.Concat(String.Join("~ ", searchTerms.Split()), "~");
advancedQuery = advancedQuery.Search("FullTextSearch", fuzzyTerms);


I have a few more questions.

There is a .Fuzzy() method on the advanced query that accepts an integer. What does that do? 

I'd like to only return results with a score higher then a specific number, is this the method I should use? 

Should I use that method instead of injecting tildes in my terms?

Lastly, If you don't mind.

In our index we have an anonymous array that includes the three fields we're doing a full text search on. Is there a way to Boost one of those fields and still do a full text search on all of them?

Oren Eini (Ayende Rahien)

unread,
Oct 26, 2012, 8:03:26 AM10/26/12
to rav...@googlegroups.com
*foo*

Is not a lucene match, it forces us to do a full index scan, and the results are just what you get.

As you noted, Search gives you the result you want.

Fuzzy allows you to build queries dynamically.


WhereEqauls("Foo", "Bar").Fuzzy(1.2)

You can't boost just a single value, no.
Reply all
Reply to author
Forward
0 new messages