Search relevance

4 views
Skip to first unread message

Javi

unread,
Feb 25, 2011, 3:18:45 PM2/25/11
to xapian_db
After getting the results of a search, is it possible to get the
relevance of each results?

Is it possible to order the results by relevance?

Thank you.

Javi.

Gernot

unread,
Feb 26, 2011, 1:29:40 AM2/26/11
to xapian_db
Hi Javi

The ordering is by relevance if you do not specify an order
expression. However, the relevance of a document is not accessible
right now. I will add that soon.

Cheers,
Gernot

Javi

unread,
Feb 26, 2011, 9:14:07 AM2/26/11
to xapian_db
OK, I've done some more investigation about it.

I've got the following blueprints defined in my model:

XapianDb::DocumentBlueprint.setup(Page) do |blueprint|
blueprint.attribute :title, weight: 10
blueprint.attribute :body, weight: 4
blueprint.ignore_if { draft? }
end

And then the following test:

describe "search" do
let(:search) { "primero" }
let(:wrong) { "nada" }
before(:each) do
@first = Factory :page, title: search, body: wrong
@second = Factory :page, title: wrong, body: search
Page.rebuild_xapian_index
@results = Page.search search
end
end

Here I would expect the results to be [@first, @second], but I'm
getting [@second, @first].

I've realized that by changing the "wrong" term, everything works.
Then I've seen the word "nada" is included in the stopwords file for
spanish language, so that must be the reason.

Is this the expected behavior?

Thanks.

Gernot

unread,
Feb 27, 2011, 2:21:12 AM2/27/11
to xapian_db
The relevance of a match is determined by xapian. Stop words are not
indexed, so the behaviour you describe seems reasonable to me.
I think that stop words are a cool concept to filter out the "noise".

Are you expecting to find documents by stop words? If so, I could make
the usage of stop words optional.

Cheers,
Gernot

Javi

unread,
Feb 27, 2011, 9:33:25 AM2/27/11
to xapian_db
On Feb 27, 8:21 am, Gernot <gernot.kog...@gmail.com> wrote:
> Are you expecting to find documents by stop words?

No, that wasn't my point :-).

In my example I'm not trying to find a document by a stop word. I'm
searching for the word "primero". But in that example the relevance of
the results depends on whether the non-searched-for word is a stop
word or not. Which I find confusing, since I'm not searching for
"nada", but having that word in the records changes the results of
searching for "primero".

Maybe this is a case that will never happen with real data anyway. It
was funny that I wrote a random test and found it.

Regards.
Reply all
Reply to author
Forward
0 new messages