I have a need to boost scores based initially on date, but later on using a
python function of my choosing to calculate the value. When using Solr I
could do this using FunctionQuery which augments the existing scoring system:
http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of_newer_documents
As far as I can tell the closest thing is writing a scoring class as in
http://packages.python.org/Whoosh/api/scoring.html
Ideally what I'd like is a way of providing an additional boost function
that runs after the existing scorers with the entire document available and
can then adjust the score calculated so far.
Another doc "bug" - there is no Cosine class as shown in the example:
http://packages.python.org/Whoosh/searching.html#scoring
Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk2jk+sACgkQmOOfHg372QRJJQCfVtqj0wrqEHuW6Zpi5MZk1ogs
VtwAn1HSDMg4hEkDMzdfFsiNMml9w7iH
=FGPw
-----END PGP SIGNATURE-----
On 04/11/2011 04:51 PM, Roger Binns wrote:
> Ideally what I'd like is a way of providing an additional boost function
> that runs after the existing scorers with the entire document available and
> can then adjust the score calculated so far.
The answer for anyone who comes across this post later and if it isn't added
to the doc is:
- - Derive a class from one in scoring (eg BM25F)
- - Set the attribute use_final to True
- - Define a final() method
- - Supply the class or an instance as the weighting parameter when making a
Searcher
- - The final method will be called after the document has been scored and
should return an adjusted score
The final method signature is:
def final(self, searcher, docnum, score):
# This will get any stored fields for the document
fields=searcher.stored_fields(docnum)
# Return the score you want
return score*1
Doing this I was able to implement a date bias with the same calculations as
the Solr articles recommend. I also cheated and used monkey-patching to get
my final method used.
Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk2j6lkACgkQmOOfHg372QQRJQCg0JNeNS9YBvS1tNR40FN3xcMv
MZYAoJeM2AQLyMk4lfskJP/TZD5GpkLs
=3+aY
-----END PGP SIGNATURE-----
- J
> --
> You received this message because you are subscribed to the Google Groups "Whoosh" group.
> To post to this group, send email to who...@googlegroups.com.
> To unsubscribe from this group, send email to whoosh+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/whoosh?hl=en.
>
>
No, but you could instantiate your custom weighting class per-request
and just give it the query.
Matt
As an example I put the start time of the request into the weighting class
so that when I do the date bias I can subtract the document date from that
to get its age. Without that I'd have to call time.time() for every document.
It is probably somewhat obvious but should also be mentioned that the final
method is only called on documents that match in some way and not every
document in the collection. Consequently you cannot use the final method to
change the score of documents that do not match at all.
Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk2keRIACgkQmOOfHg372QR7/QCfdpJQmtNBoDXT5VEISUHWqVM/
vMYAn3ONEvysGc+Zgn6xA8XfW3FV3liL
=yiXQ
-----END PGP SIGNATURE-----