Backend interface

27 views
Skip to first unread message

Matt Chaput

unread,
May 21, 2012, 12:30:26 AM5/21/12
to django-...@googlegroups.com
Hi all, after a long time I'm back on the case trying to rewrite the way-out-of-date Whoosh backend. I think I understand a little more about Haystack this time around, but I'm hoping people familiar with the backend interface can help me out a bit with some of the internals.

Unfortunately both the existing code and the documentation are unclear in a few places regarding what is being passed in the arguments to the backend methods, and I seem to be preternaturally unable to set up a working Django project to play with them, so if anyone can tell me about the following, I'd appreciate it.

(The more specificity and detail the better, for example "a list of UTC datetime objects" rather than "some dates", "an unparsed query string" rather than "the thing to search for", etc. Examples welcome ;)

Thanks for any help!

Matt


General questions
=================

* Is it necessary for the backend to deal with queries internally as strings? (For example the return value of build_query_fragment, the argument to build_not_query, etc.)

If not I might prefer to change the Whoosh backend to generate Whoosh query objects directly instead of generating strings and then parsing them, because Whoosh has many advanced capabilities that don't have equivalent syntax in the query parser. (For example, I could implement the "startswith" filter type using a positional query object, but there's currently no syntax for that in Whoosh's query parser.)

However, if the queries must be strings, I can try working around this by adding
more complex syntax to the query parser.


WhooshSearchBackend.search()
============================

* sort_by -- This is an iterable of Whoosh field names,
with a name preceded by "-" if the order is reversed, correct?

* facets -- What's in this? How is Haystack faceting supposed to work?

* date_facets -- What's in this? How is Haystack date faceting supposed to work?
From the other backends it seems like it's a fairly complex nested dictionary, with stuff
that needs to be interpreted, e.g. "gap_by".

* query_facets -- What's in this? How is Haystack query faceting supposed to work?

* narrow_queries -- If not None, an iterable of query strings, where only documents that match ALL these queries are allowed in the results, correct?

* spelling_query -- Where does this come from? Is it computed from the query string?
Just curious.

* within, dwithin, distancepoint -- What's in these, and how should they work?

* Exactly what do I put in the "facets" key of the returned directory?


WhooshSearchBackend.create_spelling_suggestion()
================================================

* Is this part of the official API? Or can I change the signature?

* This method seems to take a query_string such as "alpha brxvo charlie dxlta" and return
a string like "bravo delta". Is that right? It doesn't seem very useful because it
doesn't associate the suggestion with the original word, but if that's the way it's
supposed to work, then OK.


WhooshSearchQuery
=================

Probably more questions about this later ;)


Matt Chaput

unread,
May 21, 2012, 5:07:29 AM5/21/12
to django-...@googlegroups.com
> * date_facets -- What's in this? How is Haystack date faceting supposed to work?
> From the other backends it seems like it's a fairly complex nested dictionary, with stuff
> that needs to be interpreted, e.g. "gap_by".

OK, after I sent this I realized the trick is to look at the "add_" methods on the BaseSearchQuery, and this doesn't seem complex as I thought it was looking at it from the receiving side. But if someone can confirm the datatypes and especially how Haystack expects the "facets" keyword to be populated, I'd appreciate it :)

Thanks,

Matt

Reply all
Reply to author
Forward
0 new messages