haystack whoosh wildcards - how to really enable it?

589 views
Skip to first unread message

DK

unread,
May 11, 2011, 3:21:09 AM5/11/11
to django-haystack
I've read several posts about haystack has support (covered with
tests) of whoosh wildcards, but using queries like part* or th?s is
not working (via standard SearchView), just tries to make exact
matches.

What is the proper way to go from search query field in form to having
results including wildcards enabled? I really can't find any working
solution for that, and still I was thinking this should just work out
of the box as a obvious functionality.

Daniel Lindsley

unread,
May 11, 2011, 10:41:50 PM5/11/11
to django-...@googlegroups.com
DK,


If you're submitting queries through the web form, wildcards are intentionally called out as part of the call to "SearchQuerySet.auto_query". If you trust your users you can override "SearchForm.search" to call "SearchQuerySet.filter", which will not escape the output (but can 500 if your users submit a malformed query).

Knowing more about why you need the wildcard might help provide a better alternative.


Daniel

> --
> You received this message because you are subscribed to the Google Groups "django-haystack" group.
> To post to this group, send email to django-...@googlegroups.com.
> To unsubscribe from this group, send email to django-haysta...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/django-haystack?hl=en.
>

DK

unread,
May 19, 2011, 12:08:01 PM5/19/11
to django-...@googlegroups.com
Daniel,

first of all I resolved a strange problem of haystack==1.2.1 is not working correctly with Whoosh==1.8.3. After rebuilding index (tried with/without batch option) SearchQuerySet().all().count() returns completely wrong number of documents. Whats over, searching (with auto_query) works unpredictable, resulting with plenty of None's (yes, I read other posts on this group about deleted items and None's, but this is not that issue).

Anyway - I read another post that last known to work version of haystack&whoosh is 1.1.0 + 1.4.0 and this is what I am currently using and seems this is fine.


I've dig into auto_query method, and I still do not understand why this does not work with wildcards. It seems that method clean in 367 line of query.py

                cleaned_keyword = clone.query.clean(keyword)

this method does not remove the ' * ' sign from the keyword. I have even put some "print" after this line to get value of cleaned_keyword, and it DID NOT remove * sign.

As far as I understand  providing only one keyword to auto_query() will  only make clone.filter() on itself 

                clone = clone.filter(content=cleaned_keyword)

So this behavior should be exactly the same (but it isn't).

>>> SearchQuerySet().filter(content='krzysztof')
[<SearchResult: auth.user (pk=u'5')>, <SearchResult: auth.user (pk=u'55')>]
>>> SearchQuerySet().auto_query('krzysztof')
[<SearchResult: auth.user (pk=u'5')>, <SearchResult: auth.user (pk=u'55')>]

>>> SearchQuerySet().filter(content='krzyszto*')
[<SearchResult: auth.user (pk=u'5')>, <SearchResult: auth.user (pk=u'55')>]
>>> SearchQuerySet().auto_query('krzyszto*')
[]


So really I do not understand - if auto_query('krzyszto*') makes at the end clone.filter('krzyszto*') why the result is empty set?

DK

unread,
May 19, 2011, 12:32:44 PM5/19/11
to django-...@googlegroups.com
Sorry, mystery solved :) 

clone.query.clean('krzysztof*') returns " 'krzysztof*' " ... and this is obvious now.

Reply all
Reply to author
Forward
0 new messages