Why 'icontains' does not work for valid filter

394 views
Skip to first unread message

Praveen

unread,
May 6, 2009, 4:04:26 AM5/6/09
to django-haystack
Hi all i have Movie class inside that i have movie name 'pirates of
the carabean'
when i test in django shell like this

Movie.objects.filter(title__icontains='rates')
then it returns me the movie object
[<Movie: pirates of the carabean>]

I found valid filters in constants.py file and add one more value to
VALID_FILTERS and thought it should work

VALID_FILTERS = set(['exact', 'gt', 'gte', 'lt', 'lte', 'in',
'icontains'])

but did not work..

i do not know then what is the use of VALID_FILTER and why can we
search like this

Thanks

Matías Costa

unread,
May 6, 2009, 5:45:10 AM5/6/09
to django-...@googlegroups.com
What's your point? I mean, What are you trying to accomplish?

Anyway, some thoughts:

* Are you are querying the database? If you do, you are using the SQL
construct LIKE '%rates%'.
* Hacking django/haystack internals is not a good idea. But if you
want be sure you are modifying constants.VALID_FILTER and not creating
a local.
* Setting VALID_FILTER is not going to add your desired functionality magicaly

Praveen

unread,
May 6, 2009, 6:45:22 AM5/6/09
to django-haystack
Yeah, i was trying to write the LIKE query. then what should i do?
then you mean to say we can not use 'LIKE' query in Haystack, if it's
not, then this extra features must be add to haystack.

Matías Costa

unread,
May 6, 2009, 7:35:56 AM5/6/09
to django-...@googlegroups.com
But that LIKE query is SQL and is executed by your DB server. Is not
haystack related.

I question again, what are you trying to accomplish?

Praveen

unread,
May 6, 2009, 7:42:08 AM5/6/09
to django-haystack
Ok in my movie tables
'Pirates of the carabean'
'Pirates'
'Pirul'
'Titanic'

if one write in search box *irate*(please do not consider *) then it
should retrieve
'Pirates of the carabean'
'Pirates'

if one write in search box *Pir*(please do not consider *) then it
should retrieve
'Pirates of the carabean'
'Pirates'
'Pirul'

in query.py
there is one function auto_query
i do not know why they are trying to do for exact match
# Pull out anything wrapped in quotes and do an exact match on it.
quote_regex = re.compile(r'([\'"])(.*?)\1')
result = quote_regex.search(query_string)

I do not want to do ***exact_match***

Thanks

Matías Costa

unread,
May 6, 2009, 8:17:48 AM5/6/09
to django-...@googlegroups.com
And that code does exact match if you pass something in quotes. In
your example, if you pass *"pirates of"*, it is going to search exact
that. If you give *pirates of*, is going to search with both words.

Now, well, I get your point. What you are trying to do is insane. With
lucene you can do *irate* (the * is part of the string) and finds what
you want. Woosh does not like *.

Someone with more knowledge of woosh or haystack should talk now.

Praveen

unread,
May 6, 2009, 9:13:07 AM5/6/09
to django-haystack
I request you not to consider (*)
ok let me explain again
i want to search irate then it should return me

Pirates of the carabean
Pirates

and if i search for pir pattern then it should return me
'Pirates of the carabean'
'Pirates'
'Pirul'


in query.py
there is one function auto_query
i do not know why they are trying to do for exact match
# Pull out anything wrapped in quotes and do an exact match on it.
quote_regex = re.compile(r'([\'"])(.*?)\1')
result = quote_regex.search(query_string)

I do not want to do ***exact_match***

in forms.py

class SearchForm(forms.Form):
q = forms.CharField(required=False)

def __init__(self, *args, **kwargs):
self.searchqueryset = kwargs.get('searchqueryset', None)

if self.searchqueryset is None:
self.searchqueryset = SearchQuerySet()

try:
del(kwargs['searchqueryset'])
except KeyError:
pass

super(SearchForm, self).__init__(*args, **kwargs)

def search(self):
print "I am called in b/w"
self.clean()
#return self.searchqueryset.auto_query(self.cleaned_data['q'])
print "First:", self.searchqueryset.filter
(content__icontains='irate') did not print any value

Thanks

Daniel Lindsley

unread,
May 6, 2009, 9:27:22 AM5/6/09
to django-...@googlegroups.com
Praveen,


So, if I'm understanding properly, what you're looking for is the
equivalent of 'LIKE' in SQL. The thing to understand here is that the
search engines that back Haystack aren't like an RDBMS.

You're not wrong in what you were trying to add to Haystack
('icontains') but just adding it to the VALID_FILTER list won't
automatically make it work. Just like in Django proper, code had to be
added to each backend to process the new filter type and add it to the
search query. In that regard, Matias is correct. The low level
implementation will involve using wildcard characters with each
backend.

I've created an issue to add this feature to Haystack
(http://github.com/toastdriven/django-haystack/issues/#issue/24). What
you need to be wary of is that, once this is in place, performance is
going to drop (because just like SQL, this is an expensive operation)
and your results may not match what you think they should due to how
the engine implements relevancy and how it searches it's index.

Hopefully, I'll get to this in the near future.


Daniel



2009/5/6 Matías Costa <m.cos...@gmail.com>:

Praveen

unread,
May 6, 2009, 9:32:49 AM5/6/09
to django-haystack
Ok , thanks and hope this feature will be added in future. but for the
time being i removed the * from the whoosh back end and it works.

On May 6, 6:57 pm, Daniel Lindsley <polarc...@gmail.com> wrote:
> Praveen,
>
>    So, if I'm understanding properly, what you're looking for is the
> equivalent of 'LIKE' in SQL. The thing to understand here is that the
> search engines that back Haystack aren't like an RDBMS.
>
>    You're not wrong in what you were trying to add to Haystack
> ('icontains') but just adding it to the VALID_FILTER list won't
> automatically make it work. Just like in Django proper, code had to be
> added to each backend to process the new filter type and add it to the
> search query. In that regard, Matias is correct. The low level
> implementation will involve using wildcard characters with each
> backend.
>
>    I've created an issue to add this feature to Haystack
> (http://github.com/toastdriven/django-haystack/issues/#issue/24). What
> you need to be wary of is that, once this is in place, performance is
> going to drop (because just like SQL, this is an expensive operation)
> and your results may not match what you think they should due to how
> the engine implements relevancy and how it searches it's index.
>
>    Hopefully, I'll get to this in the near future.
>
> Daniel
>
> 2009/5/6 Matías Costa <m.costac...@gmail.com>:
>
>
>
> > And that code does exact match if you pass something in quotes. In
> > your example, if you pass *"pirates of"*, it is going to search exact
> > that. If you give *pirates of*, is going to search with both words.
>
> > Now, well, I get your point. What you are trying to do is insane. With
> > lucene you can do *irate* (the * is part of the string) and finds what
> > you want. Woosh does not like *.
>
> > Someone with more knowledge of woosh or haystack should talk now.
>

Daniel Lindsley

unread,
May 17, 2009, 11:56:12 PM5/17/09
to django-...@googlegroups.com
Praveen,


After investigation, most of the targeted backends don't support a
leading wildcard, something required for an 'icontains/endswith'
filter. However, since most support trailing wildcards, I've added a
'startswith' filter. This gets you a good part of the way, since most
engines tokenize/stem all of the words in the document. The only case
this doesn't handle is searching for the end of a word, which doesn't
look to be possible.


Daniel
Reply all
Reply to author
Forward
0 new messages