SearchQuerySet().models() not behaving correctly

580 views
Skip to first unread message

worksology

unread,
Feb 13, 2012, 12:35:24 PM2/13/12
to django-haystack
I've been stuck on this for two days so I thought it was time to
publicly ask for help. While this isn't exactly the full story, this
is the minimum in which the problem is reproducible.

# media/search_indexes.py

class PhotoIndex(indexes.ModelSearchIndex, indexes.Indexable):
class Meta:
model=Photo

class TweetIndex(indexes.ModelSearchIndex, indexes.Indexable):
class Meta:
model=Tweet


> sqs = SearchQuerySet().models(Photo)
> len(sqs)
153

But when I try to render them in a template, I get about 6 that render
correctly and then None's all the way down. When I comment out the
TweetIndex and rebuild_index, everything works as expected: 153
paginated photos. When I adjust the SearchQuerySet to not use
the .models() method, it also works correctly (but, of course,
contains hundreds of undesired Tweets, too).

One odd thing I noticed is when the SearchQuerySet with models()
method is __repr__'d , it says this: "Error in formatting: list
assignment index out of range", traceable back to line 78 of query.py:

data[-1] = "...(remaining elements truncated)..."

At that point:
> len(self)
153

but:
> print self
[]

So, I don't know. I've done a lot of digging but I don't know enough
about what's going on to fix it, apparently.

> pip freeze
Django==1.3.1
Whoosh==2.3.2
django-haystack==2.0.0-beta


Josh

worksology

unread,
Feb 14, 2012, 8:17:05 PM2/14/12
to django-haystack
After a little more digging, I've found something else interesting.
When len(Tweet.objects.all()) > len(Photo.objects.all()), none of the
photos showed up. But as I slowly deleted tweets, I noticed that the
number of Photos that worked increased in lock-step. When the # of
Tweets was 10, then exactly 10 photos would not work. When exactly one
tweet existed, exactly one photo would not work. When no tweets
existed, all the photos would work. I don't pretend to understand any
of this, but clearly there's some bug here. What else can I try? What
other information can I provide?

worksology

unread,
Feb 14, 2012, 9:45:39 PM2/14/12
to django-haystack
Somehow I've overlooked one other thing I'm doing:

sqs.order_by("-pub_date")

which seems to be turning the Tweet objects into None's. I'm still not
sure why, but I'm guessing there's some sort of "pub_date" field
conflict on the Tweet index. I'll update when I find something. Sorry
for the distraction!

worksology

unread,
Feb 14, 2012, 10:13:27 PM2/14/12
to django-haystack
Okay, that can't be the whole story because there's still clearly a
relationship between indexed Tweets and the contents of
SearchQuerySet().models(Photo). This problem is evident only when
the .models() method is used and then the results are ordered
with .order_by(). Here's some illustrating code:

>>> searchqueryset = SearchQuerySet().models(Photo)
>>> photos = searchqueryset.order_by("-pub_date")
>>> paginator = Paginator(photos, 20)
>>> for photo in paginator.page(4).object_list:
... print photo
...
<SearchResult: media.photo (pk=u'6713176893')>
None
None

(Those two None's correlate directly to the number of Tweet's indexed;
right now there's two indexed Tweet objects).

>>> paginator_unsorted = Paginator(searchqueryset, 20)
>>> for photo in paginator_unsorted.page(4).object_list:
... print photo
...
<SearchResult: media.photo (pk=u'95306086_137695')>
<SearchResult: media.photo (pk=u'99004819_137695')>
<SearchResult: media.photo (pk=u'99039878_137695')>

worksology

unread,
Feb 15, 2012, 2:25:21 AM2/15/12
to django-haystack
Looks like I'm bumping into a known problem with Whoosh, already
documented in Haystack's github issues:

https://github.com/toastdriven/django-haystack/issues/418

In retrospect, installing solr for local development would probably
have taken much less effort than getting to the root of this Whoosh-
related issue.

worksology

unread,
Feb 15, 2012, 5:04:42 AM2/15/12
to django-haystack
I've confirmed that switching to solr has removed the problem, but I
was never able to determine whether the bug is in Haystack
(whoosh_backend) or in Whoosh itself. I assume the latter.
Reply all
Reply to author
Forward
0 new messages