Django 1.8, Haystack 2.4.1, Boost not affecting score

204 views
Skip to first unread message

John Obelenus

unread,
May 24, 2016, 1:33:17 PM5/24/16
to django-haystack, John Obelenus
My project in on Django 1.9, and haystack doesnt have a compatible release yet. I was working off master and then I realized that a bunch of boost support for elasticsearch was being removed. So I backported my project to 1.8 so I could use the 2.4.1 official release to see if I can get Document and Field boosting working. It seems to me that  neither are working.

Here is what I am doing:
from haystack.query import SearchQuerySet, SQ, AutoQuery
search_term
= 'C100'
sqs
= SearchQuerySet().models(Product).filter(SQ(text=AutoQuery(search_term)) | SQ(brand_name=search_term) )
for s in sqs:
   
print s.boost, s.score, s.brand_name, ':', s.object.name

And no matter how I define the indexes the score never changes. Here are three definitions I am trying and you can see the output of the above code in the class comment:
class ProductIndex(CelerySearchIndex, indexes.Indexable):
   
"""
1.5 0.84893364 Zacuto : Zacuto Recoil Shoulder Rig for C100, C100II & C300
1.5 0.84868896 Zacuto : Zacuto Z-Finder Pro for Canon C100
1.0 0.70724076 Zacuto : Zacuto Studio Baseplate with 12"
Rods for C100/C100II/C300/C500
1.0 0.70021886 Canon : Canon CA-930 Battery Charger for C100/C100II/C300/C500
1.0 0.6494849 Anton Bauer : Anton Bauer QRC-CA940 Battery Plate for Canon C500, C300 & C100
1.0 0.6002867 Canon : Canon BP-970G Battery Pack for C100/C100II/C300/C500
2.0 0.5658623 Canon : Canon C100 Mark II Digital Cinema Camera EF Mount
1.5 0.4332277 Canon : Canon 100 f/2.8L IS Macro
   
"""
    text = indexes.CharField(document=True, use_template=True)
    product_id = indexes.IntegerField(model_attr='id', null=False)
    brand_name = indexes.CharField(model_attr='brand__name', null=True, weight=1.75)


    def get_model(self):
        return models.Product


    def index_queryset(self, using=None):
        return self.get_model().objects.filter(is_active=True)


    def prepare(self, instance):
        data = super(ProductIndex, self).prepare(instance)
        data['boost'] = 1.5  # even weight everything
        if instance.is_camera:
            data['boost'] = 2.0  # weight cameras higher
        elif instance.is_accessory:
            data['boost'] = 1.0  # weight accessories lower
        return data




class ProductIndex(CelerySearchIndex, indexes.Indexable):
    """

None 0.84893364 Zacuto : Zacuto Recoil Shoulder Rig for C100, C100II & C300
None 0.84868896 Zacuto : Zacuto Z-Finder Pro for Canon C100
None 0.70724076 Zacuto : Zacuto Studio Baseplate with 12" Rods for C100/C100II/C300/C500
None 0.70021886 Canon : Canon CA-930 Battery Charger for C100/C100II/C300/C500
None 0.6494849 Anton Bauer : Anton Bauer QRC-CA940 Battery Plate for Canon C500, C300 & C100
None 0.6002867 Canon : Canon BP-970G Battery Pack for C100/C100II/C300/C500
None 0.5658623 Canon : Canon C100 Mark II Digital Cinema Camera EF Mount
None 0.4332277 Canon : Canon 100 f/2.8L IS Macro
    """

    text
= indexes.CharField(document=True, use_template=True)
    product_id
= indexes.IntegerField(model_attr='id', null=False)
    brand_name
= indexes.CharField(model_attr='brand__name', null=True, weight=5.0)


   
def get_model(self):
       
return models.Product


   
def index_queryset(self, using=None):
       
return self.get_model().objects.filter(is_active=True)


class ProductIndex(CelerySearchIndex, indexes.Indexable):
   
"""
None 0.84893364 Zacuto : Zacuto Recoil Shoulder Rig for C100, C100II & C300
None 0.84868896 Zacuto : Zacuto Z-Finder Pro for Canon C100
None 0.70724076 Zacuto : Zacuto Studio Baseplate with 12"
Rods for C100/C100II/C300/C500
None 0.70021886 Canon : Canon CA-930 Battery Charger for C100/C100II/C300/C500
None 0.6889441 Anton Bauer : Anton Bauer QRC-CA940 Battery Plate for Canon C500, C300 & C100
None 0.6002867 Canon : Canon BP-970G Battery Pack for C100/C100II/C300/C500
None 0.5658623 Canon : Canon C100 Mark II Digital Cinema Camera EF Mount
None 0.4332277 Canon : Canon 100 f/2.8L IS Macro
   
"""
    text = indexes.CharField(document=True, use_template=True)
    product_id = indexes.IntegerField(model_attr='id', null=False)
    brand_name = indexes.CharField(model_attr='brand__name', null=True)


    def get_model(self):
        return models.Product


    def index_queryset(self, using=None):
        return self.get_model().objects.filter(is_active=True)


The score never changes despite document boost, or field boost against the non-boosted index. The haystack documentation says this stuff is supported, and I'm following the documentation example of how to query against boosted fields. Am I doing something entirely wrong? What else can I try to get this working?

John Obelenus

unread,
Sep 1, 2016, 2:49:04 PM9/1/16
to django-haystack, jobe...@activefrequency.com
I've just updated to Django Haystack 2.5.0 on Django 1.9.6 and am still seeing this problem. Whether I defined boost or not on the brand name field, or through the prepare method the `score` attribute on the search results never change. 

I see the boost coming through on the `prepared_results` dictionary that gets sent through the elasticsearch `builk`. And I see the `brand_name` field boost in elasticsearch as well:

curl 'localhost:9200/haystack/_mapping/modelresult/field/brand_name/'


{"haystack":{"mappings":{"modelresult":{"brand_name":{"full_name":"brand_name","mapping":{"brand_name":{"type":"string","boost":1.75,"analyzer":"snowball"}}}}}}


So what am I missing? Is this an AWS only issue? Is this an elasticsearch issue?

John Obelenus

unread,
Sep 1, 2016, 4:52:24 PM9/1/16
to django-haystack, John Obelenus
OK, I broke through I think. I got the output that was being sent to elasticsearch. It was:

{u'query': {u'filtered': {u'filter': {u'terms': {u'django_ct': [u'lenspro.product']}}, u'query': {u'query_string': {u'fuzzy_max_expansions': 50, u'auto_generate_phrase_queries': True, u'default_operator': u'AND', u'analyze_wildcard': True, u'query': u'(text:(Canon C100) OR brand_name:(Canon AND C100))', u'default_field': u'text', u'fuzzy_min_sim': 0.5}}}}, u'from': 0, u'size': 10}

And that brand name is obviously wrong because nothing has C100 in the brand name, so that boost wasn't working. When I manually remove that from the search terms I get some meaningful differences:

u'query': {u'filtered': {u'filter': {u'terms': {u'django_ct': [u'lenspro.product']}}, u'query': {u'query_string': {u'fuzzy_max_expansions': 50, u'auto_generate_phrase_queries': True, u'default_operator': u'AND', u'analyze_wildcard': True, u'query': u'(text:(Canon C100) OR brand_name:(Canon))', u'default_field': u'text', u'fuzzy_min_sim': 0.5}}}}, u'from': 0, u'size': 10}

1.0 10.491137 Canon : Canon CA-930 Battery Charger for C100/C100II/C300/C500
1.0 10.025042 Canon : Canon BP-970G Battery Pack for C100/C100II/C300/C500
2.0 9.703276 Canon : Canon C100 Mark II Digital Cinema Camera EF Mount
1.5 9.525196 Canon : Canon 100 f/2.8L IS Macro
1.5 3.8322778 Canon : Canon 17-55 f/2.8 IS
1.5 3.8322778 Canon : Canon 24-105 f/4L IS

But the boost that I am adding via SearchIndex `def prepare` doesn't have ANY affect on the results. Which I hope I can do because you see from the above output I have a 1.0, 2.0, and 1.5 boost on specific indexes that I need.
--
John Obelenus @ Active Frequency

Bob Donahue

unread,
May 3, 2017, 12:15:53 PM5/3/17
to django-haystack, jobe...@activefrequency.com
I'm having the same issue (ES 2.4, Django 1.9), except that no matter WHERE I try to introduce boosting, there's no effect whatsoever.

Is this just that ElasticSearch 2.4 simply does not support boost?   Did this ever get fixed?   I'm using django-haystack, so I'm stuck at 2.X for elasticsearch...


Reply all
Reply to author
Forward
0 new messages