Haystack, Facets, Variants & AttributeValues

561 views
Skip to first unread message

John P

unread,
Mar 26, 2015, 4:14:58 PM3/26/15
to django...@googlegroups.com
I'm trying to solve an issue that I think is actually a really common use-case and thought I would start a discussion here.

Because a parent can have multiple children, a parent can have multiple attribute-values. Here's a concrete example:

http://www.amazon.com/Optimum-Nutrition-Standard-Double-Chocolate/dp/B000QSNYGI

As you can see, it comes in multiple flavors, as well as multiple sizes. For the purposes of discussion, let's only consider the flavors "Cake Batter" and "Double Rich Chocolate" as well as the sizes "5 Pound" and "2 Pound". Double chocolate comes in 2 & 5 pound variations, but cake batter only comes in the 5 pound variation. If a user facets on "Cake Better" and facets on "2 Pound", we would expect the product to not return in a search result.

Unfortunately, I cannot figure out how to implement this behavior in haystack. I have two MultiValueFields, one for each dimension. When a user drills down on the two facets, we get a false positive.

I've done a decent amount of research on this, and I think it's a limitation of haystack. There seems to be some solutions using nested-document structures, but that requires talking directly to the search backend. (I've found concrete examples for elastic search, and some loose examples for solr.)

I wrote a custom search backend for another project I've worked on that doesn't have this limitation, but I would really like to avoid having to swap out the entire search app if possible. Ideally, I'd like to come up with a general solution that can be contributed back to Oscar.

Any thoughts?

BugSpencer

unread,
Mar 28, 2015, 9:22:34 AM3/28/15
to django...@googlegroups.com
Maybe best way to integrate this is to define another SearchHandler that calls directly ElasticSearch
without using Haystack at all. There's a method in Oscar (get_product_search_handler_class)
that checks if Solr is supported to decide whether ProductSearchHandler or SimpleProductSearchHandler
should be used.

I think this method could return several independent search handlers based on some configuration.
ElasticsearchProductSearchHandler could be one of them, and I'd rename ProductSearchHandler
to something like "HaystackProductSearchHandler".

Not using Haystack would make complex calls to elasticsearch possible (using elasticsearch-dsl-py,
or maybe just requests), but this will force you to manually link ES search results to a Django queryset.
You would also need to manually handle indexing.

Anyway I agree that Haystack's Elasticsearch backend seems quite "limited", and even worse ES
is evolving a lot faster than Haystack, and the gap can only grow in the near future.

My 2cents.

John P

unread,
Mar 30, 2015, 9:03:44 AM3/30/15
to django...@googlegroups.com
Thanks for the input. I was worried that it would be extremely complicated, but the approach that you described seems pretty straightforward.

John P

unread,
Apr 3, 2015, 4:54:56 PM4/3/15
to django...@googlegroups.com
Thought I would check in. Many hours later and I have a working implementation.

Features:

* ProductAttributeValues, price, availability are all nested attributes
* Category, title, text index, date_created, rating are regular attributes
* Standalone products are normalized as parent-child pairs in elastic search to make querying more sane
* The aforementioned false positives are completely fixed
* There is a sophisticated faceting system that uses elastic search's aggregates. The counts are correct (not vulnerable to false positives)

Some implementation details:

* It's implemented as a custom search app. Not a fork or anything like that
* Uses elasticsearch-dsl. I tried to write the JSON directly, but that ended up being torturous
* Had to overwrite a few things:
    - SearchForm, BrowseSearchForm
    - SearchHandler, ProductSearchHandler,
    - FacetedSearchView
    - FacetMunger
    - a few minor tweaks to templates
* Wrote a custom management command to populate the index. The DocType class was super helpful here https://elasticsearch-dsl.readthedocs.org/en/latest/persistence.html#doctype


On Thursday, March 26, 2015 at 3:14:58 PM UTC-5, John P wrote:

BugSpencer

unread,
Apr 14, 2015, 6:02:09 AM4/14/15
to django...@googlegroups.com
Il giorno venerdì 3 aprile 2015 22:54:56 UTC+2, John P ha scritto:
Thought I would check in. Many hours later and I have a working implementation.
...

* It's implemented as a custom search app. Not a fork or anything like that

Any chance this can be released as a standalone app to be used as a drop in replacement of oscar.apps.search?

I managed to extend oscar search to have working multi-facets on elasticsearch, but this doesn't solve your issue,
which definitely is an issue with haystack, and still I'd like to get rid of haystack at all if possible.

John P

unread,
Apr 14, 2015, 9:17:21 AM4/14/15
to django...@googlegroups.com
I can't release it outright for legal reasons, but I can make myself available to answer questions.


On Thursday, March 26, 2015 at 3:14:58 PM UTC-5, John P wrote:

BugSpencer

unread,
Apr 14, 2015, 9:55:12 AM4/14/15
to django...@googlegroups.com
Il giorno martedì 14 aprile 2015 15:17:21 UTC+2, John P ha scritto:
I can't release it outright for legal reasons, but I can make myself available to answer questions.

I undestrand perfectly (I'm on the same "boat"), thank you for you answer

Max

unread,
Mar 6, 2016, 7:44:05 AM3/6/16
to django-oscar
Hello John,


It is almost one year passed since original message, may be now you may share the code to public.
Specifically I'm interested in nested prices and availability.
I'm not sure how to implement it with Solr. All my attempts to "flatten" product for Solr indexing seem too hackish. 
So I'm kind of looking for ideas. If it is easier to make it work with Elastic, then I probably should dig into this.

Regards,
Max

YusufSalahAdDin

unread,
Aug 15, 2016, 3:13:49 AM8/15/16
to django-oscar
What's about this man?
It's very important, but anyone help with this.

I'm my case, i couldn't work with elasticsearch because django-haystack 2.5.0 facets incompatibilities with elasticsearch 2.3.5.
Reply all
Reply to author
Forward
0 new messages