Do suggestions honor the analyzer specified for the index ?? maybe not

40 views
Skip to first unread message

Sean Rock

unread,
Oct 30, 2014, 2:16:38 PM10/30/14
to rav...@googlegroups.com
Hi

I'm using a suggestion query (cool feature btw) however i have a situation where I'd expect to get a specific suggestion but am not.

For example, the word (tshirt) does not exist in any of my documents but (shirts) does and I get (shirts) as a suggestion - as I would expect.

However, (t-shirt) and (t-shirts) both exist but they are not offered as suggestions and I would expect them to be when searching for tshirt.

Also, if I query for (t-shirt) I would expect to set (t-shirts) as a suggestion but I get (shirt), (shirts) etc.

So, does the suggestion feature not honor the analyzer specified for the index (stopanalyzer)?

Unless i use the stop analyzer i get nothing when searching for hyphenated words.

Itamar Syn-Hershko

unread,
Oct 31, 2014, 1:21:32 AM10/31/14
to rav...@googlegroups.com
No, suggestions operate on the original text, regardless of the analyzer used

--

Itamar Syn-Hershko
http://code972.com | @synhershko
Freelance Developer & Consultant

--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sean Rock

unread,
Oct 31, 2014, 3:28:29 AM10/31/14
to rav...@googlegroups.com

oh thats unfortunate. I'm wondering if that was just over-looked, as it makes the suggestions not very accurate.

Oren Eini (Ayende Rahien)

unread,
Oct 31, 2014, 3:31:51 AM10/31/14
to ravendb
I'm pretty sure that this is by design.
You want the suggestion on the original text, because analyzers can do crazy things to it.



Hibernating Rhinos Ltd  

Oren Eini l CEO Mobile: + 972-52-548-6969

Office: +972-4-622-7811 l Fax: +972-153-4-622-7811

 

Itamar Syn-Hershko

unread,
Oct 31, 2014, 3:39:52 AM10/31/14
to rav...@googlegroups.com
Well, it is by RavenDB's design. There's plenty of ways to implement suggestions and overcome various issues / shortcomings. RavenDB uses one approach (dedicated ngram indexing)

--

Itamar Syn-Hershko
http://code972.com | @synhershko
Freelance Developer & Consultant

Sean Rock

unread,
Oct 31, 2014, 3:40:53 AM10/31/14
to rav...@googlegroups.com
In this case when searching normally for tshirt and not have t-shirt, which exists in the collection and returns results when searched normally, be returned as a suggestion seems odd.

--
You received this message because you are subscribed to a topic in the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ravendb/1B-I7dlMAIs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ravendb+u...@googlegroups.com.

Sean Rock

unread,
Oct 31, 2014, 6:24:59 AM10/31/14
to rav...@googlegroups.com
Also, it returns suggestions that do not return search results - surely that cant be by design ??

Oren Eini (Ayende Rahien)

unread,
Oct 31, 2014, 8:57:27 AM10/31/14
to ravendb
It is possible if the value _was_ in the index at some point, but was then removed.

Sean Rock

unread,
Oct 31, 2014, 1:29:54 PM10/31/14
to rav...@googlegroups.com
Unlikely as this is still on my dev machine so no changes to the document content is happening...I'm going to play with this more but on the face of it suggestions are not always accurate or even relevant however can cater for misspelled words, e.g. drees would suggest dress which is correct. A synonym library would probably give the desired results, for example searching for cardigan would give jumper, hoodie, pullover as  suggestions. I don't know immediately how I could implement this feature - of course it would require manually creating the synonyms (not a problem), its just how to provide those based on search criteria is what i'll need to figure out.

Oren Eini (Ayende Rahien)

unread,
Oct 31, 2014, 8:09:09 PM10/31/14
to ravendb
Can you provide a failing test for this?

Basically, the suggestions in ravendb just do ngram search.
See

The actual logic is pretty easy to follow and extend.

Itamar Syn-Hershko

unread,
Oct 31, 2014, 8:13:39 PM10/31/14
to rav...@googlegroups.com
That's easy to explain - t-shirt is being analyzed as [t] [shirt], while tshirt analyzes as [tshirt]. The suggestions are a tad more flexible, but in the end of day search for tshirt will return only "tshirt" and not t-shirt (because shirt != tshirt).

--

Itamar Syn-Hershko
http://code972.com | @synhershko
Freelance Developer & Consultant

Sean Rock

unread,
Nov 1, 2014, 6:59:36 AM11/1/14
to rav...@googlegroups.com
Thanks for pointing me in the right direction Oren, I'm going to pull that down and have a closer look.
Reply all
Reply to author
Forward
0 new messages