Problems with the minus char, doesn't find anything

149 views
Skip to first unread message

mix

unread,
Nov 18, 2009, 5:49:26 AM11/18/09
to Thinking Sphinx
Hi, i'm having a problem using sphinx + thinking sphinx.
I've a post with a title like "Test title with - should work"
Then i search the posts using: Post.search :conditions => {:title =>
"with - should"} and the results are 0. But it should find that post.
The problem is also with code strings like ABC-123, searching for
ABC-123, 0 results. I should search "ABC-123", but if i'd like to
search "ABC-123 other text" i can't... or i'd write '"ABC-123" other
text, and you understand that it's quite hard to explain that to
users. And neither i can add " and " to the query searched.
I could find the minus char in the query and add " and " with the text
near it... but i hope that there is a simpler way to solve this.

Pat Allan

unread,
Nov 18, 2009, 7:12:42 PM11/18/09
to thinkin...@googlegroups.com
You probably need to escape it: \-

There's the Riddle.escape method, which escapes all of Sphinx's
special characters.

--
Pat
> --
>
> You received this message because you are subscribed to the Google
> Groups "Thinking Sphinx" group.
> To post to this group, send email to thinkin...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/thinking-sphinx?hl=
> .
>
>

mix

unread,
Nov 19, 2009, 5:30:34 AM11/19/09
to Thinking Sphinx
Hi Pat, thank you for the Riddle method. What about if i'd like to
keep the minus in order to permit users to "remove" results based on
keyworks (e.g. "key -key2"), and permit them to find stuff like
"product-code" or "black ray-ban" or stuff like that?
I think that the riddle escape method is good, but it should escape
every minus, but not the ones with a space before. Is there anything
which does that?

James Healy

unread,
Nov 19, 2009, 5:49:49 AM11/19/09
to thinkin...@googlegroups.com
For that kind of control I recommend writing your own escaping method.
It would be impossible for Pat to maintain an escaping method that suits
everyone's needs.

-- James Healy <ji...@deefa.com> Thu, 19 Nov 2009 21:49:30 +1100

mix

unread,
Nov 19, 2009, 8:47:14 AM11/19/09
to Thinking Sphinx
Hi James, you're right, but i think that that's a quite normal case. I
mean, if you use the boolean search you expect that with - you remove
some words, but not if the - is used in a two words query (like ray-
ban, d-link, and many others examples). Pat, what do you think?
I've found the riddle escape pattern ( /[\(\)\|\-!@~"&\/\\\^\$=]/ ),
do you have any suggestion on how edit it in order to not escape the -
if there is a space before it?

Pat Allan

unread,
Nov 19, 2009, 8:55:41 AM11/19/09
to thinkin...@googlegroups.com
Isn't it that you *do* want to escape the - if there's a leading space?

Try the following:
/([\(\)\|!@~"&\/\\\^\$=])|(\s\-)/
http://rubular.com/regexes/11796

I'm not convinced it's a super common pattern - but it's easy enough
to change Riddle's escape pattern manually, so this isn't hard to slot
in, for those who wish to.

Riddle.escape_pattern = /custom-pattern/

--
Pat

Pat Allan

unread,
Nov 19, 2009, 9:00:22 AM11/19/09
to thinkin...@googlegroups.com
Although having said that, it will escape the space, not the dash. And
I do have it around the wrong way... that's what I get for emailing at
1AM.

--
Pat
>> To post to this group, send email to thinking-
>> sph...@googlegroups.com.

mix

unread,
Nov 22, 2009, 10:36:10 AM11/22/09
to Thinking Sphinx
Hi Pat, i've not very clear how sphinx is working (i kinda consider it
that it doesn't in this case..). I've a title post "test ATT-32 test",
using
Post.search :conditions => {:title => "ATT"} -> 1 result
Post.search :conditions => {:title => "ATT-"} -> 0 result (not
good...)
Post.search :conditions => {:title => "ATT-3"} -> 1 result
Post.search :conditions => {:title => "ATT-2"} -> 1 result (what??
there is nothing with that)
Post.search :conditions => {:title => "ATT\\-"} -> 1 result
Post.search :conditions => {:title => "ATT\\-3"} -> 0 result (ehm..it
misses something..)
Post.search :conditions => {:title => "ATT\\-2"} -> 0 result

A part from the riddle method to escape. Using riddle i'd like to have
this:

Riddle.escape("ATT-32") ---> "ATT\\-32"
As i consider that having \\ it escape the - and so it doesn't apply
the boolean logic (= removing the 32 as -32)

Your regex doesn't work because it takes the space and would escape "
-"... instead of the one without space (if i've understood clearly the
escape motivation as above)

Pat Allan

unread,
Nov 25, 2009, 3:37:23 AM11/25/09
to thinkin...@googlegroups.com
Yeah, I realise the regex is faulty - there's going to have to be some
changes to how Riddle's escape method works if we're going to do those
kinds of custom regular expressions...

As for your search results, do you have - in your charset_table
values? Because I don't think Sphinx will index hyphens by default,
and that's going to mean your search results aren't going to be what
you expect.

Have a read of the following links:
http://freelancing-god.github.com/ts/en/advanced_config.html
(character sets section is second from the bottom)
http://sphinxsearch.com/docs/manual-0.9.8.html#conf-charset-table
http://yob.id.au/2008/05/08/thinking-sphinx-and-unicode.html

Hope this helps

--
Pat
> --
>
> You received this message because you are subscribed to the Google
> Groups "Thinking Sphinx" group.
> To post to this group, send email to thinkin...@googlegroups.com.
> To unsubscribe from this group, send email to thinking-sphi...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages