How to extract unmatched keywords, thinking sphinx

128 views
Skip to first unread message

senthil kumar

unread,
May 4, 2012, 7:32:14 AM5/4/12
to Thinking Sphinx
Im using thinking sphinx in my rails 3 app.

I have used sphinx indexing on 2 tables in my products catalog app.
namely,

"titles" and "features"

For every search_key, I will check the titles table first and if no
results found,
I will goto features table. If a title match is found I will stop and
i will not goto features table.

Example: search keyword is "head and dust hachette book publishing",

Here, "heat and dust" has an exact match in titles so im not going to
the features table.

Question:

How to find the unmatched_keys for a search in sphinx..?

I want to find out the unmatched words in the search key i.e.
"hachette book publishing" in "head and dust hachette book
publishing"..

If I get that, I will check if @unmatched_title_keys is nil?.. if
not.. I will goto features table and perform a feature search for
@unmatched_title_keys.

Please advice if this approach for searching 2 tables is good or I
should alter my search approach and tell me how to retrieve
unmatched_keys from a sphinx search query...

senthil kumar

unread,
May 7, 2012, 2:54:05 AM5/7/12
to Thinking Sphinx
No reply yet????

Pat Allan

unread,
May 7, 2012, 9:21:39 PM5/7/12
to thinkin...@googlegroups.com
Hi Senthil

Have a look at the raw Sphinx results for your query, that may provide the information you're after. You can access it like so:

Model.search('query').results

That will have information on which terms matched a certain number of documents.

--
Pat

> --
> You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group.
> To post to this group, send email to thinkin...@googlegroups.com.
> To unsubscribe from this group, send email to thinking-sphi...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/thinking-sphinx?hl=en.
>



senthil kumar

unread,
May 8, 2012, 10:02:10 AM5/8/12
to Thinking Sphinx
Hi Pat,

Thanks for your reply. This is what I have done.

word_count is the number of words in a document excluding stopwords
generated by a mysql function.

with_filter = "*, IF( @weight >= word_count,1,0) AS filter"

sphinx_result = ProductsFilterCollections.search('"heat and dust
hachette book publishing"/1', :match_mode => :extended, :rank_mode
=> :wordcount, :sphinx_select => with_filter, :with => {'filter' =>
1})

sphinx_result.results value is
{:matches=>[{:doc=>139, :weight=>2, :index=>0, :attributes=>{"sphinx_internal_id"=>46,
"sphinx_deleted"=>0, "class_crc"=>2200458759,
"sphinx_internal_class"=>"ProductsFilterCollections",
"sub_category_id"=>1, "word_count"=>2, "filter"=>1}},
{:doc=>232, :weight=>2, :index=>1, :attributes=>{"sphinx_internal_id"=>77,
"sphinx_deleted"=>0, "class_crc"=>2200458759,
"sphinx_internal_class"=>"ProductsFilterCollections",
"sub_category_id"=>1, "word_count"=>2,
"filter"=>1}}], :fields=>["filter_key"], :attributes=>{"sphinx_internal_id"=>1,
"sphinx_deleted"=>1, "class_crc"=>1, "sphinx_internal_class"=>7,
"sub_category_id"=>1, "word_count"=>1,
"filter"=>1}, :attribute_names=>["sphinx_internal_id",
"sphinx_deleted", "class_crc", "sphinx_internal_class",
"sub_category_id", "word_count",
"filter"], :words=>{"heat"=>{:docs=>2, :hits=>2},
"dust"=>{:docs=>2, :hits=>2},
"hachette"=>{:docs=>0, :hits=>0}}, :status=>0, :total=>2, :total_found=>2, :time=>0.001}

@unmatched_keywords = [ ]

sphinx_result.results[:words].keys.each do |i|
if sphinx_result.results[:words][i][:hits] == 0
@unmatched_keywords << i
end
end

Now the value of @unmatched_keywords is ["hachette", "book",
"publishing"].


I get what I want.

Kindly validate.

Thanks,
Senthil.






Pat Allan

unread,
May 8, 2012, 12:00:19 PM5/8/12
to thinkin...@googlegroups.com
That's what I was thinking too, good to see you've got it figured out.

Cheers

--
Pat

senthil kumar

unread,
May 8, 2012, 1:05:22 PM5/8/12
to Thinking Sphinx
Thanks man. Good to get approved by you.
Reply all
Reply to author
Forward
0 new messages