Sunspot 1.2.rc4 and distance

16 views
Skip to first unread message

Will Bridges

unread,
Oct 12, 2010, 3:29:52 PM10/12/10
to Sunspot, m...@patch.com, ni...@onemorecloud.com
I've read this: http://outoftime.github.com/sunspot/docs/classes/Sunspot/DSL/RestrictionWithNear.html#M000116

Still, I'm having an extremely difficult time figuring out how to
translate this in to the right code.

I need all results sorted by their proximity to a certain lat/lng...

I was using near but I can't figure out how to get a large enough
bounding box. If I'm in Portland I need to be able to return the
nearest results even if the nearest results are in New York. Can
someone please help me with this? Here's the code I'm using with near.

@search = Sighting.search do
paginate(:page => page, :per_page => rpp)
keywords query if query
with(:coordinates).near(lat,lng, :boost => 2,:precision =>
3,:precision_factor => 200) if lat && lng
with(:published).equal_to(true)
order_by(:created_at, :desc) if !lat && !lng
end

I've tried increasing the precision factor as much as I can but that
doesn't seem to do anything so I'm assuming I don't understand how
that works.

Will Bridges

unread,
Oct 12, 2010, 4:16:24 PM10/12/10
to Sunspot
Okay, I just gained a better understanding of how this works by
looking at the code and the docs again... I'm not sure if this will
work at all for what I'm doing. 9q5ctr18dkw15 is los angeles... and
c20fbrmcm4qjb is portland... which seems like it wouldn't match no
matter what I do... If anybody has a better solution that would be
helpful.

On Oct 12, 2:29 pm, Will Bridges <willbrid...@gmail.com> wrote:
> I've read this:http://outoftime.github.com/sunspot/docs/classes/Sunspot/DSL/Restrict...

Mat Brown

unread,
Oct 12, 2010, 4:20:47 PM10/12/10
to ruby-s...@googlegroups.com
Hi Will,

I'm afraid you're probably correct. The Geohash approach is very
efficient and plays nicely with fulltext search, but the disadvantage
is it's imprecise, and ineffective at very large distances. The
tradeoff is worth it in the majority of cases that I can think of, but
it definitely isn't worth it in yours. You might check out some of the
new spatial search functionality in the as-yet-unreleased Solr 1.5 and
see if there's anything there you can make use of -- but fair warning,
calculating spherical distances for points all over the continent is
going to be very, very slow, pretty much no matter how you slice it,
unless your data are very, very sparse.

Mat

> --
> You received this message because you are subscribed to the Google Groups "Sunspot" group.
> To post to this group, send email to ruby-s...@googlegroups.com.
> To unsubscribe from this group, send email to ruby-sunspot...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/ruby-sunspot?hl=en.
>
>

Nick Zadrozny

unread,
Oct 12, 2010, 4:16:51 PM10/12/10
to Sunspot
Hi Will,

Would it be accurate to say that you're seeing no results outside of 389 miles? If that is the case, perhaps the default minimum match of "100%" — which treats all clauses as mandatory — is affecting you here.

Try setting "minimum_match 0" to make all clauses optional. Or "minimum_match -1" to make all but one clause mandatory. I published an article on the websolr blog about minimum match with Sunspot just this morning if you want to get into more of the nitty-gritty there.

It's worth noting that results outside of 389 miles (geohash precision of 3) won't have any distance sorting applied. To do distance sorting at that level of precision (or to guarantee exact sorting precision in general), I'd recommend sorting results in your app with something like this:

  @results.sort_by! { |result| result.distance_to(lat, lng) }

That distance_to is left as an exercise to the reader :)

This is what I'm gathering from rereading lib/sunspot/query/geo.rb among a few others. Mat, or others, feel free to correct me if I'm off on something.

Also, regarding precision factor: Precision factor is the factor by which the boost is decreased for less-precise geohash quadrants. A precision_factor of 200 makes each less-precise bounding box 1/200 as relevant as the previous. In your example, this doesn't really apply, since you are also using a precision of 3 which is the lowest level of precision. So it gets assigned a boost of 2.

--
Nick Zadrozny

Will Bridges

unread,
Oct 12, 2010, 4:45:39 PM10/12/10
to Sunspot
Thanks guys!

The problem is I'll have to drop pagination to get all results and
then sort them.

Geokit will allow sort_by_distance on an array of activerecord
objects... but it's not very speedy...
http://geokit.rubyforge.org/readme.html

On Oct 12, 3:16 pm, Nick Zadrozny <n...@onemorecloud.com> wrote:
> Hi Will,
>
> Would it be accurate to say that you're seeing no results outside of 389
> miles? If that is the case, perhaps the default minimum match of "100%" —
> which treats all clauses as mandatory — is affecting you here.
>
> Try setting "minimum_match 0" to make all clauses optional. Or
> "minimum_match -1" to make all but one clause mandatory. I published an article
> on the websolr blog<http://blog.websolr.com/post/1299174416/how-do-i-query-with-boolean-l...>about

Will Bridges

unread,
Oct 13, 2010, 12:29:16 AM10/13/10
to Sunspot
FYI, I ended up creating a dual solution. If I can't find anything in
a 389 mile bounding box with solr I pull out an optimized geokit query
to get the closest records. It's not ideal but it'll work until we
have a better solution.

On Oct 12, 3:45 pm, Will Bridges <willbrid...@gmail.com> wrote:
> Thanks guys!
>
> The problem is I'll have to drop pagination to get all results and
> then sort them.
>
> Geokit will allow sort_by_distance on an array of activerecord
> objects... but it's not very speedy...http://geokit.rubyforge.org/readme.html
Reply all
Reply to author
Forward
0 new messages