About Redis prefix searching (why geohashing)?

1,736 views
Skip to first unread message

Xiangrong Fang

unread,
Sep 22, 2011, 8:51:24 PM9/22/11
to redis-db
Recently,  geohashing seems hot on the redis list.  I take a look at its wikipedia page, it is really cool, but as regard to redis, is it any better comparing to just use latitude longitude pairs? just simply use 2 zsets you can do proximity search anyway.  Is it faster to use geohashed string than 2 zsets?  I can only imagine that it probably use less memory...

The key question is, how to do geohash matching in redis? I think the question is equivalent to string prefix searching, which is similar to "keys *" and slow, or just use some trick as described in antirez's post about auto complete feature, which I feel is not more efficient than 2 zsets?

Shannon

Josiah Carlson

unread,
Sep 22, 2011, 10:34:42 PM9/22/11
to redi...@googlegroups.com
On Fri, Sep 23, 2011 at 2:51 AM, Xiangrong Fang <xrf...@gmail.com> wrote:
> Recently,  geohashing seems hot on the redis list.  I take a look at its
> wikipedia page, it is really cool, but as regard to redis, is it any better
> comparing to just use latitude longitude pairs? just simply use 2 zsets you
> can do proximity search anyway.  Is it faster to use geohashed string than 2
> zsets?  I can only imagine that it probably use less memory...

To use 2 zsets, you first need to copy the zsets, then you need to
trim out the parts of the zsets that are outside the box, then you
need to intersect. This can be slow, depending on the size of the
zsets. It can be made faster by pre-grouping them into (for example) 1
degree by 1 degree blocks, with a potential union of blocks when you
are near an edge. Overall, however, you are still looking at 3-5
unions/intersections + 4 zremrangebyscore to get what you want. It can
all be done in pipelines, so it's not so bad.

> The key question is, how to do geohash matching in redis? I think the
> question is equivalent to string prefix searching, which is similar to "keys
> *" and slow, or just use some trick as described in antirez's post about
> auto complete feature, which I feel is not more efficient than 2 zsets?

This is where you weren't getting it ;) Geohashing just creates a
prefix of bits by which you can request ranges based on those
prefixes. In Redis, you don't actually use the "hash", you convert the
hash into a number (using the first 32 bits gives you better than 1km
by 1km resolution, 52 bits, which is the limit in Redis thanks to the
prevision of IEE 754 fp doubles gets you better than .01 by .01
nanometer resolution), then use that number as a score in a zset. To
get the objects in a geohash box, you just perform a sequence of
zrangebyscore operations.

No memory churn, no breaking into blocks for better performance, etc.
Just some range scans. Very fast, and very hard to beat. Even better,
if you are expanding the box in which you are searching (start with
.001 degree, or roughly 100m box), because you didn't get enough
items, you don't need to re-intersect, re-union, etc. It's just
another couple zrangebyscore calls.

Regards,
- Josiah

> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to
> redis-db+u...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/redis-db?hl=en.
>

Josiah Carlson

unread,
Sep 22, 2011, 10:36:36 PM9/22/11
to redi...@googlegroups.com
Serves me right for not proof-reading on this tiny keyboard.

"the prevision of IEE 754 fp doubles"
should read...
"the limitation of IEEE 754 fp doubles"

Regards,
- Josiah

Xiangrong Fang

unread,
Sep 22, 2011, 10:40:20 PM9/22/11
to redi...@googlegroups.com
Thanks Josiah,  the key point here is indeed to use number but not string prefix search :)

2011/9/23 Josiah Carlson <josiah....@gmail.com>

--
Reply all
Reply to author
Forward
0 new messages