Postal code search/autocomplete with Redis

382 views
Skip to first unread message

eatpizaa

unread,
Sep 3, 2014, 3:34:58 PM9/3/14
to redi...@googlegroups.com
I need to create a predictive postal code text input (zip code for us guys).
So if I want to type "10469 - New York" I start typing the numbers 1,0,4.. and then I get postal codes that start with 10, 104, etc. So the options keep appearing as I type until Bronx comes in.

To get that done I followed this excellent post on the topic: http://oldblog.antirez.com/post/autocomplete-with-redis.html

I have a huge set of files with the postal codes for different countries and I have created Redis keys for each country. Since I know how many digits the postal codes have I create the sorted sets reading the number of digits the postal code for each country are expected to have.

The thing is working fine. However I would like to be able to also enter N,e,w, ,y... so I could also predict based on the city name.

Any good hints how to improve the code achieve this?
An issue might be the size of DB to fit the sorted lists required to have this. For example a country like Spain can have more than 40k lines of postal codes.

This piece of code is the one that takes care of reading a line from a file and creating entries to the DB.

r = Redis.new
 
# Create the completion sorted set
if !r.exists(:compl)
 puts
"Loading entries in the Redis DB\n"
 
File.new('postal_codes.txt').each_line{|n|
 n
.strip!
 
(1..(n.length)).each{|l|
 prefix
= n[0...l]
 r
.zadd(:compl,0,prefix)
 
}
 r
.zadd(:compl,0,n+"*")
 
}end

This is the function that makes the actual search

def complete(r,prefix,count)
 results
= []
 rangelen
= 50 # This is not random, try to get replies < MTU size
 start
= r.zrank(:compl,prefix)
 
return [] if !start
 
while results.length != count
 range
= r.zrange(:compl,start,start+rangelen-1)
 start
+= rangelen
 
break if !range or range.length == 0
 range
.each {|entry|
 minlen
= [entry.length,prefix.length].min
 
if entry[0...minlen] != prefix[0...minlen]
 count
= results.count
 
break
 
end
 
if entry[-1..-1] == "*" and results.length != count
 results
<< entry[0...-1]
 
end
 
}
 
end
 
return results
end
 
complete
(r,"marcell",50).each{|res|
 puts res
}


The biggest change would be to search not just from the beginning, but "rcell" should be able to provide a set of results.

Any good advices or hints?

Jan-Erik Rediger

unread,
Sep 3, 2014, 4:59:39 PM9/3/14
to redi...@googlegroups.com
While the post is still fine for some uses cases a lot changed since
back in 2010. We now have ZRANGEBYLEX[1] for an even easier
implementation of a autocomplete search.
Have a look at http://autocomplete.redis.io/ as well as the pretty
simple code: https://gist.github.com/antirez/11126283
Basically all you need to do is adding your items with all the same
score to a sorted set plus some additional server-side code to have map "New York"
to the right postal code too.


[1] http://redis.io/commands/zrangebylex
> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/redis-db.
> For more options, visit https://groups.google.com/d/optout.

eatpizaa

unread,
Sep 4, 2014, 2:47:46 AM9/4/14
to redi...@googlegroups.com, jan...@fnordig.de
Nice. I will experiment with ZRANGEBYLEX.

Thanks for the information!

eatpizaa

unread,
Sep 22, 2014, 5:39:25 AM9/22/14
to redi...@googlegroups.com, jan...@fnordig.de
Hi, I finally started checking this out.

I've been playing with the redis console and I am able to retrieve items from a set.
This post was useful also: http://stackoverflow.com/questions/24418807/how-to-imitate-autocomplete-search-with-redis-zrangebylex

But I can only get elements from the set when giving the first character.

zadd autocomplete 0 one 0 two 0 three 0 four 0 five 0 six 0 seven 0 eight 0 nine 0 ten 0 eleven 0 twelve 0 thirteen 0 fourteen 0 fifteen

zrangebylex autocomplete [four "[four\xff"
1) "four"
2) "fourteen"

zrangebylex autocomplete [our "[our\xff"
(empty list or set)

Is there a way to have this last command to return the same set as the one that looks for four? plus any other entries that would contain "our".

I'm not sure what you mean with "...additional server-side code to have map "New York" to the right postal code too"
I'd like to have a single entry like "10469 - New York" and we able to search from inside the string, not just the first character.

Thanks.







On Wednesday, September 3, 2014 11:59:39 PM UTC+3, janerik wrote:

Jan-Erik Rediger

unread,
Sep 22, 2014, 5:49:46 AM9/22/14
to eatpizaa, redi...@googlegroups.com
No, arbitrary string match is not possible with ZRANGEBYLEX, only
prefix-search is possible.
What you want _could_ be achieved with SSCAN, but it will be slow in
Redis.
> > an email to redis-db+u...@googlegroups.com <javascript:>.
> > > To post to this group, send email to redi...@googlegroups.com
> > <javascript:>.

Jesús Gabriel y Galán

unread,
Sep 22, 2014, 5:52:32 AM9/22/14
to redi...@googlegroups.com
On Mon, Sep 22, 2014 at 11:39 AM, eatpizaa <jose...@googlemail.com> wrote:
> Hi, I finally started checking this out.
>
> I've been playing with the redis console and I am able to retrieve items
> from a set.
> This post was useful also:
> http://stackoverflow.com/questions/24418807/how-to-imitate-autocomplete-search-with-redis-zrangebylex
>
> But I can only get elements from the set when giving the first character.
>
> zadd autocomplete 0 one 0 two 0 three 0 four 0 five 0 six 0 seven 0 eight 0
> nine 0 ten 0 eleven 0 twelve 0 thirteen 0 fourteen 0 fifteen
>
> zrangebylex autocomplete [four "[four\xff"
> 1) "four"
> 2) "fourteen"
>
> zrangebylex autocomplete [our "[our\xff"
> (empty list or set)
>
> Is there a way to have this last command to return the same set as the one
> that looks for four? plus any other entries that would contain "our".
>
> I'm not sure what you mean with "...additional server-side code to have map
> "New York" to the right postal code too"
> I'd like to have a single entry like "10469 - New York" and we able to
> search from inside the string, not just the first character.

One straightforward idea would be to add all possible suffixes to the
zset, with an additional syntax to have the original value. Something
like:

127.0.0.1:6379> zadd autocomplete 0 one:one 0 ne:one 0 e:one 0 two:two
0 wo:two 0 o:two 0 three:three 0 hree:three 0 ree:three 0 ee:three 0
e:three 0 four:four 0 our:four 0 ur:four 0 r:four 0 fourteen:fourteen
0 ourteen:fourteen 0 urteen:fourteen 0 rteen:fourteen 0 teen:fourteen
0 een:fourteen 0 en:fourteen 0 n:fourteen
(integer) 23
127.0.0.1:6379> zrangebylex autocomplete [our "[our\xff"
1) "our:four"
2) "ourteen:fourteen"


Then in the client you can split by the ":" and get the second part.
Maybe this is too heavy for your use case (the zset might grow too
much), and there are better ways, but it might give you some ideas.

Jesus.

Itamar Haber

unread,
Sep 22, 2014, 5:57:21 AM9/22/14
to redi...@googlegroups.com

To search for stuff that ends with 'our', use the same auto complete pattern but with the reversed string:

ZADD revac 0 eno 0 owt 0 eerht 0 ruof ...
ZRANGEBYLEX revac [ruo "[ruo\xff"

More at: https://redislabs.com/blog/how-to-use-redis-at-least-x1000-more-efficiently

eatpizaa

unread,
Sep 22, 2014, 8:52:23 AM9/22/14
to redi...@googlegroups.com
Thanks, I'll keep it in mind, however it would present the same limitation as it is either from the beginning or from the end.

eatpizaa

unread,
Sep 22, 2014, 8:56:19 AM9/22/14
to redi...@googlegroups.com
Yeah I have something similar in mind, like having two sets, one with the postal codes and another with the city names, and then after the : they both would have the same postal code - city combination. But to avoid using too much memory I would create the set for the city name with the first 5 or 6 letters.

Thanks for the comment.

eatpizaa

unread,
Sep 22, 2014, 8:58:52 AM9/22/14
to redi...@googlegroups.com, jose...@googlemail.com, jan...@fnordig.de
Ok thanks. I need to keep playing but I think using ZRANGEBYLEX is already a much lighter set than the original I had, so that is already and improvement.
Reply all
Reply to author
Forward
0 new messages