Conctact book filtering and sorting in Redis

129 views
Skip to first unread message

alok kumar

unread,
Nov 16, 2016, 4:17:13 PM11/16/16
to Redis DB
Hi,

I am implementing a contact list book containing name, address, location and phone number as exactly described in 'Redis in Action' book by Josiah Carlson. I need to implement the searching feature on fields but the book doesn't explain this recipie and it's not straightforward. I want to be able to filter and sort on all fields but since there  can be duplicate names I am not able to determine a right data structure to store the fields.

Any suggestions would be highly appreciated.

Thanks,

Evans A.

unread,
Nov 17, 2016, 2:58:31 AM11/17/16
to redi...@googlegroups.com
Hi Alok,

You can simply index the fields with a sorted-set.
For a name like 'alok' with ID=4, you store values like: a:4, al:4, alo:4 & alok:4 (prefix searches assumed, but the idea is applicable also to infix searches)
With this method, given a search term 'al', you have the IDs ordered in the same way across the other field-indexes; this makes it possible to use a merge-join across the search fields.
Your goal is to avoid sorting; build it into the index.

Here's an implementation of such a search: https://github.com/scorpevans/redislayer/blob/master/nodejs/example2.js .
(I used redislayer which I introduced to this mailinglist on Tue, Nov 15 at 8:07 AM).


Evans








--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+unsubscribe@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at https://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

zenmaster

unread,
Nov 17, 2016, 2:13:09 PM11/17/16
to redi...@googlegroups.com
Thanks alot Evans!

I appreciate your help. I came across an article that indexes the strings in the same way you described plus it keeps a hash to the actual json record as well. http://patshaughnessy.net/2011/11/29/two-ways-of-using-redis-to-build-a-nosql-autocomplete-search-index.

Regards,

Alok

Evans A.

unread,
Nov 27, 2016, 4:00:41 PM11/27/16
to redi...@googlegroups.com
Hi Alok,

w.r.t. implementation, I think zinterstore is too resource intensive for your use-case.
It is also worth some consideration whether you want to put the prefix in the key.

For an alternative to zinterstore and the possibility to page your results, see my implementation.


Evans




zenmaster

unread,
Nov 30, 2016, 10:41:40 AM11/30/16
to redi...@googlegroups.com
Hi Evan,

I did some scale tests but didn't find any bottlenecks in using zinterstore. For my particular problem I am using zset to index first_name, last_name and business name. But I also need to find out if that record exists in a list of State and cities. For that I have to use zinterstore. I don't find any other option. For e.g.

A user inputs 3 letter keyword and from dropdown they can select a state like 'Illinois' and a city like 'Chicago'. Also I need a profession_code(for e.g. doctor, lawyer etc.) associated with that record. No single data structure in redis is giving me direct access to all fields using this input. 

So, what I am doing rightnow is creating a uuid to store first_name, last_name, business name using a zset indexing every word. Then creating a set of <State:City> <UUID>. Also a <Prof_code> <UUID> set. Using zrange I get the UUID's of the searched name and then using zinterstore I can filter on city and prof_code to return the UUID. That uuid is also stored in a hash that has full record in key, value pairs.

I didn't find a better way to achieve this.

If you know a better way I would appreciate if you suggest.

Thanks,


Evans A.

unread,
Nov 30, 2016, 11:57:31 PM11/30/16
to redi...@googlegroups.com

Why don't you treat state, city, prof_code ... property_N, in the same way as the indexes for the names?
(your current approach works if you treat all the property-indexes the same way, or?)

ASIDE:

The overhead with zinterstore is that it operates on the entire sorted-set AND stores the result AND needs updating/cleaning-up of the result; this extra work and storage go to waste if the result is not re-used in some way.
So, you could implement your own merge-join* (i.e. zinterstore)

Depending on how you tested, things may seem scalable if your test data generates few Members per Key.
i.e. if you store the prefixes like the article you linked to (probably the only chance for zinterstore), you trade-off Members for Keys:
zkey:a    --> 4, 5
zkey:al   --> 4
...
zkey:水  --> 7
With such prefixing, zinterstore deals with very few Members/matches (btw zinterstore and Redis are both fast).
So, the test could give a false impression.


In the end, it's a question of whether you store vertically (More Keys) or horizontally (more Members), and whether you use zinterstore for your merge-join.
* Generally I think you can go ahead with your current approach, or check-out this implementation: https://github.com/scorpevans/redislayer/blob/master/nodejs/example2.js (ORM + merge-join + other benefits); I'll gladly give you all the support to tryout redislayer.



Evans



zenmaster

unread,
Dec 2, 2016, 2:09:00 AM12/2/16
to redi...@googlegroups.com
Hi Evans,

I am using C# stackexchange driver. I am using python to import updates from Hbase to redis cache.

Fields like state, city, prof_code ... property_N don't need to be indexed. I mean the user won't be searching it through (typehead/autocomplet/select2). The only thing that they will search is the first_name and last_name or business name. That is why I am indexing only those 3 fields. Every single name/business has a separate UUID through which I get their respective city, state etc. 

There are 100000 total first_name, last_name and business names and approx 20 members per record which is 2000000 keys. Its a decent size number. City, state and prof_code are just <30 keys so zinter is quick in this scenario.

I am not doing SUNION(merge-join) because I need unique UUIDs matching in both zset(indexed first name/last name) and the set(city,state,etc).

BTW, redislayer looks like a very promising and interesting project. I am fairly new with redis and because JS driver didn't support transactions with redis when I last looked I had to stick to C# and python.

I will be doing more testing in coming days. I am sure I will find a good use of redislayer when the use case comes in the future. 

Thanks for your help,


Reply all
Reply to author
Forward
0 new messages