Querying values by pattern?

4,397 views
Skip to first unread message

Mason Jones

unread,
May 11, 2009, 2:20:48 AM5/11/09
to redi...@googlegroups.com
Another question...looking at my company's potential usage of redis,
I'm wondering about current "best practices" for finding a set of keys
where the values match a pattern?

As an example, we're dealing with product data. One of the fields in
my testing is product category. If we're setting a key like
"product:123:category" with a value like "Electronics", what would be
the fastest and most efficient way to find all of the keys with a
matching value -- if I wanted to get all of the products in the
"Electronics" category.

Obviously I could place those into a set based on the category, so
perhaps that's not the best example, but if I tried to create a set
for every field and every value, it will quickly multiply into numbers
that are far too large. Alternately, if a field is product title:

key="product:123:title", value="LCD Television"
key="product:456:title", value="Plasma Television"

And I want to get the keys for all products where title contains
"Television". I can of course get all of the title keys by
constructing an MGET, but that could be a million keys. If I then need
to get the values and compare each against a pattern like
"*Television*", it doesn't seem so efficient. If I could at least get
all of the keys that match a pattern, it would help a bit -- are there
plans for this?

Any suggestions for better ways to handle this would be great.

Thanks.

-- Mason

Salvatore Sanfilippo

unread,
May 11, 2009, 4:52:13 AM5/11/09
to redi...@googlegroups.com
On Mon, May 11, 2009 at 8:20 AM, Mason Jones <maso...@gmail.com> wrote:

Hello Mason,

> Another question...looking at my company's potential usage of redis,
> I'm wondering about current "best practices" for finding a set of keys
> where the values match a pattern?

KEYS foo:*

but it's not suggested to use Redis this way, unless the pattern
matching thing is just used in order to rebuild the DB schema or for
debugging.

> As an example, we're dealing with product data. One of the fields in
> my testing is product category. If we're setting a key like
> "product:123:category" with a value like "Electronics", what would be
> the fastest and most efficient way to find all of the keys with a
> matching value -- if I wanted to get all of the products in the
> "Electronics" category.

You have to do this in a simpler way.
Every time you add a product in the electronic category you "tag" it into a Set:

SADD category:Electronics <your-product-id>

To get all the IDs of the products under Electronics:

SELEMENTS category:Electronics

To get the name and price of all this procuts in Electronics, sorted by prince:

SORT category:Electronics BY product:*:price GET product:*:name GET
product:*:price

> Obviously I could place those into a set based on the category, so
> perhaps that's not the best example, but if I tried to create a set
> for every field and every value, it will quickly multiply into numbers
> that are far too large. Alternately, if a field is product title:

You should place just the IDs in this Sets.
On the other hand it will be very handy, for example, to get the
products that are both in Electronics and Music

SINTER category:Electronics category:Music

and Redis will return the intersection of the IDs. If you also need
SORT against the intersection try SINTERSTORE followed by SORT.
Possibly set an EXPIRE in the output of INTERSTORE so that it will
automatically act as a cache.

> key="product:123:title", value="LCD Television"
> key="product:456:title", value="Plasma Television"
>
> And I want to get the keys for all products where title contains
> "Television". I can of course get all of the title keys by

Ok if you want to match *by value* Redis is not a good idea at all,
otherwise you need to have Sets in order to perform a FULLTEXT
indexing of the fields you want to search against.

> constructing an MGET, but that could be a million keys. If I then need
> to get the values and compare each against a pattern like
> "*Television*", it doesn't seem so efficient. If I could at least get
> all of the keys that match a pattern, it would help a bit -- are there
> plans for this?

Actually to match against all the keys is less efficient than using a
Set + Sort to retrieve the associated keys.
Note that you can use SORT avoiding the sorting stage at all if you
are not interested in getting ordered elements:

SORT foobarkey BY dontsort ....

"dontsort" is an example, any constant key non containing '*' will work.

> Any suggestions for better ways to handle this would be great.

Hope this helps!

Cheers,
Salvatore

>
> Thanks.
>
> -- Mason
>
> >
>



--
Salvatore 'antirez' Sanfilippo
http://invece.org

"Once you have something that grows faster than education grows,
you’re always going to get a pop culture.", Alan Kay

Mason Jones

unread,
May 11, 2009, 11:16:01 PM5/11/09
to redi...@googlegroups.com
On Mon, May 11, 2009 at 1:52 AM, Salvatore Sanfilippo <ant...@gmail.com> wrote:
>
> On Mon, May 11, 2009 at 8:20 AM, Mason Jones <maso...@gmail.com> wrote:
>
> Hello Mason,
>
>> Another question...looking at my company's potential usage of redis,
>> I'm wondering about current "best practices" for finding a set of keys
>> where the values match a pattern?
>
> KEYS foo:*
>
> but it's not suggested to use Redis this way, unless the pattern
> matching thing is just used in order to rebuild the DB schema or for
> debugging.

Yes, I guess I wasn't clear enough -- I meant where the *values* match
a pattern. For example, like a traditional RDBMS query of "select id
where category='electronics'", I want all of the keys where the values
match. I realize this isn't the ideal use of redis; it's a sort of
sideways requirement we have.

>> As an example, we're dealing with product data. One of the fields in
>> my testing is product category. If we're setting a key like
>> "product:123:category" with a value like "Electronics", what would be
>> the fastest and most efficient way to find all of the keys with a
>> matching value -- if I wanted to get all of the products in the
>> "Electronics" category.
>
> You have to do this in a simpler way.
> Every time you add a product in the electronic category you "tag" it into a Set:
>
> SADD category:Electronics <your-product-id>
>
> To get all the IDs of the products under Electronics:
>
> SELEMENTS category:Electronics
>
> To get the name and price of all this procuts in Electronics, sorted by prince:
>
> SORT category:Electronics BY product:*:price GET product:*:name GET
> product:*:price

So that implies that the above does a "query" of the elements in the
"category:Electronics" set and returns just the two name and price
elements?

Doing this might work, but it is possible for a customer to have many
categories, and there are other similar field types that we need to
treat this way (such as Manufacturer). If this results in several
thousand tags being applied to a few million elements, would it scale?

>> key="product:123:title", value="LCD Television"
>> key="product:456:title", value="Plasma Television"
>>
>> And I want to get the keys for all products where title contains
>> "Television". I can of course get all of the title keys by
>
> Ok if you want to match *by value* Redis is not a good idea at all,
> otherwise you need to have Sets in order to perform a FULLTEXT
> indexing of the fields you want to search against.

Yes, I thought so. I believe that perhaps we may have only a few
fields where we must have a full-text search, and it may make sense to
separately index them outside redis, map to ids, and then use redis to
get the rest of the information. We'll see about that.

> Hope this helps!

Yes, thank you. I have some more things to try now!

-- Mason

Reply all
Reply to author
Forward
0 new messages