Or maybe the question should be: What's the best way to represent a string as a number, such that sorting their numeric representations would give the same result as if sorted as strings? I devised a way that could sort up to 9 characters per string, but it seems like there should be a much better way.
In advance, I don't think using Redis's lexicographical commands will work. (See the following example.)
Example: Suppose I want to presort all of the names linked to some ID. Ideally I would have the IDs as the zset's members, and the numeric representation of each name would be the zset's scores.
Does that make sense? Or am I going about it wrong?
--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.
Itamar Haber | Chief Developers Advocate
Redis Watch Newsletter - Editor and Janitor
Redis Labs - Enterprise-Class Redis for Developers
Mobile: +1 (415) 688 2443
Mobile (IL): +972 (54) 567 9692
Email: ita...@redislabs.com
Skype: itamar.haber
--
You received this message because you are subscribed to a topic in the Google Groups "Redis DB" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/redis-db/ifmotqIDwi8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to redis-db+u...@googlegroups.com.
I suppose I'll start from the beginning and explain what I'm trying to do.1) Users create any resource with arbitrary properties via a RESTful API, where the properties of that resource can be of any Redis data type. I should probably note that users don't normally need to bother with the API, as most calls to it are handled automatically by the app. Users typically only specify the data types (and even then, some properties' data types are determined automatically).2) The keys representing each property look something like <namespace>:<property>:<id>, but for certain data types (like strings and numbers) it defaults to Redis hashes in the form of <namespace>:<property>:<id.substring(0,4)> for the key and <id.substring(4)> for the field.3) The API works similarly to Redis. For example, suppose you have some list called "exampleList". You could POST some value to /<id>/exampleList?_bUnshift=1 to put it at the beginning of the list, or DELETE /<id>/exampleList?_start=1&_stop=3 to trim it.4) Users can retrieve multiple resources at once via queries. So for example, suppose you wanted to retrieve all the properties of red shirts of a certain brand, that request might look like: GET /query?type=shirt&color=red&brand=foo&_limit=10. Or if you only want to retrieve a few specific properties: GET /query/name,description,price?type=shirt&color=red&brand=foo&_limit=10. This would return the latest 10 resources matching those properties. Or if you wanted to sort by name: GET /query?type=shirt&color=red&brand=foo&_limit=10&_sortBy=name.All of that is fairly standard, basic stuff, I believe. But hopefully that's enough info to understand my use case, and I'll try to explain my reasoning for wanting to be able to somehow use strings as scores in ZSETs.On the backend, the queries are handled using SINTER. When some resource is created or its properties are changed, there are sets in Redis whose keys look like <namespace>:by:<property>:<value> and contain all the IDs of the property:value pairs. So for instance, suppose some resource whose ID is 5 has a property called "color" and its value is "red". The Redis set at key <namespace>:by:color:red would contain the ID "5". And the call to Redis for the above query would look something like "SINTER <namespace>:by:type:shirt <namespace>:by:color:red <namespace>:by:brand:foo" and it would return all of the matching IDs.
Now for the problem of sorting by strings... I anticipate there sometimes being an incredibly large number of IDs resulting from the intersection, and so I would rather not have to retrieve the (string) property for each ID if it's feasible to have everything presorted (which I seem to be able to do up to 9 characters) and then intersected using ZINTERSTORE.I am aware of Redis's SORT command, but wouldn't it be considerably slower than using ZINTERSTORE?
Plus, I think I would need to restructure certain property types (see point 2 above) to make it work, and I would prefer to not do that.
Does all of that make sense? If I'm completely overlooking something or if this is simply a bad design or if there's a flaw in my logic somewhere, please let me know!
Hey Josiah, thanks for the detailed reply!> This is pretty standard searching in Redis, though your #2 is a little funky. What is more common (in the #2 case) is to just store data as:>> <namespace>:id: -> {<property>: <value>, ...}>> Aside from minimizing the number of keys in Redis (which can help reduce memory to some extent), why are you sharding by id prefix (other sharding methods may offer better memory savings)?I've read that storing data using that method, where possible, is the most efficient way and can reduce memory usage quite a bit with the proper config, which makes sense to me. What other methods would you recommend? To me, it doesn't make sense to use the typical property:value hashes unless I were using the SORT command, which I don't think I will actually use. Plus, I obviously can't store other data types (sets, lists, bitmaps, etc.) within said hash table, which are quite common.
I don't anticipate string searches being common, at least not for anything important, so I think a 12 character case-insensitive alphanumeric prefix will work just fine. How can this be achieved?
...
Thanks again for the quick reply!!> So, if you have ids up to 1235000 (without skipping any), your hash will be 1111 elements in size. And as your number of digits increases, your fully-prefixed hashes will grow. On the other side of things, you will also have hashes like 1, 12, 123, etc., with only a single element.I probably should have mentioned that every ID is actually the time the resource was created, using JavaScript's Date object and checked for uniqueness: (new Date()).getTime(). Whether or not this was a good decision... I guess I'll find out haha.
I did it mainly for the following reasons: 1) It's extremely common for the client to want to know the time some resource was created 2) For the sharding reason you explained, although your method is obviously much better. I might convert everything to your method if you think it would be worth the effort.
> This functionality will not work in a language that does not have 64 bit integers (Javascript being one in particular, as all numbers in Javascript are IEEE 754 fp doubles). You can test to see if your PHP has 64 bit integers by checking PHP_INT_SIZE.I will definitely try out the algorithm you outlined. I am using Node, but surely there's a way to somehow make Redis interpret the proper value via JavaScript??