sub-key search /sub-value search?

692 views
Skip to first unread message

adityan

unread,
Sep 26, 2012, 5:02:57 PM9/26/12
to lev...@googlegroups.com
Hi 
I just got to know about leveldb and it looks pretty useful for fast key-value operations.  It is not very clear if leveldb can support sub-key search/sub-value search. So, key-values are stored as byte arrays. Now if my key was a string like say substring1+ substring2 + substring3 = (my key), can I use leveldb and search for all key-value pairs in the db that have say substring1 or substring2 in their keys?

Clarifying further: 
If key1 = substr1 + substr2 + substr3
   key2 = substrA  + substr2 + substrB
   key3 = substrX + substr2 + substrY

is there a way to query for all key-value pairs  which have *substr2* in their keys?

thanks

adityan

unread,
Sep 28, 2012, 5:14:15 PM9/28/12
to lev...@googlegroups.com
No answers? : )
I am hoping it wasnt a confusing question

Does leveldb support sub-key or sub-value search? thanks

Joshua Bell

unread,
Sep 28, 2012, 6:10:41 PM9/28/12
to lev...@googlegroups.com
I don't believe there's an efficient way to iterate over a key infix or suffix.

A typical solution would be to turn the infix into a prefix by creating an index, e.g.:

encode records as:
<type_record, substr1, substr2, substr3> = <value>

then have an index over substr2 encoded e.g.:
<type_index, substr2, primary_key> = <sentinel>

(where primary_key is probably <substr1, substr2, substr3> - but by encoding the primary key into the index key you can handle duplicates without having to invent a counter)

adityan

unread,
Oct 1, 2012, 7:25:22 AM10/1/12
to lev...@googlegroups.com
Thanks very much Joshua for your reply.

Could you explain a bit more about the actual indexing part? 

- where would you store the indexes? would you just store them as another k-v pair in the db itself?
- so, are you suggesting that I store 
   <type_index, substr2, primary_key> = <sentinel>
   <type_index, substr3, primary_key> = <sentinel>
   i.e. for each substring, store one entry in index table? What would the sentinel be??

appreciate your time
thanks

Joshua Bell

unread,
Oct 1, 2012, 12:28:58 PM10/1/12
to lev...@googlegroups.com
On Mon, Oct 1, 2012 at 4:25 AM, adityan <nri.a...@gmail.com> wrote:
Thanks very much Joshua for your reply.

Could you explain a bit more about the actual indexing part? 

- where would you store the indexes? would you just store them as another k-v pair in the db itself?

Yes. Sorry for a somewhat obtuse reply.

This is similar to the coding scheme we use for the Indexed DB implementation in Chromium on top of LevelDB. Different types of data (records, indexes, etc) are given different key prefixes, but all data for e.g. one origin is stored in a single leveldb instance. That's what I was intending to evoke with the "type_index" and "type_record" prefixes.
 
- so, are you suggesting that I store 
   <type_index, substr2, primary_key> = <sentinel>
   <type_index, substr3, primary_key> = <sentinel>
   i.e. for each substring, store one entry in index table?

For each record (i.e. unique primary key) you have an entry in the index table. For example, say you're storing these records (and ignoring type prefixes for the moment):

<aaa, bbb, ccc> = data:xxx
<ddd, bbb, eee> = data:yyy
<fff, ggg, eee> = data:zzz

Then you'd want these entries for an index on substr2:

<bbb, aaa, bbb, ccc> = sentinel
<bbb, ddd, bbb, eee> = sentinel
<ggg, fff, ggg, eee> = sentinel

Then to find all of the records that have substr2 = bbb you look for the index entries with the prefix bbb. The suffix of the key is the primary key of the record, which gets you back to the data. This also assumes you're not simply concatenating bits together, but have a string terminator or strings are length-prefixed.

(I'm not sure if your use of substr is intended to be *any* arbitrary substring of the key, or if you really have three-part keys. I've been assuming the latter.)
 
What would the sentinel be??

The sentinel is a small dummy value for a record, which isn't actually used. You only need the key itself, which gives you the primary key. IIRC, the sentinel can be the empty string.

adityan

unread,
Oct 3, 2012, 5:51:27 PM10/3/12
to lev...@googlegroups.com
Thanks Joshua! I really like the idea of storing entire key as a part of the index itself ! 

thanks

David Yu

unread,
Oct 4, 2012, 12:02:57 AM10/4/12
to lev...@googlegroups.com
On Thu, Oct 4, 2012 at 5:51 AM, adityan <nri.a...@gmail.com> wrote:
Thanks Joshua! I really like the idea of storing entire key as a part of the index itself ! 
That's basically how secondary indexing works (in every database out there).



--
When the cat is away, the mouse is alone.
- David Yu

Harit Himanshu

unread,
Oct 5, 2015, 1:58:06 PM10/5/15
to leveldb
Is there a code sample that I can see to learn more as how to achieve this? I am very new to leveldb. We have a situation (I asked question here), where I think this strategy might help.

Thanks
+ Harit Himanshu
Reply all
Reply to author
Forward
0 new messages