Question about querying views with array keys

39 views
Skip to first unread message

Brendan Duddridge

unread,
Aug 26, 2015, 4:38:28 PM8/26/15
to Couchbase Mobile
Hi,

If I have a view that emits keys something like this:

emit(@[doc[@"Form"], doc[@"Record"], doc[@"Field"]], nil);


I would like to fetch all rows that have a  specific Field no matter what the value of Form and Record are.

Is that possible?

I thought maybe something like this would work:

query.keys = @[@[@"", @"", field]];


But it didn't work. It found 0 rows. I kind of need a wildcard for Form and Record or something like that.

I'm basically trying to support a cascade delete. If the user deletes a Form, Record, or Field, I need to delete all rows that reference any of them.

Would it be better to use an NSDictionary as the key instead of an NSArray? Sort order is unimportant in this case.

Thanks,

Brendan

Brendan Duddridge

unread,
Aug 28, 2015, 1:42:00 PM8/28/15
to Couchbase Mobile
So am I barking up the wrong tree with my approach? Or is there a better way to query this kind of data? I've never used a dictionary for a key, so is that maybe the best approach to take instead of an array key?

Thanks

Brendan

Jens Alfke

unread,
Aug 28, 2015, 2:29:11 PM8/28/15
to Couchbase Mobile
On Aug 26, 2015, at 1:38 PM, Brendan Duddridge <bren...@gmail.com> wrote:

I would like to fetch all rows that have a  specific Field no matter what the value of Form and Record are.

Then you’ll need a view that has Field as its key (or its primary key, if the key is an array).

But it didn't work. It found 0 rows. I kind of need a wildcard for Form and Record or something like that.

No can do; that simply isn’t how indexes (in any database) work. Think of the index as a big on-disk array of {key, value} pairs that’s always sorted by the key. It’s very efficient — O(log n) time — to find an item with a given key, and stepping to the next item takes a very small constant time. But iterating the entire index is slow, O(n). For more details, consult Wikipedia for “search tree” or “b-tree”.

Your view’s index is sorted by Form, then Record, then Field. So its ordering looks like

form1, record1, field1
form1, record1, field2
form1, record1, field3
form1, record2, field1
form1, record2, field2
form2, record1, field1
form3, record1, field1,
form4, record2, field2
where “1”, “2”, etc. are just placeholders showing the lexical order that the different values appear in.

As you can see from this, if you ask for all rows where the key contains “field2”, there’s no quick way to find them, because they’re scattered all through the index. So the database has to do a slow linear scan of the entire index, comparing the 3rd item of each key with “field2”.

Instead, the way to find “field2” efficiently is to create another index, whose primary key is Field. Then all the “field2” values will be adjacent to each other and can quickly be found and scanned.

Again, this holds true for any type of database, it’s just that you’re probably used to SQL which will do more work for you — its approach is to perform any query you ask of it, however inefficient. So a SQL database would let you do what you’re asking; you’d just find as you scaled up your database that the query got really slow. And then you might ask a SQL guru or do some research, and find that the answer is to run another CREATE INDEX statement to make an index with Field as its primary key. Exactly as with Couchbase Lite.

(Note that you could use the CBLQueryBuilder, which is sort of SQL-like in that it can look at your high-level predicates and figure out how to convert them into an index-based query. It acts differently than SQL in that it will automatically create a view and index as necessary to make the query fast. Which is better than SQL in some ways, but has the drawback that you might end up with too many views/indexes, which will slow down querying after an insertion.)

Hope this helps, and hope I didn’t come off as patronizing :)

—Jens

Brendan Duddridge

unread,
Aug 28, 2015, 2:48:24 PM8/28/15
to Couchbase Mobile
Hello Jens,

Ok thanks very much for the explanation. It never hurts to know more about the internal workings of Couchbase and your explanation makes perfect sense.

I just have to decide now if I want to perhaps just do an in-memory search for these objects (after fetching all the objects that match the Form part) or if I want to add 2 more views to manage the records and fields separately on disk, consuming more disk space and slowing down the queries and indexing. I'm leaning towards just doing it in memory. Fortunately for any given record or field, I already know which form it belongs to.

I guess using a dictionary key wouldn't make sense since you didn't address that part.

Thanks,

Brendan

Jens Alfke

unread,
Aug 28, 2015, 3:05:42 PM8/28/15
to Couchbase Mobile
On Aug 28, 2015, at 11:48 AM, Brendan Duddridge <bren...@gmail.com> wrote:

I'm leaning towards just doing it in memory. 

That would be my choice too. “Do the simplest thing that could possibly work.”
The best way to implement it is probably as a postFilter on the query. (That’s the way the QueryBuilder will do it if you ask it to.)

I guess using a dictionary key wouldn't make sense since you didn't address that part.

Dictionary keys turn out to be pretty useless. They work like array keys except that the priority of the items depends not on the position in the array, but on the lexical order of the key. So they don’t do anything that array keys can't do, but they’re harder to understand, mistake-prone (some of the string ordering details are arcane), and use up more memory and disk space.

It would be cool if dictionary keys did what you wanted — letting you search efficiently by any of the keys in the dictionary — but to do that we’d probably have to internally split up the view and generate a separate index for each key that appeared in a dictionary. Which sounds difficult, and it could potentially create an explosion of indexes if misused by a developer.

—Jens
Reply all
Reply to author
Forward
0 new messages