CBL .Net4.5 :: How to get a set of properties of all document really fast?

43 views
Skip to first unread message

Sherry Ummen

unread,
Sep 1, 2014, 4:33:02 AM9/1/14
to mobile-c...@googlegroups.com
Hi,

 In our software we have it so that we provide user a list of all documents present in the database. We also have something like object browser where we have a file open dialog which shows a list all the documents present in the database and when user select it then the document is read from the database.

For this scenario I would want to get the list of all documents with certian properties of it.

I have tried view. But for the first time it was soo dead slow. And I have this experience that view remain stale sometimes. So could some one tell better efficient way to do this?


-Sherry

Sherry Ummen

unread,
Sep 1, 2014, 6:30:30 AM9/1/14
to mobile-c...@googlegroups.com
I tried like this as well

             
                var view = _database.CreateAllDocumentsQuery();
               
               
var docs = view.Run().Select(x => new DatabaseUtils.CacheData
               
{
                   
DescrName = (string) x.Document.Properties["name"],
                   
Date = Convert.ToInt32(x.Document.Properties["date"]),
                   
DescriptionType = Convert.ToInt32(x.Document.Properties["type"]),
                   
Version = (string) x.Document.Properties["version"]
               
});
               _iterator
= docs.GetEnumerator();

This is also soo dead slow. Please help :(

Jens Alfke

unread,
Sep 1, 2014, 1:44:31 PM9/1/14
to mobile-c...@googlegroups.com

On Sep 1, 2014, at 1:33 AM, Sherry Ummen <sherry...@gmail.com> wrote:

I have tried view. But for the first time it was soo dead slow. And I have this experience that view remain stale sometimes.

What does your view's map function look like?

—Jens

sherry...@gmail.com

unread,
Sep 1, 2014, 10:21:44 PM9/1/14
to mobile-c...@googlegroups.com
Here it is:

  var view = db.GetView("aview");
            view.SetMapReduce((IDictionary<string, object> document, EmitDelegate emitter) => {
                if (document["_id"] != null) {
                    emitter(document["_id"], document["_id"]);
                    
                }
            }, null, "1");
            return view;


--
You received this message because you are subscribed to a topic in the Google Groups "Couchbase Mobile" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mobile-couchbase/_-BKrLd7Fek/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mobile-couchba...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mobile-couchbase/BF4CA051-4919-4EC5-B4BB-B7AA21E1EC73%40couchbase.com.

For more options, visit https://groups.google.com/d/optout.

s.ande...@gmail.com

unread,
Sep 2, 2014, 6:25:01 AM9/2/14
to mobile-c...@googlegroups.com
var view = db.GetView("aview");
view.SetMapReduce((IDictionary<string, object> document, EmitDelegate emitter) => {
if (document["_id"] != null) {
emitter(document["_id"], document["_id"]);

}
}, null, "1");
return view;

Why do you need to test that the document ID is not null? Would it ever be null?

Why do you emit the document ID as the value in addition to the key? Why not just emit the document ID as the key and null as the value?

Jens Alfke

unread,
Sep 2, 2014, 12:08:17 PM9/2/14
to mobile-c...@googlegroups.com

On Sep 1, 2014, at 7:21 PM, sherry...@gmail.com wrote:

            view.SetMapReduce((IDictionary<string, object> document, EmitDelegate emitter) => {
                if (document["_id"] != null) {
                    emitter(document["_id"], document["_id"]);
                    
                }

There's no reason to create a view like this — it's identical to the built-in all-documents view. Just call Database.createAllDocsQuery instead, if you want to iterate over all documents.

(Also, as s.anderson pointed out, document["_id"] is never null, and it's redundant to emit it as the value since it's already the key.)

—Jens

Sherry Ummen

unread,
Sep 2, 2014, 1:00:26 PM9/2/14
to mobile-c...@googlegroups.com
HI Jens,

   Sorry for the confusion.

  If you check my second post you will notice the code which I use I will paste it here again.


                var view = _database.CreateAllDocumentsQuery();
               
var docs = view.Run().Select(x => new DatabaseUtils.CacheData
               
{
                   
DescrName = (string) x.Document.Properties["name"],
                   
Date = Convert.ToInt32(x.Document.Properties["date"]),
                   
DescriptionType = Convert.ToInt32(x.Document.Properties["type"]),
                   
Version = (string) x.Document.Properties["version"]
               
});
               _iterator
= docs.GetEnumerator();


so I actually use _database.CreateAllDocumentsQuery();

but the name of variable I choose is wrong. I kept it as "view" which probably cause the confusion.

And the view creation code which I posted earlier


view.SetMapReduce((
IDictionary<string, object> document, EmitDelegate emitter) => {
                if (document["_id"] != null) {
                    emitter(document["_id"], document["_id"]);
                    
                }

this one is old code and I don use it.

But the point is that even though I use _database.CreateAllDocumentsQuery(); its dead slow.

Just to give more info the database which I am querying has about 10000 documents and some of the document size is more than 90MB. Is that the reason of it being slow?
But what I wanted is just few properties of the document but not full document.

Once again sorry for the confusion.

Jens Alfke

unread,
Sep 2, 2014, 1:06:02 PM9/2/14
to mobile-c...@googlegroups.com

> On Sep 2, 2014, at 10:00 AM, Sherry Ummen <sherry...@gmail.com> wrote:
>
> But what I wanted is just few properties of the document but not full document.

Then you should use a view instead, which emits only the properties you need. In your query code, get those properties from the row's Value instead of accessing the document.

What your current code is doing is reading/parsing every document in the database one at a time. That's going to be slow.

> some of the document size is more than 90MB.

Including attachments or not? It's OK to have large attachments because they're not read as part of the document. But 90MB of JSON in the document body is much, much too large for good performance. (Did we talk about this before? Maybe it was with someone else; but there are threads from the past month talking about moving large data out of the document body.)

—Jens

Sherry Ummen

unread,
Sep 2, 2014, 1:17:37 PM9/2/14
to mobile-c...@googlegroups.com
Hmm ok then I should try with View but one thing I don understand how to emit multiple properties?

Is it like this?

 view.SetMapReduce((IDictionary<string, object> document, EmitDelegate emitter) => {
               
                    emitter(document["_id"], document["_id"]);
                    emitter(document["version"], document["version"]);
                    emitter(document["type"], document["type"]);
               
            }, null, "1");

And regarding 90Mb data. Hmm yeah the data is huge I can't help it but hmm yeah I have been storing it like JSON data. Infact I can try using attachments. 

I am using couchdb with lite so that I can store huge data.

So that means attachment is treated somehow differently?

Jens Alfke

unread,
Sep 2, 2014, 1:45:20 PM9/2/14
to mobile-c...@googlegroups.com
On Sep 2, 2014, at 10:17 AM, Sherry Ummen <sherry...@gmail.com> wrote:

Hmm ok then I should try with View but one thing I don understand how to emit multiple properties?

Call the emitter function with the value being an array or dictionary/map. Basically you can pass any JSON-compatible object as the key or value. Here are the docs on keys/values. There are some subtleties, so be sure to read them.

And regarding 90Mb data. Hmm yeah the data is huge I can't help it but hmm yeah I have been storing it like JSON data. Infact I can try using attachments. 
I am using couchdb with lite so that I can store huge data.

The fact that you can store huge document bodies doesn't mean that you should :) They'll slow down CouchDB too.

So that means attachment is treated somehow differently?

Yes. Only the attachment metadata (an "_attachments" property) is kept in the document itself. The attachment data is only read on demand, through the CBLAttachment API. There's even a streaming API. Attachments are not read when indexing so they don't slow that down.

I recommend you keep only the important fields in the document body — the ones you need for indexing or for displaying in a bulk UI like a list/table view — and put the rest of them in an attachment of type application/json.

—Jens

Reply all
Reply to author
Forward
0 new messages