Views with include_docs=true including old or incomplete results.

81 views
Skip to first unread message

Mclean, Adam

unread,
Oct 31, 2013, 9:09:58 AM10/31/13
to us...@couchdb.apache.org
I have a busy database that I'm querying with a view like "_design/[doc]/_view/[view]?include_docs=true"

I'm having trouble with old or incomplete documents being included in the result.

I see that either:
a) include_docs is including the previous version of the document - a version of the document that wouldn't actually match the view.
b) include_docs is not including the document. It includes the 'key', and 'value', but the 'doc' is empty in output.

This only occurs if I'm querying the view at roughly the same time (within a few ms) of the incorrectly included document being written. The document being written/updated DOES match the view criteria, however the include_docs seems to have a timing issue where it retrieves an old copy of the document, or doesn't include anything in the "doc": {} portion of the result.

So in summary I think the view logic is fine, but include_docs seems to retrieve incorrect documents when the timing lines up.

I'm using CouchDB 1.3.1 with ERL R15B02 and Spidermonkey 1.8.5 on Linux.

Is this a caveat of using include_docs? Would I be smarter to get the keys, then make a second query to couch to get the documents? Was this something 'known' in 1.3.1 and fixed in 1.4?

Thanks!

_______________________________________________________________________

This email may be privileged and/or confidential, and the
sender does not waive any related rights and obligations.
Any distribution, use or copying of this email or the
information it contains by other than an intended recipient
is unauthorized. If you received this email in error,
please advise the sender (by return email or otherwise)
immediately. You have consented to receive the attached
electronically at the above-noted email address; please retain a
copy of this confirmation for future reference.

Ce courriel est confidentiel et protégé. L'expéditeur ne renonce
pas aux droits et obligations qui s'y rapportent. Toute diffusion,
utilisation ou copie de ce courriel ou des renseignements qu'il
contient par une personne autre que le (les) destinataire(s)
désigné(s) est interdite. Si vous recevez ce courriel par erreur,
veuillez en aviser l'expéditeur immédiatement, par retour de courriel
ou par un autre moyen. Vous avez accepté de recevoir le(s) document(s)
ci-joint(s) par voie électronique à l'adresse courriel indiquée ci-dessus;
veuillez conserver une copie de cette confirmation pour les fins de reference future.

Robert Newson

unread,
Oct 31, 2013, 9:39:05 AM10/31/13
to us...@couchdb.apache.org
https://wiki.apache.org/couchdb/HTTP_view_API

The include_docs option will include the associated document. However,
the user should keep in mind that there is a race condition when using
this option. It is possible that between reading the view data and
fetching the corresponding document that the document has changed. If
you want to alleviate such concerns you should emit an object with a
_rev attribute as in emit(key, {"_rev": doc._rev}). This alleviates
the race condition but leaves the possibility that the returned
document has been deleted (in which case, it includes the "_deleted":
true attribute). Note: include_docs will cause a single document
lookup per returned view result row. This adds significant strain on
the storage system if you are under high load or return a lot of rows
per request. If you are concerned about this, you can emit the full
doc in each row; this will increase view index time and space
requirements, but will make view reads optimally fast.

B.

Mclean, Adam

unread,
Oct 31, 2013, 9:59:57 AM10/31/13
to us...@couchdb.apache.org
Thanks for the pointer. This seems a little bit different to me, as the fetched document is an older version that wouldn't match the view in the first place. There is no change to the document after it matches the view.

If I have a document:
{
"_id": "1234"
"_rev": 1-
"match": false
}

And a view defined as:
if(doc.match) {
emit();
}

Revision 1 of the document does not match the view.

If I write:
{
"_id": "1234"
"_rev": 2-
"match": true
}

At about the same time (within a few ms) as I query the view, I get a return showing the document now matches, however the included "doc": {} is "_rev": 1-

There are no additional changes made to the document, and if I add a few more ms between the write, and the query the included document is correctly "_rev": 2-


Either way it sounds like calling the view without include_docs, then retriving using the _all_docs with a keys = [] is a better way to go?

itl...@schrievkrom.de

unread,
Oct 31, 2013, 10:19:22 AM10/31/13
to us...@couchdb.apache.org
This could be another instance of:

https://issues.apache.org/jira/browse/COUCHDB-1797

??

Marten
Reply all
Reply to author
Forward
0 new messages