Question Regarding Document Counts in Futon Across Cluster Members

22 views
Skip to first unread message

Matthew Woodward

unread,
Oct 21, 2012, 12:43:48 AM10/21/12
to bigcou...@googlegroups.com
I emailed the list a few weeks ago about what may be an issue with one of our databases in our six-server BigCouch cluster, specifically that when we had writes go to our load balancer instead of directly to a specific server in the cluster the document counts don't match across all the servers in the cluster. This brought to mind some questions that may simply stem from my own lack of knowledge about the expected behavior on servers in a BigCouch cluster, so I hope you'll bear with me because I love learning more about how this all works.

My first question is this: should the document counts necessarily match across the members of the cluster?

For the vast majority of our databases the document counts do always match; this particular one just seems to be the odd man out. Note that the server that's lagging in terms of the document count isn't "stuck" at a fixed number of documents, so it's not as if it somehow isn't part of the cluster anymore, it's just always between 10-40 documents short compared to the others.

A couple of more piece of information in case they're relevant. The document counts do vary slightly across all the servers, but typically 3-4 of the six are in lock step with the other 1-2 lagging behind to varying degrees. Also, we do have the cluster spanning two data centers (3 in one data center, 3 in another, all in the US). Finally, the specific server that consistently has the lower document count happens to also be one of the 3 servers that holds shards for this particular database.

The reason I ask if the document counts should match is because I wrote a Python script to pull back all the document IDs for this database on one of the servers that has the higher document count as well as all the document IDs from the server with the lower document count and generate a list of the IDs that are missing from the server with the lower document count.

The interesting thing is if I take an ID that's supposedly missing from the server with the lower document count and paste that into Futon, the document comes up fine even though according to the lists of all the IDs in the database, it's missing. So within a cluster if a specific server doesn't have a document does it just ask another member of the cluster for it? If so, does that have any impact on view generation at all?

I ask about views specifically because the issue we're seeing in terms of the application that uses this database is that there seems to be intermittent missing data when a view is called, so I want to eliminate any potential problems on the database side of the equation since the application code itself is extremely straight-forward. I'm not ruling out the possibility that it's an application issue, it just doesn't seem likely based on what we've dug into thus far since the application is merely calling a view at that point at which we're seeing the problem.

Fundamentally I suppose I simply want to know if having the document counts be off between servers is even an issue. If this is expected behavior then I'll stop obsessively checking document counts and will look more deeply into other potential causes.

Thanks for any enlightenment on this issue anyone might be able to provide.

Matt

--
Matthew Woodward
ma...@mattwoodward.com
http://blog.mattwoodward.com
identi.ca / Twitter: @mpwoodward

Please do not send me proprietary file formats such as Word, PowerPoint, etc. as attachments.
http://www.gnu.org/philosophy/no-word-attachments.html
Reply all
Reply to author
Forward
0 new messages