Couchbase server crashed during minimal load test

492 views
Skip to first unread message

Kyle Heon

unread,
May 30, 2013, 2:32:59 PM5/30/13
to couchba...@googlegroups.com
We ran a load test against our testing environment this morning that puts a bit more load on our Couchbase server (Version: 2.0.0 enterprise edition (build-1976)) and when we approached the upper threshold of this environments capabilities (about 335 concurrent users) we started getting tons of 500 errors. Digging into this our code was failing when it was calling into Couchbase. Because we depend on Couchbase for building out navigation and other key elements of the interface the site came down hard.

I had to reboot the server to bring things back online.

Looking through our logs, I see hundreds of errors that look exactly like this:

System.ArgumentOutOfRangeExceptionIndex was out of range. Must be non-negative and less than the size of the collection. Parameter name: index

System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection.
Parameter name: index
   at Couchbase.CouchbaseClient.Couchbase.IHttpClientLocator.Locate(String designDocument)
   at Couchbase.CouchbaseViewHandler.GetResponse(IDictionary`2 viewParams)
   at Couchbase.CouchbaseViewHandler.<TransformResults>d__0`1.MoveNext()
   at X.Framework.Tracking.Model.ResourceAccessRepository.<GetResourceAccesses>d__0.MoveNext()
   at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
   at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
   at X.Framework.Tracking.PageVisitLogger.GetResourceAccesses(String userId)
   at X.Framework.GlobalApplication.Application_AuthenticateRequest(Object sender, EventArgs e)
   at System.Web.HttpApplication.SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
   at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)

Can someone please help me understand what might cause this and how it can be avoided in the future? Our testing environment has a single Couchbase server, our production environment has a two-server cluster. We've run load tests against this test environment and never seen this before, however this new load test did exercise an area of the site not previously load tested.

Thanks!

-K

Kyle Heon

unread,
May 30, 2013, 3:47:39 PM5/30/13
to couchba...@googlegroups.com
Also, we are using the TapMap repository setup at the moment, what I'm trying to do is find a pattern that we can implement to properly handle situations where Couchbase is not available. There seems to be a number of points within this repository implementation that we need to handle this and I'd like to make sure that we only have to do it centrally and not have to update every single repository class.

Thanks!

-K

Matt Ingenthron

unread,
May 30, 2013, 3:50:30 PM5/30/13
to couchba...@googlegroups.com
Hi Kyle,

First off, I'd very, very, very highly encourage you to update to 2.0.1.  Highly.  Did I mention you should update?  :)

Second, it would be useful to check the logs on the server to see if there is anything unusual happening at view execution time.  This sounds like it may have com back empty, unexpectedly.  

The .NET client should probably handle things better here though.  I'll file an NCBC about that.

Matt

--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.
To view this discussion on the web visit https://groups.google.com/d/msgid/couchbase-8091/72089e8a-22d4-4dbc-a2fe-6f8d2232bc2f%40googlegroups.com?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Kyle Heon

unread,
May 30, 2013, 4:01:35 PM5/30/13
to couchba...@googlegroups.com
Can you elaborate on why you recommend this? What does it get me? We are a week before launching 2.1 so now is a scary time for us to consider a change like this. Please advise.



For more options, visit https://groups.google.com/groups/opt_out.
 
 

Perry Krug

unread,
May 31, 2013, 5:16:42 AM5/31/13
to couchba...@googlegroups.com
2.0.1 added a fair amount of stability fixes, especially around timeouts (being further improved in the upcoming release), compaction scheduling and rebalance control.  Upgrading to 2.0.1 won't change any functionality or break anything within your application and will also make it smoother for you to upgrade further when new releases comes out (especially since you'll need to do so in production).

I would see this less as a "change" and more as "software is always improving, using the latest version is a good best practice when appropriate"

Also make sure you're using the latest .NET driver

Lastly, the error message you showed is from the client side, and it will be necessary (both for you and for us) to understand what's going on at the server at the same time.  Did the node crash or was it just an error in something the client was handling?

Perry

Kyle Heon

unread,
Jun 3, 2013, 1:40:12 PM6/3/13
to couchba...@googlegroups.com
Perry,

What data would be of use? I'm looking at the "Log" via the web admin now and do see a couple errors around that time. I'm trying to generate a diagnostics file but it's currently at 170mb so probably won't be attachable. In the interim I've attached a screenshot for the day in question.

-K
couchbase-log.png

Kyle Heon

unread,
Jun 3, 2013, 1:48:09 PM6/3/13
to couchba...@googlegroups.com
Additionally, I've started troubleshooting an issue where we I'm unable to take a backup from our test environment and restore it locally, well that isn't totally true, I can restore it locally but then our application acts as though nothing is found via Couchbase. This isn't our code because what is local is also deployed to the test environment and we have data.

Something that is very odd that I noticed today, if I backup our bucket in test it shows a percentage over 200. Test is a single node environment but the data that was restored into it about a week ago comes from a two node environment, but the greater then 200% is really odd to me.

I honestly can't say if this is a new issue or something that has recently begun starting since the server crashed.

-K

Kyle Heon

unread,
Jun 3, 2013, 2:12:09 PM6/3/13
to couchba...@googlegroups.com
Just for my sanity I took another backup, this time from production and the completion percentages go only to 100% so I'm pretty certain there is something wrong with our bucket in test but I don't know how to troubleshoot further or resolve. See image below.

Thoughts?

Thanks!

-K
couchbase-backup-oddness.png

Matt Ingenthron

unread,
Jun 3, 2013, 6:19:00 PM6/3/13
to couchba...@googlegroups.com
On 6/3/13 10:40 AM, "Kyle Heon" <kyle...@gmail.com> wrote:

What data would be of use? I'm looking at the "Log" via the web admin now and do see a couple errors around that time.

You can pretty safely ignore the log messages about "client side error report" as long as any changes you've tried to make are received by the server.

Those are effectively logging of situations where the Web UI runs into an error.  WIth the wide variety of browsers, javascript issues, etc. those sometimes come up but they'd most often manifest themselves as parts of the UI not updating as expected.

If you were trying to make a change and it didn't take, that's something different.

Thanks,

Matt

-- 
Matt Ingenthron
Couchbase, Inc.

Matt Ingenthron

unread,
Jun 3, 2013, 7:26:16 PM6/3/13
to couchba...@googlegroups.com
On 6/3/13 11:12 AM, "Kyle Heon" <kyle...@gmail.com> wrote:

Just for my sanity I took another backup, this time from production and the completion percentages go only to 100% so I'm pretty certain there is something wrong with our bucket in test but I don't know how to troubleshoot further or resolve. See image below.

I think the best course of action will be to get a set of diags uploaded (don't worry if they're large, follow the instructions on the wiki) and we'll need to see why backup is reporting that.

If you could, please do so and let me directly (cc Perry) know when the info is up there?

Thanks,

Matt

--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.

For more options, visit https://groups.google.com/groups/opt_out.
 
 


-- 
Matt Ingenthron - Director, Developer Solutions
Couchbase, Inc.

Kyle Heon

unread,
Jun 3, 2013, 8:36:58 PM6/3/13
to couchba...@googlegroups.com
Thanks guys! I generated diagnostics and created a new backup (stated 197% this time). I have uploaded both the backup and the diagnostics. Look for htestnosql1.zip and backup-20130603.zip in the pixelmedia folder.

I need to get the test environment back to a known state so we are pulling backups from production tonight and will be restoring into our test bucket. Hopefully what I’ve uploaded is sufficient for you to determine what went wrong.

Thanks again!

-K

Perry Krug

unread,
Jun 4, 2013, 6:50:40 AM6/4/13
to couchba...@googlegroups.com
Kyle, 

For the backup, it's most likely due to a known issue in Couchbase which causes deleted items to leave a "tombstone" (about 200 bytes) in the database.  The data itself is gone, but for the purposes of XDCR, we needed to keep this around to make sure that the item actually gets deleted from the other side.  We'll be cleaning this up in a future release and you can use this command (curl -u Administrator:password -X POST http://localhost:8091/pools/default/buckets/default/controller/unsafePurgeBucket) to force them to be cleaned up.  It's only called "unsafe" because it might cause items to not be deleted in an XDCR configuration...if you're using that, you want make sure that any deletes (or expirations) are fully synced across before running it.  I've also seen some cases where it needs to be run multiple times, but that's still under investigation.  This should take care of your >100% backup issue.

For the restore, the only times I've seen data not be available after a restore is if there is a mismatch in the number of vbuckets between source and destination.  And the only time this can occur (assuming you're not manually changing them) is when OSX is either the source or the destination.  In this case, I would recommend using '-x rehash=1' in the cbrestore command which should cause it to layout the data correctly.

Lastly, moving forward, please ignore the "generate diagnostic report" link and instead use the instructions found here to gather logs and statistics for us: http://www.couchbase.com/wiki/display/couchbase/Working+with+the+Couchbase+Technical+Support+Team

Thanks, let me know what state things are in and what else we can do to help.

Perry



--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.

Kyle Heon

unread,
Jun 4, 2013, 6:58:58 AM6/4/13
to couchba...@googlegroups.com
Aha! Thanks Perry for explaining the cause of the > 100%. In testing we are working on updates that have forced us to iterate through all of the documents during an extract and patch them with some additional information that we need. This totally explains the 200%.

I followed those very instructions and was surprised at how small the zip was compared to the file I was getting through the browser.

To clarify, I was able to restore the backup and it shows the number of documents I was expecting but our code was not finding anything when querying the views even though we know the data is there (it's in test so it should be once pulled local).

We are not using XDCR but we do have a two server cluster in production and at some point we'll have to run this patch there as well. We are safe to run that curl command in production after we've patched the documents, correct? Would we run that command on just one of the servers in the cluster or all?

Thank you for your assistance.

-K

Perry Krug

unread,
Jun 4, 2013, 7:18:32 AM6/4/13
to couchba...@googlegroups.com
Yes, you can just run it on one node and it will apply to all, and yes it is safe to run in production at just about anytime (it just triggers a special form of compaction).

Regarding the restore, did you use the rehash option?

-K

--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.

Kyle Heon

unread,
Jun 4, 2013, 7:21:41 AM6/4/13
to couchba...@googlegroups.com
Perry,

I have not tried the rehash option yet but I'm about to start testing again locally by pulling a backup from our newly patched testing environment. I'll let you know how that turns out.

Regarding that purge, I'm getting the following:

Not found.-sh-4.1$

I'm sure I'm doing something wrong, for instance, we don't have a default bucket anymore, does that mean we don't have a default pool either? I can't determine how to get a list of pools, only buckets.

Thanks again!

-K 

Perry Krug

unread,
Jun 4, 2013, 7:23:37 AM6/4/13
to couchba...@googlegroups.com
When you say "locally"...is that on a Mac?

No, the 'default' is still required there.  Are you sure the username and password is correct?  And the bucket name (case sensitive)?

Perry

-K 

--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.

Kyle Heon

unread,
Jun 4, 2013, 7:29:57 AM6/4/13
to couchba...@googlegroups.com
Locally is not directly on a Mac, this is a .NET application. I say not directly because I am on a Mac but have Fusion running a VM of my development environment which also has Windows Couchbase 2.0 installed (we were evaluating upgrading to 2.0.1 when all of this was discovered).

Yeah, bucket name is all lowercase. Yes, user/pass is correct. I continue to get the same error. To be clear, the user/pass is the Couchbase user/pass?

-K

Perry Krug

unread,
Jun 4, 2013, 7:34:49 AM6/4/13
to couchba...@googlegroups.com
Hmm, then I'm not entirely sure why the restore wouldn't work...that behavior of seeing the right item count but not being able to get any is pretty clearly the symptom I'm thinking of.   I expect the rehash will work well for you anyway.  Maybe send over the exact command you're using to restore as well?

Oh, duh...the purge command is only available on 2.0.1, so of course it's not found on 2.0 :-P  Sorry about that.

Perry


--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.

Kyle Heon

unread,
Jun 4, 2013, 7:39:01 AM6/4/13
to couchba...@googlegroups.com
Tried the restore with rehash local (Windows 2.0) and it doesn't recognize the rehash attribute:

c:\Program Files\Couchbase\Server\bin>cbrestore c:\clientWork\Couchbase\backup-2
0130603 http://localhost:8091 --bucket-source=testing-digital-campus-tracking --
bucket-destination=dev-digital-campus-tracking -u user -p pass -x rehash=1
 
error: unknown extra option: rehash

-K 

Perry Krug

unread,
Jun 4, 2013, 7:41:07 AM6/4/13
to couchba...@googlegroups.com
Okay, can you use the cbrestore from a 2.0.1 installation?  I'm still not sure why it's not working the way it should though




-K 

--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.

Kyle Heon

unread,
Jun 4, 2013, 8:00:12 AM6/4/13
to couchba...@googlegroups.com
So I have pulled a backup from the test environment and retried local but unfortunately I'm getting the same result as yesterday, document counts are correct but I get no data when pulling from our views. My local environment is Couchbase 2.0 on Windows so I'll upgrade to 2.0.1 and hopefully the rehash will fix this.

-K

Kyle Heon

unread,
Jun 4, 2013, 8:40:32 AM6/4/13
to couchba...@googlegroups.com
Perry,

So I upgraded to 2.0.1 on Windows and that allowed me to use the rehash flag but functionally things are still banged up. Here is what I've discovered though, it appears to be just querying the views. If I do a Get and pass it an ID I get a document back. Does any of this indicate a different, known issue? I'm hopeful that it does.

Just so you understand the patch process that nearly all documents underwent, what we did was as follows:
  • loaded all pertinent documents
  • pulled all data from the old document
  • created a new document with a slightly different key
  • deleted the old document
  • updated the new document with everything pulled from the old document
  • rinse and repeat
What is blowing my mind is that in the testing environment, where we ran this patch process everything is working correctly. It is only in environments where we've restored a backup from the testing environment that we run into this issue.

Please advise on next course of action. Much appreciative of all the assistance you've provided so far.

-K

Perry Krug

unread,
Jun 4, 2013, 9:04:07 AM6/4/13
to couchba...@googlegroups.com
To be honest, at the moment it doesn't "sound" like anything functionally wrong with Couchbase (as in a bug).  Can you describe more about how the views fit into that workflow?  And maybe send over the view definitions that you're using.  Can you explain more exactly what the problem related to the views you're having is?

-K

--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.

Kyle Heon

unread,
Jun 4, 2013, 9:36:07 AM6/4/13
to couchba...@googlegroups.com
I'm leaning towards that assessment although I don't get why this isn't working in development with the same exact code base as what is deployed, only difference being that we restored a backup from the test environment. Our views are very basic (short below):


digital_campus_cookies
function(doc, meta) {
 if (doc.type.toLowerCase() == 'cookie') {
  if (doc.userId != null && doc.resourceKey != null) {
    emit(doc.userId, doc._id);
   }
 }
}

digital_campus_pagevisits
function(doc, meta) {
 if (doc.type.toLowerCase() == 'pagevisit') {
  if (doc.pageId != null) {
    emit(doc.userId, doc._id);
   }
 }
}

digital_campus_resource_accesses
function(doc, meta) {
 if (doc.type.toLowerCase() == 'resourceaccess') {
  if (doc.userId != null) {
    emit(doc.userId, doc._id);
   }
 }
}

We use Couchbase to store tracking data for our application. We create a single document that is keyed to the user and a series of other identifiers (course, session, etc) and before we create the document we look to see if it already exists and if so we update some key data (like current visit date, increasing visit count, etc).

The views help us filter this data for the different types of tracking (we track page visits, resource visits and cookies that help us return users to specific areas in epub documents).

When a page loads we have a series of calls to load data specific to that user at that time, two of those calls are into Couchbase views (digital_campus_pagevisits and digital_campus_resource_accesses) where we load visit information for the current user. We then leverage that information to build out the interface during render.

This is all with the .NET SDK (currently using 1.2.6 but we just upgraded to that). Test environment is running 1.2.6 SDK and works, development same thing but doesn't work unless I start with a clean bucket.

I have extracted the code that we are using to work with Couchbase. This isn't a complete library but has the repositories, models and other key code. PageVisitRepository.GetPageVisits is a good place to start.

-K
Tracking-Code.zip

Perry Krug

unread,
Jun 4, 2013, 10:02:29 AM6/4/13
to couchba...@googlegroups.com
Thanks Kyle.

A few quick comments:
-You may want to consider only using one view here.  It's not a big deal (and is a tangential recommendation), but you could basically "prefix" the view output with either cookie, pagevisit or resourceaccess and then be able to query for just those from one view.  It would help reduce your disk space, disk IO and view processing time a bit
-You don't have to emit doc._id in the value since the document ID is included with each row being emitted anyway...this will further save disk space and disk IO.

Can you describe in a bit more detail what the actual problem you're having is?  Are the views not returning any results or are they returning rows that then don't seem to be avialable?  If the latter, am I right in understanding that you're able to successfully perform a manual 'get' for some of those keys?

Perry

-K

--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.

Kyle Heon

unread,
Jun 4, 2013, 10:16:01 AM6/4/13
to couchba...@googlegroups.com
Perry,

Yes, I'm sure we could have done things in a more efficient way but this is our first foray into the world of NoSQL. Unfortunately right now isn't the time for us to make a change of this type, we are suppose to launch 2.1 of our app next week and things aren't looking good.

Can you describe in a bit more detail what the actual problem you're having is?  Are the views not returning any results or are they returning rows that then don't seem to be available?  If the latter, am I right in understanding that you're able to successfully perform a manual 'get' for some of those keys? 

In the PageVisitRepository the following code is being called:

public IEnumerable<PageVisit> GetPageVisits(string userId)
{
    var visits = this.CouchbaseClient.GetView("digital_campus_pagevisits", "digital_campus_pagevisits")
                     .Key(userId);
    foreach (var visit in visits)
    {
        yield return Get(visit.ItemId);
    }
}

In the debugger what we see is that visits has 0 results. You are correct, I can issue a Get for a document and get it by id without any issue. This issue appears to be related to the views. In the web admin interface I can see data in the views so I know they are working.

-K
 

Perry Krug

unread,
Jun 4, 2013, 10:26:07 AM6/4/13
to couchba...@googlegroups.com
Sorry Kyle, didn't mean to imply you had to scratch everything...just wanted to give you some suggestions as I see them.

Okay, so let's debug that view stuff.  The main questions I'd want to determine is whether the view is returning "successfully" or not.  If you're getting an error that "appears" to be an empty set, that's one problem...if you're getting a successful response (but it's empty), that's another.  The fact that you can see it in the admin UI is a good sign.  Off the top of my head, I'm going to presume that the ".Key(userId)" is what's causing issue here.  Could you try manually setting that to a key that you know for sure can be seen in the UI?  Or even take that off entirely and just try to get the full resultset to make sure that the data is coming back properly.

You may also want to turn on more detailed logging from the SDK to ensure that the requests actually are being serviced properly: http://www.couchbase.com/docs/couchbase-sdk-net-1.2/couchbase-sdk-net-logging.html

Perry


-K
 

--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.

Kyle Heon

unread,
Jun 4, 2013, 10:31:42 AM6/4/13
to couchba...@googlegroups.com
Thanks Perry, I'll try all of those suggestions and report back. Given that this is end of the road things are hectic now that everyone is in the office so it might take me a bit to get back to you. Can I ask, where are you located? I'm east coast, only asking to get a sense of time zone differences.

Thanks so much for all your help.

-K

Perry Krug

unread,
Jun 4, 2013, 10:32:34 AM6/4/13
to couchba...@googlegroups.com
I'm in the UK, so 5 hours ahead of you.

Let us know what we can do to help further.

Perry




-K

--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.

Kyle Heon

unread,
Jun 4, 2013, 1:48:57 PM6/4/13
to couchba...@googlegroups.com
Thanks Perry, 

I'm betting your are done for the day now but wanted to update you on where things are at. I tried removing the .Key filter to see whether or not we are getting anything back from the view and we aren't. I'm not quite sure how to determine if there are errors as you indicated. I'm trying to work through setting up logging to see what that might provide but was hoping that maybe you had some thoughts.

None of these issues happened until we ran the load test and the server/node tanked. I was pulling backups from the test environment on a regular basis to update my local environment so I'm struggling to understand why this issue would suddenly show up and given that the code in question for accessing view data hasn't changed I just don't get it.

-K

Perry Krug

unread,
Jun 4, 2013, 1:54:53 PM6/4/13
to couchba...@googlegroups.com
Thanks Kyle.  Can you be more specific about what you actually are getting back?  Are you getting a response that says "there is no data" or are you not getting a response?

I'm looking through the docs now, but I don't see anything specific yet on dealing with error handling here, but maybe there's an exception you can check or a particular field of the object that gets returned to know when it was successful or not.

Perry

-K

--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.

Kyle Heon

unread,
Jun 4, 2013, 2:06:07 PM6/4/13
to couchba...@googlegroups.com
It looks as though I get a response back but in debug if I try and open any of the properties up they timeout. I've seen if I expand the ResultsView which triggers the retrieval of docs I get 0 TotalRows.

Perry Krug

unread,
Jun 4, 2013, 2:07:37 PM6/4/13
to couchba...@googlegroups.com
Hmm, so possibly you're not getting a successful response.  Any firewalls to think of here?  What OS are these current client and servers?



On Tue, Jun 4, 2013 at 7:06 PM, Kyle Heon <kyle...@gmail.com> wrote:
It looks as though I get a response back but in debug if I try and open any of the properties up they timeout. I've seen if I expand the ResultsView which triggers the retrieval of docs I get 0 TotalRows.

--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.

Kyle Heon

unread,
Jun 4, 2013, 2:12:07 PM6/4/13
to couchba...@googlegroups.com
No firewalls. Our entire development environment runs inside a VM that contains Windows 7, Couchbase Server 2.0.1 (now) and SQL Server 2008 R2. Outside of the upgrade to Couchbase 2.0.1 there have been no infrastructure updates in months.

The testing (and production) environment is Windows Server 2008 R2, SQL Server 2008 R2 and Couchbase 2.0.0 is on a CentOS server.

-K

Perry Krug

unread,
Jun 4, 2013, 2:19:51 PM6/4/13
to couchba...@googlegroups.com
Hmm, that's pretty strange then.  Can you upload a collect_info from that node and we'll take a look through the logs?

--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.

Kyle Heon

unread,
Jun 4, 2013, 2:29:34 PM6/4/13
to couchba...@googlegroups.com
Perry

I just generated a report and uploaded it to the "pixelmedia" folder just like last night. This file is named localhost-for-perrykrug.zip and contains logs from my local development server. I uploaded logs from the test server last night, look for a file starting with the name htest in it.

-K

Perry Krug

unread,
Jun 4, 2013, 2:30:14 PM6/4/13
to couchba...@googlegroups.com
Okay, I'll take a look at that.

Perry

-K

--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.

Perry Krug

unread,
Jun 4, 2013, 2:44:36 PM6/4/13
to couchba...@googlegroups.com
Nothing particularly conclusive yet, but I do see on the server side that it "thinks" it is returning correctly at least.

Can you try pasting this in your browser:

Just to see if there is a response...

Kyle Heon

unread,
Jun 4, 2013, 2:46:53 PM6/4/13
to couchba...@googlegroups.com
Sure thing. I get the following:

{"total_rows":23,"rows":[
]
}

Perry Krug

unread,
Jun 4, 2013, 2:50:07 PM6/4/13
to couchba...@googlegroups.com
Okay, now we're actually getting somewhere I think.

That's basically saying that it found 23 possible records in that view but that none of them match the key you specified.  So now we need to figure out if the key is wrong or if the view is wrong.  How does that key get generated and can you debug up the chain a bit to ensure that it is valid?  Looks like it's supposed to match a userId...can we verify that the document(s) it's supposed to match are in the bucket?

Perry
On Tue, Jun 4, 2013 at 7:46 PM, Kyle Heon <kyle...@gmail.com> wrote:
Sure thing. I get the following:

{"total_rows":23,"rows":[
]
}

--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.

Perry Krug

unread,
Jun 4, 2013, 2:52:21 PM6/4/13
to couchba...@googlegroups.com
Looks like you've tried a few times with this key: "37c3769e-4cc1-4ff2-b9bb-1da5d00dc249"

Can you try for a different user that you know is in there?

Matt Ingenthron

unread,
Jun 4, 2013, 2:55:26 PM6/4/13
to couchba...@googlegroups.com
Note that is a "dev" view. The key you're looking for may be in another vbucket, so I'd actually change that to not use the dev_ prefix. Chances are, you'll then find it in the response.

It'd be odd for this request to be coming from a client though, so this is probably from the console?

________________________________________
From: couchba...@googlegroups.com [couchba...@googlegroups.com] On Behalf Of Perry Krug [perr...@gmail.com]
Sent: Tuesday, June 04, 2013 11:44 AM
To: couchba...@googlegroups.com
Subject: Re: Backup is running to 200.7% after crash (was: Re: [Team 8091] Couchbase server crashed during minimal load test)

Kyle Heon

unread,
Jun 4, 2013, 3:09:09 PM6/4/13
to couchba...@googlegroups.com
Locally we use "dev", should we not be doing that? The app is configured with the DevelpomentModeNameTransformer, for production we created a custom one.

That key is the user key for my account which is why you probably see it a lot. If I remove the dev_ from the view as Matt indicated I get a result indicating 18982 total rows but no actual rows. That larger number is more close to the total number of documents in the system then it is to page visits.

-K

Perry Krug

unread,
Jun 4, 2013, 3:14:05 PM6/4/13
to couchba...@googlegroups.com
You probably don't want to use a development mode view for your testing, even locally.  You don't know deterministically which portion of the dataset will be used for that view.  It's supposed to be representative of the entire set, but won't necessarily contain a specific key you're looking for. 

However, it seems that you're still not getting the results.  I really think there is something wrong with that key.  Can you find and look at the actual document(s) that is/are supposed to contain it?

--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.

Kyle Heon

unread,
Jun 4, 2013, 3:14:37 PM6/4/13
to couchba...@googlegroups.com
Well this is interesting. If I change to the ProductionModeNameTransformer in development everything works. And I was incorrect, we didn't create a custom name transformer, we use yours.

So, can someone explain to me how we've been able to do what we've been doing for the past 18-24 months, using DevelopmentModeNameTransformer in development, pulling production/testing backups local and not once have we run into this? Why now? I just don't understand the differences here.

I was under the impression that for development we wanted to use the dev views because they aren't "read only" but that for production use we should be publishing them for performance reasons. Have I been misunderstanding things all this time?

-K

Perry Krug

unread,
Jun 4, 2013, 3:18:57 PM6/4/13
to couchba...@googlegroups.com
I can't say for sure why it's not working now, I agree that is a bit weird.

The only thing I can think of is that previously you were testing with a user that was always in the particular vbucket...but that seems like a pretty wild coincidence.  If you feel like going back to the previous "working" state, I'll take a look at the same set of logs and verify the behavior.

I don't think development views are appropriate for "testing", they're appropriate for "development" where you're able to very closely manipulate the environment to choose the right data (or the actual data doesn't matter).  Mostly I imagine them to be used by someone manually at the Web UI until they have gotten the output to look correct, but the same could certainly be done from within the SDK.  It's just when you want to test the end-to-end functionality, I think you really need to be working with the whole dataset (as we see here).

Perry

-K

--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.

Matt Ingenthron

unread,
Jun 4, 2013, 3:20:29 PM6/4/13
to couchba...@googlegroups.com
It's definitely appropriate to use dev_ views in many cases, but not all. The intent with the dev_ view is that you get a deterministic subset of the data so creation and execution of views is faster. This is most often needed when you're changing the view frequently and you have a large amount of data, so you don't want to process all of it for each minor change of the view code. It's also often needed when you have a large deployment and you want to see how that view would operate on a subset of your production deployment.

The flipside of this though is that these kinds of development efforts should be exclusive of particular keys.

Once you have your views pretty well worked out and you aren't constantly changing data, it's good to move to the production version, which is done by dropping the dev_ from the view name. You'll also have to publish the view to production.

See:
http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-views-types.html


________________________________________
From: couchba...@googlegroups.com [couchba...@googlegroups.com] On Behalf Of Kyle Heon [kyle...@gmail.com]
Sent: Tuesday, June 04, 2013 12:09 PM
To: couchba...@googlegroups.com
Subject: Re: Backup is running to 200.7% after crash (was: Re: [Team 8091] Couchbase server crashed during minimal load test)

Locally we use "dev", should we not be doing that? The app is configured with the DevelpomentModeNameTransformer, for production we created a custom one.

That key is the user key for my account which is why you probably see it a lot. If I remove the dev_ from the view as Matt indicated I get a result indicating 18982 total rows but no actual rows. That larger number is more close to the total number of documents in the system then it is to page visits.

-K

--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.
To view this discussion on the web visit https://groups.google.com/d/msgid/couchbase-8091/ec9eba3a-8aa1-44af-a4ac-e95d116e868e%40googlegroups.com?hl=en.

Kyle Heon

unread,
Jun 4, 2013, 3:26:57 PM6/4/13
to couchba...@googlegroups.com
Thanks guys!

I think that all makes sense and we've probably just been lucky all this time because our document count is pretty small (just about to hit 19k). I was just discussing with the other developer that has worked on this project with me and we think the catalyst for this is the patching process which has effectively created twice the number of documents. As explained earlier, the patch process is takes each document, grabs data from it, deletes the old one and recreates it with a new key (and some additional data).

I've verified that the before and after counts following the patch are the same so we aren't actually increasing the number of documents perse, but as Perry indicated earlier, these deleted documents linger for a bit.

I'm updating my configuration files for local development to use the ProductionModeNameTransformer going forward.

On a separate note, can either of you suggest tools for easily querying the document store? I'm a SQL guy so not being able to sling some SQL at this is killing me. How do you find things? Are there any tools you recommend for this type of need?

Thanks again!

-K


Matt Ingenthron

unread,
Jun 4, 2013, 3:44:13 PM6/4/13
to couchba...@googlegroups.com
It depends on the specific case, but generally I have a set of views that cover the kinds of queries I'll need to run.

I know the model seems a bit different, but we try to keep everything fast by having views. In a SQL system, you can toss in a random query, but it can also randomly slow down the system if there aren't sufficient indexes to plan/optimize/execute that query. In Couchbase, you have to have a view.

If you've not already, definitely hit the .NET tutorial which gives you an introduction to views with the beer-sample dataset: http://www.couchbase.com/docs/couchbase-sdk-net-1.2/stage4.html

That'll probably help you grok how to apply this to your own data. We can definitely help you get to the right view definition if you're working your way through it. I can say from experience, it seems a bit different at first but it's not that hard to do the basic things and you'll find real power in being able to normalize and sort through your data at the time of index building instead of the time of index querying.

________________________________________

Kyle Heon

unread,
Jun 5, 2013, 9:07:57 AM6/5/13
to couchba...@googlegroups.com
To follow up this morning, I upgraded the test server to 2.0.1 and restored from a backup. I have never been able to just do the upgrade, I always end up having to nuke the Couchbase install and reinstall. Not that big of a deal to start fresh every time but it would be nice if the upgrade worked.

After restoring from the 2.0.0 backup I ran the "purge" script against or bucket and then backed it up again, this time the backup specified 100% so thanks for the tip on cleaning up the bucket of those ghost items.

-K

Perry Krug

unread,
Jun 5, 2013, 9:11:40 AM6/5/13
to couchba...@googlegroups.com
Was the subsequent restore successful?

Also, when you say you couldn't upgrade...could you provide some more details about the problems you ran into?



--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.

Kyle Heon

unread,
Jun 5, 2013, 9:15:21 AM6/5/13
to couchba...@googlegroups.com
Yes, restore went perfectly and the purge ran quick. Things in test look good.

As to the upgrade, I always get some weird error at the end about not being able to stop the server. If I check the RPM list it shows as installed but hitting the web site results in a 404. I stopped/started the service manually (but got the same error the updater reported). Service claims to have started.

-K

Perry Krug

unread,
Jun 5, 2013, 9:18:48 AM6/5/13
to couchba...@googlegroups.com
Okay, it would be very helpful to have a set of logs after the upgrade since this is something that we do test and do expect to work without issue.




-K

--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.

Kyle Heon

unread,
Jun 5, 2013, 9:26:51 AM6/5/13
to couchba...@googlegroups.com
I just pushed up a cbcollect_info dump to the pixelmedia folder, file is named htestnosql1-20130605-post-upgrade.zip. Not sure this will be of much use given that I uninstalled and then did a fresh install but hopeful it is. I'll try and remember to do this when we start the production server upgrades (probably not for another week).

-K

Perry Krug

unread,
Jun 5, 2013, 9:36:54 AM6/5/13
to couchba...@googlegroups.com
Unfortunately looks like there aren't any past log messages in there to indicate what the issues were.  Please do let us know if this is something you can reproduce as we certainly want to make sure that is working properly.
On Wed, Jun 5, 2013 at 2:26 PM, Kyle Heon <kyle...@gmail.com> wrote:
I just pushed up a cbcollect_info dump to the pixelmedia folder, file is named htestnosql1-20130605-post-upgrade.zip. Not sure this will be of much use given that I uninstalled and then did a fresh install but hopeful it is. I'll try and remember to do this when we start the production server upgrades (probably not for another week).

-K

--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091?hl=en.

Perry Krug

unread,
Jun 5, 2013, 9:52:15 AM6/5/13
to couchba...@googlegroups.com
FYI Kyle, I'm doing some of my own testing on upgrade and have noticed that with fairly low powered machines, it can take 10-20 seconds for the Web UI to be avialable again after installation or upgrade.  In your testing, if you see a beam.smp process running, please wait a little bit longer before deciding to uninstall and reinstall.

Kyle Heon

unread,
Jun 13, 2013, 10:33:03 AM6/13/13
to couchba...@googlegroups.com
Perry,

Thanks for the tip. I have upgraded our production cluster to 2.0.1 this morning using the "upgrade" process and the first server was available immediately as best I can tell, the second took maybe 30 seconds before the web UI was available.

-K

Kyle Heon

unread,
Nov 5, 2013, 2:09:20 PM11/5/13
to couchba...@googlegroups.com
Ok everyone, blast from the past. We are about to launch 2.2 of the same app that was experiencing index out of range errors accessing a view. We have continued to see this error show up during minimal as well as heavy load testing. We have multiple views and for some reason this one view causes us problems. It's use is almost identical to another that hasn't error and is access immediately before this failing view.

The content they return is different though. The error, for simplicity sake is pasted below:

System.ArgumentOutOfRangeExceptionIndex was out of range. Must be non-negative and less than the size of the collection. Parameter name: index

System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection.
Parameter name: index
   at Couchbase.CouchbaseClient.Couchbase.IHttpClientLocator.Locate(String designDocument)
   at Couchbase.CouchbaseViewHandler.GetResponse(IDictionary`2 viewParams)
   at Couchbase.CouchbaseViewHandler.<TransformResults>d__0`1.MoveNext()
   at System.Linq.Enumerable.WhereSelectEnumerableIterator`2.MoveNext()
   at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
   at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
   at ###.Framework.Tracking.PageVisitLogger.GetResourceAccesses(String userId)
   at ###.Framework.GlobalApplication.Application_AuthenticateRequest(Object sender, EventArgs e)
   at System.Web.HttpApplication.SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
   at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)


For the 2.1 release that started this thread, we identified issues with the .NET 1.2.6 and had to go back to 1.2.0. For this 2.2 release I tried 1.2.7 and 1.2.8 and still experienced the issue so was not able to upgrade still. The issue we have run into is documented in this thread so I'm not going to rehash it here. I even provided diagnostics to Couchbase for troubleshooting purposes.

Two things:
  1. Is there an ETA on when this issue will be resolved?
  2. What might cause this exception for just one view?
Thanks!

-K

Matt Ingenthron

unread,
Nov 5, 2013, 3:00:30 PM11/5/13
to couchba...@googlegroups.com, Jeffry Morris
Hi Kyle,

I'm including Jeff Morris who has done a lot of work with the .NET client recently that may be relevant.  More inline…

From: Kyle Heon <kyle...@gmail.com>
Reply-To: "couchba...@googlegroups.com" <couchba...@googlegroups.com>
Date: Tuesday, November 5, 2013 12:09 PM
To: "couchba...@googlegroups.com" <couchba...@googlegroups.com>
Subject: Re: Backup is running to 200.7% after crash (was: Re: [Team 8091] Couchbase server crashed during minimal load test)

Ok everyone, blast from the past. We are about to launch 2.2 of the same app that was experiencing index out of range errors accessing a view. We have continued to see this error show up during minimal as well as heavy load testing. We have multiple views and for some reason this one view causes us problems. It's use is almost identical to another that hasn't error and is access immediately before this failing view.

Which version of the cluster are you using?  We did find an issue with views in cluster 2.1 that was addressed in 2.2.  This particular issue was when you had a _sum or _stats reduce.

Also, at the cluster, there are a set of logs for view execution.  The path may vary depending on the OS in use.  For me (on MacOS X) the path is ~/Library/Application Support/Couchbase/var/lib/couchbase/logs.  There may be a mapreduce_errors.1 file and there will always be a views.1 file.

Is there anything related to this view request?

More below…


The content they return is different though. The error, for simplicity sake is pasted below:

System.ArgumentOutOfRangeExceptionIndex was out of range. Must be non-negative and less than the size of the collection. Parameter name: index

System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection.
Parameter name: index
   at Couchbase.CouchbaseClient.Couchbase.IHttpClientLocator.Locate(String designDocument)
   at Couchbase.CouchbaseViewHandler.GetResponse(IDictionary`2 viewParams)
   at Couchbase.CouchbaseViewHandler.<TransformResults>d__0`1.MoveNext()
   at System.Linq.Enumerable.WhereSelectEnumerableIterator`2.MoveNext()
   at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
   at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
   at ###.Framework.Tracking.PageVisitLogger.GetResourceAccesses(String userId)
   at ###.Framework.GlobalApplication.Application_AuthenticateRequest(Object sender, EventArgs e)
   at System.Web.HttpApplication.SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
   at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)


For the 2.1 release that started this thread, we identified issues with the .NET 1.2.6 and had to go back to 1.2.0. For this 2.2 release I tried 1.2.7 and 1.2.8 and still experienced the issue so was not able to upgrade still. The issue we have run into is documented in this thread so I'm not going to rehash it here. I even provided diagnostics to Couchbase for troubleshooting purposes.

Two things:
  1. Is there an ETA on when this issue will be resolved?
  2. What might cause this exception for just one view?

Can you (offline if needed) send us the case number that you were working with support on?  I'll follow up with them.

I can't say for certain what the ETA is on fixing the issue, as it's not clear what the issue is to me.  It looks like there may be an unexpected response that isn't getting handled well.  That could be related to the view bug fixed in Couchbase Server 2.2.

The reason I do bring up Jeff's work is that I know we've addressed a number of things in thread safety in the core and we'll be recommending everyone to upgrade. I'm not clear what the issues you mention in the above paragraph are.  I did look back at the thread and there are a few things in there, like the backup, the use of _dev view, etc.

Typically, if this is something that has been reported and defect has been acknowledged, we'd have an NCBC for it.  Do you know if there is one of those? 

I'll do some digging as well, but I wanted to get back to you as soon as possible.

Thanks,

Matt


--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.

Matt Ingenthron

unread,
Nov 5, 2013, 6:04:50 PM11/5/13
to couchba...@googlegroups.com, Jeffry Morris
Hi Kyle,

I may have jumped to a conclusion there, so let me know what Jeff and I found by digging into code a bit.

It looks like it's not a problem with the view or view execution at all, but rather with identifying a node on which to execute the query.

The interesting thing is we haven't seen this in any of our testing and we test a few scenarios like this.

Do you have a test case we can run or can you give us a description of how to build a test that reproduces this?  We'd like to get some debug information from when this occurs.  I think if we can, we can pretty easily fix the issue based on what we see in the code.  It's just not quite clear what the cause is yet.

Thanks,

Matt

-- 
Matt Ingenthron
Couchbase, Inc.

Kyle Heon

unread,
Nov 5, 2013, 6:54:57 PM11/5/13
to couchba...@googlegroups.com, Jeffry Morris
Hi guys, thanks for following up. Honestly I don't really have any way to reproduce this with any level of consistency. In the past couple of weeks we've seen the issue 3, maybe 4 times and 3 of those were during load tests with the environment under a fairly high amount of traffic (hundreds of users). This is a test environment and there is only 1 Couchbase node. Our production environment has a 2 server cluster and we've never seen this issue (to my knowledge).

Tomorrow we deploy 2.2 (of our application) so I won't be able to support you with much but maybe after that, if it'll help I can dump server diagnostics from our test node. I can also provide you with a backup from our node if that'll help you recreate. From there I could provide you with the .NET code pieces to try and reproduce. Our app is way to big to provide you with a running solution but I can see what I can do to extract the pieces you need, thankfully this code runs in the .NET GlobalApplication's Authenticate_Request event (so at the start of every page load).

We are running Enterprise 2.0.1 in both our testing and our production environments.

-K


--
Couchbase 2.0 is Here!: http://www.couchbase.com/download
Couchbase 2.0 Learn: http://www.couchbase.com/learn
Couchbase Forums: http://www.couchbase.com/forums
---
You received this message because you are subscribed to the Google Groups "Couchbase Team 8091" group.
To unsubscribe from this group and stop receiving emails from it, send an email to couchbase-809...@googlegroups.com.
To post to this group, send email to couchba...@googlegroups.com.
Visit this group at http://groups.google.com/group/couchbase-8091.

For more options, visit https://groups.google.com/groups/opt_out.

Matt Ingenthron

unread,
Nov 5, 2013, 11:03:07 PM11/5/13
to couchba...@googlegroups.com, Jeffry Morris
Hi Kyle,


From: Kyle Heon <kyle...@gmail.com>
Reply-To: "couchba...@googlegroups.com" <couchba...@googlegroups.com>
Date: Tuesday, November 5, 2013 4:54 PM
To: "couchba...@googlegroups.com" <couchba...@googlegroups.com>
Cc: Jeffry Morris <jeffry...@couchbase.com>
Subject: Re: Backup is running to 200.7% after crash (was: Re: [Team 8091] Couchbase server crashed during minimal load test)

Hi guys, thanks for following up. Honestly I don't really have any way to reproduce this with any level of consistency. In the past couple of weeks we've seen the issue 3, maybe 4 times and 3 of those were during load tests with the environment under a fairly high amount of traffic (hundreds of users). This is a test environment and there is only 1 Couchbase node. Our production environment has a 2 server cluster and we've never seen this issue (to my knowledge).

Tomorrow we deploy 2.2 (of our application) so I won't be able to support you with much but maybe after that, if it'll help I can dump server diagnostics from our test node. I can also provide you with a backup from our node if that'll help you recreate. From there I could provide you with the .NET code pieces to try and reproduce. Our app is way to big to provide you with a running solution but I can see what I can do to extract the pieces you need, thankfully this code runs in the .NET GlobalApplication's Authenticate_Request event (so at the start of every page load).

We may want to put together an instrumented binary if it's that hard to reproduce.  I think perhaps Jeff can pull together one that is based on our 1.3 with some additional logging to help us find the problem.

Jeff: can you file an NCBC on this so we don't lose it?  Maybe we can try to repro it with some kind of longevity test with a lot of load.

Kyle: how many nodes in your cluster?  This would be needed for the repro, based on the quick scan of the code earlier.

Thanks, 

Matt

Jeffry Morris

unread,
Nov 6, 2013, 4:32:31 PM11/6/13
to Matt Ingenthron, couchba...@googlegroups.com

Kyle –

 

Here is the NCBC for tracking this issue: https://www.couchbase.com/issues/browse/NCBC-326

 

-Jeff

Kyle Heon

unread,
Nov 6, 2013, 7:12:05 PM11/6/13
to couchba...@googlegroups.com
Thanks guys! What are you going to need from me exactly?



For more options, visit https://groups.google.com/groups/opt_out.
Reply all
Reply to author
Forward
0 new messages