RAM usage at 92.1% on one node after crash

10 views
Skip to first unread message

Juan Simon

unread,
Mar 7, 2014, 11:58:13 PM3/7/14
to couchba...@googlegroups.com
Not sure if it is related, but we've been having issues with a couple of nodes in the cluster going down and reloading all buckets. It happens around 9pm UTC and makes most request to the cluster fail for about 2 minutes.

This is what I saw after the last issue:

Inline image 1

Here is the cluster info:
Version 2.2.0
Nodes 5
Total RAM 47GB
Total Disk 1.4TB


Nothing special happens around that time (That I can see) on the client side.

Side info:
I've noticed peaks in different charts;
  • Minor page faults: 200k
  • Disk write queue 1.25M
  • Disk Queue Items going from 550K to 350K
    • We do leverage heavily on the expiry of the documents, expiring around 150k to 200k per hour. (peak time does not match)

I'm not sure where to start, We've limited the access to this cluster to a minimum, but it's a pretty important piece of the main application so any kind of help is welcome.

I'm adding screenshots of the charts in case that helps

Inline image 2

Inline image 3
Inline image 4
Inline image 5
Inline image 6
Inline image 7





Sorry for the long mail and thanks in advanced,
Juan S. Simon
Reply all
Reply to author
Forward
0 new messages