'killing myself' errors in 1.8.1

124 views
Skip to first unread message

Dmytro

unread,
Oct 23, 2012, 4:30:21 AM10/23/12
to couc...@googlegroups.com
Hello,

We are running 1.8.1-937 Coucbase server on 10 Linux nodes with about 200M items bucket.

About 1.5 month ago we've upgraded servers from 1.7 to 1.8 and started to see following error messages in the log:

killing myself due to unexpected upstream sender exit with reason: {{badmatch,
{error,
closed}},
[{ebucketmigrator_srv,
upstream_sender_loop,
1}]}
ebucketmigrator_srv000ns...@172.19.4.4103:52:33 - Thu Sep 20, 2012

In many cases this error comes from same node, but sometimes from different ones. 

Since the upgrade we've experienced a failure of the node, which became unresponsive after spitting multiple errors of this kind. But in other cases (with single error appearing once a day or so), there seems to be no impact.

Google search gave me only following references for this error:


It looks like according to the last reference, this error supposed to be fixed in 1.8.1 version.

I also can't find clear information as to whether this is a 'bad' error or not and how to fix and/or prevent it.

Thank you in advance,
Dmytro Kovalov



golden Ray

unread,
Oct 24, 2012, 12:28:10 AM10/24/12
to couc...@googlegroups.com
Hi,Dmytro:
   
    Nice to meet you here, Thanks for follow the error log!

    I noticed this error in these days, seems it appeared several days each time, I think maybe it was caused by erlang remote  call's timeout, and I found we have meet mnesia's error.

:error_logger:ale_error_logger_handler:log_msg:76] Mnesia('ns_1  172.19.4.  '): ** WARNING ** Mnesia is overloaded: {dump_log,
                                                                 write_threshold}


   Does anyone else in our community  meet this error ?

Ray Huang



在 2012年10月23日星期二UTC+8下午4时30分21秒,Dmytro写道:

Aliaksey Kandratsenka

unread,
Oct 24, 2012, 1:50:58 PM10/24/12
to couc...@googlegroups.com
On Tue, Oct 23, 2012 at 9:28 PM, golden Ray <raygo...@gmail.com> wrote:
Hi,Dmytro:
   
    Nice to meet you here, Thanks for follow the error log!

    I noticed this error in these days, seems it appeared several days each time, I think maybe it was caused by erlang remote  call's timeout, and I found we have meet mnesia's error.

:error_logger:ale_error_logger_handler:log_msg:76] Mnesia('ns_1  172.19.4.  '): ** WARNING ** Mnesia is overloaded: {dump_log,
                                                                 write_threshold}


   Does anyone else in our community  meet this error ?

This error is harmless. Disk is busy and mnesia is telling you that. Happens 100% in somewhat loaded systems.

On your particular bug it appears that something inside memcached is closing producer-side tap connection. It's hard to say if failure of node and those are related. I suggest you to file bug in project's jira and attach details. Diags (you can grab it by hitting 'Generate diagnostics' link in top left corner of Log section) would help. Or you can use cbcollect_info tool to get logs + lots of system-related diagnostics. Note, we do have some known issues were passwords are logged. Most folks are unaffected as they run membase/couchbase in closed networks, but you should be aware of that anyways.

Dmytro Koval'ov

unread,
Oct 25, 2012, 3:07:02 AM10/25/12
to couc...@googlegroups.com
Hi Aliaksey,

Thank you for your answer. I have created JIRA ticket http://www.couchbase.com/issues/browse/MB-7007 right now.

Thank you for password warning too, cleaned up password from info files.

I've uploaded cbcollect information from the node that was sending most of the errors. Please let me know if additional information is required.


Thank you again and best regards,
Dmytro Kovalov
--
  Dmytro Kovalov  
  http://dmytro.github.com
 
Reply all
Reply to author
Forward
0 new messages