Hi.
We've servers running rabbitmq 3.5.6 and it ran into disk full at one point.
The server has issue so we tried to copy the files from mnesia folder to another server that has rabbitmq as well. We changed what it requires on the new server to match the hostname from the previous one.
We tried this with files from couple of crashed servers and we're able to recover the queue messages.
But for some others it seems it's not able to do it and just hangs or frozen when I ran the service rabbitmq-server start command. Base on the rab...@host.log it seems there're some parse segment issue maybe.
But it will have a line mentioning Restarting crashed queue 'clicks' in vhost '/'. and either took too long (don't know if it's stuck) or it will repeat some of the lines again and restarting crashed queue.
Any help will be much appreciated. Thanks.
Here's the log we've from the server when it starts up...
=INFO REPORT==== 30-Sep-2016::16:45:15 ===
Starting RabbitMQ 3.5.6 on Erlang R14B04
Copyright (C) 2007-2015 Pivotal Software, Inc.
Licensed under the MPL. See http://www.rabbitmq.com/
=INFO REPORT==== 30-Sep-2016::16:45:15 ===
node : rabbit@mq-mt10
home dir : /var/lib/rabbitmq
config file(s) : /etc/rabbitmq/rabbitmq.config
cookie hash : OHOm6cAi0pRByqTsrYdG1A==
log : /var/log/rabbitmq/rab...@mq-mt10.log
sasl log : /var/log/rabbitmq/rab...@mq-mt10-sasl.log
database dir : /var/lib/rabbitmq/mnesia/rabbit@mq-mt10
=INFO REPORT==== 30-Sep-2016::16:45:15 ===
Memory limit set to 12838MB of 16047MB total.
=INFO REPORT==== 30-Sep-2016::16:45:15 ===
Disk free limit set to 50MB
=INFO REPORT==== 30-Sep-2016::16:45:15 ===
Limiting to approx 924 file handles (829 sockets)
=INFO REPORT==== 30-Sep-2016::16:45:15 ===
FHC read buffering: ON
FHC write buffering: ON
=INFO REPORT==== 30-Sep-2016::16:45:16 ===
Priority queues enabled, real BQ is rabbit_variable_queue
=INFO REPORT==== 30-Sep-2016::16:45:16 ===
Management plugin: using rates mode 'basic'
=INFO REPORT==== 30-Sep-2016::16:45:16 ===
msg_store_transient: using rabbit_msg_store_ets_index to provide index
=INFO REPORT==== 30-Sep-2016::16:45:16 ===
msg_store_persistent: using rabbit_msg_store_ets_index to provide index
=ERROR REPORT==== 30-Sep-2016::16:45:18 ===
** Generic server <0.240.0> terminating
** Last message in was {init,{<0.159.0>,[non_clean_shutdown]}}
** When Server state == {q,{amqqueue,
{resource,<<"/">>,queue,<<"clicks">>},
true,false,none,[],<0.240.0>,[],[],[],
undefined,[],undefined,live},
none,false,undefined,undefined,
{state,
{queue,[],[],0},
{active,1475268316149018,1.0}},
undefined,undefined,undefined,undefined,
{state,fine,5000,undefined},
{0,nil},
undefined,undefined,undefined,
{state,
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[],[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],
[],[],[]}}},
delegate},
undefined,undefined,undefined,undefined,0,running}
** Reason for termination ==
** {function_clause,
[{rabbit_queue_index,parse_segment_entries,
[<<"y">>,false,
{{array,16384,0,undefined,
{{{{{{{true,
<<6,79,246,149,241,55,30,251,55,72,30,36,11,132,
161,214,0,0,0,0,0,0,0,0,0,0,4,167>>,
<<131,104,6,100,0,13,98,97,115,105,99,95,109,101,
115,115,97,103,101,104,4,100,0,8,114,101,115,
111,117,114,99,101,109,0,0,0,1,47,100,0,8,101,
120,99,104,97,110,103,101,109,0,0,0,0,108,0,0,0,
1,109,0,0,0,6,99,108,105,99,107,115,106,104,6,
100,0,7,99,111,110,116,101,110,116,97,60,100,0,
4,110,111,110,101,109,0,0,0,3,16,0,2,100,0,25,
114,97,98,98,105,116,95,102,114,97,109,105,110,
103,95,97,109,113,112,95,48,95,57,95,49,108,0,0,
0,1,109,0,0,4,167,123,34,64,118,101,114,115,105,
111,110,34,58,34,49,34,44,34,64,116,105,109,101,
115,116,97,109,112,34,58,34,50,48,49,54,45,48,
57,45,50,57,84,49,48,58,48,57,58,50,56,46,54,49,
56,90,34,44,34,98,101,97,116,34,58,123,34,104,
111,115,116,110,97,109,101,34,58,34,105,112,45,
49,48,45,50,53,50,45,50,45,49,48,48,34,44,34,
110,97,109,101,34,58,34,105,112,45,49,48,45,50,
53,50,45,50,45,49,48,48,34,125,44,34,115,111,
117,114,99,101,34,58,34,47,118,97,114,47,108,
111,103,47,102,105,108,101,98,101,97,116,47,114,
97,98,98,105,116,109,113,95,99,108,105,99,107,
115,47,102,105,108,101,98,101,97,116,46,106,115,
111,110,34,44,34,116,121,112,101,34,58,34,108,
111,103,34,44,34,104,111,115,116,34,58,34,105,
.
.
.
50,56,46,57,54,52,55,125,44,34,108,111,103,116,
121,112,101,34,58,34,67,108,105,99,107,79,114,
69,114,114,111,114,34,125,106,109,0,0,0,16,209,
241,202,203,147,189,214,158,240,46,46,46>>},
del,no_ack},
undefined,undefined,undefined,undefined,undefined,
undefined},
10,10},
100,100,100,100,100,100,100},
1000,1000,1000,1000},
10000,10000,10000,10000,10000,10000,10000,10000,10000}},
16384}]},
{rabbit_queue_index,recover_segment,3},
{rabbit_queue_index,'-init_dirty/3-fun-0-',5},
{lists,foldl,3},
{rabbit_queue_index,init_dirty,3},
{rabbit_variable_queue,init,6},
{rabbit_priority_queue,init,3},
{rabbit_amqqueue_process,init_it2,3}]}
=ERROR REPORT==== 30-Sep-2016::16:49:54 ===
Restarting crashed queue 'clicks' in vhost '/'.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
You can try deleting journal.jif and try restarting butin general you cannot assume that a database that was unable to write to disk at some point is consistent or can be recovered.
Hi.
We've servers running rabbitmq 3.5.6 and it ran into disk full at one point.
The server has issue so we tried to copy the files from mnesia folder to another server that has rabbitmq as well. We changed what it requires on the new server to match the hostname from the previous one.
We tried this with files from couple of crashed servers and we're able to recover the queue messages.
But for some others it seems it's not able to do it and just hangs or frozen when I ran the service rabbitmq-server start command. Base on the rab...@host.log it seems there're some parse segment issue maybe.
But it will have a line mentioning Restarting crashed queue 'clicks' in vhost '/'. and either took too long (don't know if it's stuck) or it will repeat some of the lines again and restarting crashed queue.
Any help will be much appreciated. Thanks.
Here's the log we've from the server when it starts up...
=INFO REPORT==== 30-Sep-2016::16:45:15 ===
Starting RabbitMQ 3.5.6 on Erlang R14B04
Copyright (C) 2007-2015 Pivotal Software, Inc.
Licensed under the MPL. See http://www.rabbitmq.com/
=INFO REPORT==== 30-Sep-2016::16:45:15 ===
node : rabbit@mq-mt10
home dir : /var/lib/rabbitmq
config file(s) : /etc/rabbitmq/rabbitmq.config
cookie hash : OHOm6cAi0pRByqTsrYdG1A==
log : /var/log/rabbitmq/rabbit@mq-mt10.log
sasl log : /var/log/rabbitmq/rabbit@mq-mt10-sasl.log
database dir : /var/lib/rabbitmq/mnesia/rabbit@mq-mt10
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.
Hi We tried to remove the journal.jif and start rabbitmq again but still no luck. In the admin console, the queue still shows NaN.
We also tried remove some latest file *.idx in the queue folder to see maybe the last few ones are corrupted but still no luck.
Here're the logs and it mentioned it reached the max restart. Just wonder if any suggestion to what we can do to recover the queues?
>>>>> rab...@host.log
=ERROR REPORT==== 3-Oct-2016::12:24:28 ===
** Generic server <0.270.0> terminating
** Last message in was {'$gen_cast',init}
** When Server state == {q,{amqqueue,
{resource,<<"/">>,queue,<<"clicks">>},
true,false,none,[],<0.270.0>,[],[],[],
undefined,[],[],live},
none,false,undefined,undefined,
{state,
{queue,[],[],0},
{active,1475511164366062,1.0}},
undefined,undefined,undefined,undefined,
{state,fine,5000,undefined},
{0,nil},
undefined,undefined,undefined,
{state,
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[],[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],
[],[],[]}}},
delegate},
undefined,undefined,undefined,undefined,0,running}
** Reason for termination ==
** {function_clause,
[{rabbit_queue_index,parse_segment_entries,
[<<"P">>,false,
{{array,16384,0,undefined,
{{{{{{{true,
<<43,207,103,48,84,28,201,202,18,194,206,14,163,
248,97,175,0,0,0,0,0,0,0,0,0,0,4,168>>,
<<131,104,6,100,0,13,98,97,115,105,99,95,109,101,
115,115,97,103,101,104,4,100,0,8,114,101,115,
111,117,114,99,101,109,0,0,0,1,47,100,0,8,101,
120,99,104,97,110,103,101,109,0,0,0,0,108,0,0,0,
1,109,0,0,0,6,99,108,105,99,107,115,106,104,6,
100,0,7,99,111,110,116,101,110,116,97,60,100,0,
4,110,111,110,101,109,0,0,0,3,16,0,2,100,0,25,
114,97,98,98,105,116,95,102,114,97,109,105,110,
103,95,97,109,113,112,95,48,95,57,95,49,108,0,0,
0,1,109,0,0,4,168,123,34,64,118,101,114,115,105,
111,110,34,58,34,49,34,44,34,64,116,105,109,101,
115,116,97,109,112,34,58,34,50,48,49,54,45,48,
57,45,50,57,84,49,50,58,48,56,58,49,56,46,48,50,
54,90,34,44,34,98,101,97,116,34,58,123,34,104,
.
.
.
.
58,34,85,83,34,44,34,114,
101,99,111,114,100,95,
114,101,103,105,111,110,
34,58,34,80,65,34,44,34,
114,101,99,111,114,100,
95,99,105,116,121,34,58,
34,87,105,108,107,101,
115,32,66,97,114,114,101,
34,44,34,114,101,99,111,
114,100,95,112,111,115,
116,95,99,111,100,101,34,
58,34,49,56,55,48,54,34,
44,34,114,101,99,111,114,
100,95,46,46,46>>},
del,no_ack},
undefined,undefined,
undefined,undefined,
undefined,undefined},
10,10},
100,100,100,100,100,100,100},
1000,1000,1000,1000},
10000,10000,10000,10000,10000,
10000,10000,10000,10000}},
14198}]},
{rabbit_queue_index,recover_segment,3},
{rabbit_queue_index,
'-init_dirty/3-fun-0-',5},
{lists,foldl,3},
{rabbit_queue_index,init_dirty,3},
{rabbit_variable_queue,init,6},
{rabbit_priority_queue,init,3},
{rabbit_amqqueue_process,init_it2,3}]}
=ERROR REPORT==== 3-Oct-2016::13:00:59 ===
Restarting crashed queue 'clicks' in vhost '/'.
>>>>>> rab...@host-sasl.log
=SUPERVISOR REPORT==== 3-Oct-2016::13:13:02 ===
Supervisor: {<0.243.0>,rabbit_amqqueue_sup}
Context: shutdown
Reason: reached_max_restart_intensity
Offender: [{pid,<0.276.0>},
{name,rabbit_amqqueue},
{mfargs,
{rabbit_prequeue,start_link,
[{amqqueue,
{resource,<<"/">>,queue,<<"clicks">>},
true,false,none,[],<0.270.0>,[],[],[],
undefined,[],undefined,live},
recovery,<0.242.0>]}},
{restart_type,intrinsic},
{shutdown,4294967295},
{child_type,worker}]
can you explain that in more detail?in general you cannot assume that a database that was unable to write to disk at some point is consistent or can be recovered.
On Sat, Oct 1, 2016 at 10:23 AM, Michael Klishin <mkli...@pivotal.io> wrote:
You can try deleting journal.jif and try restarting butin general you cannot assume that a database that was unable to write to disk at some point is consistent or can be recovered.
Hi.
We've servers running rabbitmq 3.5.6 and it ran into disk full at one point.
The server has issue so we tried to copy the files from mnesia folder to another server that has rabbitmq as well. We changed what it requires on the new server to match the hostname from the previous one.
We tried this with files from couple of crashed servers and we're able to recover the queue messages.
But for some others it seems it's not able to do it and just hangs or frozen when I ran the service rabbitmq-server start command. Base on the rab...@host.log it seems there're some parse segment issue maybe.
But it will have a line mentioning Restarting crashed queue 'clicks' in vhost '/'. and either took too long (don't know if it's stuck) or it will repeat some of the lines again and restarting crashed queue.
Any help will be much appreciated. Thanks.
Here's the log we've from the server when it starts up...
=INFO REPORT==== 30-Sep-2016::16:45:15 ===
Starting RabbitMQ 3.5.6 on Erlang R14B04
Copyright (C) 2007-2015 Pivotal Software, Inc.
Licensed under the MPL. See http://www.rabbitmq.com/
=INFO REPORT==== 30-Sep-2016::16:45:15 ===
node : rabbit@mq-mt10
home dir : /var/lib/rabbitmq
config file(s) : /etc/rabbitmq/rabbitmq.config
cookie hash : OHOm6cAi0pRByqTsrYdG1A==
log : /var/log/rab...@mq-mt10.log
sasl log : /var/log/rab...@mq-mt10-sasl.log
database dir : /var/lib/rabbitmq/mnesia/rabbit@mq-mt10
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.