spontaneous reboot issues bbb, jessie 8.4 console image of 4/7/2016, 4.4.9 -bone-rt-r10 LTS kernel

48 views
Skip to first unread message

Super Twang

unread,
May 19, 2016, 12:34:25 PM5/19/16
to BeagleBoard

PS. Is this the best place to put bug/issue reports?  I believe this relates to the core image & kernel of a Jessie LTS release.

---

I'm seeing semi-regular spontaneous reboots on the BBB.  Here's what I'm seeing:

Running 'top', the "Mem Buffers" field, (4th line, far right), is steadily increasing until the system reboots at around (41180 buffers).

Anyhow, since I'm going after max stability, this is of concern to me, so I've been doing a little testing.

I'm not entirely sure what I'm looking for, but since I'm not running any of my own apps in this setup, I'm hoping the results might be useful for someone else?


I have logs available from

vmstat -s K -n 1 >> logfile

as well as a screengrab at time of fail of 

'top'

and

'slabtop'

Let me know if you have any insights or would like the logs.  I still have the image and am happy to run other tests if it'll help.  
In the meantime, I'm going to revert to a Wheezy console image and hope to find more stability there.

Best,
Dave

--------- setup is below

Setup:

Running from image:  RCN's jessie 8.4 console image of 4/7/2016  (with some packages removed, and a developer environment installed)

Kernel:  4.4.9 -bone-rt-r10 kernel

Running from SDCard

ssh'ed in via DCHP connected ethernet, most recently capturing vmstat to a log file.



PS.  Under a different testing scenario where I am running my own http server/app, I see strange behavior too.  I can say that my app has been extensively tested for leaks, etc, BUT can't rule out that it is my own app causing the issues when I am running it.   Nonetheless, maybe the strangeness will help dignose the above issue, so I'm including it here.

The strangeness I'm seeing is this:  There is a 24 hour recurring pattern to the reboots.    In the log below, I'm the number to the right of "Startup" is "seconds since last startup".

Any idea what this might be???  Logfile rotation?  Some kind of cron-job interference?

2016-05-17 07:36:14: Startup       15667    31504        * 

2016-05-17 11:57:11: Startup       15657        *        * 

2016-05-17 14:21:10: Startup        8639        *        * 

2016-05-17 16:20:27: Startup        7157        *        * 

2016-05-17 17:22:41: Startup        3734        *        * 

2016-05-17 18:23:23: Startup        3642        *        * 

2016-05-17 19:24:06: Startup        3643        *        * 

2016-05-17 20:25:33: Startup        3687        *        * 

2016-05-17 21:26:16: Startup        3643        *        * 

2016-05-17 22:27:00: Startup        3644        *        * 

2016-05-17 23:27:41: Startup        3641        *        * 

2016-05-18 00:28:24: Startup        3643        *        * 

2016-05-18 01:30:36: Startup        3732        *        * 

2016-05-18 02:29:06: Startup        3510        *        * 

2016-05-18 03:17:03: Startup        2877        *        * 

2016-05-18 03:21:32: Startup         269        *        * 

2016-05-18 03:25:59: Startup         267        *        * 

2016-05-18 03:30:27: Startup         268        *        * 

2016-05-18 03:34:54: Startup         267        *        * 

2016-05-18 07:25:51: Startup       13857        *        * 

2016-05-18 11:22:51: Startup       14220        *        * 

2016-05-18 12:51:18: Startup        5307        *        * 

2016-05-18 13:51:59: Startup        3641        *        * 

2016-05-18 14:54:13: Startup        3734        *        * 

2016-05-18 15:55:42: Startup        3689        *        * 

2016-05-18 16:56:25: Startup        3643        *        * 

2016-05-18 17:57:54: Startup        3689        *        * 


Robert Nelson

unread,
May 19, 2016, 12:37:42 PM5/19/16
to Beagle Board

That's odd...

http://rcn-ee.homeip.net:81/farm/uptime/test-bbb-6.log

Double check your power supply..

> --
> For more options, visit http://beagleboard.org/discuss
> ---
> You received this message because you are subscribed to the Google Groups "BeagleBoard" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to beagleboard...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/beagleboard/d9d73516-35b8-4673-96b5-e8dc9bef93d7%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Super Twang

unread,
May 19, 2016, 1:08:17 PM5/19/16
to beagl...@googlegroups.com

Robert,

You make a good point.  Maybe a brownout/blackout caused it.  We had a tstorm roll through here last night, and I’m not running power from a UPS,   I wish there was someway to catch a spontaneous power loss in the logs so I could differentiate that out, but the syslog shows nothing in these cases.  I’ll try again with (more) stable power.

Is it normal for the buffer count to steadily increase like I’m seeing?  Could it be my own stat logging that is causing it?  Maybe I’m doing something and just not seeing it straight, but I’m trying to have as minimal an impact as I can on the running system.

Thanks for your thoughts.

Best,
ST

PS. Out of curiosity, how many machines are in your testing farm?  Do you have a photo?  I’ve got a whopping 2. :)


You received this message because you are subscribed to a topic in the Google Groups "BeagleBoard" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/beagleboard/BTnO7AiHy8g/unsubscribe.
To unsubscribe from this group and all its topics, send an email to beagleboard...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beagleboard/CAOCHtYhLXvvT9ATJCir4et-N42duXzCPginVdWdZZ0nnp1sGqw%40mail.gmail.com.

Wally Bkg

unread,
May 19, 2016, 3:37:57 PM5/19/16
to BeagleBoard
I've a BBW running 24/7 since December with: BeagleBoard.org Debian Image 2015-11-12, kernel 4.1.18-ti-r49 #1 SMP PREEMPT Fri Feb 26 00:12:54 UTC 2016 armv7l GNU/Linux

It is on a UPS and I've had exactly one lock up, happened last week -- actually looked like it had powered down, all LEDs off.  We also had a thunderstorm roll through about when its last heartbeat message came, but the UPS never "beeped" nor did the lights flicker (unusual, this generally means it was only a very minor thunderstorm as things go around here).

I posted a thread on it when it happened.  What was weird was the reset button or unplugging and re-plugging the 5V "barrel connector" power-supply didn't restart it, so I thought it'd fried.  I left it unplugged while checking some other parts of the system and when I when back to it 10+ minutes later it booted right up when I plugged it in -- almost like some kind of thermal overload, but nothing is even warm to the touch on the board.  It still running 24/7 since, another small thunderstorm is rolling through as I type this, I'll see what if anything happens.
Reply all
Reply to author
Forward
0 new messages