Beaglebone Black with Ubuntu flasher hanging unpredictably?

95 views
Skip to first unread message

Andris Bjornson

unread,
Apr 6, 2015, 7:30:45 PM4/6/15
to beagl...@googlegroups.com
Hi all,

I've deployed a number of Beagle Bone black units in West Africa as part of an emergency connectivity relief effort to support NGOs working to fight the Ebola outbreak.  The beaglebones are providing a simple network monitoring function.

The beaglebones were imaged in November with the Ubuntu flasher downloaded from here http://elinux.org/BeagleBoardUbuntu#Flasher  (The version of the image is BBB-eMMC-flasher-ubuntu-14.04.1-console-armhf-2014-10-29-2gb.img)

I'm having an issue with a few of the beaglebones hanging unpredictably, and I know I should provide some more information to help diagnose...but I'm having a hard time finding any "smoking gun" of what's causing the hang.  The beaglebones are in remote telco sheds monitoring network equipment - so one of my challenges is that I don't have a monitor connected or anyone I can ask "whats on the screen."  Fortunately I do have the ability to power cycle remotely (see below).

Here's what I know:
  1. The beaglebones have not been modified much at all from the standard base flasher image.  Just a few monitoring tools I've added from apt packages (smokeping and zabbix-proxy)  I use these tools elsewhere, and I've never had an issues with them hanging a system.
  2. The systems run for weeks at a time just fine
  3. At some point, the systems in question will "hang".  They stop responding to pings, but the ethernet port of the router they are connected to still shows a link light.
  4. Because I have the beaglebone connected to a remote manageable power strip / PDU, I am able to power cycle the beaglebone when this happens.  This causes the unit to boot normally, and it functions normally before the problem reoccurs another few weeks later.  

Each beaglebone is powered by a dedicated 5V / 1A power supply connected to its barrel connector.  Other equipment at the site does not hang or reboot - so I know the beaglebone hang does not coincide with a power issue at the site.

Can anyone give me any tips on diagnosing this?  I can see the time of hang and powercycle in dmesg and syslog....but there's no hint there as to what happened.  Everything was "all conditions normal" before the hang.

Has anyone seen this behavior before?

Thanks so much - any help greatly appreciated!


Graham

unread,
Apr 6, 2015, 8:58:54 PM4/6/15
to beagl...@googlegroups.com
What is the ambient air temperature the BBB is operating in?
I would measure the temperature of the Sitara chip. 
Perhaps it is running on the high side. 
There is built in die temperature sensor, although I don't know how easy it is to read it.

Either, based on data, or as an experiment, put a heat sink on the Sitara and/or blow some air over the BBB


--- Graham

==

Andris Bjornson

unread,
Apr 6, 2015, 10:04:03 PM4/6/15
to beagl...@googlegroups.com
Thanks for the response!  I'd thought about temperature as an issue....I'll have to dig into this.

I'd done some testing of beaglebones in a hotbox before this deployment and I ran things up pretty hot (like 65C for multiple hours) and never had an issue with the beaglebones....but let me investigate and see if i can find a correlation between high temp and these crashes.


---
Andris Bjornson | EveryLayer
skype: andris.bjornson

--
For more options, visit http://beagleboard.org/discuss
---
You received this message because you are subscribed to a topic in the Google Groups "BeagleBoard" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/beagleboard/f6-QXKbDUZo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to beagleboard...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Robert Nelson

unread,
Apr 6, 2015, 10:22:06 PM4/6/15
to Beagle Board
On Mon, Apr 6, 2015 at 9:03 PM, Andris Bjornson <and...@everylayer.com> wrote:
> Thanks for the response! I'd thought about temperature as an issue....I'll
> have to dig into this.
>
> I'd done some testing of beaglebones in a hotbox before this deployment and
> I ran things up pretty hot (like 65C for multiple hours) and never had an
> issue with the beaglebones....but let me investigate and see if i can find a
> correlation between high temp and these crashes.

uname -r ?

Regards,

--
Robert Nelson
https://rcn-ee.com/

Andris Bjornson

unread,
Apr 7, 2015, 12:25:54 AM4/7/15
to beagl...@googlegroups.com
uname -r ?

3.14.22-ti-r31


 

Andris Bjornson

unread,
Apr 7, 2015, 12:31:37 AM4/7/15
to beagl...@googlegroups.com
Hi Graham,

re:  Temperature - I looked at graphs of the temperature sensor of a router that's located in the same cabinet as the beaglebone.  At the time of the crash - the router temperature sensor was reading 40C (this sensor is inside the router case, so is not indicative of an ambient air temp of 40C)...it actually looks to have been one of the cooler days.


---
Andris Bjornson | EveryLayer
skype: andris.bjornson

On Mon, Apr 6, 2015 at 5:58 PM, Graham <gra...@flex-radio.com> wrote:

Robert Nelson

unread,
Apr 7, 2015, 9:51:38 AM4/7/15
to Beagle Board
On Mon, Apr 6, 2015 at 11:25 PM, Andris Bjornson <and...@everylayer.com> wrote:
>> uname -r ?
>
>
> 3.14.22-ti-r31

Yuck, yeah there are some issues with that old version...

Please upgrade to 3.14.37-ti-r57

sudo apt-get update ; sudo apt-get install linux-image-3.14.37-ti-r57
; sudo reboot

and retest one of your units in those conditions.

Andris Bjornson

unread,
Apr 29, 2015, 6:51:36 PM4/29/15
to beagl...@googlegroups.com
Thank you for the response (and sorry for my lack of update!....been traveling).

I'll update and let you know if there's an improvement.

Thanks!

Andris Bjornson

unread,
May 12, 2015, 4:21:32 PM5/12/15
to beagl...@googlegroups.com
Several weeks after implementing this, and stability seems very much improved.

Thank you so much for your help - this is a big relief to have this sorted! 

FYI - there is now a small army of ~15 beaglebones deployed throughout Sierra Leone and Liberia to monitor health of connectivity there.
Reply all
Reply to author
Forward
0 new messages