[Beaglebone] SD Card Corruption on Read Only File System

7,326 views
Skip to first unread message

Özen Özkaya

unread,
Jan 9, 2013, 6:53:45 AM1/9/13
to beagl...@googlegroups.com
Hi,

My beaglebone's sd card is corrupted on read only file system after 27 days of work.
Here below you can see the error:

[    4.718902] mmcblk0: error -110 transferring data, sector 415849, nr 24, cmd response 0x900, card status 0x200b00
[    8.636077] mmcblk0: error -110 transferring data, sector 415872, nr 1, cmd response 0x900, card status 0x0
[    8.646362] end_request: I/O error, dev mmcblk0, sector 415872
[   12.546051] mmcblk0: error -110 transferring data, sector 415865, nr 8, cmd response 0x900, card status 0x200b00
[   16.453765] mmcblk0: error -110 transferring data, sector 415872, nr 1, cmd response 0x900, card status 0x0
[   16.464050] end_request: I/O error, dev mmcblk0, sector 415872
[   16.470703] Kernel panic - not syncing: Attempted to kill init!


The only region which I wrote is /dev/shm/ and as I know it does not related with SD Card region?
I am really shocked about SD card corruption in read-only file system.

Do you have any suggestion?

Regards

Gerald Coley

unread,
Jan 9, 2013, 8:30:17 AM1/9/13
to beagl...@googlegroups.com
Are you unmounting the SD card before power down? If not then you should be.

Gerald

özen özkaya

unread,
Jan 9, 2013, 10:37:14 AM1/9/13
to beagl...@googlegroups.com
Thank you Grald,

I don't know when will be a power down. All power downs are unwanted, unpredicted but real. I tried to protect SD cards with mounting the whole filesystem read-only.
Why SD cards broken although they used under read-only file system.
Do you have any explanation? I can not predict the real reason of corruption. If you have any suggestion with reasons it would be great.

Regards.

2013/1/9 Gerald Coley <ger...@beagleboard.org>



--
Özen Özkaya

Robert Nelson

unread,
Jan 9, 2013, 10:41:32 AM1/9/13
to beagl...@googlegroups.com
On Wed, Jan 9, 2013 at 5:53 AM, Özen Özkaya <ozeno...@gmail.com> wrote:
> Hi,
>
> My beaglebone's sd card is corrupted on read only file system after 27 days
> of work.
> Here below you can see the error:
>
> [ 4.718902] mmcblk0: error -110 transferring data, sector 415849, nr 24,
> cmd response 0x900, card status 0x200b00
> [ 8.636077] mmcblk0: error -110 transferring data, sector 415872, nr 1,
> cmd response 0x900, card status 0x0
> [ 8.646362] end_request: I/O error, dev mmcblk0, sector 415872
> [ 12.546051] mmcblk0: error -110 transferring data, sector 415865, nr 8,
> cmd response 0x900, card status 0x200b00
> [ 16.453765] mmcblk0: error -110 transferring data, sector 415872, nr 1,
> cmd response 0x900, card status 0x0
> [ 16.464050] end_request: I/O error, dev mmcblk0, sector 415872
> [ 16.470703] Kernel panic - not syncing: Attempted to kill init!

Ah, this looks more like the SD card's controller/nand is actually
shot, not necessarily just a corrupted file system..

Regards,

--
Robert Nelson
http://www.rcn-ee.com/

Andrew Bradford

unread,
Jan 9, 2013, 11:10:11 AM1/9/13
to beagl...@googlegroups.com, robert...@gmail.com
I concur with Robert, that's not a file system issue, that's an SD card
internal issue.

Buy better SD cards. Samsung and SanDisk make nice ones in 4 and 8 GB
sizes for reasonable prices.

Can you provide the entire boot output to the serial port? Are you
sure you're using read only file systems?

-Andrew

Torkel M. Jodalen

unread,
Jan 9, 2013, 2:46:59 PM1/9/13
to beagl...@googlegroups.com
Oh, these are indeed exactly the same errors that I got from my BBone. Tried several microSD card vendors, they all went down the drain with the same error after a month or so.
 
I didn't mount my cards read-only. My I/O was all taking place to a RAM disk. Normally seeing these error messages after close to a month of operation - ranging from 22 to 28 days of continous power-on (no logged power supply failures during any of these periods).
 
Strange and undesired.
 
Just my two cents to shed light on things.

özen özkaya

unread,
Jan 10, 2013, 2:36:43 AM1/10/13
to beagl...@googlegroups.com
This is a very very bad situation. I think there is no valid solution. I am suspecting about a hardware design or scenario problem and I will try to find a solution. If I can find a solution I will share it.

Regards 

2013/1/9 Torkel M. Jodalen <la6...@gmail.com>

--
For more options, visit http://beagleboard.org/discuss
 
 



--
Özen Özkaya

meino....@gmx.de

unread,
Jan 10, 2013, 2:46:23 AM1/10/13
to beagl...@googlegroups.com
özen özkaya <ozeno...@gmail.com> [13-01-10 08:37]:
Hi

from the manpage of "mount"

-r, --read-only
Mount the filesystem read-only. A synonym is -o ro.

Note that, depending on the filesystem type, state and
kernel behavior, the system may still write to the
device. For example, Ext3 or ext4 will replay its
journal if the filesystem is dirty. To pre‐ vent this
kind of write access, you may want to mount ext3 or ext4
filesystem with "ro,noload" mount options or set the
block device to read-only mode, see command blockdev(8).

May it possible, the your system still writes to the filesystem while
slowly but shureley weare out the nand flash of the sd card.
So that finally this results in the errors?

Just a shot in the dark...

Best regards,
mcc




Andrew Bradford

unread,
Jan 10, 2013, 9:15:35 AM1/10/13
to beagl...@googlegroups.com, jon.k...@gmail.com
On Thu, 10 Jan 2013 04:22:37 -0800 (PST)
jon.k...@gmail.com wrote:

> > My beaglebone's sd card is corrupted on read only file system after
> > 27 days of work.
> > Do you have any suggestion?
>
> I'm up to my third SanDisk/KingMax SD card, and a friend of mine is
> up to his second as well. Seems the beagles just love killing sdcards.
>
> Not sure if it's a power supply issue or what. I notice on the same
> system I'm getting GPIO input bounce off open collector digital pulse
> sources, so perhaps the 5v regulated power supply I have them on has
> ripples or not enough surge capacity or something. I'm going to try
> my third SDcard one powered exclusively off a regulated USB power
> supply (it's not doing anything but network and reading a slow pulse
> output).
>
> But yeah, I've tried two different SDcard manufacturers, and the
> beagle just loves chewing through them. At best the beagle's are just
> really super sensitive to power supply quality, and at worst they
> have some sort of design flaw that chews cards (which would be a huge
> shame and pretty much rule them out from here on in for me as a valid
> dev platform...)

For anecdotes, I've been running exclusively SanDisk Ultra Mobile and
Samsung Plus microSD cards and have yet to have one die on me. I've
been running A3, A5, and A6 bones since November 2011, many of them
with syslog getting hit quite hard with no rate limit. Uptimes of a
week or more between power down/up is not uncommon (usually due to
physically having to move the bones and unplugging causes a clean
shutdown with short transfer to lithium battery).

All are running with custom 5V 2A DC sources, regulated down from 24V
AC-DC power supplies via a custom cape. Some also have lithium
batteries for clean shutdown when DC power goes away. Power might be
your problem, worth trying to fix that first.

-Andrew

Sid Boyce

unread,
Jan 10, 2013, 9:32:23 AM1/10/13
to beagl...@googlegroups.com
On 10/01/13 12:22, jon.k...@gmail.com wrote:
My beaglebone's sd card is corrupted on read only file system after 27 days of work.
Do you have any suggestion?
I'm up to my third SanDisk/KingMax SD card, and a friend of mine is up to his second as well. Seems the beagles just love killing sdcards.

Not sure if it's a power supply issue or what. I notice on the same system I'm getting GPIO input bounce off open collector digital pulse sources, so perhaps the 5v regulated power supply I have them on has ripples or not enough surge capacity or something. I'm going to try my third SDcard one powered exclusively off a regulated USB power supply (it's not doing anything but network and reading a slow pulse output).

But yeah, I've tried two different SDcard manufacturers, and the beagle just loves chewing through them. At best the beagle's are just really super sensitive to power supply quality, and at worst they have some sort of design flaw that chews cards (which would be a huge shame and pretty much rule them out from here on in for me as a valid dev platform...)

Jon
--
For more options, visit http://beagleboard.org/discuss
 
 
I am using Ubuntu and in every case during an upgrade I have ended up with a corrupt SD card, Kingston, Sandisk, all Class 10, result is the same on -XM and Beaglebone.

Sometimes fsck fixes it but eventually they fail and have to be rebuilt.

No such problem on Pandaboard or ODROID-X that use full-size SD cards.
Regards
Sid.
-- 
Sid Boyce ... Hamradio License G3VBV, Licensed Private Pilot
Emeritus IBM/Amdahl Mainframes and Sun/Fujitsu Servers Tech Support
Senior Staff Specialist, Cricket Coach
Microsoft Windows Free Zone - Linux used for all Computing Tasks

Jon Kloske

unread,
Jan 10, 2013, 4:47:35 PM1/10/13
to beagl...@googlegroups.com
> Power might be your problem, worth trying to fix that first.

Yep, on that assumption I'm already trying a different type of power supply, fed via the usb rather than power port.

These are A8 bones I believe.

Lots of variables to control! Will let you know how the new power supply goes, but of course it could be a month or two before I know :)

Jon

Gerald Coley

unread,
Jan 10, 2013, 4:55:12 PM1/10/13
to beagl...@googlegroups.com
Well, I doubt they are A8, as A6A is the latest revision. From a power standpoint there is really little difference between USB and DC power as the both go to the same point. Other than the fact that USB power limits the current to 500mA and as such limits the processor speed to 600MHz. The PMIC uses internal FETs to select one versus the other to feed all the various regulators.


Gerald

--
For more options, visit http://beagleboard.org/discuss


Deimantas Žvirblis

unread,
Jan 11, 2013, 3:18:29 AM1/11/13
to beagl...@googlegroups.com
I had the same problem, and i found a solution that helped me.

SD Cards cant handle many write cycles.
For example i was making INSERT's into my MySQL database for more then several times per second. That was bad idea. So i changed the MySQL table type to MEMORY (table is on RAM) and when i need  to save taht data (eg.before shut down) i change table type for example to MyISAM (table is in SD card).
Other thing is you should put your folders like var, tmp to RAM, because many programs uses this what leads to writing to SD card and we don't want to do that. 
Solution:
echo "tmpfs                     /tmp            tmpfs   nodev,nosuid                            0   0" >> /etc/fstab
echo
"tmpfs                     /var/log        tmpfs   nodev,nosuid                            0   0" >> /etc/fstab


With these changes my Beaglebone REV.A6 (with Robert Nelson Ubuntu 12.04) works for more than a two months with standart SD card (Kingston 4GB Calss 4) without any crashes and believe me, it performs a lot of work.

Deimantas Žvirblis

unread,
Jan 11, 2013, 3:22:29 AM1/11/13
to beagl...@googlegroups.com

vjvargas

unread,
Jan 11, 2013, 9:17:33 AM1/11/13
to beagl...@googlegroups.com
Great information, a while ago I posted in other thread about a similar situation with some BBxM I had working remotely, running a web server and mysql service, due to the heavy load or W/R on SD card I had to replace them like every 5 days. The solution I followed was by exporting the mysql DB to a external USB HD drive, which improved the whole system and also I haven't had to change the SD for about 4 months. Now I'm working in porting the project in the BBxM to a BB revC5 NAND which leads to no corruption at all due to the filesystem format implemented in NAND memories :)

Gerald Coley

unread,
Feb 5, 2013, 9:38:39 AM2/5/13
to beagl...@googlegroups.com
Yep. Works great.

Gerald

On Tue, Feb 5, 2013 at 8:33 AM, SKiAt <thes...@gmail.com> wrote:
Good! keep us updated on this..

By the way I found that cirtuirco guys made a "cape" to have a NAND mounted atop the BB 
http://circuitco.com/support/index.php?title=BeagleBone_4Gb_16-Bit_NAND_Module, but there is a note in the reseller site telling something like: "no driver, development purpose only"..

Did someone tried it out?

bye 
Luca
--
For more options, visit http://beagleboard.org/discuss
---
You received this message because you are subscribed to the Google Groups "BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beagleboard...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

SKiAt

unread,
Feb 5, 2013, 11:20:45 AM2/5/13
to beagl...@googlegroups.com
Wow, this could be a definitive solution for me, can I ask you some hint Gerald?

did you find some manual on how to prepare the NAND to boot with angstrom?
What they mean with "Note: There's currently no software support for BeagleBone Memory Expansion Cape. This cape is sold for development purposes only."

thanks in advance,
Luca

mickeyf

unread,
Feb 5, 2013, 11:52:58 AM2/5/13
to beagl...@googlegroups.com
Someone educate me on this please - isn't NAND what is used in SD cards, and won't any NAND suffer the same concerns with wear levelling, bad black management, and number of write cycles?

You may be moving the direct control out of the SD card's built in controller to somewhere else, but ultimately the issues remain and must be addressed, yes?

Just wondering.

David Goodenough

unread,
Feb 5, 2013, 12:34:57 PM2/5/13
to beagl...@googlegroups.com

If it helps I have been running small embedded systems using MIDE modules

and CF cards for over eleven years with very few failures. And I did not

even use a filesystem that is flash friendly (I used ext3). The filesystem

was mounted rw and I am using Debian linux with normal logging enabled to

the MIDE/CF cards.

 

All these tails of short life seem to be worst cases.

 

David

 

Luca Marchesi

unread,
Feb 5, 2013, 1:18:53 PM2/5/13
to beagl...@googlegroups.com
Yes David,
but Compact Flash technology is very different from SD or NAND integrated chips.
Me too I used to run systems with CF without almost any problem related to the storage for long time.

Here with 3 beaglebone A6 mounting a kingstone class10 SD 4Gb from where I run the same angstrom version given out of the box, except for some package I removed such as cloud9 and graphical stuff.
My app uses very few disk writes and with an uptime of about 85 days (that is about the time I switched them on first time). Now several files all around the filesystem are unaccessible and I can get messages like "EXT4 i/o error.." even if I try to "ls" them. I tried to reboot one of them an it never came up again.
In the office I have a 4th board installed about in the same period of the others, the only difference is the uptime. Because of it is the development board sometime I used to reboot it.
Now my tough is maybe at boot time an implicit and silent fsck is reparing time to time mistakes of the partition and the result is a well running system... uhm..

Now i just started a disk stress test on 2 boards with a fresh installation on  new sd cards, one have a cronjob to reboot every 2 days the other not. Let's see what will happen.

keep in touch
Luca
--

Toan Pham

unread,
Feb 5, 2013, 1:58:01 PM2/5/13
to beagl...@googlegroups.com

I also have so many issues with SD-cards, even though they are used in RO mode.  We  have tried at least 5 different brands, and they are equally unreliable when compared to CF cards.  By next week, I should get a new industrial grade micro-sd card made by ATP to test.  Supposively, ATP's micro-sd cards use Multi-Level Cell (MLC) technology to  them more reliable than any other sd-cards.  I'll update you guys on ATP's sd-card liability in about 10 days.  I hope their SD-cards will hold up; otherwise, i really need to move to nand mtd.

Andrew Bradford

unread,
Feb 5, 2013, 2:44:49 PM2/5/13
to beagl...@googlegroups.com, tpha...@gmail.com
Aren't SLC (single level cell) flash more durable than MLC? You can
use MLC as SLC, it just holds less data but should be more durable. At
least that was my understanding.

EEtimes seems to agree with me [1].

[1]:http://www.eetimes.com/General/PrintView/4390427

-Andrew

Andrew Bradford

unread,
Feb 5, 2013, 2:51:15 PM2/5/13
to beagl...@googlegroups.com, mic...@thesweetoasis.com
On Tue, 5 Feb 2013 08:52:58 -0800 (PST)
mickeyf <mic...@thesweetoasis.com> wrote:

> Someone educate me on this please - isn't NAND what is used in SD
> cards, and won't any NAND suffer the same concerns with wear
> levelling, bad black management, and number of write cycles?

Yes, but if you let Linux (or anything with half a wear leveling
algorithm in it) write in a civilized manner, each erase block will get
less writes and hence last longer.

Read Arnd's LWN column [1] to understand garbage collection and open
erase blocks.

[1]:https://lwn.net/Articles/428584/

> You may be moving the direct control out of the SD card's built in
> controller to somewhere else, but ultimately the issues remain and
> must be addressed, yes?

Yes, but SD card controllers are built to a dollar amount, not for
performance. Kingston's controllers are usually junk. SanDisk and
Samsung usually have better controllers. Having direct control of the
flash from Linux should give the best results as Linux has quite a lot
of work done on it to provide robust operation with flash file systems.

CF might be better than SD simply due to the use cases, such as pro
cameras and video, whereas SD is more targeted at consumers where
beating the living daylights out of their gear isn't the common
use-case. CF also has a long history of being expensive. SD doesn't.

-Andrew

SKiAt

unread,
Mar 5, 2013, 10:58:10 AM3/5/13
to beagl...@googlegroups.com
Hi guys,
I thinks it's time to report also to you the results of the test I mentioned in the last post.
=> 2 boards with same sd cards installed, one of them with a scheduled reboot and both with a stress program found here: http://weather.ou.edu/~apw/projects/stress/

Hence:
after one week:
- the board with periodic reboot was ok. (kept is on and keep monitoring)
- the board always on was unaccessible from ssh but the network was somehow alive (ping ok).
I connected the usb cable to enter to the serial console, the login propt is shown but no way to access, so noway to soft reboot. Still connected to serial, I made an hard reboot. Bootloader fails.
Take out the sd and try to read it into a card reader on my laptop. The partitions were there but large amount of files unaccessible as expected. I tried fsck several times (20 or more) and some error was corrected but every time the fs was still corrupted at the end (fsck said).
I prepared a new sd where I prepared a homemade partition table using ext3 instead of ext4. Copied all the content from the original partition ext4 to the new one, creating a tar file to avoid symbolic links loops. Placed the new ext3 sdcard into the initial board.

Now is up and running without problems from more then 20 days with the same disk stress process massively writing the sd!

So my solution is to use ext3. And the cause is a misbehavior of ext4 with sd cards or the partition table provided from the factory is not correct, I don't know..
Up to you

byebye


p.s. the other board with ext4 and automatic reboot is not accessible from time to time between 2 reboots.. so not a solution

SKiAt

unread,
Mar 14, 2013, 12:08:10 PM3/14/13
to beagl...@googlegroups.com
People,
I unfortunately have to withdraw the above..

After a few days after a proudly wrote the report of my tests the sd filesystem started to be corrupted somewhere and I was no more able to connect with ssh. A manual reboot from serial console caused the kernel to panic because of impossibility to load some library after remounted the rootfs.

No way!
We still look for a solution! :(

özen özkaya

unread,
Mar 14, 2013, 12:45:07 PM3/14/13
to beagl...@googlegroups.com
Don't use damn kingston cards. Buy a fresh sandisk class 4 or class 6.
After all, mount your filesystem read only. If you want to keep some files temprorary, use ramdisk. 
I promise you can not corrupt your sd card with this scenario.

Regards
Ozkaya

2013/3/14 SKiAt <thes...@gmail.com>

--
For more options, visit http://beagleboard.org/discuss
---
You received this message because you are subscribed to the Google Groups "BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beagleboard...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Özen Özkaya


 

Toan Pham

unread,
Mar 14, 2013, 12:50:11 PM3/14/13
to beagl...@googlegroups.com

After a few days after a proudly wrote the report of my tests the sd filesystem started to be corrupted somewhere and I was no more able to connect with ssh. A manual reboot from serial console caused the kernel to panic because of impossibility to load some library after remounted the rootfs.

No way!
We still look for a solution! :(


SkiAt,

I just want to share with you my experience.  Using Ext3 on a micro-sd flash is a bad idea b/c of journaling support in ext3/ext4 filesystem.  It will tend to wear and kill your sd flash even quicker. 

The approach i am taking is using a fat filesystem as the root partition of the sd-card.  I then mount the root filesystem from a squashfs archive.  Next, overlay the root-filesystem with advanced union filesystem (aufs) with a tmpfs.  Now, i would have an OS that would not write the the flash card but ram.  This is by far the best approach i can think of to minimize flash errors.  Although i still get flash error, but i believe it is at the MMC bus level (signal level), not actually the memory chip. 

NeonJohn

unread,
Mar 17, 2013, 4:18:22 PM3/17/13
to beagl...@googlegroups.com


On 03/17/2013 07:13 AM, jon.k...@gmail.com wrote:

> Aaaanyway, quite a shame the beagle keeps chewing cards. I've tried three
> brands (including sandisk) and they all last about as long as each other.
> IMHO this is a pretty big problem and may see me jump ship to pi's or the
> duinos. :(

I jumped ship the other way, from the Arduinos to the BBone because I
needed the raw horsepower.

I think that 3.8 or whatever the stable revision number ends up being is
going to be much better than that. I'm working with 3.8 now. I can pop
out the card from a running Bone, take it back to my office and do
whatever and when I return to the lab, the Bone is still running.
They've apparently grabbed the 3rd LED to indicate RAMdisk activity
because I see that LED flicker a lot but very little activity on the SD LED.

John


--
John DeArmond
Tellico Plains, Occupied TN
http://www.fluxeon.com <-- THE source for induction heaters
http://www.neon-john.com <-- email from here
http://www.johndearmond.com <-- Best damned Blog on the net
PGP key: wwwkeys.pgp.net: BCB68D77

Andrew Bradford

unread,
Mar 18, 2013, 4:22:25 PM3/18/13
to beagl...@googlegroups.com, jon.k...@gmail.com
On Sun, 17 Mar 2013 04:13:00 -0700 (PDT)
jon.k...@gmail.com wrote:

> Gah, yes, sorry, A6 is indeed the revision. My memory is apparently
> as good as the SD cards :) Just had my fourth one die. I tried a very
> well regulated USB power supply for this one, so definitely not power
> supply issues. They seem to last one to two months running 24/7 and
> then pack it in. I suspect they actually die after a month, but aside
> from noticing lots of interactive stalls on the console and in other
> parts of the system which seem to correct themselves fine after 2-5
> seconds, they stay up until the power is removed :) Then they just
> flat out fail to boot, and trying to read the SD card from the laptop
> fails with a device not ready error about 21% into the read!
>
> Aaaanyway, quite a shame the beagle keeps chewing cards. I've tried
> three brands (including sandisk) and they all last about as long as
> each other. IMHO this is a pretty big problem and may see me jump
> ship to pi's or the duinos. :(

What exactly is the workload you're subjecting the SD cards to?

I strongly suspect that the way you're using the cards is what's
hurting them, if you have good power and aren't doing unclean shutdowns.
Moving to some other dev kit won't change this.

-Andrew

Luca Marchesi

unread,
Mar 18, 2013, 6:16:03 PM3/18/13
to beagl...@googlegroups.com
Andrew,
In my case is just bring up the linux with my application that uses just
a couple of serial port and some ip socket.
and after some time the filesystem goeas..

The last test I'm going to try is to mount rootfs from usb device (an
industrial grade compact flash) let's see what will happen..

bye
keep in touch

Jon Kloske

unread,
Mar 18, 2013, 8:44:09 PM3/18/13
to Andrew Bradford, beagl...@googlegroups.com
Hi Andrew,

Steps to reproduce:

1. install image on sd card that comes with the bone.

2. change timezone and localtime to match where i live

3. add init.d script and rc.d to on startup change one of the gpio pins to interrupt input

4. add a folder to /var containing a script that every 5 mins hits up proc to see how many interrupts the gpio pin has had. add a link to this script to cron.d.

5. save the new number in one of the many ram backed filesystem locations.

6. use curl to upload the difference between the current and last count to a web service.

7. wait a month or two.

So... yeah. Aside from about 5 nonvolatile changes to the fs during initial systemsetup, nothing should be touching the flash card at all.

As an aside, the gpio pin is attached to an open collector pulse power monitor and I get really bad bounce on the beagle io pins. I've never seen bounce on digital pulse interfaces before... yet another annoyance with the bones...

Regards,
Jon

Andrew Bradford

unread,
Mar 19, 2013, 8:03:36 AM3/19/13
to beagl...@googlegroups.com, jon.k...@gmail.com
On Tue, 19 Mar 2013 10:44:09 +1000
Jon Kloske <jon.k...@gmail.com> wrote:

> Steps to reproduce:
>
> 1. install image on sd card that comes with the bone.

Which other manufacturer and models of SD cards have you observed this
on?

Have you tried SanDisk Mobile Ultra 4 GB cards or Samsung Plus 8 GB
cards? Do you see the same results over time?

> 2. change timezone and localtime to match where i live
>
> 3. add init.d script and rc.d to on startup change one of the gpio
> pins to interrupt input
>
> 4. add a folder to /var containing a script that every 5 mins hits up
> proc to see how many interrupts the gpio pin has had. add a link to
> this script to cron.d.
>
> 5. save the new number in one of the many ram backed filesystem
> locations.

Which of "many" ram backed locations? Are you sure it's tmpfs or a
ramdisk?

Do you have swap enabled? (sorry, I don't run the stock Angstrom that
comes with bones)
If so, disable it. Even setting your swappiness properly can still end
up writing to swap when you don't expect or want (on an SD card you
_NEVER_ want to write swap).

> 6. use curl to upload the difference between the current and last
> count to a web service.

Are you sure this isn't writing logs somewhere?

> 7. wait a month or two.
>
> So... yeah. Aside from about 5 nonvolatile changes to the fs during
> initial systemsetup, nothing should be touching the flash card at all.
>
> As an aside, the gpio pin is attached to an open collector pulse
> power monitor and I get really bad bounce on the beagle io pins. I've
> never seen bounce on digital pulse interfaces before... yet another
> annoyance with the bones...

Regarding GPIO bounce, the GPIOs are reasonably quick on am335x, do you
see bounce when observing them with a scope? I believe the GPIO
subsystem is on a 100 MHz clock that rotates between each bank (check
TRM for sure) so 25 MHz input is possible (although the timing would
have to line up nicely to avoid aliasing). Sorry, I'm not much help
here other than to suggest you put some hardware filtering on there or
debounce in software.

-Andrew

Jason Kridner

unread,
Mar 19, 2013, 8:26:40 AM3/19/13
to beagl...@googlegroups.com, jon.k...@gmail.com

Can you keep the rootfs marked as read-only? I see very little that could be influenced by the hardware here and advise you to test your use case well on any platform, since the Bone isn't likely causing your issue.

> >
> > As an aside, the gpio pin is attached to an open collector pulse
> > power monitor and I get really bad bounce on the beagle io pins. I've
> > never seen bounce on digital pulse interfaces before... yet another
> > annoyance with the bones...
>
> Regarding GPIO bounce, the GPIOs are reasonably quick on am335x, do you
> see bounce when observing them with a scope?  I believe the GPIO
> subsystem is on a 100 MHz clock that rotates between each bank (check
> TRM for sure) so 25 MHz input is possible (although the timing would
> have to line up nicely to avoid aliasing).  Sorry, I'm not much help
> here other than to suggest you put some hardware filtering on there or
> debounce in software.

The slew rate on the digital outputs is programmable. Try setting the lower slew rate.

>
> -Andrew


>
> --
> For more options, visit http://beagleboard.org/discuss

Andrew Bradford

unread,
Mar 19, 2013, 10:10:06 AM3/19/13
to beagl...@googlegroups.com, jon.k...@gmail.com
On Tue, 19 Mar 2013 05:47:11 -0700 (PDT)
jon.k...@gmail.com wrote:

> Regarding GPIO bounce, the GPIOs are reasonably quick on am335x, do
> you
>
> > see bounce when observing them with a scope? I believe the GPIO
> > subsystem is on a 100 MHz clock that rotates between each bank
> > (check TRM for sure) so 25 MHz input is possible (although the
> > timing would have to line up nicely to avoid aliasing). Sorry, I'm
> > not much help here other than to suggest you put some hardware
> > filtering on there or debounce in software.
> >
>
> Oh, I meant to say on this front as well - the bounce gets
> worse ...MUCH worse... once the sdcard starts to fail... doesn't seem
> to noticeably bounce other than maybe once or twice in the first
> month, and then you can tell when the card has failed because
> suddenly the bounce rate goes through the roof - you'll get massive
> bounces 5 or 6 times a day.
>
> I should have also mentioned that - when I say bounce, it's not all
> the time. Just really occasionally at first - you'll get normal clean
> interrupts for 100 hours or more, and then maybe an extra 50 or 100
> interrupts on a single level change... then back to normal again for
> days or weeks. Then, when the card fails, you'll get them all over
> the place, but again, plenty of normal interrupts in between.

How exactly are you interfacing to this external device? And what
exactly is the device? It sounds like some kind of power meter or
similar?

-Andrew

Andrew Bradford

unread,
Mar 19, 2013, 10:14:57 AM3/19/13
to beagl...@googlegroups.com, jon.k...@gmail.com
On Tue, 19 Mar 2013 05:40:51 -0700 (PDT)
jon.k...@gmail.com wrote:

> > Which other manufacturer and models of SD cards have you observed
> > this on?
> >
>
> Sandisk class 2 SDHC SDSDQ-4096-P36M and Kingmax 4GB SDMC4KM-C4
> class 4, and whatever originally came with the bone. I can't remember
> what brand it was, but I believe it was a third brand. Sadly, I've
> thrown it away months ago so I can't tell.

Kingmax and Kingston cards are all crap. If they fail, that's kind of
expected operation. The low end SanDisk stuff is questionable, too,
especially if you're buying from anywhere that's not a reputable
retailer of SanDisk (ie: cards bought on eBay would be not reputable).

Buy some Samsung Plus series 8 GB uSD cards from a reputable location,
like NewEgg or Amazon (assuming those are available where you live).

I'm also a fan of SanDisk Mobile Ultra 4 GB class 6 uSD cards but the
Samsung should (it doesn't, but it *should*) have better performance
based on the way the controller inside works.

-Andrew

Ed of the Mountain

unread,
Mar 19, 2013, 1:41:13 PM3/19/13
to beagl...@googlegroups.com

I just want to share with you my experience.  Using Ext3 on a micro-sd flash is a bad idea b/c of journaling support in ext3/ext4 filesystem.  It will tend to wear and kill your sd flash even quicker. 

The approach i am taking is using a fat filesystem as the root partition of the sd-card.  I then mount the root filesystem from a squashfs archive.  Next, overlay the root-filesystem with advanced union filesystem (aufs) with a tmpfs.  Now, i would have an OS that would not write the the flash card but ram.  This is by far the best approach i can think of to minimize flash errors.  Although i still get flash error, but i believe it is at the MMC bus level (signal level), not actually the memory chip. 

The squashfs solution sounds like the best solution.  Basically it runs like a LiveCD with nothing to corrupt.

Can you please elaborate on the steps you took to accomplish this?

-Ed 

Andrew Bradford

unread,
Mar 19, 2013, 8:52:40 PM3/19/13
to Jon Kloske, beagl...@googlegroups.com
On Wed, 20 Mar 2013 09:08:59 +1000
Jon Kloske <jon.k...@gmail.com> wrote:

>
>
> On 20/03/2013, at 12:14 AM, Andrew Bradford
> I don't buy memory cards off ebay :) Too many people I know get
> rubbish that way. We have a local supplier with small margins anyway,
> so it's not usually a price issue going the local retailer way.
>
> Next time it fails I'll try one of the more expensive cards if you
> like, though if this turns out to be the answer I'd be a little
> irritated that the card that comes with the devices when you buy them
> are rubbish and fail within a month.

Well, if you read the SD spec, using an SD card as a root file system
isn't really one of the designed use cases. The main goal is for
recording still images or video in very specific ways. Random access,
especially lots of small writes, isn't something SD was meant to do.
It just so happens that the bus is quite simple to implement for single
board computers and the memory itself can be found for < $1 / GB so it's
become popular.

The cards that come with Beagles are crap. All of the reasonably low
priced Kingston cards are crap (I've not seen much testing of the
expensive ones so I reserve judgment on those). Kingston cards ship
with Beagles, in my opinion, because of the pricing and availability,
not because of the quality.

There's the old adage:
1. Fast
2. Cheap
3. Good
Pick two.

Rasp Pi doesn't come with an SD card (at least not in the USA last I
checked). Many other single board computers, if they do come with an
SD card, also come with Kingston cards. Going to some other board
isn't going to help here. Spend the $10 (USD) on a decent uSD card,
it's worth it regardless of what use or board you have.

If you'd like, send me a self addressed stamped envelope and I'll mail
you a bunch of crap Kingston cards. I have at least 15 in a box under
my desk at work and I have no use for them.

-Andrew

Jon Kloske

unread,
Mar 19, 2013, 7:08:59 PM3/19/13
to Andrew Bradford, beagl...@googlegroups.com
I don't buy memory cards off ebay :) Too many people I know get rubbish that way. We have a local supplier with small margins anyway, so it's not usually a price issue going the local retailer way.

Next time it fails I'll try one of the more expensive cards if you like, though if this turns out to be the answer I'd be a little irritated that the card that comes with the devices when you buy them are rubbish and fail within a month.

(Actually if that were the case our local supplier ought to be worried as Australia has some pretty good consumer protections which I would think would kick in if suddenly we all decided to put the sd card issue to them... and I do know others affected; its not just me!)

Cheers,
Jon

Jon Kloske

unread,
Mar 19, 2013, 7:09:50 PM3/19/13
to Andrew Bradford, beagl...@googlegroups.com
How exactly are you interfacing to this external device?  And what
exactly is the device?  It sounds like some kind of power meter or
similar?

Yep it's a power meter with an open collector pulse output (90ms width). I have the gpio configured for internal pullup and then I have the gpio pin wired through the pulse output open collector and a nominal resistor (not large.. maybe 270ohms? can't remember exactly) to ground.

Jon

Jon Kloske

unread,
Mar 20, 2013, 7:50:38 AM3/20/13
to Andrew Bradford, beagl...@googlegroups.com

> Well, if you read the SD spec, using an SD card as a root file system
> isn't really one of the designed use cases. The main goal is for
> recording still images or video in very specific ways. Random access,
> especially lots of small writes, isn't something SD was meant to do.
> It just so happens that the bus is quite simple to implement for single
> board computers and the memory itself can be found for < $1 / GB so it's
> become popular.

Understood, so I'd imagine these board designers would be doing everything to ensure the software was set up in such a way as to reduce the chance the device would fail in 30 days.

If they aren't, then you can't just say "yes well their product destroys hardware but to be fair that hardware wasnt designed with their product in mind" - the beagle makers made a design decision to use that hardware; either make it work or make it clear you're going to be feeding them 12 cards a year and that they better keep good backups.


> The cards that come with Beagles are crap.

Again, that may well be true, and buying expensive cards which somehow let you do what the spec you referred to implies they weren't designed to do, but regardless of that, the problem is either the hardware or the software and both came with the device in the default config, so....

Anyway, I assume these things are failing due to excessive (compared to their intended use as camera storage) writes, in which case what on earth is writing to the cards? I saw others commenting that ext4 is possibly to blame, but still I'd say something is still hitting the disk unintentionally...

I'll ignore the rest of your comments because they were largely and uncharacteristically unhelpful.

Jon

Toan Pham

unread,
Apr 19, 2013, 11:17:14 AM4/19/13
to beagl...@googlegroups.com, mcas...@gmail.com
>Not sure that it is of interest to anyone, but the 2 cards have now written over 2 Billions blocks

First of all, how big is each block. You should set a write block to
the size of a filesystem cluster that way you do not have
fragmentation and that will every write, you're guaranteed to modify
the entire block array.

Second, Make sure that your write operation flushes to the sd-card so
that modified data wont live in system cache.

Whatever you do, a memory cell usually fail at 200,000 writes. Figure
out how the size of your sd card, the sector size, and cluster size;
basic math will tell you how many cluster write you will need to
perform until failure.

Andrew Bradford

unread,
Apr 22, 2013, 8:47:22 AM4/22/13
to beagl...@googlegroups.com, mcas...@gmail.com
On Sun, 21 Apr 2013 13:01:03 -0700 (PDT)
mcas...@gmail.com wrote:

> Thanks for the feedback.
>
> I do not know the size of the blocks. What I was looking to find out
> was if there really was wear leveling on these cards (as they fail
> after a few weeks, I wanted to check this).

Wear leveling on the card is very hard to detect, in my experience.
It's up to the controller embedded in the card on how the writes take
place physically.

> There should not be any fragmentation anyway as I am writing the same
> file over and over.

Don't be sure of this, the controller may do a read-copy-write
operation and that may involve garbage collection on the flash itself,
depending on the erase block size and the size of your write. Your
file system may not be fragmented but the data on flash may be.

On flash, you can't simply overwrite a physical erase block, you have to
fully erase it first so usually a controller will copy the existing
good bits out of an erase block, append the new data, and write it
somewhere else. The old erase block now can be erased.

See Arnd's article in lwn [1].

[1]:https://lwn.net/Articles/428584/

> I used sync to mount the card in order to make it gets written every
> time. Before doing so, iostat would increment only every 5 sec, when
> the cache was flushed to the card. After that, iostat would
> increment continuously.

Watch out for wear increasing with sync mounts. You're now writing much
more often when not relying on the kernel buffers. Yes, it's safer in
the sense that if power goes away the card has the most right data on
it, but now you're doing writes (and hence the read-copy-write / garbage
collection above) potentially much more often.

If you want SD/MMC flash cards that do wear leveling, sign an NDA from
the card vendor and get the data sheet / papers on how they do it. Or
buy some really tiny probes and probe a raw die to see what it's doing
(definitely possible but probably not cheap).

-Andrew

Andrew Bradford

unread,
Apr 22, 2013, 2:03:48 PM4/22/13
to beagl...@googlegroups.com, mcas...@gmail.com
On Mon, 22 Apr 2013 02:11:20 -0700 (PDT)
mcas...@gmail.com wrote:

> One of the 2 cards has failed after a bit more than 2 billion block
> writes which corresponds to writing the same file 21 million times.
> My block size (reported by frisk -l) is 512 bytes and the card size
> is 4GbB with at least 2.8GB free.

fdisk is wrong.

Usual erase block sizes for 4 GB SD cards are 1 to 8 MB. Cards usually
have hundreds or thousands of erase blocks, at most.

Read Arnd's lwn article [1]. Use flashbench [2] to determine what your
card has for erase block size.

[1]:https://lwn.net/Articles/428584/
[2]:https://git.linaro.org/gitweb?p=people/arnd/flashbench.git;a=summary

> Am I wrong to assume the following:
> (2.8GB / 512) x 200 000 = max number of block writes (1 094 billions)

See above. You're also not accounting for read-copy-write that might
take place depending on what data already exists at the physical erase
block the card decides to write to (possibly prompting garbage
collection which may have more writes than you expect).

> If the above is correct, the card has failed a lot earlier than it
> can be expected. Hence something else is killing these cards or the
> wear leveling is not working properly.

Most of the cheap cards have poor, if any, wear leveling routines.

-Andrew

Britton Kerin

unread,
Apr 22, 2013, 3:07:56 PM4/22/13
to beagl...@googlegroups.com
On Mon, Apr 22, 2013 at 10:03 AM, Andrew Bradford
<and...@bradfordembedded.com> wrote:
> On Mon, 22 Apr 2013 02:11:20 -0700 (PDT)
> mcas...@gmail.com wrote:
>
>> One of the 2 cards has failed after a bit more than 2 billion block
>> writes which corresponds to writing the same file 21 million times.
>> My block size (reported by frisk -l) is 512 bytes and the card size
>> is 4GbB with at least 2.8GB free.
>
> fdisk is wrong.

[snip]

>> If the above is correct, the card has failed a lot earlier than it
>> can be expected. Hence something else is killing these cards or the
>> wear leveling is not working properly.
>
> Most of the cheap cards have poor, if any, wear leveling routines.

I'm wrestling probably the same issues. 2/3 of my uSD cards now cannot
be backed up or restored (usind dd if=/dev/mmcblk0 etc.), one doesn't
seem to boot the other hangs the command line on bone for long periods.
My application writes small files pretty often.

This page:
http://electronics.stackexchange.com/questions/27619/is-it-true-that-a-sd-mmc-card-does-wear-levelling-with-its-own-controller
contains comments pointing out that wear leveling and power cycle tolerance
are conflicting goals, and probably even fewer cards get both right.

And there I was thinking all this mess was sorted out inside the little
plastic wafer.

Andrew, do you happen to be aware of anyone who sells SD cards that *do*
have good leveling/power cycle tolerance and *are* believed to work well
with the bone? I'd be happy to pay.

Thanks,
Britton

Andrew Bradford

unread,
Apr 22, 2013, 3:40:00 PM4/22/13
to beagl...@googlegroups.com, britto...@gmail.com
On Mon, 22 Apr 2013 11:07:56 -0800
Britton Kerin <britto...@gmail.com> wrote:

> On Mon, Apr 22, 2013 at 10:03 AM, Andrew Bradford
> <and...@bradfordembedded.com> wrote:
> > On Mon, 22 Apr 2013 02:11:20 -0700 (PDT)
> > mcas...@gmail.com wrote:
> >
> >> One of the 2 cards has failed after a bit more than 2 billion block
> >> writes which corresponds to writing the same file 21 million times.
> >> My block size (reported by frisk -l) is 512 bytes and the card size
> >> is 4GbB with at least 2.8GB free.
> >
> > fdisk is wrong.
>
> [snip]
>
> >> If the above is correct, the card has failed a lot earlier than it
> >> can be expected. Hence something else is killing these cards or
> >> the wear leveling is not working properly.
> >
> > Most of the cheap cards have poor, if any, wear leveling routines.
>
> I'm wrestling probably the same issues. 2/3 of my uSD cards now
> cannot be backed up or restored (usind dd if=/dev/mmcblk0 etc.), one
> doesn't seem to boot the other hangs the command line on bone for
> long periods. My application writes small files pretty often.

What cards are you using?

> This page:
> http://electronics.stackexchange.com/questions/27619/is-it-true-that-a-sd-mmc-card-does-wear-levelling-with-its-own-controller
> contains comments pointing out that wear leveling and power cycle
> tolerance are conflicting goals, and probably even fewer cards get
> both right.

I don't think wear leveling and power cycle tolerance have anything to
do with each other.

Wear leveling is just writing data to erase blocks that have had less
writes rather than putting it somewhere else. Advanced modes may
migrate static data out of low write-count blocks in order to even
things out further. The goal is to wear all blocks at the same rate,
such that no single block fails much before any other.

Power cycle tolerance, as talked about on that stackexchange thread,
isn't quite right. If you are sure you've flushed the kernel buffers
and waited for the card to have written everything, it doesn't really
matter if the data was written in a way that is good at wear leveling
or bad. If you pull power with the kernel buffers having data not yet
written, or only partially written out, then you'll run into fun file
system check issues on next boot, again regardless of wear leveling.

Don't pull power till you're sure the transactions with the card have
completed. Or, simply use a read only file system or boot from a
tmpfs/ramdisk.

> And there I was thinking all this mess was sorted out inside the
> little plastic wafer.

In the expensive ones, yes, probably it'll wear level at least in a
rudimentary way. In the cheap cards, I wouldn't count on it. SD cards
are super price conscious, if it's not required by the spec (it's not)
many manufacturers of the lower end won't do it.

Cameras write in a nice linear fashion (usually), that's what the SD
spec is written for. This crazy random writes that Linux (or any other
non-camera) does is a brave new world for little SD cards.

> Andrew, do you happen to be aware of anyone who sells SD cards that
> *do* have good leveling/power cycle tolerance and *are* believed to
> work well with the bone? I'd be happy to pay.

Samsung Plus 8 GB uSD cards are generally considered quite good. I
personally like the SanDisk mobile ultra 4 GB uSD cards the best but the
larger sizes of this card are not as good (and 32 GB version has some
major possible issues, avoid those). I've used both Samsung Plus and
SanDisk ultra mobile on bones with no issues.

If you're in the USA, BestBuy.com has 4 GB SanDisk mobile ultra cards
for $4.99 each on sale.

-Andrew

Andrew Bradford

unread,
Apr 23, 2013, 9:23:35 AM4/23/13
to Britton Kerin, beagl...@googlegroups.com
You replied only to me but I'm replying to you and the beagle list.

On Mon, 22 Apr 2013 17:25:01 -0800
Britton Kerin <britto...@gmail.com> wrote:

> On Mon, Apr 22, 2013 at 11:40 AM, Andrew Bradford
> <and...@bradfordembedded.com> wrote:
>
> > On Mon, 22 Apr 2013 11:07:56 -0800
> > Britton Kerin <britto...@gmail.com> wrote:
> >
> >> > Most of the cheap cards have poor, if any, wear leveling
> >> > routines.
> >>
> >> I'm wrestling probably the same issues. 2/3 of my uSD cards now
> >> cannot be backed up or restored (usind dd if=/dev/mmcblk0 etc.),
> >> one doesn't seem to boot the other hangs the command line on bone
> >> for long periods. My application writes small files pretty often.
> >
> > What cards are you using?
>
> Kingston 4 GB Class 4 Can't dd onto or off of card
> PNY 4GB Class 4 Can't dd onto or off of card
> kngston 4 GB Class 4 works but hasn't seen nearly as many writes

Those are cheap cards. I wouldn't use them.

My rule of thumb is to use SD cards from companies who also own a
semiconductor fab. Usually you'll get decent quality stuff following
that rule. To get good stuff, buy as many cards as you can and test
them to find out what they really do.

> None of these cards has seen anywhere near enough writing to get close
> to the normal limits of flash memory assuming even slightly sane wear
> leveling. Unless Angstrom is doing something insane that I don't know
> about (they have considerable uptime).

I doubt any of those cards do any kind of wear leveling, sane or not.
Angstrom, or any other OS, has no control over the wear leveling on an
SD card, the internal controller to the card handles that.

> I think maybe bones should ship with a different, better card if
> possible.

That's not as easy as it sounds.

> >> This page:
> >> http://electronics.stackexchange.com/questions/27619/is-it-true-that-a-sd-mmc-card-does-wear-levelling-with-its-own-controller
> >> contains comments pointing out that wear leveling and power cycle
> >> tolerance are conflicting goals, and probably even fewer cards get
> >> both right.
> >
> > I don't think wear leveling and power cycle tolerance have anything
> > to do with each other.
> >
> > Wear leveling is just writing data to erase blocks that have had
> > less writes rather than putting it somewhere else. Advanced modes
> > may migrate static data out of low write-count blocks in order to
> > even things out further. The goal is to wear all blocks at the
> > same rate, such that no single block fails much before any other.
>
> I would hope you would be correct, but if the migration process fails
> to correctly use some sort of atomic lock then the posts on that page
> saying it could corrupt any data on the disk (not just what's being
> written) could be true. This panasonic ad implies (FUDs?) that many
> cards aren't power-cycle tolerant:
> http://panasonic.net/avc/sdcard/industrial_sd/function.html

During a garbage collection or other background operations, yes, if
power is lost you can lose data or end up with a disk where the state
is not know even by the controller. That is a risk with any flash based
storage.

Background operations might happen at any time, including garbage
collection or rewriting static data. This doesn't really have anything
to do with wear leveling other than some wear leveling algorithms may
do these background operations as well as other kinds of algorithms.
So my earlier statement about wear leveling and power loss sensitivity
isn't completely right, but rarely will SD cards do intense background
operations or wear leveling schemes like this. It's cost (dollars)
prohibitive to implement such abilities when 99% of customers won't
ever need them.

Unless you can get a data sheet on an SD card that describes the way it
does wear leveling and other background operations, we're all just
stabbing in the dark. Maybe Panasonic's cards do a great job, but I
have no data on that (and I assume if I did I wouldn't be allowed to
share it due to NDA).

I'm not sure where those Panasonic microSD cards can even be purchased.

> > Cameras write in a nice linear fashion (usually), that's what the SD
> > spec is written for. This crazy random writes that Linux (or any
> > other non-camera) does is a brave new world for little SD cards.
>
> Yes it seems like cameras would need to worry about it less. But.
> My camera recently silently ate a bunch of pictures. And my droid SD
> card has now entirely quit working. All together I've pretty well
> lost all belief in SD card reliability unless somebody is ready to
> promise otherwise.

No one's ready to promise otherwise :)
At least not me.

> >> Andrew, do you happen to be aware of anyone who sells SD cards that
> >> *do* have good leveling/power cycle tolerance and *are* believed to
> >> work well with the bone? I'd be happy to pay.
> >
> > Samsung Plus 8 GB uSD cards are generally considered quite good. I
> > personally like the SanDisk mobile ultra 4 GB uSD cards the best
> > but the larger sizes of this card are not as good (and 32 GB
> > version has some major possible issues, avoid those). I've used
> > both Samsung Plus and SanDisk ultra mobile on bones with no issues.
> >
> > If you're in the USA, BestBuy.com has 4 GB SanDisk mobile ultra
> > cards for $4.99 each on sale.
>
> Thanks, I'll try these. On a related note I see the blackbone has 2
> DB eMMC which is supposedly more reliable, I wonder if it boots from
> there out of the box or is an easy setup.

eMMC should work better than cheap SD cards. There's other benefits of
eMMC, too, mainly that the bus is wider so read throughput can be higher
than SDv2.00 (the write throughput on the eMMC used on the black isn't
stellar[1], assuming the BOM hasn't changed since then). Setup, if
you don't buy a black that boots from eMMC directly, is easy.

[1]:http://lists.linaro.org/pipermail/flashbench-results/2013-January/000353.html

-Andrew

Andrew Bradford

unread,
Apr 24, 2013, 11:10:14 AM4/24/13
to beagl...@googlegroups.com, mcas...@gmail.com
On Wed, 24 Apr 2013 06:23:17 -0700 (PDT)
mcas...@gmail.com wrote:

> I think that you might have misunderstood my goal.
> My goal was to find out if there was some wear leveling on the
> kingston card shipped with the beagle bone.
> This is why I used sync to write as much as possible to the SD card.
>
> The result is that there is certainly wear leveling on that card as I
> have overwritten the same file 21 million times. If there was no
> wear leveling, it should have failed much earlier.

The ability to write to the same block multiple times more than you
think you should does not indicate wear leveling. It may simply
indicate that there was a read-copy-write operation. All cards will do
this, it's required by the physics of flash when attempting to write
back to the same block data that won't fit based on the page sizes
(sorry, I'm not good at explaining this).

Wear leveling is doing this activity with a stated goal of optimizing
the write cycles on each erase block. Doing wear leveling intelligently
will lead to longer life. Simply doing read-copy-write operations on a
write to a block does not imply wear leveling.

If the controller was really simply overwriting the same erase block
every time, performance would really suck. You'd have to read out the
entire erase block, erase it, and write back the new data. This would
require quite a large amount of cache (at least 1 erase block's worth,
measured in MB). The controllers in cheap SD cards have a few kB of
RAM, at most, as cost is very critical. Thus, they don't simply write
back to the same erase block but move the useful data along with the new
write to another erase block and then go back and erase the now
no-longer-needed erase block after the operation completes. This gives
decent write performance with small caches. It looks like wear
leveling at a high level but it's not.

-Andrew

> I am aware of the fact that a full block erase occurs before writing
> to flash.
> Thanks for the links, I will look them up.

Britton Kerin

unread,
Apr 24, 2013, 12:30:30 PM4/24/13
to Andrew Bradford, beagl...@googlegroups.com
>> >> This page:
>> >> http://electronics.stackexchange.com/questions/27619/is-it-true-that-a-sd-mmc-card-does-wear-levelling-with-its-own-controller
>> >> contains comments pointing out that wear leveling and power cycle
>> >> tolerance are conflicting goals, and probably even fewer cards get
>> >> both right.
>> >
>> > I don't think wear leveling and power cycle tolerance have anything
>> > to do with each other.
>> >
>> > Wear leveling is just writing data to erase blocks that have had
>> > less writes rather than putting it somewhere else. Advanced modes
>> > may migrate static data out of low write-count blocks in order to
>> > even things out further. The goal is to wear all blocks at the
>> > same rate, such that no single block fails much before any other.
>>
>> I would hope you would be correct, but if the migration process fails
>> to correctly use some sort of atomic lock then the posts on that page
>> saying it could corrupt any data on the disk (not just what's being
>> written) could be true. This panasonic ad implies (FUDs?) that many
>> cards aren't power-cycle tolerant:
>> http://panasonic.net/avc/sdcard/industrial_sd/function.html
>
> During a garbage collection or other background operations, yes, if
> power is lost you can lose data or end up with a disk where the state
> is not know even by the controller. That is a risk with any flash based
> storage.

Isn't it possible to handle this (at least theoretically) with an atomic
write that certifies that a (possibly larger) recent write has completed?
This is how databases work if I understand right, I would have guessed that
the firmware on the SD cards would do the same sort of thing. Or is this
not possible with flash?

Thanks very much for all your info on this stuff.

Britton

Andrew Bradford

unread,
Apr 24, 2013, 12:58:31 PM4/24/13
to Britton Kerin, beagl...@googlegroups.com
It's possible with flash. I have no idea of telling if a controller
does it or not though, at least not without probing the part in
question and taking a logic analyzer to it.

Now you have me interested in doing this... :)

-Andrew

Andrew Bradford

unread,
May 7, 2013, 8:54:41 AM5/7/13
to beagl...@googlegroups.com, wil...@pcfish.ca
On Mon, 6 May 2013 23:24:46 -0700 (PDT)
willem <wil...@pcfish.ca> wrote:

> I recently deployed 100 beaglebone's into the wild with another 50 to
> follow in a month. Stupidly, I just used the stock Kingston 4GB that
> came with the board from circuitco. I'm already experiencing card
> failures with errors just like those that have been posted. I'm
> using a backup battery and have the beagle talking with a MSP430
> microcontroller so that it can gracefully shutdown but there are
> still issues. I thought I'd save some money but now realize I will
> have to visit each and everyone of my deployed units to replace the
> Kingston crap with a better card. So far I'm considering the Sandisk
> Ultra 4GB which Andrew has suggested. However, if wear leveling is
> an issue does it make sense to throw even more memory into the
> equation such as an 8GB or 16GB? i.e. more to wear down before failure

It only makes sense if the wear leveling actually works better and
the erase block sizes stay small. A better bet would be, if you're
going to deploy new images, use any SD card you want but not write to
it. Boot to a ramdisk as the root file system, then all your problems
go away :)

Boot will possibly take longer, some logging ability will be lost (or
will become more convoluted), but graceful powerdown becomes literally
pull the plug.

Regarding deployment, cycle through your customers. Build 25 units or
so and mail them out asking customers to mail the defunct units back
after they receive the new one. That way your customers don't have down
time. Then repair the 25 you get back and repeat.

-Andrew

Andrew Bradford

unread,
May 7, 2013, 2:52:36 PM5/7/13
to beagl...@googlegroups.com, wil...@pcfish.ca
On Tue, 7 May 2013 09:01:49 -0700 (PDT)
wil...@pcfish.ca wrote:

> Well ironically this morning I've had many more pack it in. This is
> a bit of a disaster. Unfortunately my remote units are tamper proof
> with no one allowed to go into the enclosures. I've started reading
> up on Ram disks. I'm using Ubuntu 12.10 on the boards. Do you have
> any good references for setting up a RAM disk?

Google has a lot of recommendations:
https://www.google.com/search?q=root+on+ramdisk

-Andrew

Andrew Bradford

unread,
May 8, 2013, 8:52:24 AM5/8/13
to beagl...@googlegroups.com, wil...@pcfish.ca
On Tue, 7 May 2013 23:41:09 -0700 (PDT)
wil...@pcfish.ca wrote:

> Is it enough to only ramdisk the known write activity directories
> like /var/log and /tmp or are there other places where write activity
> will occur and therefore it is safer to ramdisk the entire root? I'm
> wondering because 256M is not very much memory and Ubuntu 12.10 is
> not that small.

Running just the highly written directories in a ramdisk or tmpfs is a
good half measure. To be safest, put the full root fs on ram disk.

Use something like Angstrom or BuildRoot to construct your root fs.
Depending on your needs, a root fs of 10s of MB is possible without
much effort using either of those. Ubuntu (and Debian) are bloated
pigs for running root on a ram disk (but they are decent choices when
not).

> I'm noticing that Raspberry Pi is also encountering much of the same
> problem with Sd card corruption.
>
> Did anyone every figure out why a read-only filesystem is still
> causing problems as an earlier poster already remarked?

Not that I know of. Most likely the file system wasn't as read only as
the user thought.

> I've also found at this
> link<http://cxcv.de/post/34356721648/fixing-raspberry-pi-sd-card-issues>
> mention of a fix for SD card corruption with the Raspberry Pi based
> on core frequency. I believe the Beaglebone switches clock speed
> based on power from usb or 5V barrel. Are there are situations that
> cause the BB to fluctuate clock speed and perhaps this is causing the
> issue with SD card failure.

It shouldn't but if you want to disable freq scaling, it's easy, just
don't install the cpufreq tools and disable freq scaling in your
kernel. I run my bones at 720 MHz all the time, the power savings of
the lower frequencies isn't worth the hassle for me. If that saves SD
cards, too, that's a nice side benefit. I've never tested the impact
to the SD card of frequency scaling.

> I'm throwing everything at the wall to see what sticks because I need
> to pick a route pronto and figure out how to make the BB bulletproof
> within the next 12 hours.

Don't rush a fix. Find what works, sweet talk the customers, and
deploy as quickly as you can, but don't rush something out that's going
to have issues without testing that you've actually fixed anything and
not broken something else.

-Andrew

fred basset

unread,
May 8, 2013, 12:42:52 PM5/8/13
to beagl...@googlegroups.com
I'm investigating these for our datalogger design that's based on the BB.
Also interested to hear of any others using industrial rated SD cards.


On Wed, May 8, 2013 at 7:21 AM, <wil...@pcfish.ca> wrote:
Thank Andrew.  I appreciate your help.  I really wish I had more time to test this properly but alas I don't.  One of my remote units is coming back into civilization for only one day before it heads back out into the wild frontier.  For that one I'll stick with good half measures and then will get to work on a smaller Angstrom or BuildRoot fs for 50 more expected to head out in a month.  

It always seems to be the case that on the benchtop, for months on end, after lots of testing that I had no issues whatsoever with SD card corruption on 2 different units.  It was only until I scaled from 2 units to 100 that I started to see more failures.  Lesson to be learned I guess - expect the unexpected.

Andrew Bradford

unread,
May 8, 2013, 2:40:57 PM5/8/13
to beagl...@googlegroups.com, fredbas...@gmail.com
On Wed, 8 May 2013 09:42:52 -0700
fred basset <fredbas...@gmail.com> wrote:

> You might also want to look at the so called industrial rated SD
> cards, e.g.
>
> http://www.delkinoem.com/secure-digital-industrial.html
>
> or
>
> http://swissbit.com/index.php?option=com_content&view=article&id=194&Itemid=62
>
> I'm investigating these for our datalogger design that's based on the
> BB. Also interested to hear of any others using industrial rated SD
> cards.

Expect to pay quite a premium for "industrial" SLC cards.
I'm possibly awaiting a few ATP cards to test. If they arrive I'll post
life test and flashbench results.

As a point of reference, one ATP card some of our Windows guys were
looking to get was the ATP 32 GB industrial full size SD [1] and it
priced out around $250 per unit for a few hundred unit order. CDW has
ATP 2 GB industrial cards for $39 [2]. Compare this with 4 GB SanDisk
Mobile Ultra consumer level cards for $5 and you'll get a rough idea of
the price multiplier.

[1]:http://www.atpinc.com/p2-4a.php?sn=00000395
[2]:https://www.cdw.com/shop/products/ATP-Industrial-Grade-flash-memory-card-2-GB-SD/1446871.aspx

-Andrew
> >> > link<http://cxcv.de/post/**34356721648/fixing-raspberry-**
> >> pi-sd-card-issues<http://cxcv.de/post/34356721648/fixing-raspberry-pi-sd-card-issues>>
> >> > > https://www.google.com/search?**q=root+on+ramdisk<https://www.google.com/search?q=root+on+ramdisk>

jmelson

unread,
May 8, 2013, 5:02:07 PM5/8/13
to beagl...@googlegroups.com, wil...@pcfish.ca


On Wednesday, May 8, 2013 1:41:09 AM UTC-5, wil...@pcfish.ca wrote:


Did anyone every figure out why a read-only filesystem is still causing problems as an earlier poster already remarked?  


I'm not an SD card expert, but I think that if there is ANY writeable partition on an SD card, then they are all, in a sense,
writeable, as the pool of erase blocks is constantly being  recycled.  So, you THINK that you have a partition that
is read-only, but when another partition is written to, blocks from here and there are gathered and moved to another
place by the wear leveling and/or block erase mechanism, including those from the supposedly read-only partition.

Jon

fred basset

unread,
May 8, 2013, 5:27:05 PM5/8/13
to beagl...@googlegroups.com
Don't forget that the card has to do it's internal housekeeping (bad block assignment etc)., so corruption can occur if it loses power whilst doing one of these operations.


--

Joseph Pearce

unread,
May 8, 2013, 6:11:35 PM5/8/13
to beagl...@googlegroups.com
This was asked awhile ago but I did not see a clear response. Playing around with the Beaglebone black I have had corruption on the SD card while not on the eMMC. 

If I can minimize my app to work purely on the onboard 2Gb am I guaranteed (nearly) very little corruption, or is it going to have the same failure style just further down the road. When/if this happens it would seem the Beagle black would be 'bricked' except for then depending upon external SD cards with periodic replacements.

Andrew Bradford

unread,
May 9, 2013, 8:33:53 AM5/9/13
to beagl...@googlegroups.com, joseph...@gmail.com
On Wed, 8 May 2013 15:11:35 -0700 (PDT)
Joseph Pearce <joseph...@gmail.com> wrote:

> This was asked awhile ago but I did not see a clear response. Playing
> around with the Beaglebone black I have had corruption on the SD card
> while not on the eMMC.
>
> If I can minimize my app to work purely on the onboard 2Gb am I
> guaranteed (nearly) very little corruption, or is it going to have
> the same failure style just further down the road. When/if this
> happens it would seem the Beagle black would be 'bricked' except for
> then depending upon external SD cards with periodic replacements.

There's no guarantee.

The black won't be bricked, you can always boot from SD. If you have a
hot air rework station, you can remove the eMMC from the board
(granted, putting a new one down is a tad trickier).

If you write to flash a lot, any kind of flash, eventually erase blocks
will wear out and it won't take writes without corruption. Write less
or if you need to write a lot, get a spinning rust disk or other way to
get data off without hitting flash.

The eMMC on the black is the lowest end device Micron makes. But, that
being said, I'd expect its wear leveling routines and other background
operations to be quite lot better than Kingston SD cards. If you're
interested in the flashbench results, see [1] (4 bit mode) and [2] (8
bit mode). Erase block size is 2 MiB, so align your partitions to that
and you will see probably slightly better life and performance.

[1]:http://lists.linaro.org/pipermail/flashbench-results/2013-January/000353.html
[2]:http://lists.linaro.org/pipermail/flashbench-results/2013-February/000355.html

Also see Arnd's recommendations [3] in response. Don't use ext3.

[3]:http://lists.linaro.org/pipermail/flashbench-results/2013-February/000360.html

-Andrew

> On Wednesday, May 8, 2013 5:27:05 PM UTC-4, fredb wrote:
> >
> > Don't forget that the card has to do it's internal housekeeping
> > (bad block assignment etc)., so corruption can occur if it loses
> > power whilst doing one of these operations.
> >
> >
> > On Wed, May 8, 2013 at 2:02 PM, jmelson
> > <el...@pico-systems.com<javascript:>
Reply all
Reply to author
Forward
0 new messages