Block erase non-atomicity

124 views
Skip to first unread message

Frédéric SERGENT

unread,
Feb 13, 2012, 2:13:02 PM2/13/12
to uf...@googlegroups.com
Hi Ricky,

First of all, I saw that you implemented atomic file deletion, which is really cool, thanks! :-)

I have a concern about being reset or powered down while erasing a block, though.

Here is what happened here: one of the tests we run does reset our device while erasing a file in a UFFS partition. What we get, sometimes, is that the block is actually not totally erased: the first pages are correctly set to 0xFF, but the few last ones keep data, probably the old ones.

The result is that when building its tree at next startup, UFFS sees this block as clean, as it only scans the first page header and tag. Then later, data is written to this block, which results in a mixup of old and newer data. This is only seen next time one of the written pages are read, as the CRC in header turns out wrong.

I am not sure if this is universal or just appliable to our hardware. In any case, to me, erasing a block of flash memory cannot be atomic, as it involves specific behaviours of the electronic devices in use, I don't see how any atomicity can be guaranteed in this kind of operation. I.e.: if an erase operation of a block gets interrupted, I think the block should be considered to be in an undetermined state.

What do you think?

I did something here to get around this, at least for our use. I am not sure if this is very clean within UFFS architecture, and there are probably ways to optimize this, as it involves reading much more pages at startup than is done normally, just to check their state. This seems to work here, without slowing too much our init step (on 1.5 MiB partition).

In uffs_tree.c:

static int _ScanCleanBlock(uffs_Device *dev, int block)
{
    int page;
   
    int size = dev->com.pg_size;
    u8* page_content = MY_ALLOC_FCTN(u8, size*sizeof(u8));
           
    for (page = dev->attr->pages_per_block - 1; page > 0; page--) {
        int ret = UFFS_FLASH_UNKNOWN_ERR;
        if (dev->ops->ReadPageWithLayout) {
            ret = dev->ops->ReadPageWithLayout(dev, block, page, page_content, size, NULL, NULL, NULL);
        }
        else {
            ret = dev->ops->ReadPage(dev, block, page, page_content, size, NULL, NULL, 0);
        }
       
        if(ret == UFFS_FLASH_IO_ERR){
            uffs_Perror(UFFS_MSG_SERIOUS,
                    "IO error reading block %d page %d",
                    block, page);
            return -1;
        }
       
        u8 *check_FF = page_content, *check_limit = page_content + size;
        for(; check_FF < check_limit; check_FF++){
            if (*check_FF != 0xFF){
                uffs_Perror(UFFS_MSG_NORMAL,
                        "unclean page found, block %d page %d, ",
                        block, page);
                goto ext;
            }
        }
    }
ext:
    MY_FREE_FCTN(page_content);
    return page;
}

static URET _BuildTreeStepOne(uffs_Device *dev)
{
(...)
        }
        else if (uffs_IsPageErased(dev, bc, 0) == U_TRUE) { //@ read one spare: 0
            // page 0 tag shows it's an erased block, we need to check the mini header status to make sure it is clean.
//            if (uffs_LoadMiniHeader(dev, block_lt, 0, &header) == U_FAIL) {
            int page = _ScanCleanBlock(dev, block_lt);
           
            if (page == -1) {
                uffs_Perror(UFFS_MSG_SERIOUS,
                            "I/O error when reading page !"
                            "block %d page %d",
                            block_lt, 0);
                ret = U_FAIL;
                break;
            }

            if (page > 0) {
                // page 0 tag is clean but page data is dirty ???
                // this block should be erased immediately !
                uffs_Perror(UFFS_MSG_NORMAL,
                            "block %d dirty, erasing it",
                            block_lt);
                uffs_FlashEraseBlock(dev, block_lt);
            }
            node->u.list.block = block_lt;
            if (HAVE_BADBLOCK(dev)) {
                uffs_Perror(UFFS_MSG_NORMAL,
                            "New bad block (%d) discovered.", block_lt);
                uffs_BadBlockProcess(dev, node);
            }
            else {
                uffs_TreeInsertToErasedListTail(dev, node);
            }
        }
        else {
(...)

Best regards,
Fred




Frédéric SERGENT

unread,
Feb 14, 2012, 3:29:03 PM2/14/12
to uf...@googlegroups.com

By the way, there is a flaw in the routine I posted yesterday: it doesn't check properly page 0. If page 0 happens to be partially written, it appears to be OK from the uffs_IsPageErased function's perspective because the flag at the end is all 0xFF, especially the SEAL byte. Thus the rest of the block is not checked, and not erased. This appeared in a stress test I ran today after a reset occurred while writing to page 0 of a block.

I modified it today and left it under heavy testing for the whole night. If it's OK, I'll post the correction here.

Best regards,
Fred

Ezio

unread,
Feb 14, 2012, 8:08:51 PM2/14/12
to UFFS
what about this situation:
-----
|empty|
|empty|
...
|full |
|full |
...
|empty|
|empty|
-----
as the first few pages are empty after erase,but a powerloss makes the
erase stop,so the middle pages are still dirty while the last few
pages have not been written before.how to determine the states of this
block? only read all the pages?

-----
Regards,
Ezio
On Feb 15, 4:29 am, Frédéric SERGENT <fred.sergent...@gmail.com>
wrote:
> By the way, there is a flaw in the routine I posted yesterday: it doesn't
> check properly page 0. If page 0 happens to be partially written, it
> appears to be OK from the uffs_IsPageErased function's perspective because
> the flag at the end is all 0xFF, especially the SEAL byte. Thus the rest of
> the block is not checked, and not erased. This appeared in a stress test I
> ran today after a reset occurred while writing to page 0 of a block.
>
> I modified it today and left it under heavy testing for the whole night. If
> it's OK, I'll post the correction here.
>
> Best regards,
> Fred
>

Ricky Zheng

unread,
Feb 14, 2012, 11:24:22 PM2/14/12
to uf...@googlegroups.com
I got an idea ... if first page is clean, either the block is erased or the block is half-erased, put it to erased block list as it is.

When picking a block from erased block list, we then check every pages to make sure it IS fully erased otherwise erase it immediately.

This solution will not affect system startup speed but slow down a bit when writing files.

What do you think ?

- Ricky

--
You received this message because you are subscribed to the Google Groups "UFFS" group.
To post to this group, send email to uf...@googlegroups.com.
To unsubscribe from this group, send email to uffs+uns...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/uffs?hl=en.


Message has been deleted

Frédéric SERGENT

unread,
Feb 15, 2012, 11:17:49 AM2/15/12
to uf...@googlegroups.com
Hi Ezio,

yes, exactly, that was part of my concern. It should be addressed by my routine if I did it right. What it does is:

for each block:
  if page 0 seems clean (by reading its tag):
    read every byte of every page starting from the end of the block
    if one byte is not 0xFF:
      erase block
  else:
    check as usual with header and tag

By the way, the correction I made yesterday seems to work, I'll post it later, just in case.

Fred

Frédéric SERGENT

unread,
Feb 15, 2012, 11:43:52 AM2/15/12
to uf...@googlegroups.com

That would work. It depends on the kind of use, but spreading all blocks checks over use time instead of all at init time could help the time cost not to be too painful.

The drawback would be that if we check a supposedly-blank block before every use, we will spend more time reading every pages instead of just once and for all if done at init time.

I suppose this can be avoided, though, by storing the information: if a block has never been used since startup -> check it totally. If it has been erased since startup -> it is trustworthy, don't check.
I am not sure where this info could be stored:

- adding a field to one of the struct/unions in uffs_tree.h/c? But none match specifically the "erased type"
- having a second list similar to 'erase' and 'erase_tail' in uffs_TreeSt? Like 'blank' (maybe not even 'blank_tail') filled at startup as usual instead of 'erase'? From which blocks would be picked in priority until it's empty, checked at that moment? And erased ones would go as usual to 'erased', and not checked when picked up again?

What do you think?

Fred
To unsubscribe from this group, send email to uffs+unsubscribe@googlegroups.com.

Frédéric SERGENT

unread,
Feb 15, 2012, 11:57:03 AM2/15/12
to uf...@googlegroups.com
> how to determine the states of this block? only read all the pages?

I have thought a bit of alternatives, but for now none seem totally convincing to me:

after erasing a block, marking the first page with a non intrusive pattern, something like:
1st byte:  0xAA  (i.e. b10101010)
last byte: 0x55   (i.e. b01010101)

The advantages I saw were:
- doing so would not prevent from using these bytes properly when actually writing to the page: they will both be set to 0 (status byte in header, seal byte in tag)
- the chosen pattern would be very unlikely to be set only by chance, following a reset while erasing
- this is atomic: this cannot be done unless we are sure the block is actually erased, and we get to check both start and end of this page
- it would allow to read only the 1st page, allowing to spare the time spent to not read all the following pages

Drawbacks:
- it would slow down the block erase procedure by adding a write to a page at the end. If not significantly less than reading all bytes of a page, then it's useless.
- this is a little little bit less safe than checking every byte is really 0xFF: there is no warranty that resetting while writing the pattern or while resetting would not, by accident, leave the page presenting the same pattern. It has to be chosen to be very unlikely and very recognizable, I suppose, that's why I thought of 0xAA and 0x55, and this should be checked.
- it should not hamper the rest of the code with specific checkings everywhere just because of that (if page 0 and HEADER() != ... else blabla...)
- and the main drawback, I think: I suppose it wouldn't work on NAND flash with automatic ECC. I guess writing a second time on a page without erasing would destroy the ECC value, which would be incorrect without erasing the block again, because some of its bits would have to switch from 0 to 1...

Another idea: maybe some "drivers", when existing, could provide the info more efficiently: in that case it would make sense to add a function pointer (to the collection already existing with read_with_layout etc), something like "is_block_erased", called only if available. But I did not dig much into this area.

Regards,
Fred



Frédéric SERGENT

unread,
Feb 15, 2012, 12:04:03 PM2/15/12
to uf...@googlegroups.com
By the way, just in case, here is my current corrected routine, which so far passed my nightly testings:


static int _ScanCleanBlock(uffs_Device *dev, int block)
{
    int page;
   
    int size = dev->com.pg_size;
    u8* page_content = MY_ALLOC_FCTN(u8, size*sizeof(u8));
           
    int ret = UFFS_FLASH_UNKNOWN_ERR;

    for (page = dev->attr->pages_per_block - 1; page >= 0; page--) {

        if (dev->ops->ReadPageWithLayout) {
            ret = dev->ops->ReadPageWithLayout(dev, block, page, page_content, size, NULL, NULL, NULL);
        }
        else {
            ret = dev->ops->ReadPage(dev, block, page, page_content, size, NULL, NULL, 0);
        }
       
        if(ret == UFFS_FLASH_IO_ERR){
            uffs_Perror(UFFS_MSG_SERIOUS,
                    "IO error reading block %d page %d",
                    block, page);
            goto ext;

        }
       
        u8 *check_FF = page_content, *check_limit = page_content + size;
        for(; check_FF < check_limit; check_FF++){
            if (*check_FF != 0xFF){
                uffs_Perror(UFFS_MSG_NORMAL,
                        "unclean page found, block %d page %d, ",
                        block, page);
                ret = UFFS_FLASH_BAD_BLK;
                goto ext;
            }
        }
    }
    ret = UFFS_FLASH_NO_ERR;
ext:
    MY_FREE_FCTN(page_content);
    return ret;

}

static URET _BuildTreeStepOne(uffs_Device *dev)
{
(...)
        }
        else if (uffs_IsPageErased(dev, bc, 0) == U_TRUE) { //@ read one spare: 0
            // page 0 tag shows it's an erased block, we need to check the mini header status to make sure it is clean.
            int ret = _ScanCleanBlock(dev, block_lt);
           
            if (ret == UFFS_FLASH_IO_ERR) {

                uffs_Perror(UFFS_MSG_SERIOUS,
                            "I/O error when reading page !"
                            "block %d page %d",
                            block_lt, 0);
                ret = U_FAIL;
                break;
            }

            if (ret == UFFS_FLASH_BAD_BLK) {

                // page 0 tag is clean but page data is dirty ???
                // this block should be erased immediately !
                uffs_Perror(UFFS_MSG_NORMAL,
                            "block %d dirty, erasing it",
                            block_lt);
                uffs_FlashEraseBlock(dev, block_lt);
            }
            node->u.list.block = block_lt;
            if (HAVE_BADBLOCK(dev)) {
                uffs_Perror(UFFS_MSG_NORMAL,
                            "New bad block (%d) discovered.", block_lt);
                uffs_BadBlockProcess(dev, node);
            }
            else {
                uffs_TreeInsertToErasedListTail(dev, node);
            }
        }
        else {
            // this block have valid data page(s).
(...)
}


Best regards,
Fred



Ezio

unread,
Feb 15, 2012, 9:16:13 PM2/15/12
to UFFS
it's safe but slow.
what about this:
declare a array,its size is the number of block,and make it full of 0
to present all blocks have not been visited.after first time you
visit block n ,make array[n] = 1 to indicate this block has been
visted.

for each block
if page 0 is not clean
erase it
else if it's the first time to work with this block
if one page in this block is not clean
erase it
else
something
else
do something
so that,only the first time you work with one block, you may read
every byte of every page starting from the end of the block.

by the way,if you write marks in the first two page as jffs do,you may
loss them as some nand flash did not support partial program.
-----
Regards,
Ezio

On Feb 16, 12:17 am, Frédéric SERGENT <fred.sergent...@gmail.com>

Ricky Zheng

unread,
Feb 16, 2012, 12:48:07 AM2/16/12
to uf...@googlegroups.com
Hi all,

I've added erased block checking in master branch. I added 'need_check" field in 'struct BlockListSt', When mounting, set need_check = 1, do the block page checking when calling uffs_TreeGetErasedBlock().

However, I'm on holiday so do not have equipment to verify it, please give it a try.

- Ricky

--
You received this message because you are subscribed to the Google Groups "UFFS" group.
To post to this group, send email to uf...@googlegroups.com.
To unsubscribe from this group, send email to uffs+uns...@googlegroups.com.

Frédéric SERGENT

unread,
Feb 16, 2012, 4:27:32 AM2/16/12
to uf...@googlegroups.com

Great, thanks. I will do that today.

Fred
- Ricky

To unsubscribe from this group, send email to uffs+unsubscribe@googlegroups.com.

Frédéric SERGENT

unread,
Feb 16, 2012, 5:33:07 AM2/16/12
to uf...@googlegroups.com

what about this:
declare a array,its size is the number of block,and make it full of 0
to present all blocks have not been visited.after first time you
visit  block n ,make array[n] = 1 to indicate this block has been
visted.

Yes, It is similar to the solution discussed with Ricky yesterday, and that he implemented by adding a field to the struct used in the linked list of blank blocks.

And you are right: it's faster in the (quite likely) case we don't write on all blank blocks during a session, and even so, it spreads the same time over the whole session, which is less painful.


by the way,if you write marks in the first two page as jffs do,you may
loss them as some nand flash did not support partial program.

Didn't know that. Interesting. Unfortunately I have limited hardware to work on, as use only a few kind of flash chips in our devices.

Regards,
Fred

Frédéric SERGENT

unread,
Feb 17, 2012, 12:51:38 PM2/17/12
to uf...@googlegroups.com
Hi all,

I've run Ricky's new need_check code: looks fine by me. I'll leave modules resetting in loop this week end to stress this, but I think it's ok.

Regards,

Fred

Le jeudi 16 février 2012 06:48:07 UTC+1, Ricky Zheng a écrit :
- Ricky

To unsubscribe from this group, send email to uffs+unsubscribe@googlegroups.com.

Frédéric SERGENT

unread,
Feb 22, 2012, 5:21:15 AM2/22/12
to uf...@googlegroups.com
Hi Ricky,

just so you know: I have had the need_check code running on a few modules with various flash chips with reset stress tests, and it works fine. I have left one of them running since friday: it survived more than 11400 reset cycles since then (I never reached more than 150 before that).

Regards,
Fred

Ricky Zheng

unread,
Mar 15, 2012, 7:01:22 AM3/15/12
to uf...@googlegroups.com
Hi Fred,

Just back from holiday ... glad to know the result :-)

Thanks for your testing.

- Ricky

To view this discussion on the web visit https://groups.google.com/d/msg/uffs/-/ckesfmT39SkJ.

To post to this group, send email to uf...@googlegroups.com.
To unsubscribe from this group, send email to uffs+uns...@googlegroups.com.

Frédéric SERGENT

unread,
Mar 15, 2012, 9:40:22 AM3/15/12
to uf...@googlegroups.com

Hey, Ricky! Welcome back! :-)

Fred

Sergei Sharonov

unread,
Jun 29, 2015, 12:42:10 AM6/29/15
to uf...@googlegroups.com
On Monday, February 13, 2012 at 1:13:02 PM UTC-6, Frédéric SERGENT wrote:
.....


Here is what happened here: one of the tests we run does reset our device while erasing a file in a UFFS partition. What we get, sometimes, is that the block is actually not totally erased: the first pages are correctly set to 0xFF, but the few last ones keep data, probably the old ones.

I believe matters are even worth than that. On power fail you can get a partially erased page where bits can flip from read to read. AFAIK that caused design change of JFFS2 where they write erase marker immediately after erasing. Original JFFS was not doing that and was failing during power cycling. I suspect the issue is less severe on NAND due to short erase cycle (few ms) and worse on NOR where erase cycle could be in the order of 100s of ms. There is probably enough juice in power supply caps to finish NAND erase. However this is a tricky test to do as you may not know what else (e.g. other devices) may sit on the same power supply in the field and drain the caps.
Does it make sense to change status byte to contain erase marker (at least for NOR)?
Regards,
Sergei
P.S. I understand the thread is 3 years old. Worth a try though...

Ricky Zheng

unread,
Jun 29, 2015, 4:42:12 AM6/29/15
to uf...@googlegroups.com
UFFS assume that when erasing a block, the first page get's erased first, then second page...etc. When start up, UFFS first scan the first page, if the 'mini header' and 'status' byte all '0xFF', it then decide it's 'clean', the block is added to the 'erased block list'. The rest of pages in the 'erased block' will be checked again before it's confirmed 'clean'.

Further more, if the block is partially erased and passed the check, then UFFS will detect that there are two block that having the same logic ID with different circular-time-stamp. One of the block then will be erased. At the moment, the block having newer time-stamp will be erased, but if the page data CRC check is enabled (CONFIG_ENABLE_PAGE_DATA_CRC), we can check the CRC to decide which block having the valid data.

Cheers,
Ricky.

--
You received this message because you are subscribed to the Google Groups "UFFS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to uffs+uns...@googlegroups.com.

To post to this group, send email to uf...@googlegroups.com.
Visit this group at http://groups.google.com/group/uffs.
For more options, visit https://groups.google.com/d/optout.

Sergei Sharonov

unread,
Jun 29, 2015, 12:43:05 PM6/29/15
to uf...@googlegroups.com


On Monday, June 29, 2015 at 3:42:12 AM UTC-5, Ricky wrote:
UFFS assume that when erasing a block, the first page get's erased first, then second page...etc. When start up, UFFS first scan the first page, if the 'mini header' and 'status' byte all '0xFF', it then decide it's 'clean', the block is added to the 'erased block list'. The rest of pages in the 'erased block' will be checked again before it's confirmed 'clean'.
And here lies potential problem. When erase operation is interrupted (e.g. by power failure) the analog value of charge in the floating gate (or whatever your flash uses to store zeros and ones) is somewhere in between valid window for 0 and valid window for 1. A comparator than will sometimes read it as 0 and sometimes as 1. The scan on power up may report that page as all 0xff and we assume that it was properly erased. However once we use it and program with data some of the programmed 1s will read sometimes 0 and sometimes 1 as they were never properly erased to state of 1. See Linux MTD FAQ about cleanmarker:

As I mentioned in my initial reply, in case of NAND, this problem is less pronounced or not present at all on some hardware due to short duration of erase cycle. NOR is a different story.
I hope I can either:
1. use "Read 1s" command (specific to Kinetis flash) that can validate erase state with tighter margins or
2. change status byte to be cleanmarker as MTD does.
The (1) seems like the best choice for me even though it is specific to Kinetis flash technology.

Regards,
Sergei
Reply all
Reply to author
Forward
0 new messages