fatfs seems to confuse the Master Boot Record with the FAT32 boot record.

514 views
Skip to first unread message

bob.fe...@rafresearch.com

unread,
Jul 20, 2018, 9:21:38 PM7/20/18
to NuttX
I have been observing the below error messages every time I automount an SD-card.

fat_checkbootrecord: ERROR: Signature: aa55 FS: 0 HW sectorsize: 512
(That is with a new, manufacturer formatted SD-card. After Linux re-formats the card it's FS: 36352.)

It did not seem to cause any failures, so I made it a lower priority for investigation. Now that I had a bit of time to look further, these are my initial findings.

First let's refresh our memories with the FAT32 file system structure.

Master Boot Record (MBR) located in the first sector of a SD-card (sector 0x0).
It contains executable boot code, four partition entries, and a 0x55AA signature in bytes 510 & 511.

FAT32 Boot Record located at sector 0x800.
This sector contains a jump instruction, BytesPerSector, SectorsPerCluster, and all of the other things that fs/fat/fs_fat32.h claims are in the MBR. Also there is a 0x55AA signature in bytes 510 & 511.

The printf() function that generates this error message is...
if (MBR_GETSIGNATURE(fs->fs_buffer) != BOOT_SIGNATURE16 ||
    MBR_GETBYTESPERSEC(fs->fs_buffer) != fs->fs_hwsectorsize)
  {
    ferr("ERROR: Signature: %04x FS sectorsize: %d HW sectorsize: %d\n",
          MBR_GETSIGNATURE(fs->fs_buffer), MBR_GETBYTESPERSEC(fs->fs_buffer),
          fs->fs_hwsectorsize);

    return -EINVAL;
  }


The problem is that is that fat is really reading these fields from the MBR (sector 0) when they are in the FAT32 Boot Record located at sector 0x800.

I am posting this to the Group, because I don't know how far this confusion has spread in the fatfs code. This aspect of it seems to have only cosmetic consequences, but I am still seeing occasional automount bind failures. (Not debounce related). And I am wondering if this MBR vs FAT32 Boot Record confusion has been more widespread than this cosmetic issue.

Has anyone else been noticing anomalies during mount or automount?

Does anyone know the history of this code and can shed some light on other sections that can have been infected by the MBR vs FAT32 Boot Record confusion?

Regards,
Bob Feretich

patacongo

unread,
Jul 21, 2018, 8:53:03 AM7/21/18
to NuttX
I have seen that error before, but I cannot replicate it now (using STM32F4Discovery with baseboard and microSD card)

bob.fe...@rafresearch.com

unread,
Jul 21, 2018, 5:26:46 PM7/21/18
to NuttX
It's probably something architecture specific then.
I'll look deeper. I think there is an uninitialized variable issue somewhere in there too, an automount bind failure popped up, but went away with a change to an unrelated source file. I haven't been able to reproduce it again.

Gregory Nutt

unread,
Jul 21, 2018, 5:30:27 PM7/21/18
to nu...@googlegroups.com

> It's probably something architecture specific then.
> I'll look deeper. I think there is an uninitialized variable issue
> somewhere in there too, an automount bind failure popped up, but went
> away with a change to an unrelated source file. I haven't been able to
> reproduce it again.

On Cortex-M7, the data cache issues, data cache configuratiobn, cache
coherency, data alignment (with respect to the cache line)  is often the
cause of strange behavior... especially when DMA is involved.

I always disable write back data cache.  I have never seen that work
reliably, mostly due to data alignment issues.  I talk about some of
these issues here:
http://www.nuttx.org/doku.php?id=wiki:howtos:port-drivers_stm32f7

bob.fe...@rafresearch.com

unread,
Jul 21, 2018, 7:34:33 PM7/21/18
to NuttX
I believe that if you enable file system debug error messages, then you will see the error message too.

The error message is erroneous.

fatfs starts looking for the FAT Boot Record (FBR) (confusingly referred to in the code as the MBR.
  1. It reads sector 0, the true MBR, calls fat_checkbootrecord(), which fails because the uint16_t at offset 11 does not equal the hardware blocksize (512). This causes the error message to print.
  2. The fatfs decides that the record it just read may instead be the MBR, which it is, and looks for partition entries.
  3. It then reads the first sector of each partition and calls fat_checkbootrecord() again.
  4. The first partition is a fat file system and the FBR is at the location and it satisfies fat_checkbootrecord(), so the mount continues and completes successfully.

Do you mind if I change the message emitted by fat_checkbootrecord() from ferr() to finfo()?

Mount will produce an error message if it checks all four partitions and does not find a valid FBR.


Also, do you mind if I change the erronious MBR references to FBR?

bob.fe...@rafresearch.com

unread,
Jul 21, 2018, 7:35:38 PM7/21/18
to NuttX
I have the D-cache in store-through mode.

patacongo

unread,
Jul 21, 2018, 7:58:52 PM7/21/18
to NuttX

I believe that if you enable file system debug error messages, then you will see the error message too.

No, I do not  see the error and I do have file system debug enable.  The error is definitely NOT occurring.

Do you mind if I change the message emitted by fat_checkbootrecord() from ferr() to finfo()?


Perhaps a warning.  But don't waste a lot of time for everyone involved with too many cosmetic changes.
 

Mount will produce an error message if it checks all four partitions and does not find a valid FBR.


Also, do you mind if I change the erronious MBR references to FBR?



My preference, again, would be that you not make a lot of such cosmetic changes.  They are kind of annoying.  They could be folded in with some more important changes.

If your scenario is correct, then the best route would be to fix the error.  I am not, however, seeing any such error messages with file system debug on, so I don't know what to think.  I have not dug into the code (recently) as you have, but your analysis seems reasonable.  But then wouldn't I see the error too?  I don't.



patacongo

unread,
Jul 21, 2018, 8:02:28 PM7/21/18
to NuttX
  1. It reads sector 0, the true MBR, calls fat_checkbootrecord(), which fails because the uint16_t at offset 11 does not equal the hardware blocksize (512). This causes the error message to print.

According you your previous statements, it prints:

fat_checkbootrecord: ERROR: Signature: aa55 FS: 0 HW sectorsize: 512

Why do you say that the block size is incorrect?  Isn't 512 correct?  It might instead be complaining about the signature.

bob.fe...@rafresearch.com

unread,
Jul 21, 2018, 8:09:25 PM7/21/18
to NuttX
Do you mount the SD-card or mount a partition on the SD-card?

If I insert a two multi-partition SD-card in Linux, Isee something like...
ls /dev
...
SDB
SDB1
SDB2
...

Then if I want to mount the second partition, I mount /dev/SDB1

For Nuttx I am using a card with only one partition; /dev/sdcard

Should I be seeing /dev/sdcard and /dev/sdcard1  ?

It mount was given /dev/sdcard1 then the first sector it reads should be the FBR.

patacongo

unread,
Jul 21, 2018, 8:14:03 PM7/21/18
to NuttX
I don't recall, but I think only a single partition will ever be used.

bob.fe...@rafresearch.com

unread,
Jul 21, 2018, 8:19:17 PM7/21/18
to NuttX
I meant, that I mount  /dev/SDB2 for the second partition.

bob.fe...@rafresearch.com

unread,
Jul 21, 2018, 8:23:50 PM7/21/18
to NuttX
OK, I'll change the ferr() to fwarn(); and leave it at that.

How about a comment just above the erroneous MBR definitions stating that "The below references to MBR actually refer to the Fat Boot Record."?

bob.fe...@rafresearch.com

unread,
Jul 21, 2018, 8:26:06 PM7/21/18
to NuttX
fat_checkbootrecord: ERROR: Signature: aa55 FS sectorsize: 0 HW sectorsize: 512

aa55 is the correct signature for both the MBR and FBR. (Unfortunate that Microsoft made them the same.)
FS sectorsize: is from the sector that was read.  (MBR_GETBYTESPERSEC(fs->fs_buffer)
HW sectorsize: I think is from the low level driver, I think. (fs->fs_hwsectorsize)

patacongo

unread,
Jul 21, 2018, 8:30:01 PM7/21/18
to NuttX

How about a comment just above the erroneous MBR definitions stating that "The below references to MBR actually refer to the Fat Boot Record."?

That would be a really silly thing to do.

bob.fe...@rafresearch.com

unread,
Jul 21, 2018, 8:35:30 PM7/21/18
to NuttX
Why silly?
I wasted some time and lost some confidence in fatfs by thinking that fatfs really thought that the described fields were in the MBR.

Also, since this is a one word (ferr to fwarn) change, and maybe a one line comment, would it be easier for you to process a submitted patch, of just make the change yourself?

patacongo

unread,
Jul 21, 2018, 8:50:05 PM7/21/18
to NuttX
Okay, you have convinced me I think you are right on all accounts.

Look also at apps/fsutils/mkfatfs.  It has the corresponding error.  It looks like I screwed the MBR stuff up when I wrote that code back in August 2018... almost 10 years ago!

I do not see the error because I use the NuttX mkfatfs to format FAT and it does force the block size right in the middle of the boot code.  So there is no error reported.

Renaming MBR->FBR would make some of that clearer.  I even agree with that now.  Any changes to nuttx/fs/fat need to also be made to apps/fsutils/mkfatfs.

Curious however, because I did reformat the SD card on Windows 10 and still did not see the error.  And also Windows 10 is completely happy with the FAT formatted by NuttX.

Still some mysteries.

bob.fe...@rafresearch.com

unread,
Jul 21, 2018, 9:25:56 PM7/21/18
to NuttX
I doubt the Windows has used the boot code in the MBR since the 80286. Linux doesn't either.
When a card is manufactured the boot code section is set to all zeros.
Linux format plugs in the original Microsoft boot code to be officially compliant.
I don't know about the Windows 10 formatting program. I don't believe the default "Quick format" option rewrites the boot code portion of the sector.

Thanks for validating my observations. I could not understand how you were not getting the error message. NuttX mkfatfs forcing the block size explains it.

So, given that this is all cosmetic...
  • I agree that making all the changes risks really breaking the code, and since it has been very stable, that risk should be taken lightly.
  • I think that the ferr() to finfo() change is appropriate, since if there is really no FBR, then mount will generate an ferr().
  • Maybe a comment like...
    "The below references to MBR actually refer to the Fat File System Master Boot Record."
    That may be enough to never have to fix it.

spudaneco

unread,
Jul 21, 2018, 9:48:00 PM7/21/18
to nu...@googlegroups.com
I don't like the third option.  Such a comment is still silly in my opinion.  It is like having a picture of a dog preceded by thd comment that the following dog is really a cat.

I would prefer that you change the naming as you originally suggested.



Sent from Samsung tablet.
--
You received this message because you are subscribed to the Google Groups "NuttX" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nuttx+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

bob.fe...@rafresearch.com

unread,
Jul 21, 2018, 10:47:08 PM7/21/18
to NuttX
Roger.
I will grep MBR*, specifically look into fatfs and apps/fsutils/mkfatfs, and be very careful.

patacongo

unread,
Jul 22, 2018, 8:21:38 AM7/22/18
to nu...@googlegroups.com
The code is actually fine for the case supported by the FS which is a very old version of FAT before partitions were supported.  In the old version, all of the file system information was in the MBR just as implemented.  And apparently, Windows and Linux still support the old version although it is difficult to find documentation (they must because the NuttX mkfatfs does not create any partition information).

But when partitions were added, things went awry.  No new FBR definitions were created.  I am not sure how things were working with partitions before.  It will take awhile to straighten out the details.  But that is okay.  Things are working fine on the master branch as it is now.. except for the  the error message that is annoying you.

patacongo

unread,
Jul 22, 2018, 11:04:09 AM7/22/18
to NuttX
I have committed the changes.

I now have a much better feeling for what is going on.  The MBR may contain a boot record (or it may not) and it may include partition table (or it may not) and the first sector of each partitiion also may contain a boot record.  The naming FBR and MBR is really useless.  There is just a boot record and it may lie in any of those five locations (MBR or in on the four partition boot sectors).  So although I added a lot of useless definitions for the FBR that duplicate those for the MBR, no change was really necessary.  The code was complete correct as it was.

A better resolution would have just to had generic boot record definitions that apply equally in all cases (which it does).  As it is, I have probably deepened the naming chaos by making the MBR/FBR distinction there is really none:  Those are just two different "containers" that may hold the common boot record.  Having a single BR would have been cleaner.

So the bottom line is that the only real change was changing the ferr() to fwarn() debug output.  Those were not fwarn to begin with because the warnings are relatively new (compared to this ancient FAT driver).  There used to be only dbg() and vdbg().  When the conversion was made, dbg() became err() and vdbg() become info().  warn() have been manually added since.

Greg

bob.fe...@rafresearch.com

unread,
Jul 22, 2018, 1:14:58 PM7/22/18
to NuttX
OK. Works for me. Let's consider this issue "Solved".
Reply all
Reply to author
Forward
0 new messages