Raspbian on ext4: Structure needs cleaning

Markus Robert Kessler

unread,

Mar 19, 2021, 6:55:44 PM3/19/21

to

Hi all,

I have several Raspberries running. One of them is used to collect sensor
data, and since it is located > 100 miles distant, it is currently only
accessible via VPN.

Last reboot was end of November and now I have to see that a stat, ls
etc. to some directories on the ext4 partition return an error alert
"structure need cleaning".

Reading in some RPi fora it looks like this is a common problem related
to ext4.

So, some questions (since I don't have the chance to drive to that
location now and replace the SD card / installation):

- Has someone already installed Raspbian (Buster) on ext3 instead of ext4,
and if so, will this prevent the card from further problems like this?

- What can be done remotely now to "repair" the FS? -- I can hardly pull
the card out of it and run an fsck on a second machine via VPN.
B.t.w., badblocks came back with 0 errors, the SD card seems still ok.

I assume that the next reboot will be the last one...
So, any ideas highly appreciated. -- Thanks!

Best regards,

Markus

--
Please reply to group only.
For private email please use http://www.dipl-ing-kessler.de/email.htm

Martin Gregorie

unread,

Mar 19, 2021, 8:00:58 PM3/19/21

to

On Fri, 19 Mar 2021 22:55:43 +0000, Markus Robert Kessler wrote:

> - What can be done remotely now to "repair" the FS? -- I can hardly pull
> the card out of it and run an fsck on a second machine via VPN.
> B.t.w., badblocks came back with 0 errors, the SD card seems still ok.
>

A few questions:

- How full is the card? "df -h" should show that.

- How long has thus card been in use?

- Are you running 'fstrim' on the card and, if so , how frequently?
-- but see below ---

- Can you describe how you collect and store data on the Pi4 - by that I
mean how big are the files, how many are held on the Pi, how long is
each file held, or are you using a database?

- If you're using files, how is each written to? IOW is it left open and
data added until it hits a preset limit and a new one is opened, or is
the file normally closed and every so often the file is opened, data
appended to it and the file closed again. How often are old files
discarded to make room for more data and how are the old files chosen
for deletion?

- How easy is it to change the maximum number and size of files [or
rows in the database table(s)] on the Pi?

** Guess **
Could it be that you have never run fstrim? If so, running it may help:
something like "sudo fstrim -v /home" should do the trick - however I've
just tried it on a Pi 2B running Buster and, since I don't usually run
fstrim on the Pi, I thought it would do its thing (I run it weekly on a
Lenovo laptop with a Sanyo 120 GB SSD and Fedora Linux; several Kb of
blocks are trimmed each time fstrim is run. However, on the Pi 2 (16GB SD
card fitted) 'fstrim' just reports zero bytes trimmed, i.e. it didn't do
anything.

I assume from this that by design fstrim does nothing when pointed at a
partition on and SD card. Can anybody confirm this? The manpage is silent
about using in on SD cards.

However, the other stuff I asked about should let us make sensible
suggestions.

--
Martin | martin at
Gregorie | gregorie dot org

Markus Robert Kessler

unread,

Mar 20, 2021, 4:13:44 AM3/20/21

to

On Sat, 20 Mar 2021 00:00:55 +0000 Martin Gregorie wrote:

Hi,

> On Fri, 19 Mar 2021 22:55:43 +0000, Markus Robert Kessler wrote:
>
>> - What can be done remotely now to "repair" the FS? -- I can hardly
>> pull the card out of it and run an fsck on a second machine via VPN.
>> B.t.w., badblocks came back with 0 errors, the SD card seems still ok.
>>
> A few questions:
>
> - How full is the card? "df -h" should show that.
>
> - How long has thus card been in use?

installation was end of November on a fresh 16 GB card.

As recommended in this group, I did not perform reboots every day /
night, since there are only some bash and python scripts running, which
consume only few resources and are ending after running properly.

$ who -r
run-level 3 Nov 30 09:21
$
$
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/root 15G 2.4G 12G 18% /
devtmpfs 184M 0 184M 0% /dev
tmpfs 216M 0 216M 0% /dev/shm
tmpfs 216M 22M 194M 11% /run
tmpfs 5.0M 4.0K 5.0M 1% /run/lock
tmpfs 216M 0 216M 0% /sys/fs/cgroup
/dev/mmcblk0p1 253M 54M 199M 22% /boot
tmpfs 44M 0 44M 0% /run/user/0
$
$
$ cd /etc/resolvconf/update-libc.d
$ ll
ls: cannot access 'avahi-daemon': Structure needs cleaning
total 8.0K
drwxr-xr-x 2 root root 4.0K Feb 13 2020 ./
drwxr-xr-x 3 root root 4.0K Feb 13 2020 ../
-????????? ? ? ? ? ? avahi-daemon
$ cat avahi-daemon
cat: avahi-daemon: Structure needs cleaning
$ fstrim -a -v
/boot: 197.4 MiB (206990848 bytes) trimmed on /dev/mmcblk0p1
/: 0 B (0 bytes) trimmed on /dev/mmcblk0p2
$ ll
ls: cannot access 'avahi-daemon': Structure needs cleaning
total 8.0K
drwxr-xr-x 2 root root 4.0K Feb 13 2020 ./
drwxr-xr-x 3 root root 4.0K Feb 13 2020 ../
-????????? ? ? ? ? ? avahi-daemon
$

> - Are you running 'fstrim' on the card and, if so , how frequently?
> -- but see below ---
>

Well, indeed, I never used fstrim so far.
But in this case it seems to do nothing, though.

> - Can you describe how you collect and store data on the Pi4 - by that I
> mean how big are the files, how many are held on the Pi, how long is
> each file held, or are you using a database?

Data is received via I2C bus, processed and transmitted to a webserver
outside. These are only some Kilobytes, and there is no sensor data
stored on disk.

> - If you're using files, how is each written to? IOW is it left open and
> data added until it hits a preset limit and a new one is opened, or is
> the file normally closed and every so often the file is opened, data
> appended to it and the file closed again. How often are old files
> discarded to make room for more data and how are the old files chosen
> for deletion?
>
> - How easy is it to change the maximum number and size of files [or
> rows in the database table(s)] on the Pi?
>
> ** Guess **
> Could it be that you have never run fstrim? If so, running it may help:
> something like "sudo fstrim -v /home" should do the trick - however I've
> just tried it on a Pi 2B running Buster and, since I don't usually run
> fstrim on the Pi, I thought it would do its thing (I run it weekly on a
> Lenovo laptop with a Sanyo 120 GB SSD and Fedora Linux; several Kb of
> blocks are trimmed each time fstrim is run. However, on the Pi 2 (16GB
> SD card fitted) 'fstrim' just reports zero bytes trimmed, i.e. it didn't
> do anything.

Yes, same here. See above.

> I assume from this that by design fstrim does nothing when pointed at a
> partition on and SD card. Can anybody confirm this? The manpage is
> silent about using in on SD cards.
>
> However, the other stuff I asked about should let us make sensible
> suggestions.

So, the errors still persist, and I don't dare to do a reboot...

Martin Gregorie

unread,

Mar 20, 2021, 11:22:26 AM3/20/21

to

On Sat, 20 Mar 2021 09:12:57 -0400, Dennis Lee Bieber wrote:

> On Sat, 20 Mar 2021 00:00:55 -0000 (UTC), Martin Gregorie
> <mar...@mydomain.invalid> declaimed the following:

>
>
>
>>I assume from this that by design fstrim does nothing when pointed at a
>>partition on and SD card. Can anybody confirm this? The manpage is
>>silent about using in on SD cards.
>>

> It does, however, state:
>
> """
> -a, --all
> Trim all mounted filesystems on devices that support the
> discard operation. The other supplied options, like
> --offset, --length and --minimum, are applied to all these
> devices. Errors from filesystems that do not support the
> discard operation, read-only devices and read-only
> filesystems are silently ignored.
> """
>
> I would suspect SD cards do not have "discard" (after all, they
are,
> for the most part, optimized for FAT file systems)

That makes sense. Thanks.

Martin Gregorie

unread,

Mar 20, 2021, 12:24:46 PM3/20/21

to

On Sat, 20 Mar 2021 08:13:43 +0000, Markus Robert Kessler wrote:

> On Sat, 20 Mar 2021 00:00:55 +0000 Martin Gregorie wrote:
>
> installation was end of November on a fresh 16 GB card.
>

OK, but is it a noname card or from one of the better brands? I've been
using Sandisk for everything (RPi, camera, glider navigation system and
flight logger) a few years now and have had no card-related problems.

> As recommended in this group, I did not perform reboots every day /
> night, since there are only some bash and python scripts running, which
> consume only few resources and are ending after running properly.
>
> $ who -r
> run-level 3 Nov 30 09:21
> $
> $
> $ df -h Filesystem Size Used Avail Use% Mounted on /dev/root
> 15G 2.4G 12G 18% /
> devtmpfs 184M 0 184M 0% /dev tmpfs 216M 0
> 216M 0% /dev/shm tmpfs 216M 22M 194M 11% /run tmpfs
> 5.0M 4.0K 5.0M 1% /run/lock tmpfs 216M 0 216M
> 0% /sys/fs/cgroup /dev/mmcblk0p1 253M 54M 199M 22% /boot tmpfs
> 44M 0 44M 0% /run/user/0 $
> $

OK - nothing wrong so far

> $ cd /etc/resolvconf/update-libc.d $ ll ls: cannot access
> 'avahi-daemon': Structure needs cleaning total 8.0K drwxr-xr-x 2 root
> root 4.0K Feb 13 2020 ./
> drwxr-xr-x 3 root root 4.0K Feb 13 2020 ../
> -????????? ? ? ? ? ? avahi-daemon $ cat avahi-daemon
> cat: avahi-daemon: Structure needs cleaning $ fstrim -a -v /boot: 197.4
> MiB (206990848 bytes) trimmed on /dev/mmcblk0p1 /: 0 B (0 bytes) trimmed
> on /dev/mmcblk0p2 $ ll ls: cannot access 'avahi-daemon': Structure needs
> cleaning total 8.0K drwxr-xr-x 2 root root 4.0K Feb 13 2020 ./
> drwxr-xr-x 3 root root 4.0K Feb 13 2020 ../
> -????????? ? ? ? ? ? avahi-daemon $
>

I think you'll find that avahi-daemon is only needed if your RPi needs to
talk to some sort of Apple computer: here's synopsis:

The Avahi mDNS/DNS-SD daemon implements Apple's Zeroconf architecture
(also known as "Rendezvous" or "Bonjour"). The daemon registers local IP
addresses and static services using mDNS/DNS-SD and provides two IPC APIs
for local programs to make use of the mDNS record cache the avahi-daemon
maintains. First there is the so called "simple protocol" which is used
exclusively by avahi-dnsconfd (a daemon which configures unicast DNS
servers using server info published via mDNS) and nss-mdns (a libc NSS
plugin, providing name resolution via mDNS). Finally there is the D-Bus
interface which provides a rich object oriented interface to D-Bus
enabled applications.

So, it seems that if you are using Apple kit you need it, otherwise kill
it. I only connect to my RPi from a Linux system, so I don't understand
or use avahi-daemon.

> Well, indeed, I never used fstrim so far.
> But in this case it seems to do nothing, though.
>

OK

> Data is received via I2C bus, processed and transmitted to a webserver
> outside. These are only some Kilobytes, and there is no sensor data
> stored on disk.
>

OK

That looks like all the obvious stuff covered, then.

If nothing else occurs to you or is suggested, check the SD card with
fsck: "fsck -A -s" would seem appropriate and tell it not to fix any
problems if it finds something and asks if it should repair it: IOW treat
this as just a problem scan and only consider what to do if the complete
fsck scan shows any errors.

If errors are found, try to back up anything useful (code, scripts etc
that aren't already backed up) and then, if you're feeling keen back up
the SD card onto new backup media, i.e. don't overwrite a good backup.

Then:
- try using fsck to fix errors. If that works, great.
- otherwise use gparted to clear the SD card, repartition and reformat it
and copy the backed up stuff back onto it and see if its now OK.
- if still not fixed, repeat the last step with a new disk unless your
backup was to a new, freshly partitioned and formatted SD card, in
which case, use that as the RPi's main card and junk toe original.

Markus Robert Kessler

unread,

Mar 20, 2021, 2:02:54 PM3/20/21

to

On Sat, 20 Mar 2021 16:24:45 +0000 Martin Gregorie wrote:

> On Sat, 20 Mar 2021 08:13:43 +0000, Markus Robert Kessler wrote:
>
>> On Sat, 20 Mar 2021 00:00:55 +0000 Martin Gregorie wrote:
>>
>> installation was end of November on a fresh 16 GB card.
>>
> OK, but is it a noname card or from one of the better brands? I've been
> using Sandisk for everything (RPi, camera, glider navigation system and
> flight logger) a few years now and have had no card-related problems.

I only use brands like Sandisk, Samsung EVO and similar.

It makes me cry to see that the card is totally ok,

# badblocks -vvv /dev/mmcblk0
Checking blocks 0 to 15446015
Checking for bad blocks (read-only test):
done
Pass completed, 0 bad blocks found. (0/0/0 errors)

and this seems to be one more ext4-issue.

In the meantime the filesystem was going more and more corrupted after I
tried to perform fsck.ext4.
Finally, there were errors in /home, and even /var was empty (!)...

So, the last thing I tried was to switch to NFS- or NAS-boot but I had to
see that the total storage space at that location was by far not
sufficient. Even worse, init didn't work either.

Since even /lib was more and more messed up, not even a shutdown / halt /
poweroff etc. was possible. So, I kicked it out in the firewall to
prevent it from doing unpredictable things after sshd also crashed and I
lost the connection.

So, end of the line here. Oh man...

B.t.w.,

I set up one more box here with ext3 rootfs to make some experiments.
It works perfectly, and if the installation survives the next days then I
will switch all of my machines to ext3 one after the other.

So, thank you all for the nice discussion!

Martin Gregorie

unread,

Mar 20, 2021, 4:21:09 PM3/20/21

to

On Sat, 20 Mar 2021 18:02:53 +0000, Markus Robert Kessler wrote:

>
> I only use brands like Sandisk, Samsung EVO and similar.
>

Good.

> It makes me cry to see that the card is totally ok,
>
> # badblocks -vvv /dev/mmcblk0 Checking blocks 0 to 15446015 Checking for
> bad blocks (read-only test):
> done Pass completed, 0 bad blocks found. (0/0/0 errors)
>

Good.

> and this seems to be one more ext4-issue.
>

Not necessarily - see below.

Besides, IME ext3 and ext4 are very reliable filing systems - I've never
had any problems with either, even when retrieving the /home directory
structure from a hard drive that was failing due to old age (50,000
hours).

> In the meantime the filesystem was going more and more corrupted after I
> tried to perform fsck.ext4.
> Finally, there were errors in /home, and even /var was empty (!)...
>
> So, the last thing I tried was to switch to NFS- or NAS-boot but I had
> to see that the total storage space at that location was by far not
> sufficient. Even worse, init didn't work either.
>
> Since even /lib was more and more messed up, not even a shutdown / halt
> /
> poweroff etc. was possible. So, I kicked it out in the firewall to
> prevent it from doing unpredictable things after sshd also crashed and I
> lost the connection.
>
> So, end of the line here. Oh man...

The reason that I suggested running "fsck -A -s" is because:

- the -A option tells fsck to check every partition in /etc/fstab, using
the appropriate filing system checker, as specified in /etc/fstab, for
each partition

- the -s option tells it to check one partition at a time. The default
is to check them all at once, but this does mean that the error
messages will be jumbled together if more than one partition has errors.

So, what exactly did you run? Just fsck.ext4? If so, with what options?

If you let fsck.ext4 loose on both partitions of course it would throw
errors because the boot partition is VFAT, not EXT4, but if you *DID NOT*
let fsck.ext4 make any changes, then the filing system should not have
been damaged (any more than it was already).

Did you make a backup copy, as I also suggested, before running fsck.ext4?
If not, and the errors are due to telling fsck.ext4 to scan a VFAT
partition, then your larger EXT4 partition may well be salvageable, but
we can't tell you how unless we know what other computers you have and
what operating systems they run.

For example everything here, apart from my RPi, runs on X86 chips and has
Fedora Linux installed, so I could transfer only stuff I've written on my
RPi to another SD card by:

- using gparted to make a same sized pair of VFAT and EXT4 partitions on
a new SD card
- make two tar archives, one containing everything in /home and the
other containing everything in /usr/local, saving them on a Fedora box.
- set up a clean copy of Raspbian Buster on the new card
- unpack the contents of the /home and /usr/local tar archives
over the new Debian install

This would put me pretty much back in business with all my own code and
data reinstalled in an upto date Debian Buster system. And, I've done
this several times already as the SD card had grown form 4GB -> 8GB ->
16GB and Raspbian has successively outgrown the previous cards.

Finally, back when you wrote your monitoring system, did you make a
backup copy of the source code, binaries, shell scripts etc. on another
machine before putting your RPi system into everyday use? Ans, I have to
add: if not, why not?

Markus Robert Kessler

unread,

Mar 21, 2021, 4:06:14 AM3/21/21

to

Hi Martin,

On Sat, 20 Mar 2021 20:21:07 +0000 Martin Gregorie wrote:

many thanks for your very interesting thoughts!

Well, to make it short,

snip [...]

> So, what exactly did you run? Just fsck.ext4? If so, with what options?

First partition was ok and then I only tested the second one with

fsck.ext4 -y /dev/mmcblk0p2

I knew that this was not a good idea, but the filesystem was already so
corrupted that it was not even possible to switch to runlevel 1 and
unmount ext4 partition. Very soon it was clear that I can forget about
the installation and there was not much to lose.

I just tried to learn from what happened.

snip [...]

> Did you make a backup copy, as I also suggested, before running
> fsck.ext4?

I have the same data as backups and besides this there are several
machines running with almost identical configuration.

To make things easier, some weeks ago I started with setting up one
sample machine very carefully, I installed everything needed, created the
necessary users, made all updates, and then I created two tgz-balls, one
for boot and one for rootfs.

Making a new instance out of it can now be easily done by taking a new SD
card out of the box, overwrite the first hundred MBs with 'cat /dev/zero
> /dev/sdb' (stopping after some seconds) on a linux machine, then create
a Win95 partition with 256MB and the whole remaining space with linux
filesystem using fdisk.

Afterwards I only have to format both partitions with the appropriate
filesystem type and restore the backups.

Of course, the machine name has to be changed in /etc/hosts and /etc/
hostname, the most recent updates have to be applied and so on.

But creating a new installation based on this is a matter of half an hour.

So, there are only two things left annoying me somehow in this case:

- I still do not know why the filesystem changed to dust.
Since I did not overload the machine, nor did I reboot all the time

- Now I have to travel to that location to replace the installation :-)

Well, the new box is already there, ready to replace the crashed one.
This one now has ext3 as second partition.
So, let's see how long this will work.

Thanks a lot!

druck

unread,

Mar 21, 2021, 8:37:04 AM3/21/21

to

On 19/03/2021 22:55, Markus Robert Kessler wrote:
> Hi all,
>
> I have several Raspberries running. One of them is used to collect sensor
> data, and since it is located > 100 miles distant, it is currently only
> accessible via VPN.
>
> Last reboot was end of November and now I have to see that a stat, ls
> etc. to some directories on the ext4 partition return an error alert
> "structure need cleaning".
>
> Reading in some RPi fora it looks like this is a common problem related
> to ext4.
>
> So, some questions (since I don't have the chance to drive to that
> location now and replace the SD card / installation):

You can change your /boot/cmdline.txt to the following:-

dwc_otg.lpm_enable=0 console=tty1 root=/dev/sda1 rootfstype=ext4
elevator=deadline fsck.repair=yes rootwait rootdelay=5

Which will then run fsck to repair the disk everytime when you reboot
it. It might take a good few minutes to come back up again, which is
worrying when doing it remotely.

> - Has someone already installed Raspbian (Buster) on ext3 instead of ext4,
> and if so, will this prevent the card from further problems like this?

ext3 wont stop you getting these problems, but rather they will cause
serious corruption, so stick with ext4.

> - What can be done remotely now to "repair" the FS? -- I can hardly pull
> the card out of it and run an fsck on a second machine via VPN.
> B.t.w., badblocks came back with 0 errors, the SD card seems still ok.

It's just soft corruption, so the above should repair it.

> I assume that the next reboot will be the last one...
> So, any ideas highly appreciated. -- Thanks!

There is always the possibility it wont come back from any remote
reboot, so having someone at the remote location who can put in a backup
SD card is always a good move.

---druck

A. Dumas

unread,

Mar 21, 2021, 9:20:02 AM3/21/21

to

On 21-03-2021 13:37, druck wrote:
> ext3 wont stop you getting these problems, but rather they will cause
> serious corruption, so stick with ext4.

Yeah, going back from ext4 to ext3 to solve some unknown problem doesn't
seem like a good idea.

Even apart from the inherent stability improvements, it seems likely
that modern kernel and application features depend on certain ext4
extensions like the smaller-than-one-second file timestamps (which also
highlights another ext3 disadvantage: its Y2038 problem).

https://en.wikipedia.org/wiki/Ext3#Disadvantages

Markus Robert Kessler

unread,

Mar 21, 2021, 9:44:48 AM3/21/21

to

One thing left to mention, just for the files:

On Sun, 21 Mar 2021 08:06:13 +0000 Markus Robert Kessler wrote:

> snip [...]

> To make things easier, some weeks ago I started with setting up one
> sample machine very carefully, I installed everything needed, created
> the necessary users, made all updates, and then I created two tgz-balls,
> one for boot and one for rootfs.
>
> Making a new instance out of it can now be easily done by taking a new
> SD card out of the box, overwrite the first hundred MBs with 'cat
> /dev/zero
>> /dev/sdb' (stopping after some seconds) on a linux machine, then create
> a Win95 partition with 256MB and the whole remaining space with linux
> filesystem using fdisk.
>
> Afterwards I only have to format both partitions with the appropriate
> filesystem type and restore the backups.
>
> Of course, the machine name has to be changed in /etc/hosts and /etc/
> hostname, the most recent updates have to be applied and so on.

"And so on": The partition IDs have to be adapted also:

When the SD card is mounted on a Linux workstation, then the two
raspberry partitions will show up as (e.g.) /dev/sdb1 and /dev/sdb2 or
similar -- just verify. Then get their partition names with blkid /dev/
sdb{1,2} and update boot/cmdlist.txt and rootfs/etc/fstab on the SD card.

Otherwise your cigarette box won't boot.

A. Dumas

unread,

Mar 21, 2021, 12:58:44 PM3/21/21

to

Dennis Lee Bieber <wlf...@ix.netcom.com> wrote:
> It appears that the R-Pi organization has dropped the NOOBS installer,
> and only provides downloads for each specific OS.

Finally!