Mac Fsck

0 views

Skip to first unread message

Endike Baur

unread,

Aug 5, 2024, 4:02:16 AM8/5/24

to plejinunec

ClarificationI run fsck.ext4 -n because it's a mounted filesystem, to check if there are errors. This tells me that there are. I thought that the automatic fsck every 30 mounts during the boot-up process is precisely to take care of errors in the root filesystem. But it doesn't do it in my case.I could reboot with a LiveCD and fix the errors, and then reboot again, but that's some serious downtime for a live server. A reboot, auto fsck, then continue booting is much more sustainable on a live server, and I believe should be the right behaviour.

"Note that in general it is not safe to run e2fsck on mounted filesystems. The only exception is if the -n option is specified, and -c, -l, or -L options are not specified. However, even if it is safe to do so, the results printed by e2fsck are not valid if the filesystem is mounted. If e2fsck asks whether or not you should check a filesystem which is mounted, the only correct answer is ''no''. Only experts who really know what they are doing should consider answering this question in any other way."

If you don't check the filesystem while it is mounted, I don't understand why you need to use touch /forcefsck you can just unmount it and fix it. But if it is the case and after a fix your FS still have errors then you can consider using :

The system utility fsck (file system consistency check) is a tool for checking the consistency of a file system in Unix and Unix-like operating systems, such as Linux, macOS, and FreeBSD.[1] The equivalent programs on MS-DOS and Microsoft Windows are CHKDSK, SFC, and SCANDISK.

Generally, fsck is run either automatically at boot time, or manually by the system administrator. The command works directly on data structures stored on disk, which are internal and specific to the particular file system in use - so an fsck command tailored to the file system is generally required. The exact behaviors of various fsck implementations vary, but they typically follow a common order of internal operations and provide a common command-line interface to the user. On modern systems, fsck simply detects the type of filesystem and calls the specialized fsck.type (Linux) or fsck_type (BSD, macOS) program for each type.[1][2]

Most fsck utilities provide options for either interactively repairing damaged file systems (the user must decide how to fix specific problems), automatically deciding how to fix specific problems (so the user does not have to answer any questions), or reviewing the problems that need to be resolved on a file system without actually fixing them. Partially recovered files where the original file name cannot be reconstructed are typically recovered to a "lost+found" directory that is stored at the root of the file system.

A system administrator can also run fsck manually if they believe there is a problem with the file system. The file system is normally checked while unmounted, mounted read-only, or with the system in a special maintenance mode.

As boot time fsck is expected to run without user intervention, it generally defaults to not perform any destructive operations. This may be in the form of a read-only check (failing whenever issues are found), or more commonly, a "preen" -p mode that only fixes innocuous issues commonly found after an unclean shutdown (i.e. crash, power fail).[2]

Independent of checking the file system structure, modern file systems may offer a data scrubbing tool to check for silent corruption in stored data against a mirror or a checksum. Scrubs tend to be slow as they cover all data on a disk, but periodic runs can defend against data rot and help identify failing drives.[7]

fsck first appeared in 4.0BSD of 1980. It turned into its modern wrapper form in NetBSD 1.3 (1998). fsck is not defined by any extant standard,[2] but the primitive non-wrapper form is present in the 1995 draft Systems Management: File System and Scheduling Utilities (FSSU) from X/Open.[8]

The severity of file system corruption led to the terms "fsck" and "fscked" becoming used among Unix system administrators as a minced oath for "fuck" and "fucked".[9] It is unclear whether this usage was cause or effect, as a report from a question and answer session at USENIX 1998 claims that "fsck" originally had a different name:

Ted Kowalski, username frodo, may he rest in peace, was the original author, just down the hall from my office in Murray Hill, and his name for the program had a 'u' where there is now an 's'. Management made him change it for distribution, but they couldn't make him change his pronunciation.

"Go fsck yourself", is occasionally used online as an injunction to a person to go and correct their issue (attitude, ignorance of the subject matter, etc.) - in the same way that running fsck involves fixing fundamental errors.

I think more than me experience repairing corrupted indexes take long time. At least in my environment with a single Indexer server, I could see the 1 out of my many CPUs was 100% utilized while the other ones where idle during the process. Disk util seemed to be not an issue at all...

If I'm right, fsck command for fixing the index files are per bucket and then my question is then:

Why doesn't the feature start several threads handling more than one bucket at a time to utilize the available CPU ressources as an option?

Maybe including the possibility to decide your self how many concurrent cores you want use?

This would have saved us a lot of hours with unavailable Splunk service!!!!!

Since this isn't avail. wrote a quick bash script that generates a list of buckets to fsck and uses the parallel command to kick of many bucket fsck's at a time. I then distributed this to hundred plus indexers and executed via pssh.

I would love to see splunk be able to add this to the startup so multiple cores can be involved in checking a bucket per index.. right now it seems to parse each index and fsck a single bucket in a single index at a time.

I am unfamiliar with parallel but it seems that it is aware of the threads that it makes and will persist it's command state until all of the threads are complete. Is this so? The reason that I ask is that it seems like it might work to add this to the bottom of your screipt, @kbecker:

I misspoke earlier when I thought I could run multiple executions of 'splunk fsck repair --all-buckets-one-index' in parallel. I looked closer and saw that it would repeat processing the buckets in the same order. I tried it the way you posted above and that worked great.

I think you're saying you want index repair to go faster, and that you believe that parallelizing it will make it go faster.

We certainly can parallelize repair by repairing buckets in parallel, but a single bucket would be very difficult to parallelize.

Using the GUN Parallel option above mentioned by kbecker did the trick for me... I modified the process a bit and had a loop outside that to check every bucket instead of just the single bucket that the example they have shows. I don't have the commands handy right now though.

As also support indicated this took around 30min per 10GB data in average...

Since only one CPU was active and in 100% all time, so I assumed no multi threading... Splitting the load onto several cores may have helped speed up the processing?

I am trying to run e2fsck on my ext4 USB drive which has errors and the USB shared drive won't unmount. Per the web gui options, the drive unmounted OK in the OMV "File System" selection menu but something is still attached to it. I've also got transmission shut down but I can't figure out how to unmount the drive. I've shut down services including watchdog, transmission-daemon, samba-ad-dc, smdb and not sure what else to try at this point ...

Other options would be to reboot your server from a USB stick containing a Linux live/rescue environment, or connect the drive to your main PC. If you're running Windows, use a Linux VM to pass the drive through to. (A bit of overkill, but it's an option.)

I have a small raspberry pi in a closet in my open road camper with OMV running on the pi attached to a USB drive with media stored on it. The camper TV has Kodi linked to SMB shares and the radio linked to MP3 on the device. As I hit bumps and various other things it causes issues with the drive or the electronics go wacky from short losses of power. So I need to clean things up from time to time after any of the near daily small catastrophes. I know all the reasons that one wants to keep such a setup in a simulated "server farm' environment, but out on the road, it just doesn't always work out that way. I keep a backup of the USB drive and pi image to reload to SD card but I'm mostly looking for a few commands to fix things on the fly without spending all day on the project. I've been pulling the drive out of the closet, attaching it to my laptop and booting ubuntu, running e2fsck on the drive but I'd like to somehow just run the routine from the pi via the console which would save me having to shut the pi down, remove the drive, set up the laptop for the check disk and then mounting the drive back in the closet, etc. (P.S. I have none of those other services running on my pi other than transmission-daemon, which shares access to the USB drive but I shut it down. Note I also used to run this setup off my router using OpenWrt and samba share, which allowed shutting down the respective service, unmounting the drive and running the e2fsck on the drive to fix various issues from undesired power loss.)

The power is non-stop (high current battery to 5V buck for pi and 12v buck for USB drive but when I have a road hazard sometimes the power comes unplugged, plus for some reason it stops working and a check disk just fixes everything. I'd like to run check disk on the drive without removing it to a laptop like I do now if you know of any way. Like I say, this was no issue on the OpenWRT platform but OMV seems to take over the entire machine. If I could simply kill enough porcesses where the drive would unmount, I'd be satisfied. It's not production work just media viewing but the thing I'm trying to avoid is removing the drive and just shutting down services, unmounting drive, running e2fsck, then rebooting the pi. Thanks.