Had a drive fail, went completely unreadable. The first hint was that it reported SMART failure. Things went downhill from there.
It was part of a 3-drive array: /dev/md0 with 2TB drives.
# grep raid /etc/fstab
/dev/md0 /raid xfs defaults 0 0
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 1.8T 0 disk
sdb 8:16 0 238.5G 0 disk
├─sdb1 8:17 0 600M 0 part /boot/efi
├─sdb2 8:18 0 1G 0 part /boot
└─sdb3 8:19 0 236.9G 0 part
├─fedora-root 253:0 0 70G 0 lvm /
├─fedora-swap 253:1 0 24G 0 lvm [SWAP]
└─fedora-home 253:2 0 142.9G 0 lvm /home
sdc 8:32 0 1.8T 0 disk
└─md0 9:0 0 3.6T 0 raid5
sdd 8:48 0 1.8T 0 disk
└─md0 9:0 0 3.6T 0 raid5
I have a working drive in there now as sda, but I don't seem able to add it to the array.
[root@simak ~]# mdadm /dev/md0 --add /dev/sda
mdadm: add new device failed for /dev/sda as 3: Invalid argument
I think I should have tried to remove it from the array logically, with mdadm --remove - before powering down and removing the physical drive, but like an idiot, I didn't do that. Now that the drive is out of the system and sitting on my desk cooling off, I don't seem able to:
[root@simak ~]# mdadm /dev/md0 --remove /dev/sda
mdadm: hot remove failed for /dev/sda: No such device or address
[root@simak ~]# mdadm /dev/md0 --remove /dev/sda1
mdadm: stat failed for /dev/sda1: No such file or directory
BTW my distro doesn't seem to have an mdadm.conf file -- some docs say it's supposed to be in /etc/mdadm/mdadm.conf. My man pages say it should be in /etc/mdadm.conf but I don't have one of those either. I tried using locate and find and both came up dry.
So here's what the array looks like now:
[root@simak ~]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sdd[1] sdc[4]
3906764800 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [UU_]
bitmap: 2/15 pages [8KB], 65536KB chunk
unused devices: <none>
[root@simak ~]# mdadm -D /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Fri Dec 13 21:41:56 2019
Raid Level : raid5
Array Size : 3906764800 (3.64 TiB 4.00 TB)
Used Dev Size : 1953382400 (1862.89 GiB 2000.26 GB)
Raid Devices : 3
Total Devices : 2
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Mon Feb 3 10:38:29 2025
State : clean, degraded
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Consistency Policy : bitmap
Name : simak:0 (local to host simak)
UUID : 6670fd7c:6cda3656:9b0930d0:cb21afb4
Events : 9255
Number Major Minor RaidDevice State
4 8 32 0 active sync /dev/sdc
1 8 48 1 active sync /dev/sdd
- 0 0 2 removed
The replacement drive could have been part of an md array before, so I did mdadm --zero-superblock /dev/sda
When I couldn't replace it into the array, I used dd to wipe it as well:
dd if=/dev/zero of=/dev/sda
I let that run for a while and killed it after it wrote 7 or 8 GB. Then re-ran the commands I had been trying to get it to rebuild the array:
# mdadm --assemble /dev/md0 /dev/sdd /dev/sdc /dev/sda
mdadm: failed to add /dev/sda to /dev/md0: Invalid argument
mdadm: /dev/md0 has been started with 2 drives (out of 3).
# mdadm /dev/md0 --add /dev/sda
mdadm: add new device failed for /dev/sda as 3: Invalid argument
I still can't get it to use sda for anything. I've tried everything I can think of.
-T