On 8/21/25 21:02, Robert wrote:
> On 8/21/2025 3:03 PM, Dag-Erling Smørgrav wrote:
>> You should take a look in /var/backups, you may find a backup of
>> the partition table from the failed drive. Assuming you remove the
>> failed drive first, you can safely `gpart restore -l` this backup onto
>> the replacement drive, which will recreate the labels (but not UUIDs).
>
> Great, had no idea, yes, I see the gpartada0.backup in /var/backups...
>
> root@db1:~ # cat /var/backups/gpart.ada0.bak <<-- REMOVED disk
> GPT 128
> 1 freebsd-boot 40 1024 gptboot0
> 2 freebsd-swap 2048 16777216 swap0
> 3 freebsd-zfs 16779264 276267008 zfs0
> root@db1:~ # cat /var/backups/gpart.ada1.bak
> GPT 128
> 1 freebsd-boot 40 1024 gptboot1
> 2 freebsd-swap 2048 16777216 swap1
> 3 freebsd-zfs 16779264 276267008 zfs1
> root@db1:~ # cat /var/backups/gpart.ada2.bak
> GPT 128
> 1 freebsd-boot 40 1024 gptboot2
> 2 freebsd-swap 2048 16777216 swap2
> 3 freebsd-zfs 1677926l /v4 276267008 zfs2
> root@db1:~ # cat /var/backups/gpart.ada3.bak
> GPT 128
> 1 freebsd-boot 40 1024 gptboot3
> 2 freebsd-swap 2048 16777216 swap3
> 3 freebsd-zfs 16779264 276267008 zfs3
> root@db1:~ # cat /var/backups/gpart.ada4.bak
>
Good. So long as nothing uses GUID/UUID, gpart(8) restore with labels
should work.
This is my server system disk (BIOS, MBR):
2025-08-21 21:13:19 toor@f5 ~
# gpart show ada0
=> 40 117231328 ada0 GPT (56G)
40 1024 1 freebsd-boot (512K)
1064 29359104 2 freebsd-ufs (14G)
29360168 1564672 3 freebsd-swap (764M)
30924840 86306528 - free - (41G)
I have a backup of the freebsd boot partition:
2025-08-21 21:55:05 toor@f5 ~
# ll /var/backups/boot.ada0p1.bak
-rw-r--r-- 1 root wheel 524288 2024/03/04 03:01:00
/var/backups/boot.ada0p1.bak
And the backup still matches adap1:
2025-08-21 21:13:44 toor@f5 ~
# cmp /dev/ada0p1 /var/backups/boot.ada0p1.bak
2025-08-21 21:14:00 toor@f5 ~
# echo $?
0
The last piece of the puzzle is the MBR. I see some possibilities in /boot:
2025-08-21 21:20:36 toor@f5 ~
# ll -S /boot | grep ' 512 ' | grep -v drwx
-r--r--r-- 1 root wheel 512 2025/05/24 14:51:34 boot0
-r--r--r-- 1 root wheel 512 2025/05/24 14:51:34 boot0sio
-r--r--r-- 1 root wheel 512 2023/04/06 21:24:38 boot1
-r--r--r-- 1 root wheel 512 2023/04/06 21:24:38 mbr
-r--r--r-- 1 root wheel 512 2023/04/06 21:24:38 pmbr
Referring to WikiPedia "Master boot record" table "Structure of a
classical generic MBR":
https://en.wikipedia.org/wiki/Master_boot_record
The bootstrap code area is the first 446 bytes. Look for a match:
2025-08-21 21:24:08 toor@f5 ~
# cmp -n 446 /dev/ada0 /boot/boot0
/dev/ada0 /boot/boot0 differ: char 12, line 1
2025-08-21 21:25:00 toor@f5 ~
# cmp -n 446 /dev/ada0 /boot/boot0sio
/dev/ada0 /boot/boot0sio differ: char 12, line 1
2025-08-21 21:25:05 toor@f5 ~
# cmp -n 446 /dev/ada0 /boot/boot1
/dev/ada0 /boot/boot1 differ: char 1, line 1
2025-08-21 21:25:08 toor@f5 ~
# cmp -n 446 /dev/ada0 /boot/mbr
/dev/ada0 /boot/mbr differ: char 12, line 1
2025-08-21 21:25:12 toor@f5 ~
# cmp -n 446 /dev/ada0 /boot/pmbr
So, the FreeBSD installer put /boot/pmbr into the MBR of my system disk.
Checking the partition table entries and boot signature:
2025-08-21 21:28:19 toor@f5 ~
# cmp -i 446 -n 16 /dev/ada0 /boot/pmbr
/dev/ada0 /boot/pmbr differ: char 3, line 1
2025-08-21 21:28:50 toor@f5 ~
# cmp -i 462 -n 16 /dev/ada0 /boot/pmbr
2025-08-21 21:28:58 toor@f5 ~
# cmp -i 478 -n 16 /dev/ada0 /boot/pmbr
2025-08-21 21:29:09 toor@f5 ~
# cmp -i 494 -n 16 /dev/ada0 /boot/pmbr
2025-08-21 21:29:17 toor@f5 ~
# cmp -i 510 -n 2 /dev/ada0 /boot/pmbr
So, everything matches except partition entry number 1:
2025-08-21 21:31:33 toor@f5 ~
# dd if=/dev/ada0 count=1 status=none | hexdump -s 446 -n 16
000001be 00 00 02 00 ee ff ff ff 01 00 00 00 2f cf fc 06
|............/...|
000001ce
2025-08-21 21:32:27 toor@f5 ~
# dd if=/boot/pmbr count=1 status=none | hexdump -s 446 -n 16
000001be 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
|................|
000001ce
So, the installer must have populated the first partition entry.
Referring to the WikiPedia page table "Layout of one 16-byte partition
entry", decoding my MBR first partition entry:
Status or physical drive
inactive
CHS address of first absolute sector in partition
cylinder = 0
head = 0
sector = 2
Partition type
ee = GPT protective MBR
CHS adress of last absolute sector in partition
cylinder = 1023
head = 255
sector = 31
LBA of first absolute sector in the partition
0x00000001 = sector 1
Number of sectors in partition
0x06fccf2f = 117231407 sectors
Convert the number of sectors in partition field value to decimal:
2025-08-21 21:32:37 toor@f5 ~
# perl -e 'printf "%i\n", 0x06fccf2f'
117231407
This matches the disk size minus 1 (for the MBR):
2025-08-21 21:54:57 toor@f5 ~
# diskinfo -v ada0 | grep 'mediasize in sectors'
117231408 # mediasize in sectors
Again, I would check if the failed disk and the other disck all have the
same MBR. If so, you could clone one of them into the MBR of
replacement disk.
>>> Would recovering the disk be beneficial versus replace? As far as
>>> faster recovery, not needing to resilver or as much. These are not big
>>> drives as you can see and RAID10 zpool.
>> You can try to use recoverdisk to copy undamaged portions of the failed
>> drive onto the replacement, but it's likely to take longer than
>> resilvering.
> Then I'll stick to the original plan but with attach instead of replace
> using `zpool attach ada0p3 ada0p3`.
>
I think you have a typo -- the replacement ada0p3 should attach to ada1p3.
David