disk problems with Dell PowerEdge r210 / SEAGATE ST3300656SS HS09

0 views
Skip to first unread message

Matthias Apitz

unread,
Oct 27, 2025, 7:58:36 AMOct 27
to freebsd-...@freebsd.org
Hello,
Since 2017 I own the above server which my company wanted to
decomissioned. I use it since then as my bakery for FreeBSD CURRENT and
ports.

The server has two SCSI harddrives, da0 is UFS for /root, /usr etc. and
da1 is ZFS used for poudriere:

Oct 27 04:55:34 jet kernel: da1: <SEAGATE ST3300656SS HS09> Fixed Direct Access SPC-3 SCSI device
Oct 27 04:55:34 jet kernel: da1: Serial Number 3QP1NF96
Oct 27 04:55:34 jet kernel: da1: 300.000MB/s transfers
Oct 27 04:55:34 jet kernel: da1: Command Queueing enabled
Oct 27 04:55:34 jet kernel: da1: 286102MB (585937500 512 byte sectors)

Since some time this disk gives fault like the messages below and only a
power-off reset help. Here are the last two faults on October 25 and 27.

What could I do as tests or map away disk blocks so they will not be
touch again?

Thanks

matthias

/var/log/messages

Oct 25 16:55:37 jet kernel: (da1:mps0:0:4:0): READ(10). CDB: 28 00 1a 99 75 39 00 00 02 00
Oct 25 16:55:37 jet kernel: (da1:mps0:0:4:0): CAM status: SCSI Status Error
Oct 25 16:55:37 jet kernel: (da1:mps0:0:4:0): SCSI status: Check Condition
Oct 25 16:55:37 jet kernel: (da1:mps0:0:4:0): SCSI sense: UNIT ATTENTION asc:29,cd (Vendor Specific ASCQ)
Oct 25 16:55:37 jet kernel: (da1:mps0:0:4:0): Info: 0x22c0f7
Oct 25 16:55:37 jet kernel: (da1:mps0:0:4:0): Field Replaceable Unit: 204
Oct 25 16:55:37 jet kernel: (da1:mps0:0:4:0): Retrying command (per sense data)
Oct 25 16:55:37 jet kernel: (da1:mps0:0:4:0): READ(10). CDB: 28 00 1a 99 75 39 00 00 02 00
Oct 25 16:55:37 jet kernel: (da1:mps0:0:4:0): CAM status: SCSI Status Error
Oct 25 16:55:37 jet kernel: (da1:mps0:0:4:0): SCSI status: Check Condition
Oct 25 16:55:37 jet kernel: (da1:mps0:0:4:0): SCSI sense: NOT READY asc:4,1 (Logical unit is in process of becoming ready)
Oct 25 16:55:37 jet kernel: (da1:mps0:0:4:0): Polling device for readiness
Oct 25 16:55:43 jet kernel: (da1:mps0:0:4:0): TEST UNIT READY. CDB: 00 00 00 00 00 00 length 0 SMID 105 Command timeout on target 4(0x0009) 5000 set, 5.4367562 elapsed
Oct 25 16:55:43 jet kernel: mps0: Sending abort to target 4 for SMID 105


Oct 27 02:32:19 jet kernel: (da1:mps0:0:4:0): READ(10). CDB: 28 00 0e 87 fe 7b 00 00 02 00 length 1024 SMID 1471 Command timeout on target 4(0x0009) 60000 set, 60.68865685 elapsed
Oct 27 02:32:19 jet kernel: mps0: Sending abort to target 4 for SMID 1471
Oct 27 02:32:19 jet kernel: (da1:mps0:0:4:0): READ(10). CDB: 28 00 0e 87 fe 7b 00 00 02 00 length 1024 SMID 1471 Aborting command 0xfffffe00c347b8a8
Oct 27 02:32:20 jet kernel: (da1:mps0:0:4:0): READ(10). CDB: 28 00 0e 87 ff c5 00 00 02 00 length 1024 SMID 1313 Command timeout on target 4(0x0009) 60000 set, 60.32376095 elapsed
Oct 27 02:32:20 jet kernel: (da1:mps0:0:4:0): READ(10). CDB: 28 00 0e b8 20 29 00 01 00 00 length 131072 SMID 1552 Command timeout on target 4(0x0009) 60000 set, 60.118961952 elapsed
Oct 27 02:32:20 jet kernel: (da1:mps0:0:4:0): READ(10). CDB: 28 00 0e b8 1f 29 00 01 00 00 length 131072 SMID 650 Command timeout on target 4(0x0009) 60000 set, 60.119515098 elapsed
Oct 27 02:32:20 jet kernel: (da1:mps0:0:4:0): WRITE(10). CDB: 2a 00 19 dd 6c 84 00 00 07 00 length 3584 SMID 299 Command timeout on target 4(0x0009) 60000 set, 60.75147185 elapsed
Oct 27 02:32:20 jet kernel: (da1:mps0:0:4:0): WRITE(10). CDB: 2a 00 19 dd 6c 7e 00 00 01 00 length 512 SMID 118 Command timeout on target 4(0x0009) 60000 set, 60.75441295 elapsed


--
Matthias Apitz, ✉ gu...@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub

Alexander Burke

unread,
Oct 27, 2025, 9:47:50 AMOct 27
to ques...@freebsd.org
Hi Matthias,

In my experience, once a drive starts to fail it tends to continue getting worse.

You can get used SAS disks cheaply on eBay. If they have similar power-on hours I recommend replacing both of them.

Cheers,
Alex

Jack Raats

unread,
Oct 27, 2025, 10:30:51 AMOct 27
to ques...@freebsd.org
What does smartmontools say about this drive?

Gr.
Jack Raats

Op 27-10-2025 om 12:58 schreef Matthias Apitz:

Frank Leonhardt

unread,
Oct 27, 2025, 11:16:31 AMOct 27
to ques...@freebsd.org

I still run R200s in production - very reliable! The early 210 had heat problems.

SCSI drives (and SATA) do their own bad block mapping so it is no longer necessary (or possible) to have a bad block map on the system. It's time to get a new drives. They do a very good job of hiding failures from the OS so if you are seeing errors you are looking at the "tip of an iceberg". A SCSI drive will detect a bad block, recover the data and move it to a new area of the disk. This can take a long time, and this is a big clue that the drive is failing. However, it returns "OK" if the timeout is long enough and it can get the data off.

Regards, Frank.


Reply all
Reply to author
Forward
0 new messages