HDD error: Current_Pending

Default User

unread,

Feb 20, 2024, 1:00:06 PMFeb 20

to

Hi guys!

I am running Debian 12 Stable, up to date, on a low-spec Dell Inspiron
15 3000 Model 3511. Firmware is also up to date.

I have a 4 Gb Western Digital external usb SATA HDD, Model WDC
WD40NDZW-11A8JS1. It has only one partition, formatted as ext4. The
filesystem is labeled MSD00012.

Every night, I use rsync to copy all contents of a (theoretically)
identical drive, which has filesystem label MSD00014, to the drive with
MSD00012.

Two nights ago, I could not do the copy correctly. Apparently, as a
safety measure, MSD00012 was automatically re-mounted as read only, due
to a filesystem error.

I used the gnome-disks utility to unmount and then remount it. It was
remounted as read-write.

Now it "works", BUT . . . I ran:

sudo smartctl --test=long /dev/sdb on it,

and it reports a Current_Pending_Sector error, at LBA 325904690.

>From sudo smartctl --all /dev/sdb > backup_drive_b_test.txt:

------------------------------------------------------------

smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.1.0-18-amd64] (local
build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke,
www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Elements / My Passport (USB, AF)
Device Model: WDC WD40NDZW-11A8JS1
Serial Number: WD-XXXXXXXXXXXX
LU WWN Device Id: 5 0014ee 269112168
Firmware Version: 01.01A01
User Capacity: 4,000,753,475,584 bytes [4.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Form Factor: 2.5 inches
TRIM Command: Available, deterministic
Device is: In smartctl database 7.3/5319
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Tue Feb 20 11:32:04 2024 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection
activity
was never started.
Auto Offline Data Collection:
Disabled.
Self-test execution status: ( 121) The previous self-test
completed having
the read element of the test
failed.
Total time to complete Offline
data collection: (12240) seconds.
Offline data collection
capabilities: (0x1b) SMART execute Offline
immediate.
Auto Offline data collection
on/off support.
Suspend Offline collection
upon new
command.
Offline surface scan
supported.
Self-test supported.
No Conveyance Self-test
supported.
No Selective Self-test
supported.
SMART capabilities: (0x0003) Saves SMART data before
entering
power-saving mode.
Supports SMART auto save
timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging
supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 24) minutes.
SCT capabilities: (0x30b5) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 198 051 Pre-fail
Always - 41
3 Spin_Up_Time 0x0027 253 253 021 Pre-fail
Always - 4741
4 Start_Stop_Count 0x0032 099 099 000 Old_age
Always - 1175
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail
Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age
Always - 0
9 Power_On_Hours 0x0032 099 099 000 Old_age
Always - 1311
10 Spin_Retry_Count 0x0032 100 100 000 Old_age
Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age
Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age
Always - 693
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age
Always - 23
193 Load_Cycle_Count 0x0032 199 199 000 Old_age
Always - 3045
194 Temperature_Celsius 0x0022 114 102 000 Old_age
Always - 38
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age
Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age
Always - 1
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age
Always - 0
200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age
Offline - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining
LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 90% 1311
3259046960
# 2 Short offline Completed without error 00% 1311
-
# 3 Short offline Completed without error 00% 1311
-

Selective Self-tests/Logging not supported

--------------------------------------------------------------------

I ran "check" in gnome-disks. It showed a disk error, so I ran
"repair" in gnome-disks. Then I ran "check again". It reported no
errors.

I also ran the partition check/repair utility in Gparted. It did not
report any errors.

Finally, I did sudo smartctl --test=long /dev/sdb on it, as previously
mentioned.

According to sudo smartctl --test=long /dev/sdb,
there is one Current_Pending_Sector error.

I have done some research online, which seems to say that the error
will remain until there is an an attempt to write to the bad sector
(block). Then it will be "re-mapped", and presumably taken out of
service.

But since the sector already can not be read, How can it be re-written
to a "good" sector?

If I knew which file (if any) is using the bad sector, I could try just
deleting that file from the "bad" drive, then copy the same file over
from the "Good" drive, at which time the bad sector "should" be
retired, and replaced by a good sector.

Or, as a more "brute force" solution, I could either simply format the
bad drive, or do:

sudo dd if=/dev/zero of=/dev/sdb bs=4M status=progress conv=fdatasync

and then format the drive, or even do:

sudo dd if=/dev/urandom of=/dev/sdb bs=4M status=progress
conv=fdatasync

and then format the drive.

But since this is a 4Tb external usb drive, overwriting and formatting
the whole drive might take days! And maybe even work the very low-spec
computer until it cooks!

Any suggestions?

David Christensen

unread,

Feb 20, 2024, 1:10:06 PMFeb 20

to

Perhaps use dd(1) to zero the one block(?)

David

Michael Kjörling

unread,

Feb 20, 2024, 4:40:07 PMFeb 20

to

On 20 Feb 2024 12:51 -0500, from hungupo...@gmail.com (Default User):

> But since the sector already can not be read, How can it be re-written
> to a "good" sector?

Generally, it can't. It will be remapped if necessary when something
else is written to that sector.

> If I knew which file (if any) is using the bad sector, I could try just
> deleting that file from the "bad" drive, then copy the same file over
> from the "Good" drive, at which time the bad sector "should" be
> retired, and replaced by a good sector.

Assuming ext[234]fs, it looks like you can use tune2fs, udisks and
debugfs to determine the pathname to the file at a given LBA offset.
See http://www.randomnoun.com/wp/2013/09/12/determining-the-file-at-a-specific-vmdk-offset/

--
Michael Kjörling 🔗 https://michael.kjorling.se
“Remember when, on the Internet, nobody cared that you were a dog?”

David

unread,

Feb 20, 2024, 5:20:06 PMFeb 20

to

On Tue, 20 Feb 2024 at 17:51, Default User <hungupo...@gmail.com> wrote:
>
> Hi guys!

Hi to all readers,

> Now it "works", BUT . . . I ran:
>
> sudo smartctl --test=long /dev/sdb on it,
>
> and it reports a Current_Pending_Sector error, at LBA 325904690.

[...]

> According to sudo smartctl --test=long /dev/sdb,
> there is one Current_Pending_Sector error.
>
> I have done some research online, which seems to say that the error
> will remain until there is an an attempt to write to the bad sector
> (block). Then it will be "re-mapped", and presumably taken out of
> service.
>
> But since the sector already can not be read, How can it be re-written
> to a "good" sector?
>
> If I knew which file (if any) is using the bad sector, I could try just
> deleting that file from the "bad" drive, then copy the same file over
> from the "Good" drive, at which time the bad sector "should" be
> retired, and replaced by a good sector.
>
> Or, as a more "brute force" solution, I could either simply format the
> bad drive, or do:

[...]

> Any suggestions?

Maybe this document will guide you to a preferred solution:
https://www.smartmontools.org/wiki/BadBlockHowto

Default User

unread,

Feb 20, 2024, 8:00:05 PMFeb 20

to

On Tue, 2024-02-20 at 21:36 +0000, Michael Kjörling wrote:
> On 20 Feb 2024 12:51 -0500, from hungupo...@gmail.com (Default
> User):
> > But since the sector already can not be read, How can it be re-
> > written
> > to a "good" sector?
>
> Generally, it can't. It will be remapped if necessary when something
> else is written to that sector.
>
>
> > If I knew which file (if any) is using the bad sector, I could try
> > just
> > deleting that file from the "bad" drive, then copy the same file
> > over
> > from the "Good" drive, at which time the bad sector "should" be
> > retired, and replaced by a good sector.
>
> Assuming ext[234]fs, it looks like you can use tune2fs, udisks and
> debugfs to determine the pathname to the file at a given LBA offset.
> See
> http://www.randomnoun.com/wp/2013/09/12/determining-the-file-at-a-specific-vmdk-offset/
>

Hi, Michael!

I forgot to say that the filesystem of the "bad" drive is a single
partition, formatted as ext4. And the sector and block sizes both seem
to be 512b.

Per sudo fdisk -l /dev/sdb:

----------------------------------------------------------------
Disk /dev/sdb: 3.64 TiB, 4000752599040 bytes, 7813969920 sectors
Disk model: My Passport 2627
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX

Device Start End Sectors Size Type
/dev/sdb1 2048 7813967871 7813965824 3.6T Linux filesystem
-------------------------------------------------------------

Regarding:
>. . . it looks like you can use tune2fs, udisks and

>debugfs to determine the pathname to the file at a given LBA offset.
>See
>http://www.randomnoun.com/wp/2013/09/12/determining-the-file-at-a-specific-vmdk-offset/

>that could be a little above my current level of competence. But I
>will try to read up on it.

Note: I occurs to me that another idea would be to simply delete all
files from the "bad" drive, then rsync everything fresh from the "good"
drive back onto the "bad" drive.

IIUC, that would the cause the "bad" sector to be retired, and replaced
by a "good" sector.

That might be easier than everything else mentioned so far.

Andy Smith

unread,

Feb 21, 2024, 3:20:06 PMFeb 21

to

Hi,

On Tue, Feb 20, 2024 at 07:53:38PM -0500, Default User wrote:
> Note: I occurs to me that another idea would be to simply delete all
> files from the "bad" drive, then rsync everything fresh from the "good"
> drive back onto the "bad" drive.

You can do it in one step with rsync --delete … which will delete
anything that doesn't exist on the source.

> IIUC, that would the cause the "bad" sector to be retired, and replaced
> by a "good" sector.

Yes, a lot of the time a new write is successful and when it's not
it will be remapped. As long as the remapped sector count doesn't
keep going up I'd be fairly comfortable in continuing to use the
drive (assuming backups exist) s while longer.

Thanks,
Andy

--
https://bitfolk.com/ -- No-nonsense VPS hosting

HDD error: Current_Pending_Sector

Default User

David Christensen

Michael Kjörling

David

Default User

Andy Smith