Time to smartfail a disk

scott

unread,

Jul 24, 2013, 2:51:19 PM7/24/13

to isilon-u...@googlegroups.com

I completed an upgrade from 6.5.5.11 to 7.0.2.1 last Thursday - things seemed to go well.

Over the weekend OneFS decided a drive was unreliable and it was placed into 'Smartfail' status.

FlexProtect has been working to cleanly remove this drive for almost 4 days.

Job                        Impact Pri Policy     Phase Run Time
-------------------------- ------ --- ---------- ----- ----------
FlexProtect[251]           Medium 1   MEDIUM     2/6   3d 17:35
        (Working on nodes: None and drives: 1:bay30)
        Progress: Processed 9602495 lins; 0 zombies and 0 errors

It seems like progress has been hung around the 9 million LIN mark, and I have around 200 million LINS on this cluster.

I haven't had a lot of disks fail. How long does it normal take for the smartfail of a disk (flexprotect) to complete?

Thanks

Scott

Matt Dey

unread,

Jul 24, 2013, 3:48:12 PM7/24/13

to isilon-u...@googlegroups.com

How big is the disk and what kind of node are you using?

Steven Kreuzer

unread,

Jul 24, 2013, 4:02:28 PM7/24/13

to isilon-u...@googlegroups.com

On Wed, Jul 24, 2013 at 3:48 PM, Matt Dey <matt...@gmail.com> wrote:

How big is the disk and what kind of node are you using?

Also, how much free space do you have in the smartpool that drive is in. The less free space you have,

the longer it is going to take

You can also bump the policy the flexproect is running at to high (isi job modify -j 251 --policy high) but I

I have heard conflicting information on if this will actually speed up the rebuild time or not because

running at high could lead to spindle contention, and if that happens, you won't go any faster than if you run it at

Medium, but increase your chances to throw ECCs and have drive stalls

Cory Snavely

unread,

Jul 24, 2013, 4:30:17 PM7/24/13

to isilon-u...@googlegroups.com

The questions all make sense as far as diagnosis, but just to be clear,
no matter what, that's way longer than what you should be seeing. I'd
open an SR on that right away.

> --
> You received this message because you are subscribed to the Google
> Groups "Isilon Technical User Group" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to isilon-user-gr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Luc Simard

unread,

Jul 24, 2013, 5:51:16 PM7/24/13

to isilon-u...@googlegroups.com, isilon-u...@googlegroups.com

Never upgrade smartfail to high. That's a very bad idea.

In that use case, you are telling the cluster to commit all to this job. It will put extra pressure and could additional drive to fail. No point in rushing this.

Time to finish Varies , onefs will balance client churn and disk I/o , varies if you have SSD or not with GNA in that pool where the failure occurred .

If your workflow is based on large number of small files, on non-GNA, with shallower directories with large number of files per dir, this will impact time to completion.

Luc Simard - 415-793-0989

simard.j...@gmail.com

Messages may contain confidential information.

Sent from my iPhone

--

Jason Davis

unread,

Jul 24, 2013, 7:45:56 PM7/24/13

to isilon-u...@googlegroups.com

Yes this is pretty much what I see, our clusters have lots and lots of small files. A 4 day SmartFail is not all that improbable under a heavily used pool.

So the short explanation is if you have tons on LINs then adjust your drive SmartFail expectations accordingly.

I'm not certain if SmartFail performance is any different from OneFS 6.5.5 to 7.x

Peter Serocka

unread,

Jul 25, 2013, 3:05:24 AM7/25/13

to Jason Davis, isilon-u...@googlegroups.com

isi statistics drive --long

lists, among many parameters, the Inode usage per disk.

Can you see the number of Inodes diminishing

on the smartfailing drive?

-- Peter

Peter Serocka

CAS-MPG Partner Institute for Computational Biology (PICB)

Shanghai Institutes for Biological Sciences (SIBS)

Chinese Academy of Sciences (CAS)

320 Yue Yang Rd, Shanghai 200031, China

pser...@picb.ac.cn

Luc Simard

unread,

Jul 25, 2013, 9:55:16 AM7/25/13

to isilon-u...@googlegroups.com, isilon-u...@googlegroups.com

There is a small difference, but the big improvements will be in 7.1 later this year.

So I would favor GNA capable nodes ( w/SSD) in the long run in this use case. You cluster will scale better with this large footprint of small files.

Like x400 nodes with SSD instead of NL400. if you need speed, S200 w/SSD for your ingest pool, manage capacity with ATIME or MTIME.

Sure they are more expensive over NL but those SSD will give you a welcomed boots across the cluster.

For your directories, like any other FS, good advice , it's about balance, 10000 directories with 10000 files is still manageable vs few dir with 100k files per directory, with or without GNA , and platforms like Windows like to enumerate , cache everything and then display a folders content. Sure GNA will help in the latter, but you are only postponing the real question. Is your workflow and data usage properly balanced?

Cheers .

Luc Simard - 415-793-0989

simard.j...@gmail.com

Messages may contain confidential information.

Sent from my iPhone

scott

unread,

Jul 25, 2013, 12:13:45 PM7/25/13

to isilon-u...@googlegroups.com, Jason Davis

In this case 1:30 is my troublemaker. Currently 4 days + 15 hours into the smartfail. Nice command, wish I knew about it a few days ago.

-Scott

   Drive Type OpsIn BytesIn SizeIn OpsOut BytesOut SizeOut TimeAvg Slow TimeInQ Queued Busy Used Inodes
LNN:bay        N/s     B/s      B    N/s      B/s       B      ms N/s      ms           %    %
     1:1 SATA 66.0    2.5M    38K 373.8     3.1M    8.2K     0.1 0.0    15.0    6.6 26.9 65.0   3.2M
     1:2 SATA 67.0    2.6M    39K 348.4     2.9M    8.2K     0.1 0.0    15.7    6.4 25.9 65.0   3.2M
     1:3 SATA 41.8    1.4M    33K 331.8     2.7M    8.2K     0.2 0.0    22.7    9.4 39.5 65.0   3.2M
     1:4 SATA 69.0    2.4M    35K 348.2     2.9M    8.2K     0.2 0.0    15.5    6.5 37.7 65.0   3.2M
     1:5 SATA 59.8    2.2M    36K 341.2     2.8M    8.2K     0.1 0.0    16.8    6.9 43.9 65.0   3.2M
     1:6 SATA 69.2    2.8M    41K 273.0     2.2M    8.2K     0.1 0.0    14.6    6.1 24.7 65.0   3.2M
     1:7 SATA 92.8    4.1M    44K 328.2     2.7M    8.2K     0.1 0.0    19.8    7.5 23.3 66.2   3.2M
     1:8 SATA 79.0    3.5M    44K 356.8     2.9M    8.2K     0.1 0.0    18.4    7.4 29.3 66.2   3.2M
     1:9 SATA 110.8    5.2M    47K 281.6     2.3M    8.2K     0.1 0.0    16.0    6.6 40.9 66.2   3.2M
    1:10 SATA 85.4    3.6M    43K 308.8     2.5M    8.2K     0.1 0.0    18.1    7.0 40.5 66.2   3.2M
    1:11 SATA 79.6    3.6M    45K 277.8     2.3M    8.2K     0.1 0.0    24.7    9.4 48.9 66.2   3.2M
    1:12 SATA 98.8    4.5M    45K 329.4     2.7M    8.2K     0.1 0.0    18.1    7.2 26.1 66.2   3.2M
    1:13 SATA 64.4    2.5M    38K 366.8     3.0M    8.2K     0.1 0.0    15.3    6.4 29.3 64.9   3.2M
    1:14 SATA 55.2    1.8M    33K 347.4     2.8M    8.2K     0.1 0.0    14.9    6.3 30.9 64.9   3.2M
    1:15 SATA 80.8    3.3M    41K 293.2     2.4M    8.2K     0.2 0.0    14.6    6.2 36.1 64.9   3.2M
    1:16 SATA 52.6    1.8M    35K 339.2     2.8M    8.2K     0.1 0.0    15.6    6.6 35.1 64.9   3.2M
    1:17 SATA 74.6    2.9M    39K 365.8     3.0M    8.2K     0.1 0.0    16.9    7.1 44.1 64.9   3.2M
    1:18 SATA 79.2    3.2M    40K 302.6     2.5M    8.2K     0.2 0.0    16.0    6.6 42.7 64.9   3.2M
    1:19 SATA 73.2    3.2M    43K 367.4     3.0M    8.2K     0.1 0.0    16.6    6.8 29.3 66.8   3.2M
    1:20 SATA 51.4    2.1M    41K 274.0     2.2M    8.2K     0.1 0.0    16.5    7.0 26.5 66.8   3.2M
    1:21 SATA 50.0    1.6M    33K 373.6     3.1M    8.2K     0.1 0.0    13.1    6.0 34.1 66.8   3.2M
    1:22 SATA 70.2    2.7M    38K 334.8     2.7M    8.2K     0.1 0.0    17.7    7.5 28.9 66.8   3.2M
    1:23 SATA 82.8    3.6M    44K 315.0     2.6M    8.2K     0.1 0.0    16.7    7.3 26.1 66.8   3.2M
    1:24 SATA 86.6    3.8M    44K 373.4     3.1M    8.2K     0.2 0.0    15.6    6.5 42.1 66.8   3.2M
    1:25 SATA 83.2    3.3M    39K 342.2     2.8M    8.2K     0.1 0.0    17.9    7.7 44.7 73.6   3.4M
    1:26 SATA 68.2    2.7M    40K 373.8     3.1M    8.2K     0.1 0.0    17.1    7.7 25.9 73.6   3.4M
    1:27 SATA 96.0    3.9M    41K 366.6     3.0M    8.2K     0.1 0.0    16.7    7.2 40.9 73.6   3.4M
    1:28 SATA 94.0    4.4M    46K 334.4     2.7M    8.2K     0.1 0.0    16.2    7.4 33.9 73.6   3.4M
    1:29 SATA 57.0    2.0M    35K 449.4     3.7M    8.2K     0.1 0.0    23.3   10.7 42.1 73.6   3.4M
    1:30 SATA   0.2    1.6K   8.2K    0.0      0.0     0.0     0.1 0.0     1.1    0.0 0.0 64.5    17K
    1:31 SATA 80.0    3.3M    41K 393.0     3.2M    8.2K     0.1 0.0    16.7    6.7 29.1 65.2   3.2M
    1:32 SATA 72.6    3.0M    41K 322.6     2.6M    8.2K     0.1 0.0    16.0    6.8 26.1 65.1   3.2M
    1:33 SATA 88.2    3.6M    40K 328.6     2.7M    8.2K     0.1 0.0    16.7    6.8 39.7 65.1   3.2M
    1:34 SATA 88.2    3.5M    40K 316.4     2.6M    8.2K     0.1 0.0    16.3    6.8 26.5 65.1   3.2M
    1:35 SATA 70.2    2.9M    41K 331.6     2.7M    8.2K     0.1 0.0    16.9    6.9 44.7 65.1   3.2M
    1:36 SATA 62.2    2.4M    39K 313.2     2.6M    8.2K     0.2 0.0    16.8    6.9 37.1 65.2   3.2M

scott

unread,

Jul 25, 2013, 12:34:10 PM7/25/13

to isilon-u...@googlegroups.com

3TB Hitachi disk on an IQ 108NL

Luc Simard

unread,

Jul 25, 2013, 3:37:48 PM7/25/13

to isilon-u...@googlegroups.com

One point to mention, do not put too much stock in the %Busy value, I've replicated use cases where cluster with 100% idle, no client connections, no I/O would still show 100% busy, this is known to Eng and properly logged as "broken" or to be improved.

Peter Serocka

unread,

Jul 26, 2013, 3:18:58 AM7/26/13

to isilon-u...@googlegroups.com, Jason Davis

Looks dead...

Suppose you open a case, let us know what you find.

BTW, disks 25-29 are 73% full, while all others are at 65-66%.

When did the most recent MultiScan (or AutoBalance, which is part of it)

finish *successfully*?

MultiScan gets more or less silently (no events, but syslog messages)

"system cancelled" by disk stalls:

have you been seeing any disk stalls (in syslog messages)?

You also might check wether the recent monthly MediaScan

jobs have been successful. (MediaScan phase 5 gets

inflicted by SnapshotDelete, so careful monitoring

is recommended.)

-- Peter

Reply all

Reply to author

Forward