Time to smartfail a disk

1,872 views
Skip to first unread message

scott

unread,
Jul 24, 2013, 2:51:19 PM7/24/13
to isilon-u...@googlegroups.com
I completed an upgrade from 6.5.5.11 to 7.0.2.1 last Thursday - things seemed to go well. 
Over the weekend OneFS decided a drive was unreliable and it was placed into 'Smartfail' status.
 
FlexProtect has been working to cleanly remove this drive for almost 4 days. 
 
Job                        Impact Pri Policy     Phase Run Time
-------------------------- ------ --- ---------- ----- ----------
FlexProtect[251]           Medium 1   MEDIUM     2/6   3d 17:35
        (Working on nodes: None and drives: 1:bay30)
        Progress: Processed 9602495 lins; 0 zombies and 0 errors

It seems like progress has been hung around the 9 million LIN mark, and I have around 200 million LINS on this cluster. 
 
I haven't had a lot of disks fail.  How long does it normal take for the smartfail of a disk (flexprotect) to complete?
 
Thanks
 
Scott

Matt Dey

unread,
Jul 24, 2013, 3:48:12 PM7/24/13
to isilon-u...@googlegroups.com
How big is the disk and what kind of node are you using?

Steven Kreuzer

unread,
Jul 24, 2013, 4:02:28 PM7/24/13
to isilon-u...@googlegroups.com
On Wed, Jul 24, 2013 at 3:48 PM, Matt Dey <matt...@gmail.com> wrote:
How big is the disk and what kind of node are you using?

Also, how much free space do you have in the smartpool that drive is in. The less free space you have,
the longer it is going to take

You can also bump the policy the flexproect is running at to high (isi job modify -j 251 --policy high)  but I
I have heard conflicting information on if this will actually speed up the rebuild time or not because
running at high could lead to spindle contention, and if that happens, you won't go any faster than if you run it at
Medium, but increase your chances to throw ECCs and have drive stalls

Cory Snavely

unread,
Jul 24, 2013, 4:30:17 PM7/24/13
to isilon-u...@googlegroups.com
The questions all make sense as far as diagnosis, but just to be clear,
no matter what, that's way longer than what you should be seeing. I'd
open an SR on that right away.
> --
> You received this message because you are subscribed to the Google
> Groups "Isilon Technical User Group" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to isilon-user-gr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Luc Simard

unread,
Jul 24, 2013, 5:51:16 PM7/24/13
to isilon-u...@googlegroups.com, isilon-u...@googlegroups.com
Never upgrade smartfail to high. That's a very bad idea.

In that use case, you are telling the cluster to commit all to this job. It will put extra pressure and could additional drive to fail. No point in rushing this.

Time to finish Varies , onefs will balance client churn and disk I/o , varies if you have SSD or not with GNA in that pool where the failure occurred .

If your workflow is based on large number of small files, on non-GNA, with shallower directories with large number of files per dir, this will impact time to completion.

Luc Simard - 415-793-0989
Messages may contain confidential information.
Sent from my iPhone
--

Jason Davis

unread,
Jul 24, 2013, 7:45:56 PM7/24/13
to isilon-u...@googlegroups.com

Yes this is pretty much what I see, our clusters have lots and lots of small files. A 4 day SmartFail is not all that improbable under a heavily used pool.

So the short explanation is if you have tons on LINs then adjust your drive SmartFail expectations accordingly.

I'm not certain if SmartFail performance is any different from OneFS 6.5.5 to 7.x

Peter Serocka

unread,
Jul 25, 2013, 3:05:24 AM7/25/13
to Jason Davis, isilon-u...@googlegroups.com
isi statistics drive --long 

lists, among many parameters, the Inode usage per disk.
Can you see the number of Inodes diminishing 
on the smartfailing drive?

-- Peter


Peter Serocka
CAS-MPG Partner Institute for Computational Biology (PICB)
Shanghai Institutes for Biological Sciences (SIBS)
Chinese Academy of Sciences (CAS)
320 Yue Yang Rd, Shanghai 200031, China





Luc Simard

unread,
Jul 25, 2013, 9:55:16 AM7/25/13
to isilon-u...@googlegroups.com, isilon-u...@googlegroups.com
There is a small difference, but the big improvements will be in 7.1 later this year.

So I would favor GNA capable nodes ( w/SSD) in the long run in this use case. You cluster will scale better with this large footprint of small files.

 Like x400 nodes with SSD instead of NL400. if you need speed, S200 w/SSD for your ingest pool, manage capacity with ATIME or MTIME.

Sure they are more expensive over NL but those SSD will give you a welcomed boots across the cluster.

For your directories, like any other FS, good advice , it's about balance, 10000 directories with 10000 files is still manageable vs few dir with 100k files per directory, with or without GNA , and platforms like Windows like to enumerate , cache everything and then display a folders content. Sure GNA will  help in the latter, but you are only postponing the real question. Is your workflow and data usage properly balanced?

Cheers .

Luc Simard - 415-793-0989
Messages may contain confidential information.
Sent from my iPhone

scott

unread,
Jul 25, 2013, 12:13:45 PM7/25/13
to isilon-u...@googlegroups.com, Jason Davis
In this case 1:30 is my troublemaker.  Currently 4 days + 15 hours into the smartfail.  Nice command, wish I knew about it a few days ago.
 
-Scott
 
   Drive Type OpsIn BytesIn SizeIn OpsOut BytesOut SizeOut TimeAvg Slow TimeInQ Queued Busy Used Inodes
 LNN:bay        N/s     B/s      B    N/s      B/s       B      ms  N/s      ms           %    %
     1:1 SATA  66.0    2.5M    38K  373.8     3.1M    8.2K     0.1  0.0    15.0    6.6 26.9 65.0   3.2M
     1:2 SATA  67.0    2.6M    39K  348.4     2.9M    8.2K     0.1  0.0    15.7    6.4 25.9 65.0   3.2M
     1:3 SATA  41.8    1.4M    33K  331.8     2.7M    8.2K     0.2  0.0    22.7    9.4 39.5 65.0   3.2M
     1:4 SATA  69.0    2.4M    35K  348.2     2.9M    8.2K     0.2  0.0    15.5    6.5 37.7 65.0   3.2M
     1:5 SATA  59.8    2.2M    36K  341.2     2.8M    8.2K     0.1  0.0    16.8    6.9 43.9 65.0   3.2M
     1:6 SATA  69.2    2.8M    41K  273.0     2.2M    8.2K     0.1  0.0    14.6    6.1 24.7 65.0   3.2M
     1:7 SATA  92.8    4.1M    44K  328.2     2.7M    8.2K     0.1  0.0    19.8    7.5 23.3 66.2   3.2M
     1:8 SATA  79.0    3.5M    44K  356.8     2.9M    8.2K     0.1  0.0    18.4    7.4 29.3 66.2   3.2M
     1:9 SATA 110.8    5.2M    47K  281.6     2.3M    8.2K     0.1  0.0    16.0    6.6 40.9 66.2   3.2M
    1:10 SATA  85.4    3.6M    43K  308.8     2.5M    8.2K     0.1  0.0    18.1    7.0 40.5 66.2   3.2M
    1:11 SATA  79.6    3.6M    45K  277.8     2.3M    8.2K     0.1  0.0    24.7    9.4 48.9 66.2   3.2M
    1:12 SATA  98.8    4.5M    45K  329.4     2.7M    8.2K     0.1  0.0    18.1    7.2 26.1 66.2   3.2M
    1:13 SATA  64.4    2.5M    38K  366.8     3.0M    8.2K     0.1  0.0    15.3    6.4 29.3 64.9   3.2M
    1:14 SATA  55.2    1.8M    33K  347.4     2.8M    8.2K     0.1  0.0    14.9    6.3 30.9 64.9   3.2M
    1:15 SATA  80.8    3.3M    41K  293.2     2.4M    8.2K     0.2  0.0    14.6    6.2 36.1 64.9   3.2M
    1:16 SATA  52.6    1.8M    35K  339.2     2.8M    8.2K     0.1  0.0    15.6    6.6 35.1 64.9   3.2M
    1:17 SATA  74.6    2.9M    39K  365.8     3.0M    8.2K     0.1  0.0    16.9    7.1 44.1 64.9   3.2M
    1:18 SATA  79.2    3.2M    40K  302.6     2.5M    8.2K     0.2  0.0    16.0    6.6 42.7 64.9   3.2M
    1:19 SATA  73.2    3.2M    43K  367.4     3.0M    8.2K     0.1  0.0    16.6    6.8 29.3 66.8   3.2M
    1:20 SATA  51.4    2.1M    41K  274.0     2.2M    8.2K     0.1  0.0    16.5    7.0 26.5 66.8   3.2M
    1:21 SATA  50.0    1.6M    33K  373.6     3.1M    8.2K     0.1  0.0    13.1    6.0 34.1 66.8   3.2M
    1:22 SATA  70.2    2.7M    38K  334.8     2.7M    8.2K     0.1  0.0    17.7    7.5 28.9 66.8   3.2M
    1:23 SATA  82.8    3.6M    44K  315.0     2.6M    8.2K     0.1  0.0    16.7    7.3 26.1 66.8   3.2M
    1:24 SATA  86.6    3.8M    44K  373.4     3.1M    8.2K     0.2  0.0    15.6    6.5 42.1 66.8   3.2M
    1:25 SATA  83.2    3.3M    39K  342.2     2.8M    8.2K     0.1  0.0    17.9    7.7 44.7 73.6   3.4M
    1:26 SATA  68.2    2.7M    40K  373.8     3.1M    8.2K     0.1  0.0    17.1    7.7 25.9 73.6   3.4M
    1:27 SATA  96.0    3.9M    41K  366.6     3.0M    8.2K     0.1  0.0    16.7    7.2 40.9 73.6   3.4M
    1:28 SATA  94.0    4.4M    46K  334.4     2.7M    8.2K     0.1  0.0    16.2    7.4 33.9 73.6   3.4M
    1:29 SATA  57.0    2.0M    35K  449.4     3.7M    8.2K     0.1  0.0    23.3   10.7 42.1 73.6   3.4M
    1:30 SATA   0.2    1.6K   8.2K    0.0      0.0     0.0     0.1  0.0     1.1    0.0  0.0 64.5    17K
    1:31 SATA  80.0    3.3M    41K  393.0     3.2M    8.2K     0.1  0.0    16.7    6.7 29.1 65.2   3.2M
    1:32 SATA  72.6    3.0M    41K  322.6     2.6M    8.2K     0.1  0.0    16.0    6.8 26.1 65.1   3.2M
    1:33 SATA  88.2    3.6M    40K  328.6     2.7M    8.2K     0.1  0.0    16.7    6.8 39.7 65.1   3.2M
    1:34 SATA  88.2    3.5M    40K  316.4     2.6M    8.2K     0.1  0.0    16.3    6.8 26.5 65.1   3.2M
    1:35 SATA  70.2    2.9M    41K  331.6     2.7M    8.2K     0.1  0.0    16.9    6.9 44.7 65.1   3.2M
    1:36 SATA  62.2    2.4M    39K  313.2     2.6M    8.2K     0.2  0.0    16.8    6.9 37.1 65.2   3.2M

scott

unread,
Jul 25, 2013, 12:34:10 PM7/25/13
to isilon-u...@googlegroups.com
3TB Hitachi disk on an IQ 108NL

Luc Simard

unread,
Jul 25, 2013, 3:37:48 PM7/25/13
to isilon-u...@googlegroups.com
One point to mention, do not put too much stock in the %Busy value, I've replicated use cases where cluster with 100% idle, no client connections, no I/O  would still show 100% busy, this is known to Eng and properly logged as "broken" or to be improved. 

Peter Serocka

unread,
Jul 26, 2013, 3:18:58 AM7/26/13
to isilon-u...@googlegroups.com, Jason Davis
Looks dead...

Suppose you open a case, let us know what you find.

BTW, disks 25-29 are 73% full, while all others are at 65-66%.

When did the most recent MultiScan (or AutoBalance, which is part of it)
finish *successfully*? 

MultiScan gets more or less silently (no events, but syslog messages)
"system cancelled" by disk stalls:

have you been seeing any disk stalls (in syslog messages)? 

You also might check wether the recent monthly MediaScan
jobs have been successful. (MediaScan phase 5 gets
inflicted by SnapshotDelete, so careful monitoring
is recommended.)


-- Peter
Reply all
Reply to author
Forward
0 new messages