DSR failure

Hemantt Chugh

unread,

Mar 17, 2021, 9:41:58 AM3/17/21

to Isilon Technical User Group

Hello Experts

We have a scenario where we have 2 nodes stuck on mounting /ifs with below DSR failure and since cluster is not in EMC support. Can anyone help how can we skip DSR below so flex protect can complete successfully.

History

# Node 1 has some issues and after replacing boot drive it didnt come up successfully

# in mean time we saw boot drive health going down on node 7 we planned to get it replaced unfortunately it also got stuck on the same mounting /ifs.

I need urgent help on the same.

DSR failure on { 5,0,10593525760:8192 }

DSR failure on { 5,0,10549551104:8192 }

Jean-Didier stefaniak

unread,

Mar 17, 2021, 5:30:39 PM3/17/21

to Isilon Technical User Group

This is a very risky situation you are finding yourself into here. Your data may be at risk at the very least.DSR stands for Dynamic Sector Recovery; i.e : one or more drive/node has/have suffered an ECC error. your LNN 5 has 2 blocks impacted (10593525760 && 10549551104).
If concurrently nodes 1 & 7 are also down you may have already exceeded your N+M protection model or just at the threshold of doing so.You are likely experiencing partial Data unavailability at this point in time and contemplating Data Loss if not careful every single step of the way from now on.

You say you replaced the boot drives on node#1 but you do not say how you did it. This procedure is a fairly advanced one that requires precise steps in a given order to protect the node's FS-awareness.

It may not be what you want to hear or read but a forum is definitively not the place you want to ask help for this type of issue.
Those situation need to be handled extremely carefully as one wrong step could mean the difference between recovering or loosing your data.

Contact your Dell EMC Account Manager and ask them to quote you for "Time and Maintenance".

hemant chugh

unread,

Mar 18, 2021, 3:13:51 AM3/18/21

to isilon-u...@googlegroups.com

Thanks for your response.

Boot drive was replaced by 3rd party vendor . It's DR cluster hence we can remove dar failures to get nodes back online then we can replicate data from primary cluster.

I am requesting procedure as customer not giving approval to remove data but we can overwrite with synciq for that nodes should be added back to cluster . Currently i have put cluster in degraded mode .

Thanks
Hemant
9686630313

--
You received this message because you are subscribed to a topic in the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/isilon-user-group/KNquwhdInbM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to isilon-user-gr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/isilon-user-group/b1a45ae0-44cb-4afa-91ab-f6a9754416f8n%40googlegroups.com.

mandar kolhe

unread,

Mar 18, 2021, 3:39:46 AM3/18/21

to isilon-u...@googlegroups.com

Hemant, dsr failures shouldn't be causing due to boot drive replacement . Their is some issue with node id 5 who has lnum 0 drive. In logs you can find lin number and then get the path using dd command and see if its readable or gives io error. Support should be engaged if corruption spreads it would be blunder and stop replication and restriping jobs

You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/isilon-user-group/CAKYrsr8VEy_yvmx-Ndw_8hcR0Euv%3D1QsMYyZLmOM_zL6ZdSX7g%40mail.gmail.com.

hemant chugh

unread,

Mar 18, 2021, 4:52:24 AM3/18/21

to isilon-u...@googlegroups.com

Hello Mandar,

How can I read DSR location ? LIN can be read via isi get i tried DSr also see details below. we have deleted all snapshots only synciq failover snapshots are available. Shall i delete them as well ? and then try smartfail ?

sudo isi get -L 4000:0001:0085:0029
isi: Could not find a path to LIN:0x4000000100850029/SNAP:18446744073709551615: Invalid argument

4.9881805 03/15 23:24 C 4 479990 DSR failure on { 5,0,10643939328:8192 } of UNKNOWN:{} owned by 4000:0001:0013:008c::HEAD: syscall failed: _sys_pctl2_advance: EIO
4.9881824 03/15 23:28 C 4 479999 DSR failure on { 3,2,13969637376:8192 } of UNKNOWN:{} owned by 4000:0001:0018:0017::HEAD: syscall failed: _sys_pctl2_advance: EIO
4.9881829 03/15 23:29 C 4 480001 DSR failure on { 5,0,10644971520:8192 } of UNKNOWN:{} owned by 4000:0001:001b:0023::HEAD: syscall failed: _sys_pctl2_advance: EIO
4.9881850 03/15 23:34 C 4 480011 DSR failure on { 5,0,10636500992:8192 } of UNKNOWN:{} owned by 4000:0001:0024:0012::HEAD: syscall failed: _sys_pctl2_advance: EIO
4.9881916 03/15 23:56 C 4 480044 DSR failure on { 5,0,10593525760:8192 } of UNKNOWN:{} owned by 4000:0001:0085:0029::HEAD: syscall failed: _sys_pctl2_advance: EIO
4.9881921 03/15 23:57 C 4 480046 DSR failure on { 5,0,10549551104:8192 } of UNKNOWN:{} owned by 4000:0001:008d:000e::HEAD: syscall failed: _sys_pctl2_advance: EIO
4.9881936 03/16 00:04 C 4 480053 DSR failure on { 5,0,10529497088:8192 } of UNKNOWN:{} owned by 4000:0001:00f2:0007::HEAD: syscall failed: _sys_pctl2_advance: EIO
4.9881942 03/16 00:05 C 4 480055 DSR failure on { 5,0,9801621504:8192 } of UNKNOWN:{} owned by 4000:0001:011f:000f::HEAD: syscall failed: _sys_pctl2_advance: EIO
4.9881944 03/16 00:05 C 4 480055 DSR failure on { 5,0,9801621504:8192 } of UNKNOWN:{} owned by 4000:0001:011f:000f::HEAD: syscall failed: _sys_pctl2_advance: EIO
4.9881954 03/16 00:10 C 4 480061 DSR failure on { 5,0,10545807360:8192 } of UNKNOWN:{} owned by 4000:0001:0173:0016::HEAD: syscall failed: _sys_pctl2_advance: EIO

Thanks,
Hemant Chugh
+919686630313
Please don't print this e-mail unless you really need to. Keep our City & Country Clean & Green

To view this discussion on the web visit https://groups.google.com/d/msgid/isilon-user-group/CAJH3o5EFYpm-OeQ9s1dW94C33b8mY6CPg_m5T6BgZCHY3BLMRQ%40mail.gmail.com.

hemant chugh

unread,

Mar 18, 2021, 5:12:39 AM3/18/21

to isilon-u...@googlegroups.com

Hello Mandar,

Do we have a way to see which snapshot it's referring to ? All are pointing to the snapshots.

sudo isi get -L 1:0a59:7ea1
isi: Could not find a path to LIN:0x10a597ea1/SNAP:18446744073709551615: No such file or directory

sudo isi get -L 4000:0001:0085:0029
isi: Could not find a path to LIN:0x4000000100850029/SNAP:18446744073709551615: Invalid argument

sudo isi get -L 4000:0001:008d:000e
isi: Could not find a path to LIN:0x40000001008d000e/SNAP:18446744073709551615: Invalid argument

Thanks,
Hemant Chugh
+919686630313
Please don't print this e-mail unless you really need to. Keep our City & Country Clean & Green

mandar kolhe

unread,

Mar 18, 2021, 11:11:59 AM3/18/21

to isilon-u...@googlegroups.com

Hello Hemant,

Yes HEAD indicates its snapshot lin try like isi get -L 4000:0001:0173:0016::HEAD

EIO its input output error.

do you have more devices down ? is your protection lost ? isi_group_info , isi_classic stat -q -d -v

Their are some bugs as well for snapshot lins for that need to review logs. some are false alert if its snapshot lin in onefs 8. family i dont remember in which version its fixed.

To view this discussion on the web visit https://groups.google.com/d/msgid/isilon-user-group/CAKYrsr9HC%3DqAa0NVZyopJ1L5W2VjdnC0w3euAssSvnEYkrwWNQ%40mail.gmail.com.

Hemantt Chugh

unread,

Mar 18, 2021, 1:33:55 PM3/18/21

to isilon-u...@googlegroups.com

Hello Mandar

Yes node 1 and node 7 are not able to mount /ifs after boot drive was replaced we are on 8.1.2.0

Sent from my iPhone

On 18-Mar-2021, at 8:42 PM, mandar kolhe <kolhem...@gmail.com> wrote:

To view this discussion on the web visit https://groups.google.com/d/msgid/isilon-user-group/CAJH3o5F76VWRJ5tEiWq7Otn%2B3JRmg4fJ1x53Ehqkr-jCPxGSEQ%40mail.gmail.com.

mandar kolhe

unread,

Mar 28, 2021, 6:55:23 PM3/28/21

to isilon-u...@googlegroups.com

What error it gives ? Did you try to manually mount it

To view this discussion on the web visit https://groups.google.com/d/msgid/isilon-user-group/89EB2544-0A94-4218-B04F-C8CFF8563D71%40gmail.com.

Reply all

Reply to author

Forward