Isilon hardware refresh procedure ( again ).

744 views
Skip to first unread message

pierluigi...@gmail.com

unread,
Jan 14, 2021, 2:57:43 AM1/14/21
to Isilon Technical User Group
Hello all , and Happy New Year.
 I have a question about the subject.
I have read in this group the various info's and I'm quite confident on the overall procedure:

1) Cabling: insert new backend switches in rack ( to have more port ).
2) Disconnect one backend network and move to the new switch
3) Verify that moved BE network is fine 
4) Repeat 2 and 3 for the second BE network
5) Add new nodes to the BE switches
6) Power on new nodes __but not join them to cluster yet__
7) One at time join the node to cluster
---- Finish adding new nodes. Now wait for flexprotect doing his magic ---
If there are ( in my case there are ) different type of nodes ( smartpool ):
8) Create a filepool policy to move data from old smartpools to new ones.
When pool is empty.
9) One at time smartfail old nodes but last two ( in case of different pool ): This two should be smartfailed together.
10) Remove old hardware.

There is no mention of adding network cable and IP address when and if needed as it goes by itself ( normal procedures as adding a node ).

Now the questions:
In one of my cluster I have a mix of NL and H400 nodes so I need to move data from NL.
Is this policy ok to move data ?
isi filepool policies create MoveFromOldPool --begin-filter --name="*" --operator=eq --end-filter --data-storage-target=NewPool  --snapshot-storage-target=NewPool --data-ssd-strategy=metadata --snapshot-ssd-strategy=metadata 

And then I wait for smartpool job to complete ( which would take quite a bit )

I have read on old posts, here, that I should add node "slowly" ( i have read between two nd four hour per node ) to avoid quorum problems. 
Is there some command to check if the quorum is fine ? 

Last one: am I missing something ?

Thanks in advance 

Pierluigi
 

pierluigi...@gmail.com

unread,
Jan 14, 2021, 4:23:10 AM1/14/21
to Isilon Technical User Group
I will add a little bit I've found ( and could change the procedure ):
After adding enough nodes to new pool to keep all data from the NL ones, is better to change default filepool policy to send all new data to new pool.
I'm not sure, indeed, if modifying the default pool policy will also move the old data to new pool. 
Just to be on the safe side, i think is better to add the new nodes, and when the new space s available modify the default ( for new writes ) and create the add-on policy to move old data.
What do you think ?

Thx
Pierluigi

Anurag Chandra

unread,
Jan 14, 2021, 4:41:16 AM1/14/21
to isilon-u...@googlegroups.com
Hi Pierluigi,

your procedure looks correct.

to verify quorum use : isi_group_info  

Adding nodes 1 at a time every 1 min or so should be ok, remember you will need to run MultiScan or Auto balance to ensure the capacity is balanced across nodes.

Thanks,
Anurag

--
You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/isilon-user-group/71607ae4-2b42-4936-bc03-1956be06416fn%40googlegroups.com.

pierluigi...@gmail.com

unread,
Jan 18, 2021, 11:13:11 AM1/18/21
to Isilon Technical User Group
Thanks !!!
I need another info but would open a new thread !

pierluigi...@gmail.com

unread,
Jan 19, 2021, 7:21:39 AM1/19/21
to Isilon Technical User Group
Yesterday I did some tests on a testing environment using the isilon simulator.
I created a cluster ( on vmware ) of 3 nodes, moved some data on it and then added 3 more nodes.
With disi command I have removed the new 3 nodes from the pool:
disi -I diskpools modify  v200_25gb_2gb:2 --remove=4
disi -I diskpools modify  v200_25gb_2gb:2 --remove=5
disi -I diskpools modify  v200_25gb_2gb:2 --remove=6

and then create a new storagepool with this three:
isi storagepool nodepools create v200_25gb_4gb -n 4,5,6

This lead me to a similar situation I have in my prod cluster ( Eg :)
CL-Isi-Test-1# disi -I diskpools list -v                                                      
Name                      Id  Type Prot Flags    Members              VHS   HDD Used / Size       SSD Used / Size       
------------------------------------------------------------------------------------------------------------------------
v200_25gb_2gb             1   G    +2d: SDH----- 2                    1      585M /   32G (2%   )     0 /     0 (n/a  ) 
                                   1n                                                                                   
v200_25gb_2gb:2           2   D    +2d: S------- 1-3:bay2-7           -      585M /   32G (2%   )     0 /     0 (n/a  ) 
                                   1n                                                                                   
v200_25gb_4gb             18  G    +2d: SDH-M--- 19                   1      6.9G /   32G (22%  )     0 /     0 (n/a  ) 
                                   1n                                                                                   
v200_25gb_4gb:19          19  D    +2d: S------- 4-6:bay2-7           -      6.9G /   32G (22%  )     0 /     0 (n/a  ) 
                                   1n                                                                                   

------------------------------------------------------------------------------------------------------------------------

At this moment I played with filepool policies to move data from a pool to the other.
It worked as expected, but .... as expected some data remained on old pool.
Eg: 
CL-Isi-Test-1# isi get -D ./.ifsvar/modules/nfs/nsm/nsm_1_10.155.235.12.db-journal | grep pool
*  Disk pools:         policy system policy ID -> data target v200_25gb_2gb:2(2), metadata target v200_25gb_2gb:2(2)
        Disk pool policy ID      0      5   

Now I have 2 more question:
1) How do I smartfail two node at a time ?
    There is no such command to remove two node in the same command ( as it used to be in 7.x version ).   
    Should I have to issue two commands sequentially ?
2) Wuold the remaining data in old pool be moved in remaining one or, as stated in previous messages, I have to open an SR to EMC tech to help in decomission old pool ?

Thanks in advance.
Pierluigi

mandar kolhe

unread,
Jan 19, 2021, 7:59:50 AM1/19/21
to isilon-u...@googlegroups.com
Remaining data will be moved to other storagepool as per global spillover when smartfail happens it restripes the data, it doesnt delete so nothing should happen to data. sometimes chances are their manually managed files are in existing pool. you can smartfail using 

# isi devices node smartfail --node-lnn=4,5 or run command twice it should work. 

--
You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.

Anurag Chandra

unread,
Jan 19, 2021, 8:17:13 AM1/19/21
to isilon-u...@googlegroups.com
Note that the smartfail command starts a "Flexprotect" job visible via isi job status -v , the nodes will complete smart fail when each of the Flexprotect jobs complete.

Thanks,
Anurag


mandar kolhe

unread,
Jan 19, 2021, 8:43:32 AM1/19/21
to isilon-u...@googlegroups.com
Incase if you are running smartfail command twice make sure you cancel exiting flexprotect job and let new job run so it will smartfail both nodes at same time.

Pierluigi Frullani

unread,
Jan 20, 2021, 2:08:02 AM1/20/21
to isilon-u...@googlegroups.com
I've tried this on a 8.1.2.0 cluster ( simulator ).
It does not work:

CL-Isi-Test-4# isi devices smartfail --node-lnn=1,2  
Option --node-lnn has an error:
Cannot parse value: 1,2 as type int.
Usage:
    isi devices drive smartfail { <bay> | --lnum <integer> | --sled <string> }
        [--node-lnn <integer>]
        [{--force | -f}]
        [{--verbose | -v}]
        [{--help | -h}]
See 'isi devices drive smartfail --help' for more information.

Pierluigi

Erik Weiman

unread,
Jan 20, 2021, 2:09:21 AM1/20/21
to isilon-u...@googlegroups.com
It’s not recommended to smartfail more than 1 node at a time. 

--
Erik Weiman 
Sent from my iPhone 7

On Jan 20, 2021, at 1:07 AM, Pierluigi Frullani <pierluigi...@gmail.com> wrote:


--
You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.

mandar kolhe

unread,
Jan 20, 2021, 3:58:27 AM1/20/21
to isilon-u...@googlegroups.com
Then you should run command twice after running first command it will start flex job so before running second time cancel that flex job when u will run second command it will automatically kick another job


--
You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.

Pierluigi Frullani

unread,
Jan 21, 2021, 2:44:28 AM1/21/21
to isilon-u...@googlegroups.com
I understand, but I need to smartfail the last two nodes of a cluster in mixed configuration ( old gen5 and new gen6 nodes ).
Dell-Emc says "smartfail the last two nodes togheter". 

Pierluigi

mandar kolhe

unread,
Jan 21, 2021, 2:49:46 AM1/21/21
to isilon-u...@googlegroups.com
They recommend to smartfail last 2 nodes when you want to smartfail that entire storage pool for other scenarios they will recommend to do one by one

--
You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.
Message has been deleted

Pierluigi Frullani

unread,
Jan 21, 2021, 6:50:57 AM1/21/21
to isilon-u...@googlegroups.com
It surely helps. I can have a script that does this, or eventually open two terminal on the two nodes or even on a single one and issue the command almost at the same time. 
Thanks !

On Thu, Jan 21, 2021 at 10:35 AM <karim....@gmail.com> wrote:
Hi,
If you issue the two smartfail commands consecutively, max ~15 seconds apart, OneFS will create a single SmartFail job for these two nodes and they will smartfail together.

I hope this helps.
Karim

Pierluigi Frullani

unread,
Apr 10, 2021, 4:47:24 AM4/10/21
to isilon-u...@googlegroups.com
Just for future references, this advice did the job.
I did an
isi devices node smartfail --node-lnn=14 && isi devices node smartfail --node-lnn=16 
and now the remaining node from an old smartpool are smartfailing togheter.
Thanks to all !!!!

Reply all
Reply to author
Forward
0 new messages