Isilon Unbalanced Storage

61 views
Skip to first unread message

sc...@tacomadata.com

unread,
Dec 2, 2025, 7:37:25 PM12/2/25
to Isilon Technical User Group
This year I was asked to help with an Isilon project for our storage team.  We were planning a datacenter move and wanted to eject the oldest tech (first 20 nodes) from a 40 node Isilon cluster.  

To make the smartfail of n1-20  complete more quickly, our techsupport had us enable a trial license for 'smartpools'  -  then configure it to move data from nodes 1-20 to nodes 21-40.  This all went well. 

The cluster has been operating on 20 nodes, the move is still pending, but here's where it gets interesting:

Last month I got a 'pool is getting full' warning.  Our smartpools license expired in July as expected. I'd noticed Autobalance and FSanalyze had failed 2x, so I started them manually (they did complete) 

I realized I'd overlooked my networked nodes 33-40 from  pool "a2000_200tb_800gb-ssd-sed_64gb" were all filled to >99% while nodes 21-32 from pool 'a40_200tb_800gb-ssd-sed_16gb'  are only filled to 36%.

I made sure AutoBalance completed correctly but I think that might have only balanced data within each nodepool.  

Here is what I wonder:  Did we inadvertently turn-off a setting that keeps whole cluster balanced when we enabled the temporary Smartpools license?  When I look here I see 'balanced' set to 'no' between nodepools.  Maybe there is a way to set this to 'yes?' :-)

# isi storagepool nodepools list -v                  ID: 19                Name: a40_200tb_800gb-ssd-sed_16gb               Nodes: 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32       Node Type IDs: 2   Protection Policy: +3d:1n              Manual: No          L3 Enabled: Yes L3 Migration Status: l3                Tier: Archive               Usage                 Avail Bytes: 1.35P             Avail SSD Bytes: 0.00                    Balanced: No                  Free Bytes: 1.36P              Free SSD Bytes: 0.00                 Total Bytes: 2.11P             Total SSD Bytes: 0.00     Virtual Hot Spare Bytes: 19.12T --------------------------------------------------------------------------------                  ID: 50                Name: a2000_200tb_800gb-ssd-sed_64gb               Nodes: 33, 34, 35, 36, 37, 38, 39, 40       Node Type IDs: 3   Protection Policy: +2d:1n              Manual: No          L3 Enabled: Yes L3 Migration Status: l3                Tier: Archive               Usage                 Avail Bytes: 390.98G             Avail SSD Bytes: 0.00                    Balanced: Yes                  Free Bytes: 19.41T              Free SSD Bytes: 0.00                 Total Bytes: 1.41P             Total SSD Bytes: 0.00     Virtual Hot Spare Bytes: 19.02T

Hope someone here can help, or give me ideas on what I can do.  


Joseph DAndrea

unread,
Dec 15, 2025, 10:44:12 AM12/15/25
to isilon-u...@googlegroups.com
If you still have nodes/pools over about 85 percent full the auto balance may not work correctly.  I have attached a document that I have had to use several times on customer environments when their space utilization gets out of hand 

I'm not sure about the behavior of a smart pools license expiring. My instinct is you let it get too far out of hand and the only option is the link I provided. 

--
You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/isilon-user-group/fd5301b0-d0ad-423b-827d-5449355bd013n%40googlegroups.com.


--
Regards
Joseph DAndrea
USMC 02-07
AZANG 07-08

Do not worry about your difficulties in Mathematics. I can assure you mine are still greater
                                                                                                                                 -Albert Einstein
PowerScale_ Using AutoBalanceLin to quickly move data off of a full node pool _ Dell US.pdf

Luc Simard

unread,
Dec 15, 2025, 11:46:05 PM12/15/25
to isilon-u...@googlegroups.com, isilon-u...@googlegroups.com
Use autobalancelin if you possibly can, less of a heavy lift , I recommend you resolve your license expiration with your Dell team. 

Cheers.

/ls

Luc Simard
Personal Email:
simard.J...@gmail.com
Mobile Vox: 415-501-0438
Washington State, USA

Sent from my iPhone

On Dec 15, 2025, at 07:44, Joseph DAndrea <joseph....@gmail.com> wrote:


To view this discussion visit https://groups.google.com/d/msgid/isilon-user-group/CANwL3hwBD5JDcR_j-mCaRT0N%2B3db7iJJ1hvkX5wwTLkH4CPf-A%40mail.gmail.com.
<PowerScale_ Using AutoBalanceLin to quickly move data off of a full node pool _ Dell US.pdf>

Joseph DAndrea

unread,
Dec 17, 2025, 11:41:59 AM12/17/25
to isilon-u...@googlegroups.com
When i get home I can shoot you the autobalanclin instructions. 

Regards
Joseph DAndrea
USMC 02-07
AZANG 07-08

Do not worry about your difficulties in Mathematics. I can assure you mine are still greater
                                                                                                                                 -Albert Einstein

sc...@tacomadata.com

unread,
Jan 8, 2026, 5:57:42 PM (14 days ago) Jan 8
to Isilon Technical User Group
I finally got this worked out - thought I'd post an update here to close the circuit:

Our cluster is made out of two similar but different versions of isilon hardware - so we have 20 nodes in this cluster and 2 nodepools (10 of each type)

On newer versions of Isilon there is a  job 'SetProtectPlus' that balances the cluster across nodepools.
If 'SmartPools' is licensed (even if the license is expired) , it prevents 'SetProtectPlus' from running.  

With Isilon you cannot remove a license.  You can only request a new set of licenses- omitting the single license you want to remove.

At the start of this year, a contractor enabled a trial of smartpools to move some data around to help us  smartfail some nodes. In July that license expired and the node pools started creeping slowly out of balance with one filling.

To correct the full nodepool, I started AutoBalance, then AutoBalanceLIN - each take around 10 days - this is when I found out these jobs don't balance across nodepools anymore.  I ultimately discovered the expired SmartPools license is preventing 'SetProtectPlus' from running.

Here is what happens if I try to start SetProtectPlus:

isi job start SetProtectPlus --policy medium
Job operation failed: SmartPools is licensed, and so SetProtectPlus is disabled. Use SmartPools to control cluster protection.: Operation not permitted


-- 

The fix (don't do this - its not a normal user command and while it worked for me it might break things for you)

isi_gconfig -t licensing -R licensing.features.SMARTPOOLS.json

This command removed the expired SmartPools license and I was able to start SetProtectPlus - SetProtectPlus ran for 10 days, crashed, I started it again and it ran for another week, but then I was able to run AutoBalance and return things to normal.  

-GoodLuck

Reply all
Reply to author
Forward
0 new messages