Isilon IQ12000 issues.

75 views
Skip to first unread message

Dowlet Komekow

unread,
Mar 4, 2024, 1:05:53 PMMar 4
to Isilon Technical User Group
Good day all,
Would appreciate if somebody could assist me in solving Isilon IQ12000 issues.
I have a cluster of 3 nodes. Node 1 & 2 are 100% utilized. Third node is 83% filled. Additionally node 2 has 1 failed drive. Due to no free space on it could not either rebuild or start isi_job_d or any other job on it.
Recently joined the company, and don't know what to do.
Please advise. 
Appreciate your assistance.1.jpgimage (1).png

Luc Simard

unread,
Mar 4, 2024, 1:57:00 PMMar 4
to isilon-u...@googlegroups.com
Hi Dowlet, 


(Smartpool tabs),  Look if the VHS feature is enabled, if so release it ( uncheck the box) , to re-enable it with at least 4 drives redundancy or 10% of resiliency.

Look if you access to InsightIQ find least used directories.

You would stand with better outcomes to stop writes as soon as possible , install DiskoverData DiskoveryWeb tool if you don’t have it running already , find cold data and move it asap to another Isilon/Powerscale storage repository. 

Until the drive can be failed out, you are in failed state limbo as capacity full use case for effect.

Run flex protect ,replace the drive ,  then run autobalanceLins. 

I strongly encourage you to look at tech upgrade, Gen2 and Gen3 gear is well beyond its supported life cycle.

Once the system is whole again, get more drives on hand, reboot node 1, if node drives fail, reboot node 2, if no drives fail , reboot node 3.

You may have other near death drives floating in nodes, a quarterly reboot is not a bad idea with older gear.

Once you have high confidence, migrate the data off this cluster if you can, retire it.

Cheers.

/ls

Luc Simard
Personal Email:
simard.J...@gmail.com
Mobile Vox: 415-501-0438
Washington State, USA

Sent from my iPhone

On Mar 4, 2024, at 10:05, Dowlet Komekow <dkom...@gmail.com> wrote:

Good day all,
Appreciate your assistance.
<1.jpg>
<image (1).png>

--
You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/isilon-user-group/789296c8-4177-4bee-97d5-4fcbc3561a1an%40googlegroups.com.
<image (1).png>
<1.jpg>

Dowlet Komekow

unread,
Mar 5, 2024, 4:05:08 AMMar 5
to Isilon Technical User Group
Once i try to tick off "Reduce ammount of available space" and "deny new data writes" i get an error (see below)
How could i free up some space to proceed with rebuild?

2.jpg


понедельник, 4 марта 2024 г. в 23:57:00 UTC+5, Luc Simard:

Jerry

unread,
Mar 5, 2024, 9:39:57 AMMar 5
to isilon-u...@googlegroups.com
Go rooting around in the ifsvar dir for core files or other log files you might be able to truncate.???


Luc Simard

unread,
Mar 5, 2024, 2:04:42 PMMar 5
to isilon-u...@googlegroups.com

That is a good strategy, you may have a number of rod core files.


Be mindful of what you do under /.ifsvar/ as it contains cluster global resources and configurations.

Take a look at :

/var/crash/*.core
/var/crash/*.hangdfumps
/var/crash/*.tcpdump (or related tcp dump files done recently)

/ifs/data/Isilon_Support/pkg/*.pkg *.core
/ifs/data/Isilon_Support/pkg/*.core
/ifs/data/Isilon_Support/pkg/*.hangdumps
/ifs/data/Isilon_Support/pkg/*.Tcpdump or related files



Cheers.

/ls

Luc Simard
Google Personal Email:
simard.j...@gmail.com
Mobile gVox: 415-501-0438
Washington State, USA




On Mar 5, 2024, at 06:39, Jerry <jua...@gmail.com> wrote:

Go rooting around in the ifsvar dir for core files or other log files you might be able to truncate.???


On Tue, Mar 5, 2024 at 4:05 AM Dowlet Komekow <dkom...@gmail.com> wrote:
Once i try to tick off "Reduce ammount of available space" and "deny new data writes" i get an error (see below)
How could i free up some space to proceed with rebuild?

<2.jpg>


Luc Simard

unread,
Mar 5, 2024, 2:04:45 PMMar 5
to isilon-u...@googlegroups.com


You are running at maximum capacity 

You will have to move data off the cluster first to allow the drive to fail normally. Sorry.

Capacity full brings out a number of behaviors, I was hoping you had a different configuration over the basics for VHS. It was worth a try.

I highly recommend you use a percentage basis, like 8%-10% to avoid this type of scenario.


Cheers.

/ls

Luc Simard
Google Personal Email:
simard.j...@gmail.com
Mobile gVox: 415-501-0438
Washington State, USA




On Mar 5, 2024, at 01:05, Dowlet Komekow <dkom...@gmail.com> wrote:

Once i try to tick off "Reduce ammount of available space" and "deny new data writes" i get an error (see below)
How could i free up some space to proceed with rebuild?

<2.jpg>


Reply all
Reply to author
Forward
0 new messages