Infiniband switch model information

1,417 views
Skip to first unread message

Jeff Cleverley

unread,
Mar 6, 2014, 4:22:55 AM3/6/14
to isilon-u...@googlegroups.com
Greetings,

Is there a command that will show information about the infiniband switches in the cluster?  I'm looking for model numbers, firmware, etc.

Thanks,

Jeff

--
Jeff Cleverley
Unix Systems Administrator
4380 Ziegler Road
Fort Collins, Colorado 80525
970-288-4611

Matt Dey

unread,
Mar 6, 2014, 6:11:05 AM3/6/14
to isilon-u...@googlegroups.com
In my opinion tools for querying the IB side of things are very weak. It's all based off opensm. I would start by seeing if the info you want is in the log file /var/log/opensm-1.log

Jason Davis

unread,
Mar 6, 2014, 10:33:34 AM3/6/14
to isilon-u...@googlegroups.com
Yeah, your best bet is to make a trip and physically get access to the IB switches... unless you configured management on these :)


On Thu, Mar 6, 2014 at 5:11 AM, Matt Dey <matt...@gmail.com> wrote:
In my opinion tools for querying the IB side of things are very weak.  It's all based off opensm.  I would start by seeing if the info you want is in the log file /var/log/opensm-1.log

--
You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Jeff Cleverley

unread,
Mar 6, 2014, 3:41:03 PM3/6/14
to isilon-u...@googlegroups.com
Thanks for the replies.  The opensm-1.log and the opensm-1.topo files show the make of the switches, but not model or firmware. 

I don't think we have any access configured directly for the switches.  I assumed more detailed information was available in some of the logs being sent to Isilon periodically.  I didn't see it in any of the isi get collections.

Jeff

Jason Davis

unread,
Mar 6, 2014, 4:19:25 PM3/6/14
to isilon-u...@googlegroups.com
From my experience Isilon has always treated the IB switches as "black boxes". We have several clusters with the larger, managed QLogic/Intel 12800s and we're starting to monitor these proactively. 

Jeff Cleverley

unread,
Mar 6, 2014, 9:30:40 PM3/6/14
to isilon-u...@googlegroups.com
Jason,

The notification about the Qlogic switches is what got me thinking about these.  In case you didn't get it, here is the number and description:

EMC Technical Advisory

This notification is to alert you of the following technical advisory:

ETA 180317: Isilon OneFS: Intel (QLogic) InfiniBand switch models 12300 and 12800 may become unresponsive, which may cause communication between nodes to stop and data to become unavailable



Jeff

Jason Davis

unread,
Mar 7, 2014, 10:01:06 AM3/7/14
to isilon-u...@googlegroups.com
Yeah, we're quite aware of this as a number of our clusters have encountered this in the wild... prompting this advisory :) Our workloads on our clusters are uh... unique.


For more options, visit https://groups.google.com/d/optout.

Peter Serocka

unread,
Mar 7, 2014, 10:13:40 AM3/7/14
to isilon-u...@googlegroups.com
Shocked to see it happens on both switches,
i.e. for redundant configurations…

Curious, what makes your workloads so unique?


— Peter

On Fri 7 Mar '14 md, at 23:01 st, Jason Davis <scr...@gmail.com> wrote:

> Yeah, we're quite aware of this as a number of our clusters have encountered this in the wild... prompting this advisory :) Our workloads on our clusters are uh... unique.
>
>
> On Thu, Mar 6, 2014 at 8:30 PM, Jeff Cleverley <jeff.cl...@avagotech.com> wrote:
> Jason,
>
> The notification about the Qlogic switches is what got me thinking about these. In case you didn't get it, here is the number and description:
>
>

Jason Davis

unread,
Mar 7, 2014, 11:02:39 AM3/7/14
to isilon-u...@googlegroups.com
Metadata heavy, deep directory structures and billions of files less then 8k, general purpose HPC (No luxury to tune the cluster for a known IO workload) and high storage pool utilization (+88% is the norm). 

Our engineering groups have a tendency of pushing the Isilon clusters to their very limits. Hence why we ran into the IB FW switch bug on multiple clusters. 


On Fri, Mar 7, 2014 at 9:13 AM, Peter Serocka <pser...@picb.ac.cn> wrote:
Shocked to see it happens on both switches,
i.e. for redundant configurations...


Curious, what makes your workloads so unique?


-- Peter

Peter Serocka

unread,
Mar 7, 2014, 11:25:00 AM3/7/14
to isilon-u...@googlegroups.com
Thanks! And good luck further on.
Reply all
Reply to author
Forward
0 new messages