isi_hw_status report failed power supply but is working

149 views
Skip to first unread message

pierluigi...@gmail.com

unread,
Jul 24, 2020, 11:36:40 AM7/24/20
to Isilon Technical User Group
Hi all,
 I have an X210 where has been replaced a failed power supply.
After the replacement I0ve cleard every event and from the webui all is "green".
The only thing that seems strange is that isi_hw_status still report the power supply as failed although all the other indications are fine.
I can see fw version, voltage, current and so on.
Only the indication of "Power Supply 1 not providing power" is wrong.
I've runned the isi healtcheck run  ioca_checkHardwareStatus and also says that the power is failed but it don't raise any event.

Any idea on how to resolve this ?

Thanks in advance.

Here the output from isi_hw_status

Isi_clust-1# isi_hw_status
  SerNo: XXXXXXXXXXX
 Config: 610-XXXXX-06
FamCode: X
ChsCode: 2U
GenCode: 10
Product: X210-2U-Single-48GB-2x1GE-2x10GE SFP+-44TB-800GB SSD
  HWGen: CTO (CTO Hardware)
Chassis: ISI12V3 (Isilon 12-Bay(V3) Chassis)
    CPU: GenuineIntel (2.39GHz, stepping 0x000306e4)
   PROC: Single-proc, Quad-core
    RAM: 51412152320 Bytes
   Mobo: IntelS1400FP (Intel S1400FP Motherboard)
  NVRam: LX4381 (Isilon LOx NVRam Card) (2016MB card) (size 2113798144B)
 DskCtl: LSI2308SAS2 (LSI 2308 SAS Controller) (8 ports)
 DskExp: LSISAS2X24 (LSI SAS2x24 SAS Expander)
PwrSupl: PS1 (type=ACBEL POLYTECH , fw=09.05)
PwrSupl: PS2 (type=ACBEL POLYTECH , fw=09.05)
ChasCnt: 1 (Single-Chassis System)
  NetIF: ib1,ib0,igb0,igb1,bxe0,bxe1
 IBType: MT4099 QDR (Mellanox MT4099 IB QDR Card)
 LCDver: IsiVFD1 (Isilon VFD V1)
    IMB: Board Version 0xffffffff
A Power Supply has FAILED
Power Supply 1 not providing power
Power Supply 2 good
CPU Operation (raw 0x88420000)  = Normal
FAN TAC SENSOR 1                = 12700.000
FAN TAC SENSOR 2                = 12900.000
FAN TAC SENSOR 3                = 12600.000
PS FAN SPEED 1                  = 5000.000
PS FAN SPEED 2                  = 5000.000
BB +12.0V                       = 11.986
BB +5.0V                        = 4.937
BB +3.3V                        = 3.225
BB +5.0V STBY                   = 4.872
BB +3.3V AUX                    = 3.225
BB -12.0V                       = -11.455
BB +1.2V VCCP1                  = 0.875
BB 1.5V P1 MEM                  = 1.498
BB +1.8V AUX                    = 1.781
BB +1.1V STBY                   = 1.090
BB +3.0V Vbat                   = 3.192
BB +1.35 P1LV AB                = na
V1.0                            = 1.000
V1.8                            = 1.800
V3.3                            = 3.300
V5.0                            = 5.000
V5.0_STBY                       = 5.000
V3.3_STBY                       = 3.320
V5.0_NVRAM                      = 4.860
VBATT_1                         = 3.980
VBATT_2                         = 3.980
V5.0_FP_X                       = 4.960
V12.0_BB_A                      = 12.160
V12.0_FAN1                      = 12.000
V12.0_FAN2                      = 12.040
V12.0_FAN3                      = 12.080
V12.0_MB0                       = 12.200
V12.0_MB1                       = 12.200
V12.0                           = 12.180
V3.3_CMD                        = 3.340
PS IN VOLT 1                    = 231.000
PS IN VOLT 2                    = 231.000
PS OUT VOLT 1                   = 12.300
PS OUT VOLT 2                   = 12.200
PS IN CURR 1                    = 0.600
PS IN CURR 2                    = 0.600
PS OUT CURR 1                   = 8.000
PS OUT CURR 2                   = 8.000
Temp Front Panel                = 18.0
P1 Therm Margin                 = -63.000
P1 DTS Thrm Mgn                 = -63.000
P1 DIMM Thrm Mgn                = -61.000
BB EDGE Temp                    = 23.000
BB P1 VR Temp                   = 30.000
BB BMC Temp                     = 31.000
BB MEM VR Temp                  = 26.000
SSB Temp                        = 43.000
LAN NIC Temp                    = 41.000
TEMP SENSOR 1                   = 21.000
PS TEMP 1                       = 44.000
PS TEMP 2                       = 43.000
LSI CORE TEMP                   = 34.000
TEMP SENSOR 2                   = 23.000


Pierluigi

Anurag Chandra

unread,
Jul 24, 2020, 11:48:47 AM7/24/20
to isilon-u...@googlegroups.com
One easy way to check is to go to 

Isi_hwmon.log inside /var/log on this node to confirm if this is a problem or not

Thanks 
Anurag 

--
You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/isilon-user-group/13220b2f-0e47-47dc-bef4-95025ea72148n%40googlegroups.com.

mandar kolhe

unread,
Jul 24, 2020, 3:53:26 PM7/24/20
to Isilon Technical User Group
Hello Pierluigi,

X210 is a gen5 node so for sensor their is BMC/CMC in these nodes , can you verify if BMC CMC is responsive on these nodes ? if not or anyways I would recommend you to power cycle that node & reset both PSU and its cable. to verify if their is sensor issue you can check /var/log/isi_hwmon.log

Steps to power cycle :

shutdown node
remove power cable
wait 5 minutes
power on node

Power cycling node will make BMC/CMC responsive. sometimes if BMC CMC is responsive you can see this kind of behavior.

Thanks,
Mandar 

pierluigi...@gmail.com

unread,
Jul 27, 2020, 3:44:36 AM7/27/20
to Isilon Technical User Group
This procedure did the trick.
Thank you very much !
Pierluigi

mandar kolhe

unread,
Jul 27, 2020, 11:11:18 AM7/27/20
to isilon-u...@googlegroups.com
no problem, you are welcome

--
You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages