icecard "local" temperature?

8 views
Skip to first unread message

Paul H. Hargrove

unread,
Mar 3, 2011, 2:20:17 PM3/3/11
to lnx...@googlegroups.com
I have LNXI nodes connected to an IceBox v4.
It appears that every 2 days the thermal shutdown is killing one node.
The temperatures reported by the IceBox are, according to docs 4,
"remote" and 1 "local" temp:

# threshget 8
Port 8 power-off temperature thresholds (in degrees C):
R1: 60
R2: 60
R3: 60
R4: 60
L1: 45

# temp 7,8,9
port 7 temperatures: 46, 42, 1, 1, 26
port 8 temperatures: 39, 37, 1, 1, 36
port 9 temperatures: 42, 44, 1, 1, 25

Having been shutdown, the R1 and R2 (cpu) temps on "port 8" are lower
than its neighbors. However, the "L1" temperature (36 C at the moment)
is already 10 degrees higher than other nodes, even though the node has
been idle.

The two "1"s are, I understand, because there are only 2 cpus being
monitored.

I'd like to have some idea of what (if anything) can be done to repair
this node, and having some clue where the over-temp is being measured
seems like a goof place to start. So, my question is what/where is the
"L1" temperature being reported?

-Paul

--
Paul H. Hargrove PHHar...@lbl.gov
Future Technologies Group
HPC Research Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900

Ed Wahl

unread,
Mar 3, 2011, 9:13:31 PM3/3/11
to lnx...@googlegroups.com
Depends on the kind of nodes. Evo1s?  Evolocity 2 boxes?  Dells?  etc

On later evo2's there is a card inside the node that runs a stringer of wire to it from the front maybe? (been a year or two) I recall it was called the 'ice card'.  Had a couple thermistors on it.   Should be simple to pop the node open and find the card if it's an evolocity 2.  It's not the motherboard, nor the power supply.  Maybe put a touch of freeze spray to it to test the range and see if it's faulty. More than likely you've had some fans inside the case die.

Can't speak to the later dells and earliest evo1s.
Reply all
Reply to author
Forward
0 new messages