Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Jammed (dead?) M4000 without SLA

409 views
Skip to first unread message

Kay-Uwe Loebel

unread,
Sep 11, 2019, 1:58:22 AM9/11/19
to
Hi,

our professorship's Sparc Enterprise M4000 heavily used for student labs
suddenly went down (power off) with lighting amber LED. :-(((

Here the related outputs:


XSCF> poweron -d 0
DomainIDs to power on:00
Continue? [y|n] :y
00 :Not powering on :Poweron canceled due to missing component.

XSCF> showstatus
* MBU_A Status:Faulted;
* CPUM#0-CHIP#0 Status:Deconfigured;
* CPUM#0-CHIP#1 Status:Deconfigured;
...

XSCF> showboards -va
XSB R DID(LSB) Assignment Pwr Conn Conf Test Fault COD
---- - -------- ----------- ---- ---- ---- ------- -------- ----
00-0 * 00(00) Assigned n n n Unknown Faulted n

XSCF> showlogs error
Date: Aug 27 08:37:47 CEST 2019 Code: 60002108-7e010000-0508090810ff1017
Status: Warning Occurred: Aug 27 08:37:43.968 CEST 2019
FRU: /MBU_A
Msg: MBC internal fatal error
Date: Aug 27 08:38:07 CEST 2019 Code: 80006108-7b010000-0508071610ff1005
Status: Alarm Occurred: Aug 27 08:38:06.113 CEST 2019
FRU: /MBU_A
Msg: MBC internal serious error

XSCF> fmdump
TIME UUID MSG-ID
Aug 27 08:37:47.6128 5fa5d831-1d4b-4344-a6c9-5939d01886bb SCF-8003-HA
Aug 27 08:38:07.3002 8852b041-6bc0-4fe9-b972-d373dfd0f10c SCF-8003-LS

XSCF> version -c xcp -v
XSCF#0 (Active )
XCP0 (Current): 1081
OpenBoot PROM : 02.08.0000
XSCF : 01.08.0004
XCP1 (Reserve): 1081
OpenBoot PROM : 02.08.0000
XSCF : 01.08.0004
OpenBoot PROM BACKUP
#0: 02.08.0000
#1: --.--.----


It seems, that the motherboard is broken, but I've read in 3 or 4
sources, that sometimes these MBC faults are spuriously, i.e.
software-related, so I tried intensive to reset the error status.

Re-testing the PSB fails unfortunately:

XSCF> testsb 0
Initial diagnosis is about to start, Continue?[y|n] :y
The current configuration does not support this operation.


Clearing the error condition manually (clearstatus /MBU_A) fails due to
the lack of a service password:

XSCF> enableservice
Service Password:
...
XSCF> service
Account is not enabled for service mode.


Even a factory reset didn't solve the problem:

XSCF> dumpconfig -v file:///media/usb_msd/config.txt
...
XSCF> restoredefaults -c factory
...
XSCF> restoreconfig -v file:///media/usb_msd/config.txt

The fltlog is now empty (no output of fmdump), but the amber LED lights
again, when on the concole

start /scf/sbin/fmd (pid=719)

is executed.


You see the huge problem: What can I do to get the M4000 back in service?

Many thanks in advance for every hint.

Kind regards
Kay-Uwe Loebel

YTC#1

unread,
Sep 11, 2019, 3:41:53 AM9/11/19
to
On 11/09/2019 06:58, Kay-Uwe Loebel wrote:
> Hi,
>
> our professorship's Sparc Enterprise M4000 heavily used for student labs
> suddenly went down (power off) with lighting amber LED. :-(((
>
> Here the related outputs:
<snip>

>
>
> You see the huge problem: What can I do to get the M4000 back in service?
>

In all honesty, the quickest (and time cheapest) method would be to get
on Ebay and buy another one, then swap the disks etc. (They are quite
cheap).

Then thing about your DR policy :-)

If you have another M series handy, get the disks in that and take flash
archives of the OS and backup of the data.

The, assuming other kit is available, re-install.


> Many thanks in advance for every hint.
>
> Kind regards
> Kay-Uwe Loebel



--
Bruce Porter
"The internet is a huge and diverse community but mainly friendly"
http://ytc1.blogspot.co.uk/
There *is* an alternative! http://www.openoffice.org/

Kay-Uwe Loebel

unread,
Sep 11, 2019, 6:01:48 AM9/11/19
to
Am 11.09.2019 um 09:42 schrieb YTC#1:

> In all honesty, the quickest (and time cheapest) method would be to get
> on Ebay and buy another one, then swap the disks etc. (They are quite
> cheap).

A pragmatic solution of course, but I fear, that the university will not
advocate the purchase of a second obsolete piece of hardware.
Moreover the licence server for our CAD software is bound till the end
of the year to the current hostid.

Perhaps it's possible to install a more recent firmware (we have 1081),
which allows the status reset by a simple power on/off with the key in
the service position?
Do you /someone know, why a further test of the PSB fails?

XSCF> testsb 0
Initial diagnosis is about to start, Continue?[y|n] :y
The current configuration does not support this operation.

> Then thing about your DR policy :-)

;-)
I do, and some functions are already ported to a linux server.
Nevertheless the M4000 has 2 PSUs and 2 system disks, the main causes of
desasters ...

> If you have another M series handy, get the disks in that and take flash
> archives of the OS and backup of the data.

We have the only one Mxxx in the university.

Kind regards
Kay-Uwe Loebel

YTC#1

unread,
Sep 11, 2019, 6:26:04 AM9/11/19
to
On 11/09/2019 11:01, Kay-Uwe Loebel wrote:
> Am 11.09.2019 um 09:42 schrieb YTC#1:
>
>> In all honesty, the quickest (and time cheapest) method would be to get
>> on Ebay and buy another one, then swap the disks etc. (They are quite
>> cheap).
>
> A pragmatic solution of course, but I fear, that the university will not
> advocate the purchase of a second obsolete piece of hardware.
> Moreover the licence server for our CAD software is bound till the end
> of the year to the current hostid.

The hostid is probably an easy solution. Many ways to skin a cat.
>
> Perhaps it's possible to install a more recent firmware (we have 1081),
> which allows the status reset by a simple power on/off with the key in
> the service position?

And that is an old FW at that.

> Do you /someone know, why a further test of the PSB fails?
>
> XSCF> testsb 0
> Initial diagnosis is about to start, Continue?[y|n] :y
> The current configuration does not support this operation.

This thread may help, as you have been reseting the system
<url:https://support.oracle.com/epmos/faces/SearchDocDisplay?_adf.ctrl-state=12587jyhco_4&_afrLoop=366634817834842>


>
>> Then thing about your DR policy :-)
>
> ;-)
> I do, and some functions are already ported to a linux server.

Booo, hisss !!!!

> Nevertheless the M4000 has 2 PSUs and 2 system disks, the main causes of
> desasters ...
>
>> If you have another M series handy, get the disks in that and take flash
>> archives of the OS and backup of the data.
>
> We have the only one Mxxx in the university.
>

What about any other Sparc boxes ? T series ? V series ?

Kay-Uwe Loebel

unread,
Sep 11, 2019, 7:11:20 AM9/11/19
to
Am 11.09.2019 um 12:26 schrieb YTC#1:

> The hostid is probably an easy solution. Many ways to skin a cat.

I know the LD_PRELOAD disguise. ;-)

>> Perhaps it's possible to install a more recent firmware (we have 1081),
>> which allows the status reset by a simple power on/off with the key in
>> the service position?

> And that is an old FW at that.

Okay, the 1050 should enable this feature?
But where get from and is it possible to install firmware in the faulted
state?

>> Do you /someone know, why a further test of the PSB fails?
>>
>> XSCF> testsb 0
>> Initial diagnosis is about to start, Continue?[y|n] :y
>> The current configuration does not support this operation.
>
> This thread may help, as you have been reseting the system
> <url:https://support.oracle.com/epmos/faces/SearchDocDisplay?_adf.ctrl-state=12587jyhco_4&_afrLoop=366634817834842>

Unfortunately I can't access the thread without Support Identifier ...

>> I do, and some functions are already ported to a linux server.

> Booo, hisss !!!!

Don't blame me, since at least 5 years I'm the last SPARC / Solaris user
here ... ;-)

>> We have the only one Mxxx in the university.

> What about any other Sparc boxes ? T series ? V series ?

UltraSPARC 45 and SunBlade 2000 are still working excellently.

Kind regards
Kay-Uwe Loebel

YTC#1

unread,
Sep 11, 2019, 11:16:58 AM9/11/19
to
On 11/09/2019 12:11, Kay-Uwe Loebel wrote:
> Am 11.09.2019 um 12:26 schrieb YTC#1:
>
>> The hostid is probably an easy solution. Many ways to skin a cat.
>
> I know the LD_PRELOAD disguise. ;-)
>
>>> Perhaps it's possible to install a more recent firmware (we have 1081),
>>> which allows the status reset by a simple power on/off with the key in
>>> the service position?
>
>> And that is an old FW at that.
>
> Okay, the 1050 should enable this feature?
> But where get from and is it possible to install firmware in the faulted
> state?

I take it you don't any Oracle support ?

>
>>> Do you /someone know, why a further test of the PSB fails?
>>>
>>> XSCF> testsb 0
>>> Initial diagnosis is about to start, Continue?[y|n] :y
>>> The current configuration does not support this operation.
>>
>> This thread may help, as you have been reseting the system
>> <url:https://support.oracle.com/epmos/faces/SearchDocDisplay?_adf.ctrl-state=12587jyhco_4&_afrLoop=366634817834842>
>>
>
> Unfortunately I can't access the thread without Support Identifier ...

Oh, looks like I attached the wrong link anyway :-(
<url:https://community.oracle.com/community/support/oracle_sun_technologies/sparc_m-series_servers>

It is suggesting no CPU is attached to the domain

>
>>> I do, and some functions are already ported to a linux server.
>
>> Booo, hisss !!!!
>
> Don't blame me, since at least 5 years I'm the last SPARC / Solaris user
> here ... ;-)

Sounds like you really are the last now :-(

>
>>> We have the only one Mxxx in the university.
>
>> What about any other Sparc boxes ? T series ? V series ?
>
> UltraSPARC 45 and SunBlade 2000 are still working excellently.
>

I don't think that will roll.

Keith Thompson

unread,
Sep 11, 2019, 5:24:08 PM9/11/19
to
Kay-Uwe Loebel <loe...@etit.tu-chemnitz.de> writes:
> Am 11.09.2019 um 09:42 schrieb YTC#1:
>> In all honesty, the quickest (and time cheapest) method would be to get
>> on Ebay and buy another one, then swap the disks etc. (They are quite
>> cheap).
>
> A pragmatic solution of course, but I fear, that the university will not
> advocate the purchase of a second obsolete piece of hardware.
> Moreover the licence server for our CAD software is bound till the end
> of the year to the current hostid.
[...]

Can you talk to your vendor about changing the hostid for the license
server? I would hope they have provisions for dying hardware.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Will write code for food.
void Void(void) { Void(); } /* The recursive call of the void */

YTC#1

unread,
Sep 12, 2019, 3:20:18 AM9/12/19
to
On 11/09/2019 22:24, Keith Thompson wrote:
> Kay-Uwe Loebel <loe...@etit.tu-chemnitz.de> writes:
>> Am 11.09.2019 um 09:42 schrieb YTC#1:
>>> In all honesty, the quickest (and time cheapest) method would be to get
>>> on Ebay and buy another one, then swap the disks etc. (They are quite
>>> cheap).
>>
>> A pragmatic solution of course, but I fear, that the university will not
>> advocate the purchase of a second obsolete piece of hardware.
>> Moreover the licence server for our CAD software is bound till the end
>> of the year to the current hostid.
> [...]
>
> Can you talk to your vendor about changing the hostid for the license
> server? I would hope they have provisions for dying hardware.
>

Bet they have not got support for that either :-)

Kay-Uwe Loebel

unread,
Sep 12, 2019, 8:57:22 AM9/12/19
to
Am 11.09.2019 um 17:17 schrieb YTC#1:

>> Okay, the 1050 should enable this feature?
>> But where get from and is it possible to install firmware in the faulted
>> state?

> I take it you don't any Oracle support ?

Right, it wouldn't be an easy thing of course.

> Oh, looks like I attached the wrong link anyway :-(
> <url:https://community.oracle.com/community/support/oracle_sun_technologies/sparc_m-series_servers>
>
> It is suggesting no CPU is attached to the domain

Thanks for the hint - I guess you meant the motherboard (XSB) is not
assigned to the domain.
It is, but I de- and re-assigned it just to be sure:

XSCF> showboards -v -d 0
XSB R DID(LSB) Assignment Pwr Conn Conf Test Fault COD
---- - -------- ----------- ---- ---- ---- ------- -------- ----
00-0 * 00(00) Assigned n n n Unknown Faulted n

XSCF> testsb 0
Initial diagnosis is about to start, Continue?[y|n] :y
The current configuration does not support this operation.

XSCF> deleteboard -c unassign 00-0
XSB#00-0 will be unassigned from domain immediately. Continue?[y|n] :y

XSCF> showboards -v -d 0
XSB R DID(LSB) Assignment Pwr Conn Conf Test Fault COD
---- - -------- ----------- ---- ---- ---- ------- -------- ----
00-0 SP Unavailable n n n Unknown Faulted n

XSCF> testsb 0
Initial diagnosis is about to start, Continue?[y|n] :y
The current configuration does not support this operation.

XSCF> addboard -c assign -d 0 00-0
XSB#00-0 will be assigned to DomainID 0. Continue?[y|n] :y

It seems that one condition is still missing to re-test the board.
Perhaps the XSB must be _connected_ the the domain, but there's no
special option / command to perform this. :-(


Alternatively I look for a way to perform a "clearstatus /MBU_A" ...
or to prevent the XSCF from starting the /scf/sbin/fmd daemon ...

There must be a way for _my_ purchased machine. ;-)

Kind regards
Kay-Uwe Loebel

YTC#1

unread,
Sep 12, 2019, 3:36:45 PM9/12/19
to
It has been a while since I touched an Mx000, never mind trouble shot
one :-(

Essentially you appear to have a major issue, it may be that the reset
has cleared it. However it may not.

Ok, going back to the fmdumps

SCF-8003-HA
This fault can occur on the MBC chip that resides on the Motherboard.

The fault may affect the the entire MBC chip or it may affect just a
single XSB; this can be determined by looking at the FMRI of the fault.

The recommended service action for this event is to schedule the
replacement of the affected FRU.

What does

fmdump -v -u 5fa5d831-1d4b-4344-a6c9-5939d01886bb

Show ?

I still say your best bet is to buy one off ebay.

>
> There must be a way for _my_ purchased machine. ;-)

Yeah, but it is old and broken.

I bought a MacBook in 2013. It is still in use, but I expect it to break
at some point.

Kay-Uwe Loebel

unread,
Sep 13, 2019, 1:51:46 AM9/13/19
to
Am 12.09.2019 um 21:36 schrieb YTC#1:

> It has been a while since I touched an Mx000, never mind trouble shot
> one :-(

a fortiori I appreciate your help!

> What does
>
> fmdump -v -u 5fa5d831-1d4b-4344-a6c9-5939d01886bb
>
> Show ?

Now: nothing, because I did carry out a factory reset. ;-)

TIME UUID MSG-ID
fmdump: /var/opt/sun/fm/fmd/fltlog is empty


Of course I saved the outputs before:

XSCF> fmdump -v -u 5fa5d831-1d4b-4344-a6c9-5939d01886bb
TIME UUID MSG-ID
Aug 27 08:37:47.6128 5fa5d831-1d4b-4344-a6c9-5939d01886bb SCF-8003-HA
100% fault.chassis.SPARC-Enterprise.asic.mbc.fe

Problem in: hc:///chassis=0/cmu=0/mbc=0
Affects: hc:///chassis=0/cmu=0/xsb=0
FRU: hc://:product-id=SPARC Enterprise
M4000:chassis-id=BC********:server-id=******:serial=BC********:part=CF00541-0893
06 \541-0893-06:revision=0101/component=/MBU_A
Location: /MBU_A

XSCF> fmdump -v -u 8852b041-6bc0-4fe9-b972-d373dfd0f10c
TIME UUID MSG-ID
Aug 27 08:38:07.3002 8852b041-6bc0-4fe9-b972-d373dfd0f10c SCF-8003-LS
100% fault.chassis.SPARC-Enterprise.asic.mbc.se

Problem in: hc:///chassis=0/cmu=0/mbc=0
Affects: hc:///chassis=0/cmu=0
FRU: hc://:product-id=SPARC Enterprise
M4000:chassis-id=BCF092404K:server-id=******:serial=BC********:part=CF00541-0893
06 \541-0893-06:revision=0101/component=/MBU_A
Location: /MBU_A

XSCF> fmdump -m -M
MSG-ID: SCF-8003-LS, TYPE: Fault, VER: 1, SEVERITY: Critical
EVENT-TIME: Tue Aug 27 08:38:07 CEST 2019
PLATFORM: SPARC Enterprise M4000, CSN: BC********, HOSTNAME: ******
SOURCE: sde, REV: 1.16
EVENT-ID: 8852b041-6bc0-4fe9-b972-d373dfd0f10c
DESC: A non-fatal uncorrectable error was detected within a MBC chip.
Refer to http://www.sun.com/msg/SCF-8003-LS for more information.
AUTO-RESPONSE: No immediate action is taken by XSCF software due to this
fault.
Resources associated with the faulty FRU will be deconfigured after the
platform is power cycled or after the domain reboots or after a Dynamic
Reconfiguration operation is performed. This resource deconfiguration
may cause the platform to become unbootable. Please consult the detail
section of the knowledge article for additional information.
IMPACT: The non-fatal uncorrectable error trap may cause the domain to
panic.
REC-ACTION: Schedule a repair action to replace the affected Field
Replaceable Unit (FRU), the identity of which can be determined using
fmdump -v -u EVENT_ID.
Please consult the detail section of the knowledge article for
additional information.


One reason to insist on a software-related trial is the "Current Issues
Page" in the "Sun SPARC(R) Enterprise M3000/M4000/M5000/M8000/M9000
(OPL) Servers" from 2012:

M5000 - MBC failures SCF-8003-LS and/or SCF-8003-HA
Specifically looking for cases that have a fatal error immediately
before the serious error. ->
Still under investigation by engineering. Current action is to replace
the faulted MBU.


Perhaps the guys found a solution meanwhile ...
The above mentioned web page http://www.sun.com/msg/SCF-8003-LS has been
unfortunately moved behind the pay wall. :-(


>> There must be a way for _my_ purchased machine. ;-)
>
> Yeah, but it is old and broken.

I meant, for a machine without any service contract I should have access
to all functions / features IMHO (also to execute a "clearstatus /MBU_A").

Kind regards
Kay-Uwe Loebel

YTC#1

unread,
Sep 13, 2019, 4:28:43 AM9/13/19
to
As you can, it is a H/W fault. The part needs replacing.
Sometimes you have to give up and accept defeat :-(

>
> Perhaps the guys found a solution meanwhile ...

It is broken.

> The above mentioned web page http://www.sun.com/msg/SCF-8003-LS has been
> unfortunately moved behind the pay wall. :-(
>

It says it is broken, contact your support provider and organise a
replacement.

>
>>> There must be a way for _my_ purchased machine. ;-)
>>
>> Yeah, but it is old and broken.
>
> I meant, for a machine without any service contract I should have access
> to all functions / features IMHO (also to execute a "clearstatus /MBU_A").

Why ? Even Sun would not have agreed to that. Fujitsu must have had good
reason to hold end users back from that command level, probably because
it can completely fubar the server. To get to escalation mode you need
to put get a password .... using a support call.

When you buy a car, does the manufacture supply you with free fixes for
life ? There is a warranty period and when that is over you pay for an
extended one, or take your chances and pay per breakage.

Chris

unread,
Sep 13, 2019, 8:19:29 AM9/13/19
to
Don't know if this would help, but had a similar problem on an M3000 in
April this year. That needed a password to clear the fault, but searched
around for a bit and found a procedure to reset the service processor by
moving a jumper. Here are the notes made at the time:

* remove service processor and fit jumper to J505, external terminal to
serial management port, 9600,n,8,1
* Plug in power cord
* To interrupt the sp boot process, type in xyzzy when you see the line:
Booting linux in n seconds. May take a couple of attempts.
* At the preboot prompt, Preboot > type in reset all
* The preboot menu exits, SP restarts, erases flash, sets defaults and
reboots SP
* At the login prompt, root, pwd, changeme

Might be worth a try...

Chris





Chris

unread,
Sep 13, 2019, 8:37:36 AM9/13/19
to
The fault here was reported as: SCF XSCF watchdog timeout

To get into service mode, must have mode privs, which needs a password.
so the command, enableservice, asks for one. But, service mode
can be entered via the keyswitch.

Don't remember more deatils, but assume defaults have been set via the
jumper J505, when rebooted with keyswitch in diags mode, you can access
service mode withoiut a password.

In my case, a spurious faultt message, which disappeared after the above...

Chris

Kay-Uwe Loebel

unread,
Sep 16, 2019, 2:34:25 AM9/16/19
to
Am 13.09.2019 um 14:37 schrieb Chris:

>> * remove service processor and fit jumper to J505, external terminal to
>> serial management port, 9600,n,8,1
>> * Plug in power cord
>> * To interrupt the sp boot process, type in xyzzy when you see the line:
>> Booting linux in n seconds. May take a couple of attempts.
>> * At the preboot prompt, Preboot > type in reset all
>> * The preboot menu exits, SP restarts, erases flash, sets defaults and
>> reboots SP
>> * At the login prompt, root, pwd, changeme
>>
>> Might be worth a try...
>>
>> Chris
>>
>>
>
> The fault here was reported as: SCF XSCF watchdog timeout
>
> To get into service mode, must have mode privs, which needs a password.
> so the command, enableservice, asks for one. But, service mode
> can be entered via the keyswitch.
>
> Don't remember more deatils, but assume defaults have been set via the
> jumper J505, when rebooted with keyswitch in diags mode, you can access
> service mode withoiut a password.
>
> In my case, a spurious faultt message, which disappeared after the above...
>
> Chris

Sounds like a very hot tip!
I pulled the XSCF Unit (FFSCFB) but found only a very small 4x jumper
block labelled CN21 (nothing put).
One could say 4 jumpers - 4 chances (or 3 to brick the unit). ;-)

Kay

Kay-Uwe Loebel

unread,
Sep 16, 2019, 2:57:36 AM9/16/19
to
Am 12.09.2019 um 09:20 schrieb YTC#1:

> On 11/09/2019 22:24, Keith Thompson wrote:

>> Can you talk to your vendor about changing the hostid for the license
>> server? I would hope they have provisions for dying hardware.

> Bet they have not got support for that either :-)

We have, but compared to Oracle's hardware support for the M4000 it's
quite cheap. ;-)

Kay

YTC#1

unread,
Sep 16, 2019, 4:49:08 AM9/16/19
to
It is a brick already.... what could you do to make it worse ? :-)

But bear in mind, his issue was a spurious one. Ones defo indicates a
H/W issue.

*if* you do manage to clear it (which I have my doubts without the
escalated password) you may end up corrupting something. After all the
system has done it's job, spotted an issue and stopped things going from
bad to worse.

YTC#1

unread,
Sep 16, 2019, 4:50:31 AM9/16/19
to
Is there a cost to your department to the service not being available to
users ? (as in they only pay when it is available).

As time drags on the buying a 2nd unit off Ebay will look better and
better value.

Kay-Uwe Loebel

unread,
Sep 17, 2019, 3:10:05 AM9/17/19
to
Am 16.09.2019 um 10:50 schrieb YTC#1:

> Is there a cost to your department to the service not being available to
> users ? (as in they only pay when it is available).

Of course _no_ at the university. ;-)
I moved the few actual guys to the linux server, but soon the term will
start and many students want / must use several programs on the M4000 ...

Meanwhile I found a company possibly would swap the mainboard.
Therfore arises the question, how the Service Processor notices, that
the MBU is repaired (changed). It's not the replacefru command.
I removed and put in the current board, but the fault status is not
cleared (perhaps due to the unchanged serial number).

Kay

Chris

unread,
Sep 17, 2019, 8:12:19 PM9/17/19
to
I would do a bit more digging, as some of the jumpers may erase the
flash completely, for example. Or, could be for remote jtag debug.
Found the info for mine on one of the Oracle forums, but plug any
error message strings directly into google. Should turn up something
relevant. M4000 are real cheap now, trading time against money on Ebay
and something will most likely turn up that you can maybe fund yourself...

Chris


Kay-Uwe Loebel

unread,
Sep 19, 2019, 4:13:22 AM9/19/19
to
The machine is back in service!

The solution was finally to install a firmware update (amazingly
possible despite faultet MBU), execute the clearfault command and
perform a power cycling.
Many thanks for all the help in this seemingly abandoned group!

Kay

YTC#1

unread,
Sep 19, 2019, 6:49:21 AM9/19/19
to
Good that it is up and working.
But bear in mind, you had a fault. You have cleared that fault. But the
fault may re-occur.

Now would be a good time to hunt out another Sparc server and get a
backup/migration to it. If a nice T series is available, run up some
LDoms :-)

Udo Tödter

unread,
Sep 23, 2019, 10:57:26 AM9/23/19
to
On 19.09.19 12:49, YTC#1 wrote:
> On 19/09/2019 09:13, Kay-Uwe Loebel wrote:
>> The machine is back in service!
>>
>> The solution was finally to install a firmware update (amazingly
>> possible despite faultet MBU), execute the clearfault command and
>> perform a power cycling.
>> Many thanks for all the help in this seemingly abandoned group!
>>
>
> Good that it is up and working.
> But bear in mind, you had a fault. You have cleared that fault. But the
> fault may re-occur.
>
> Now would be a good time to hunt out another Sparc server and get a
> backup/migration to it. If a nice T series is available, run up some
> LDoms :-)
>
>

Well here is another M4000 still in service, it runs and runs and runs.
And the old gem is still certified to run Solaris 11.

We are currently migrating our application (SAM/FS with 600TB active
data) to a LDOM running on a S7.

Udo

--
+----------------------------------------------------------------------+
|Udo Toedter |FSU Jena |Email: |Phone +493641940532|
|Bereich ZSB |Rechenzentrum|Udo.T...@uni-jena.de|FAX +493641940632|
+----------------------------------------------------------------------+

YTC#1

unread,
Sep 23, 2019, 1:54:31 PM9/23/19
to
On 23/09/2019 15:57, Udo Tödter wrote:
> On 19.09.19 12:49, YTC#1 wrote:
>> On 19/09/2019 09:13, Kay-Uwe Loebel wrote:
>>> The machine is back in service!
>>>
>>> The solution was finally to install a firmware update (amazingly
>>> possible despite faultet MBU), execute the clearfault command and
>>> perform a power cycling.
>>> Many thanks for all the help in this seemingly abandoned group!
>>>
>>
>> Good that it is up and working.
>> But bear in mind, you had a fault. You have cleared that fault. But the
>> fault may re-occur.
>>
>> Now would be a good time to hunt out another Sparc server and get a
>> backup/migration to it. If a nice T series is available, run up some
>> LDoms :-)
>>
>>
>
> Well here is another M4000 still in service, it runs and runs and runs.
> And the old gem is still certified to run Solaris 11.
>
> We are currently migrating our application (SAM/FS with 600TB active
> data) to a LDOM running on a S7.
>

When you have migrated, I think I know someone who might want the old
server off you :-)

larbob

unread,
Dec 16, 2021, 11:27:41 PM12/16/21
to
I've got an M3000 with the same issue. Chris, if you still read here, "xyzzy" doesn't seem to want to work for me as that password -- it doesn't really acknowledge that I made any input and then keeps going after 5 seconds.

larbob

unread,
Dec 16, 2021, 11:34:18 PM12/16/21
to
On Thursday, December 16, 2021 at 11:27:41 PM UTC-5, larbob wrote:
> I've got an M3000 with the same issue. Chris, if you still read here, "xyzzy" doesn't seem to want to work for me as that password -- it doesn't really acknowledge that I made any input and then keeps going after 5 seconds.
https://pastebin.com/9xM6Zwbp

I found this but I'm not sure what the root password is either.

chris

unread,
Dec 17, 2021, 12:13:00 PM12/17/21
to
I bought a T4-1 completely bricked, but after a bit of web search, found
a page that involves removing the internal service processor
card, fit jumper to J505, refit card, repower up with a terminal to
the ilom port.

SP reboots, type in xyzzy when you see the message:

Booing linux in N seconds (may take several tries)

Reinstall and re power up . At the preboot prompt:

reset all

Then at the login prompt:

Login root
Password changeme

Must be something similar for the M4000

Chris

M3000 for 4 years+ now...
0 new messages