EDS mainframe goes <elided>, crashes RBS cheque system

McKown, John

unread,

Dec 17, 2009, 10:40:09 AM12/17/09

to

From The Register (Vulture Central).

http://www.theregister.co.uk/2009/12/17/eds_mainframe/

Two z10s crashed in the UK due to lack of microcode maintenance. The first one crashed. This caused a DR roll over to the second one, which then also crashed. I don't know how an application can cause a microcode problem. Likely a misstatement due to lack to knowledge.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to list...@bama.ua.edu with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Nuttall, Peter

unread,

Dec 17, 2009, 12:35:08 PM12/17/09

to

Well it also mentions that the cheque application was based on software
called Connect Direct (a component of application, maybe) .... So I suspect
the Journo is not techy enough to understand what he/she's writing about ...
Would be interested to know what caused it ...

2009/12/17 McKown, John <John....@healthmarkets.com>

--
Peter Nuttall
J&LR AMS Group 2

Ken Porowski

unread,

Dec 17, 2009, 12:37:33 PM12/17/09

to

<snip>

HP managers are reaping the harvest of their deep cost-cutting at EDS,
in the form of a massive mainframe failure that crippled some very large
clients, including the taxpayer-owned bank RBS.

An IBM Z10 at EDS's Stockley Park site, west of London, fell over this
week after vital microcode fixes had not been applied, because all the
qualified staff had been fired.

Previously the updates would have been applied by the Stockley park
hardware team, who have all been made redundant.

When EDS' disaster recovery plan kicked in, switching processes to
another Z10 at Mitcheldean in Gloucestershire, a similar lack of
maintenance scuppered the stand-in machine.

<snip>

They perform their own microcode updates?
I would have thought there were IBM CE's for that as part of 'normal'
maintenance charges.
Or maybe they just didn't give IBM the machine time?

What sort of microcode fix (if not applied) causes an otherwise working
machine to crash?

The way this was written kid of negates the 'Mainframes never crash'
(from a hardware perspective) idea.

-----Original Message-----
McKown, John

Mark Post

unread,

Dec 17, 2009, 1:43:04 PM12/17/09

to

>>> On 12/17/2009 at 9:25 AM, "McKown, John" <John....@HEALTHMARKETS.COM>
wrote:

> From The Register (Vulture Central).
>
> http://www.theregister.co.uk/2009/12/17/eds_mainframe/
>
> Two z10s crashed in the UK due to lack of microcode maintenance. The first
> one crashed. This caused a DR roll over to the second one, which then also
> crashed.

I've been expecting something like this for years, having seen first hand all the cuts EDS made to the mainframe support organizations _before_ being acquired by HP. I'm actually surprised it took this long. I guess the difference before was that we actually still had people working like crazy to keep things running, and now there simply aren't any in some locations.

Mark Post

Bob Shannon

unread,

Dec 17, 2009, 1:43:51 PM12/17/09

to

> From The Register (Vulture Central).

>http://www.theregister.co.uk/2009/12/17/eds_mainframe/

That's a poorly written article. I seriously doubt the "Stockley park hardware team" applies microcode updates. More likely they simply failed to schedule IBM to perform maintenance. The mention of Connect Direct bears no relevance to the hardware problem.

Bob Shannon
Rocket Software

Sam Siegel

unread,

Dec 17, 2009, 1:54:25 PM12/17/09

to

Connect:Direct (previously known as NDM) can only be loosely related to
check processing as it just transmits files back and forth. An IBM Check
shop uses CPCS (check processing control system) Vector sort (by Sterling),
etc.

These are low level programs still shipped in source (at least partially)
and run Authorized and preform device control and other non specialized
processes.

The article seems to have a bunch of information assembled in a random
fashion. Either the writer does not understand check processing or the
person who provided the information was not clear on the details.

It is not clear how microcode fits into this unless they are talking about
the 3890 check sorters (unit record devices).

Just not enough information to make any sense of what actually occurred.

Sam

Ed Finnell

unread,

Dec 17, 2009, 1:56:06 PM12/17/09

to

In a message dated 12/17/2009 11:36:29 A.M. Central Standard Time,
Ken.Po...@CIT.COM writes:

What sort of microcode fix (if not applied) causes an otherwise working
machine to crash?

The way this was written kid of negates the 'Mainframes never crash'
(from a hardware perspective) idea.

>>
It's an evolving species! Stuff mutates. Mixed vendor environments
are especially challenging. Have to get past the finger pointing and
designated blame game to pin it down. The z10 like it's predecessors
downloads EC's and fixes as they are discovered and tested. It is
designed to do concurrent maintenance on all but the most critical
aspects. On the software side SMP/E provides HOLDDATA for EC or hardware
related actions. Usually these are level sets for future
enhancements or features.

It's up to management to schedule service time and provide a testing
environment for changes to include staffing and training.

Pommier, Rex R.

unread,

Dec 17, 2009, 2:00:26 PM12/17/09

to

And now watch the guilty parties spin it as "we need to migrate off
these mainframes to HP hardware..."

Rex

Steve Thompson

unread,

Dec 17, 2009, 2:05:19 PM12/17/09

to

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-...@bama.ua.edu] On Behalf Of Ken Porowski
Sent: Thursday, December 17, 2009 10:00 AM
To: IBM-...@bama.ua.edu
Subject: Re: EDS mainframe goes <elided>, crashes RBS cheque system

<snip>

They perform their own microcode updates?

I would have thought there were IBM CE's for that as part of 'normal'
maintenance charges.
Or maybe they just didn't give IBM the machine time?

What sort of microcode fix (if not applied) causes an otherwise working
machine to crash?

The way this was written kid of negates the 'Mainframes never crash'
(from a hardware perspective) idea.

<SNIP>

Perhaps there is a TCP/IP microcode patch that needs to be put on their system? The type that if you don't put it on, you wind up with corrupted data or sockets that hang/block?

--Sent from my Dick Tracy Two-Way TV Wrist-Watch --

William Janulin

unread,

Dec 17, 2009, 2:07:25 PM12/17/09

to

Yes, until they discover the hidden costs of migration, after flailing
away for two years like a major pharmaceutical firm did some years ago,
then reverting back.

Bill Janulin
Mgr Tech Support & Product Dev.
ASPG, Inc.

Hal Merritt

unread,

Dec 17, 2009, 2:30:32 PM12/17/09

to

From personal experience, I expect more misinformation than information early in a major event (aka 'disaster').

I read that modern processors can be configured to automatically download and apply patches on the fly. Some prefer the patches be staged until someone pulls the trigger in a suitable window. Our CE does that for us, but the process is pretty straight forward and easy enough for mere mortals. And a large shop might want their own team going around pulling triggers. It follows that elimination of the team would leave triggers unpulled.

Wonder if we'll ever know what really happened?

<snip>

NOTICE: This electronic mail message and any files transmitted with it are intended
exclusively for the individual or entity to which it is addressed. The message,
together with any attachment, may contain confidential and/or privileged information.
Any unauthorized review, use, printing, saving, copying, disclosure or distribution
is strictly prohibited. If you have received this message in error, please
immediately advise the sender by reply email and delete all copies.

Pinnacle

unread,

Dec 17, 2009, 6:28:53 PM12/17/09

to

----- Original Message -----
From: "Mark Post" <mp...@NOVELL.COM>
Newsgroups: bit.listserv.ibm-main
Sent: Thursday, December 17, 2009 1:43 PM
Subject: Re: EDS mainframe goes <elided>, crashes RBS cheque system

>>>> On 12/17/2009 at 9:25 AM, "McKown, John"
>>>> <John....@HEALTHMARKETS.COM>
> wrote:
>> From The Register (Vulture Central).
>>
>> http://www.theregister.co.uk/2009/12/17/eds_mainframe/
>>
>> Two z10s crashed in the UK due to lack of microcode maintenance. The
>> first
>> one crashed. This caused a DR roll over to the second one, which then
>> also
>> crashed.
>
> I've been expecting something like this for years, having seen first hand
> all the cuts EDS made to the mainframe support organizations _before_
> being acquired by HP. I'm actually surprised it took this long. I guess
> the difference before was that we actually still had people working like
> crazy to keep things running, and now there simply aren't any in some
> locations.
>
>

After Ross Perot left, EDS went straight downhill. In 1990, EDS owned
outsourcing. By 2000 IGS had eaten EDS for breakfast and spit it out. Sad,
really, to watch a once-great company that invented the business become an
also-ran.

Regards,
Tom Conley

Hunkeler Peter , KIUP 4

unread,

Dec 18, 2009, 2:19:11 AM12/18/09

to

>And now watch the guilty parties spin it as "we need to
>migrate off these mainframes to HP hardware..."

.. and in support of their ignorance they surely will refer to
some glamorous consultant reports like those that recently
recommended "... now being the time to get ..." (to somewhere
else :-)

<rant>
With an ass at the right place shit is bound to come out. (With
appologies to the animals)
</rant>

--
Peter Hunkeler
Credit Suisse

Ted MacNEIL

unread,

Dec 18, 2009, 2:41:57 AM12/18/09

to

>An IBM Check shop uses CPCS (check processing control system) Vector sort (by Sterling)

Since when is it by Sterling?
When I worked at a Canadian bank, it was an IBM product using 3890 cheque processors.
All the presentations and support were by IBM.
Unless what I loosely call my mind has failed (again).
-
Too busy driving to stop for gas!

Sam Siegel

unread,

Dec 18, 2009, 3:13:46 AM12/18/09

to

Sorry about that ... CPCS is indeed (and always has been) an IBM product.
VECTOR:sort is by Sterling.

Maarten Slegtenhorst

unread,

Dec 21, 2009, 5:27:48 AM12/21/09

to

The article may be lacking some detail info, but it seems pretty clear:

- One mainframe crashes because a critical MCL ( hiper? ) was not
applied/activated
- Naturally all LPAR's on this mainframe are then unavailable
- Connect Direct doesn't run because it ran in one of the unavailable
lpars
- If the cheque clearing system runs in a lpar, then it's unavailable
too
- If the cheque clearing system runs elsewhere, using a Connect Direct
connection to one of the lpar's, it doesn't receive input/can't send
output, so it doesn't function anymore

- Everything switches to the DR-site

- This mainframe also crashes because the critical MCL ( hiper? ) was
not applied/activated
- Naturally all LPAR's on this mainframe are then unavailable
- Connect Direct doesn't run because it ran in one of the unavailable
lpars
- If the cheque clearing system runs in a lpar, then it's unavailable
too
- If the cheque clearing system runs elsewhere, using a Connect Direct
connection to one of the lpar's, it doesn't receive input/can't send
output, so it doesn't function anymore

So the crash was not caused by the software, but the unavailability of
the check clearing system was a result of the crash.

Or am I missing something?

------------------------------------------------------------------------
----

This reminds me of SEGMENTATIONOFFLOAD, which crashed our OSA's with a
domino-effect.
All LPAR's that used those OSA's were unavailable through the network.

P.s. We also have a hardwareteam that activates the disruptive MCL's (
OSA MCL's amongst others )
P.p.s. Nice horror-scenario to show all those people that try to reject
MCL-changes because somewhere in the prehistoric an update went wrong.

--
Maarten

-----Oorspronkelijk bericht-----
Van: IBM Mainframe Discussion List [mailto:IBM-...@BAMA.UA.EDU] Namens
McKown, John
Verzonden: donderdag 17 december 2009 15:26
Aan: IBM-...@BAMA.UA.EDU
Onderwerp: EDS mainframe goes <elided>, crashes RBS cheque system

http://www.theregister.co.uk/2009/12/17/eds_mainframe/

-----------------------------------------------------------------
ATTENTION:
The information in this electronic mail message is private and
confidential, and only intended for the addressee. Should you
receive this message by mistake, you are hereby notified that
any disclosure, reproduction, distribution or use of this
message is strictly prohibited. Please inform the sender by
reply transmission and delete the message without copying or
opening it.

Messages and attachments are scanned for all viruses known.
If this message contains password-protected attachments, the
files have NOT been scanned for viruses by the ING mail domain.
Always scan attachments before opening them.
-----------------------------------------------------------------