dump reading made simple??

55 views
Skip to first unread message

Kurt LeBesco

unread,
Jul 29, 2013, 4:50:36 PM7/29/13
to ASSEMBL...@listserv.uga.edu
You are given the registers and psw in just about every dump.
That with an abend code should provide enough info to resolve the issue. Of
course many oc4's are the result of unresolved linkage or wild branches.
These can be frustrating! It is good to sprinkle eye catchers throughout
your codecode. Then of course you could always resort to desk checking your
code and making sure the code generated matches your intent.
Every one has there own style to finding the gremlins hidden in their code.
Best advice I can provide is don't give up!
Kurt LeBesco Nys courts retired.
On Jul 29, 2013 2:43 PM, "Tom Marchant" <m42tom-...@yahoo.com> wrote:

T'Dell Sparks

unread,
Jul 29, 2013, 5:33:02 PM7/29/13
to ASSEMBL...@listserv.uga.edu
(First set of registers are usually the calling program) be careful

That can be misleading as the RTM2 will put things in the DUMP as well. That makes IPCS indispensable when dealing with dumps. The PSW might be pointing to something else initially in the dump so you have to scan down through the listing find you program. That's why, as some have said here mentorship is a good way to skip some other painful lessons. I still have to call on 50 year vetran even after I've done t6his for over 30.. I don't write wait post code..
Good point though ..

Bernd Oppolzer

unread,
Jul 30, 2013, 4:16:26 AM7/30/13
to ASSEMBL...@listserv.uga.edu
I'd like to enter this thread once again.

90 percent of the dumps - if not more - that we see at our site are
dumps from
normal application code, that is 0Cx etc. exceptions or other "easy"
resolvable
reasons from application code. Or same reason, but the error is detected
somewhere below, for example in a LE routine which is called after a call
to a PL/1 or C runtime routine. Then the caller of this runtime routine
normally
has to be blamed for it.

We automized dump reading as much as possible - making it almost
unnecessary
most of the time - by providing an LE exit which runs in all our
environment and which
in case of an error catches this error and provides enough information
from the
save area back trace, that normally the application developer only has
to look at
those informations and simply doesn't need to refer to the following
SYSUDUMP.
For example, we print every DSA for every procedure call, together with the
name of the function, the parameter address lists of every call, the
complete
call hierarchy etc., the registers at every call level, the offset of
the call etc.
If the error is indeed in a LE routine below the application code, we
recognize
this and go up to the application code and identify the error position
in the application
code - same goes for DB2 errors, that is, when the error position is in
the routine
that is handling the DB2 "SQLCODE not handled" condition. And: if we found
the name of the module which is the cause of the error, we send an alarm
mail
to the department which is reponsible for the module - we get this
information
from a repository.

The information provided this way is much easier to read for our people
than SYSUDUMP and even easier than CEEDUMP (it has more information,
has a somehow better structure in our opinion, and - important for some
of our co-workers - it's in German language).

Furthermore, we teach the developers how to cope with this.

This was necessary (we did it in 2005), because we realized some problems:

- the dumps looked different in the different environments (batch, test,
DB dialog aka IMS),
but we wanted the same look and feel in every environment

- dump reading skills degraded

- we didn't want to buy an expensive tool and do the customizing in the
different environments;
instead we wanted one of our own, where we could add additional function
(see above
in an easy way)

From today's viewpoint, it looks like a success story.

Even in cases when the save area is destroyed (overwritten), the LE exit
does a very good
job by providing at least the rests of the save area trace. It tries to
find the save areas first
from the bottom (register 13), then from above (TCBFSA), and in the
normal case, the
two chains fit together. If not, there is a gap, and this gap is
documented.

The save area trace and the back chain is very imporant for us, because
at our site we
typically have many small modules calling each other and it is not
uncommon to see
some 50 levels of calling hierarchy.

BTW: the method works regardless of the programming language; we have C,
PL/1 and
ASSEMBLER (and, at a neighbor site, the exit also works with C++
functions - in fact
the method to get the function name from the entry point is the same for
all LE languages,
so I believe it will work for COBOL, too, although there is no COBOL
around).

Kind regards

Bernd

Shane

unread,
Jul 30, 2013, 4:43:48 AM7/30/13
to ASSEMBL...@listserv.uga.edu
On Tue, 30 Jul 2013 10:16:26 +0200 Bernd Oppolzer wrote:

> 90 percent of the dumps - if not more - that we see at our site are
> dumps from
> normal application code, that is 0Cx etc. exceptions or other "easy"
> resolvable reasons from application code.

Lucky fella. At the site I am currently with, almost all abends are in
(OCO) vendor code. *If* we get an abend at all - high importance address
spaces looping gets ugly real quick. BTGTS.
And I'd reckon I get to look at a dump for the above at least weekly.
All in one small site.

Simple be damned.

Shane ...

David de Jongh

unread,
Jul 30, 2013, 10:47:03 AM7/30/13
to ASSEMBL...@listserv.uga.edu
This was "d�j� vu all over again" for me. We have an in-house abend
analysis routine driven by an LE handle condition routine, and also by the
CICS XPCTA exit. I've been maintaining it for about 20 years now, through
multiple releases of CICS, COBOL and MVS/z/OS. I learned assembler through
a 3-week part time class in 1972, then learned how real programs were
written over the next couple of years in an assembler application
maintenance group. After being shown how to read a SYSUDUMP, I was fixing
production problems at 2a.m. It's not rocket science, just a question of
getting some help at the start, and getting dumps to look at fairly
frequently.
David de Jongh

-----Original Message-----
From: IBM Mainframe Assembler List [mailto:ASSEMBL...@LISTSERV.UGA.EDU]

Baraniecki, Ray

unread,
Jul 30, 2013, 10:49:33 AM7/30/13
to ASSEMBL...@listserv.uga.edu

I am OOO  with no access to email.  Please contact Widner Joseph for all urgent matters.





NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.

Chris Craddock

unread,
Jul 30, 2013, 11:11:32 AM7/30/13
to ASSEMBL...@listserv.uga.edu
I have no dog in this discussion - I am long gone from BMC - but to put this back on the rails; what Bill Blair was specifically asking for was people who have deep z/OS development and diagnostic skills.

Dump reading is just one of those skills. In Bill's case they already have shared product infrastructure (*) that diagnoses and recovers from abends in product code, regardless of the state it is running in. That's the thing that vomits up diagnostic messages and captures LOGREC and SVC dumps. All of that is completely automatic and in most cases you can figure out the root cause just by reading the diagnostic messages.

...except when you can't. That's when you need to be able to read and analyze the contents of an SVC dump, or a SAD. None of that has the slightest bit of anything to do with LE dumps, or SYSxDUMPs. To meet Bill's requirements you MUST know your way around z/OS internals sufficiently to understand the control block chains indicating the state of the machine at the time of the error. You get even more brownie points if you can do that for dumps taken out of FRR routines for work that was locked or disabled at the time of the error.

There is a limited number of people out there who have those skills and we probably already know most of them personally through having been in the industry a long time, but he's hopeful that some keen / talented person will step up and say "oh yeah, I can do that".

CC
(*) I know this because I wrote it.

Sent from my iPad

On Jul 30, 2013, at 9:49 AM, "David de Jongh" <davidd...@verizon.net> wrote:

> This was "déj� vu all over again" for me. We have an in-house abend
> analysis routine driven by ...

John Gilmore

unread,
Jul 30, 2013, 12:13:20 PM7/30/13
to ASSEMBL...@listserv.uga.edu
There is no substitute for experience in the company of mentors/more
knowledgeable colleagues. For this reason the very experienced,
sure-footed mules who carry people down into and back up out of the
Grand Canyon of the Colorado should not be consulted about the
Canyon's geology.

To vary the metaphor, books having titles like 'Topology made easy' or
'Neurosurgery for dummies' achieve their objectives, when they do, by
leaving the hard parts out.

John Gilmore, Ashland, MA 01721 - USA

charles hottel

unread,
Jul 30, 2013, 1:07:36 PM7/30/13
to ASSEMBL...@listserv.uga.edu
I am surprised no one has mentioned the book:

Debugging System 360/370 Programs Using OS and Vs Storage Dumps [Hardcover]
Daniel H. Rindfleisch

I found it sufficient for debugging most application program dumps.

Charles Hottel

-----Original Message-----
From: IBM Mainframe Assembler List [mailto:ASSEMBL...@LISTSERV.UGA.EDU]
On Behalf Of John Gilmore
Sent: Tuesday, July 30, 2013 12:13 PM
To: ASSEMBL...@LISTSERV.UGA.EDU
Subject: Re: dump reading made simple??

Bernd Oppolzer

unread,
Jul 30, 2013, 1:43:57 PM7/30/13
to ASSEMBL...@listserv.uga.edu
The normal application dumps are normal business.

Once you know how to handle this - and you learn it best by teaching or
mentoring others - you can of course go further and solve more advanced
problems. A very hard problem for me was a race condition with two
subtasks in IBM's APL code (the linkage to other languages, processor 11).
It took me weeks to convince IBM that the error is in their code, not in
ours,
because they couldn't reproduce it on their machines (our machines were
faster, more CPUs). They accepted it in the end when I did a local fix
to their code and
documented in detail, what I made - I simply checked the ECB, and when
finding the condition that would lead to the subsequent ABEND, I waited for
a little amount of time, wrote a message, and then tried it again - and so
the error disappeared. But it took me weeks to find the reason for the
problems
and to understand the code at the relevant positions - of course, it's
all OCO.

IBM's APL code was patched on the fly, after initial load - control was
transferred from the original WAIT / POST routine to one of my own,
that worked a little bit different, see above.

In the end, IBM accepted my proposition for the remedy of this problem.

While working on this episode, it was very important for me that there
was a co-worker who was very experienced with all the z/OS topics
and who encouraged me to continue. He said to me things like that he
already experienced problems with APL in this particular area and he
talked with me about my different approaches and, although he did not
support me with practical work, the discussions with him were very helpful
to me. Also some SLIP traces, because, as you can imagine, the problem
only occured once for some millions of transactions, only during periods
of peak load.

Kind regards

Bernd

David P de Jongh

unread,
Jul 30, 2013, 5:07:15 PM7/30/13
to ASSEMBL...@listserv.uga.edu
You are responding to the wrong thread.  This discussion is a spin-off from Bill's original post, and the subject has changed,  The spin-off started with Don Nielsen's response, which simply asked "Where might one find good instruction on how to read a dump? This is probably my poorest skill and I should be better at it."  Eventually, somebody actually changed the subject line, as Jean Snow had asked after the thread had become sidetracked.
David de Jongh
> This was "déjà vu all over again" for me. We have an in-house abend

T'Dell Sparks

unread,
Jul 30, 2013, 5:32:04 PM7/30/13
to ASSEMBL...@listserv.uga.edu
You don’t find it.. It’s an acquired skill. Someone has to lead you through the first few. Then when you hone some additional sill,,, mostly through trial and error ,, the knowledge becomes obvious, you should be able to find your way through most situations. Other will be more complex, so you really need senior level people around to handle this. I’d read through thread again It was chock full of good time saving advice.

BTW
Cannot get this (Dump reading )without a good knowledge of assembler. Cursory knowledge won’t suffice.

From: IBM Mainframe Assembler List [mailto:ASSEMBL...@LISTSERV.UGA.EDU] On Behalf Of David P de Jongh
Sent: Tuesday, July 30, 2013 2:07 PM
To: ASSEMBL...@LISTSERV.UGA.EDU
Subject: Re: dump reading made simple??

You are responding to the wrong thread. This discussion is a spin-off from Bill's original post, and the subject has changed, The spin-off started with Don Nielsen's response, which simply asked "Where might one find good instruction on how to read a dump? This is probably my poorest skill and I should be better at it." Eventually, somebody actually changed the subject line, as Jean Snow had asked after the thread had become sidetracked.
David de Jongh

On 07/30/13, Chris Craddock<cras...@hotmail.com<mailto:cras...@hotmail.com>> wrote:

I have no dog in this discussion - I am long gone from BMC - but to put this back on the rails; what Bill Blair was specifically asking for was people who have deep z/OS development and diagnostic skills.

Dump reading is just one of those skills. In Bill's case they already have shared product infrastructure (*) that diagnoses and recovers from abends in product code, regardless of the state it is running in. That's the thing that vomits up diagnostic messages and captures LOGREC and SVC dumps. All of that is completely automatic and in most cases you can figure out the root cause just by reading the diagnostic messages.

...except when you can't. That's when you need to be able to read and analyze the contents of an SVC dump, or a SAD. None of that has the slightest bit of anything to do with LE dumps, or SYSxDUMPs. To meet Bill's requirements you MUST know your way around z/OS internals sufficiently to understand the control block chains indicating the state of the machine at the time of the error. You get even more brownie points if you can do that for dumps taken out of FRR routines for work that was locked or disabled at the time of the error.

There is a limited number of people out there who have those skills and we probably already know most of them personally through having been in the industry a long time, but he's hopeful that some keen / talented person will step up and say "oh yeah, I can do that".

CC
(*) I know this because I wrote it.

Sent from my iPad

Reply all
Reply to author
Forward
0 new messages