Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

COBOL inline CPU timer

457 views
Skip to first unread message

Bill Woodger

unread,
Dec 20, 2016, 4:56:52 AM12/20/16
to
How to time COBOL code blocks, prompted by John McKown's "how to: convince programmer something else is better.":


See here: http://ibmmainframes.com/viewtopic.php?p=294906

The main reason for including the link is to show the program nicely formatted. Here's the post, in case the link-to information ever disappears:

With the usual provisos (don't use in production, don't rely on it being available after z/OS upgrade) here is a module which will make the CPU time available in the program which calls it (or wherever the storage happens to be defined).

At the "front" of a program,

CALL "BBPFTC1" USING ADDRESS OF a-doubleword-binary-defined-in-linkage

Then forget about the sub-program, no longer needed. You could, I suppose, even CANCEL it if you've loaded it dynamically.

When you need the CPU time, just use the field from Linkage. Typically, save it before a suspect block of code. Save it somewhere else after the block. Compare. To get the time "big enough", "divide by 4096" to get it to miliseconds.

ID DIVISION.
PROGRAM-ID. BBPFTC1.

DATA DIVISION.
WORKING-STORAGE SECTION.
01 W-PSA-POINTER POINTER.
LINKAGE SECTION.
01 PSA-CONTROL-BLOCK.
05 FILLER PIC X(548).
05 PSA-CB-ASCB-POINTER POINTER.
01 ASCB-CONTROL-BLOCK.
05 FILLER PIC X(64).
05 ASCB-CB-CPUTIME PIC X(8).
01 L-CPU-TIME-PTR USAGE POINTER.
PROCEDURE DIVISION USING L-CPU-TIME-PTR.
SET W-PSA-POINTER TO NULL
SET ADDRESS OF PSA-CONTROL-BLOCK
TO W-PSA-POINTER
SET ADDRESS OF ASCB-CONTROL-BLOCK
TO PSA-CB-ASCB-POINTER
SET L-CPU-TIME-PTR TO ADDRESS OF ASCB-CB-CPUTIME
GOBACK
.

----------------------------

Here's an example of how to use it, prepared for Peter Hunkeler's topic on the "decimal overflow", "How to Identify modules as C/C++ or Cobol, or ASM from dump (SYSMDUMP)." Unfortunately without formatting:

IDENTIFICATION DIVISION.
PROGRAM-ID. 'STAB64'.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 W-WHEN-COMPILED PIC X(4)/XX/XXBXXBXXBXX.
01 W-LARGE-SOURCE-FIELD PIC 9.
01 W-LARGE-TARGET-FIELD PIC 9.
01 W-LARGE-ANSWER PIC 9.
01 W-SMALL-SOURCE-FIELD PIC 9.
01 W-SMALL-TARGET-FIELD PIC 9.
01 W-SMALL-ANSWER PIC 9.
01 W-NUMBER-OF-LOOPS BINARY PIC 9(8).
01 W-ZONED-TO-BIN PIC 9(8).
01 CALC-ITERATION-LENGTH BINARY PIC 9(1).
01 W-EMPTY-CPU COMP-5 PIC 9(18).
01 W-GROSS-CPU COMP-5 PIC 9(18).
01 W-NETT-CPU COMP-5 PIC 9(18).
01 W-ORIG-CPU COMP-5 PIC 9(18).
01 W-CPU-TIME-FOR-DISPLAY PIC Z(11)9,99,9999.
01 W-CPU-BEFORE-BLK1 COMP-5 PIC 9(18).
01 W-CPU-BEFORE-BLK1X
REDEFINES W-CPU-BEFORE-BLK1.
05 FILLER PIC X(8).
01 W-CPU-AFTER-BLK1 COMP-5 PIC 9(18).
01 W-CPU-AFTER-BLK1X
REDEFINES W-CPU-AFTER-BLK1.
05 FILLER PIC X(8).
01 W-CPU-BEFORE-BLK2 COMP-5 PIC 9(18).
01 W-CPU-BEFORE-BLK2X
REDEFINES W-CPU-BEFORE-BLK2.
05 FILLER PIC X(8).
01 W-CPU-AFTER-BLK2 COMP-5 PIC 9(18).
01 W-CPU-AFTER-BLK2X
REDEFINES W-CPU-AFTER-BLK2.
05 FILLER PIC X(8).
01 W-CEE3SPM-FUNCTION BINARY PIC 9(8).
88 W-CEE3SPM-QUERY VALUE 2.
88 W-CEE3SPM-SET VALUE 1.
01 W-CEE3SPM-FUNCTION-VALUES PIC X(80).
01 W-CEE3SPM-FB-IGNORED-HERE PIC X(12).
LINKAGE SECTION.
01 PARM-DATA.
05 PARM-LENGTH COMP PIC 9(4).
05 PARM-VALUE.
10 FILLER OCCURS 0 TO 100 TIMES
DEPENDING ON PARM-LENGTH.
15 FILLER PIC X.
01 PARM-WITH-CONTEXT.
05 PARM-IND PIC X.
88 SET-OVERFLOW-OFF VALUE "1".
05 PARM-NUMBER-OF-ITERATIONS.
10 FILLER OCCURS 0 TO 8 TIMES
DEPENDING ON CALC-ITERATION-LENGTH.
15 FILLER PIC X.
01 L-CPU-TIME PIC X(8).
PROCEDURE DIVISION USING
PARM-DATA
.
PERFORM INITIAL-PROCESSING
PERFORM DO-THE-SMALL-ADD
PERFORM DO-THE-LARGE-ADD
PERFORM FINAL-PROCESSING
GOBACK
.
INITIAL-PROCESSING.
PERFORM SAY-WHO-WE-ARE
PERFORM PROCESS-PARM-FOR-LOOP-COUNT
PERFORM QUERY-AND-DISPLAY-MASK
IF SET-OVERFLOW-OFF
PERFORM SET-DECIMAL-IN-MASK-OFF
END-IF
PERFORM SAVE-CPU-START-TIME
.
SAY-WHO-WE-ARE.
MOVE FUNCTION WHEN-COMPILED TO W-WHEN-COMPILED
DISPLAY
"STAB64 COMPILED ON: "
W-WHEN-COMPILED
.
PROCESS-PARM-FOR-LOOP-COUNT.
PERFORM REDEFINES-TO-ALLOW-ODO
SUBTRACT LENGTH OF PARM-IND FROM PARM-LENGTH
GIVING CALC-ITERATION-LENGTH
MOVE PARM-NUMBER-OF-ITERATIONS
TO W-ZONED-TO-BIN
MOVE W-ZONED-TO-BIN TO W-NUMBER-OF-LOOPS
DISPLAY
">"
PARM-NUMBER-OF-ITERATIONS
"<"

.
REDEFINES-TO-ALLOW-ODO.
SET ADDRESS OF PARM-WITH-CONTEXT
TO ADDRESS OF
PARM-VALUE
.
QUERY-AND-DISPLAY-MASK.
SET W-CEE3SPM-QUERY TO TRUE
CALL 'CEE3SPM' USING W-CEE3SPM-FUNCTION
W-CEE3SPM-FUNCTION-VALUES
W-CEE3SPM-FB-IGNORED-HERE
DISPLAY
W-CEE3SPM-FUNCTION-VALUES
.
SET-DECIMAL-IN-MASK-OFF.
SET W-CEE3SPM-SET TO TRUE
MOVE 'NOF NOD NOU NOS' TO W-CEE3SPM-FUNCTION-VALUES
CALL 'CEE3SPM' USING
W-CEE3SPM-FUNCTION
W-CEE3SPM-FUNCTION-VALUES
W-CEE3SPM-FB-IGNORED-HERE
.
SAVE-CPU-START-TIME.
CALL "BBPFTC1" USING
ADDRESS OF
L-CPU-TIME
.
DO-THE-SMALL-ADD.

ACCEPT W-SMALL-SOURCE-FIELD
ACCEPT W-SMALL-TARGET-FIELD
MOVE L-CPU-TIME TO W-CPU-BEFORE-BLK1X
PERFORM
W-NUMBER-OF-LOOPS TIMES
ADD W-SMALL-SOURCE-FIELD TO W-SMALL-TARGET-FIELD
GIVING W-SMALL-ANSWER
END-PERFORM
MOVE L-CPU-TIME TO W-CPU-AFTER-BLK1X
DISPLAY "THE ADD SEEMED TO WORK"
DISPLAY W-SMALL-ANSWER
.
DO-THE-LARGE-ADD.

ACCEPT W-LARGE-SOURCE-FIELD
ACCEPT W-LARGE-TARGET-FIELD
MOVE L-CPU-TIME TO W-CPU-BEFORE-BLK2X
PERFORM W-NUMBER-OF-LOOPS TIMES
ADD W-LARGE-SOURCE-FIELD TO W-LARGE-TARGET-FIELD
GIVING W-LARGE-ANSWER
END-PERFORM
MOVE L-CPU-TIME TO W-CPU-AFTER-BLK2X
DISPLAY "THE LARGE ADD SEEMED TO WORK"
DISPLAY W-LARGE-ANSWER
.
FINAL-PROCESSING.
PERFORM SHOW-CPU-USED-BLOCK-1
PERFORM SHOW-CPU-USED-BLOCK-2
.
SHOW-CPU-USED-BLOCK-1.
COMPUTE W-GROSS-CPU = ( W-CPU-AFTER-BLK1
- W-CPU-BEFORE-BLK1 )
/ 4096
MOVE W-GROSS-CPU TO W-CPU-TIME-FOR-DISPLAY
DISPLAY
"CODE BLOCK 1 GROSS CPU"
">"
W-CPU-TIME-FOR-DISPLAY
"<"
.
SHOW-CPU-USED-BLOCK-2.
COMPUTE W-GROSS-CPU = ( W-CPU-AFTER-BLK2
- W-CPU-BEFORE-BLK2 )
/ 4096
MOVE W-GROSS-CPU TO W-CPU-TIME-FOR-DISPLAY
DISPLAY
"CODE BLOCK 2 GROSS CPU"
">"
W-CPU-TIME-FOR-DISPLAY
"<"
.

JCL to run it:

//RUNIT EXEC PGM=STAB64,TIME=(,reasonableforyoursystem),PARM='G1'
//STEPLIB DD DISP=SHR,DSN=yourload
// DD DISP=SHR,DSN=yourLEruntime IBM-provided is CEE.SCEELKED, I think
//SYSOUT DD SYSOUT=*
//CEEOPTS DD *
TRAP(OFF)
RPTOPTS(ON)
//SYSIN DD *
1
1
9
9

To get the overflow bit "on", use compiler option PGMNAME(LONGUPPER) or LONGMIXED.

See the CPU counts with TRAP(OFF), see the S0CA with TRAP(ON). Then change the G in the PARM to F (all values other than F are equal) to get the program to turn the overflow bit "off", and compare and contrast.

This program demonstrates the point made by Tom Ross in a presentation, which was referred to in Peter Hunkeler's topic.

1 and 1, and 9 and 9, are the two one-digit values which will be added together giving a one-digit result. 1 + 1 will not cause overflow, 9 + 9 (or indeed 9 + anything other than zero) will cause overflow.

In the example (and in Tom Ross's) the hit is huge, but hopefully even when overflow occurs, it is not the regular result, although people have been affected by this.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to list...@listserv.ua.edu with the message: INFO IBM-MAIN

Bill Woodger

unread,
Dec 20, 2016, 6:58:52 AM12/20/16
to
In my rush to exchange an Electronic Arts (EA) PS4 game (don't, anyone, ever, mention their idiotic policy on their product for sale in Poland (the "international version" has Polish, amongst many languages, including the two my son would be interested in, English and Portuguese, the Polish version has... Polish), I missed explaining the second part of the PARM.

It is a zero-to-eight-digit value representing the number of iterations. The example posted shows for one iteration, which may not show the disparity (but the disparity is huge) :-)

Jim Mulder

unread,
Dec 20, 2016, 1:00:13 PM12/20/16
to
> With the usual provisos (don't use in production, don't rely on it
> being available after z/OS upgrade) here is a module which will make
> the CPU time available in the program which calls it (or wherever
> the storage happens to be defined).
>
> At the "front" of a program,
>
> CALL "BBPFTC1" USING ADDRESS OF a-doubleword-binary-defined-in-linkage
>
> Then forget about the sub-program, no longer needed. You could, I
> suppose, even CANCEL it if you've loaded it dynamically.
>
> When you need the CPU time, just use the field from Linkage.
> Typically, save it before a suspect block of code. Save it somewhere
> else after the block. Compare. To get the time "big enough", "divide
> by 4096" to get it to miliseconds.

ASCBEJST is in timer units. Dividing it by 4096 yields
microseconds (not milliseconds).

ASCBEJST contains accumulated CPU time for all TCBs and
SRBs in the address space.

ASCBEJST does not include the CPU time spent in the current
dispatch.

Jim Mulder z/OS Diagnosis, Design, Development, Test IBM Corp.
Poughkeepsie NY

Paul Gilmartin

unread,
Dec 20, 2016, 1:17:00 PM12/20/16
to
On Tue, 20 Dec 2016 12:59:51 -0500, Jim Mulder wrote:
>
> ASCBEJST does not include the CPU time spent in the current
>dispatch.
>
I knew someone who dealt with this by issuing an (almost) no-op SVC
to force re-dispatch. IIRC, he used CHAP.

(Almost) every language needs a no-op. But does the programmer
get to select the side effects. POSIX shell has ":" which has the
(sometimes desirable) side effect of setting RC to 0, and it evaluates
its arguments then ignores them.

-- gil

Bill Woodger

unread,
Dec 20, 2016, 6:58:59 PM12/20/16
to
Thanks Jim Mulder.

Perhaps you can provide guidance if I explain what I thought I was doing.

At times there's a bit of code which "runs like a dog with no legs". There will usually be several options to deal with it, but to actually land on which is "better" you need to know how they perform against each other.

Rather than tossing five different versions of the program onto the machine, collating usage, and not being able to compare with the "overhead" of the rest of the program, I was hoping to get "a starting point" of the CPU used, an "ending point" of the CPU used, thus being able to arrive at "the CPU used by this bit of code" (repeated lots of times).

I'm not concerned with IO of any type, including DBs, I am concerned with explicit CALL and implicit (COBOL run-time routines). Nothing fancy, although it can include LE callable services. I'm not interested in System Services. (In summary, I'm not interested in things (at that particular stage) that I can have no influence on, if I have to read a KSDS there's only one way to do it in COBOL, so I "abstract it out" of the loop).

Using it so far does (apparently) give fair indicative results. "Yes, that's significantly better" or "yes, that's faster, but not enough to chose it in this case over something "less exotic".

If there's some way to make it more "accurate" for what I want, I'd be pleased.

I'm "applications", so when you say "despatch", I think "mail room" or "loading bay". As plain as possible is fine with me.

Perhaps-related question. "interruptible" instructions (like MVCL, I was told), are they "doing me in"?

Charles Mills

unread,
Dec 21, 2016, 9:50:47 AM12/21/16
to
I'm not Jim Mulder but

> Rather than tossing five different versions of the program onto the machine

Yes, your idea of measurement is superior, at least as a starting point.

> when you say "despatch", I think "mail room" or "loading bay". As plain as possible is fine with me.

Perhaps not quite the official explanation but dispatch = your task getting control/usage of a CPU: the last time your program "woke up." I would say that if you are measuring the CPU time used in some large span of code including I/O then the CPU time figure will be pretty good. If you are measuring how long it took you to compute pi to ten digits, then you could conceivably end up with zero, because it was all computed since you "got" the CPU most recently.

Perhaps others here have a strategy for forcing a new dispatch. Does the TIMEUSED macro force a dispatch?

> "interruptible" instructions (like MVCL, I was told), are they "doing me in"?

I would say not per se. They are interruptible precisely because they might take a long time, but the fact that they might be interrupted is not the problem. If you are doing lots of 200K moves, for example, then the problem is the moves, not the interruptions.

Charles
-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-...@LISTSERV.UA.EDU] On Behalf Of Bill Woodger
Sent: Tuesday, December 20, 2016 6:59 PM
To: IBM-...@LISTSERV.UA.EDU
Subject: Re: COBOL inline CPU timer

Thanks Jim Mulder.

<snip>

Paul Gilmartin

unread,
Dec 21, 2016, 10:14:41 AM12/21/16
to
On Wed, 21 Dec 2016 09:50:18 -0500, Charles Mills wrote:
>
>> when you say "despatch", I think "mail room" or "loading bay". As plain as possible is fine with me.
>
>Perhaps not quite the official explanation but dispatch = your task getting control/usage of a CPU: the last time your program "woke up." I would say that if you are measuring the CPU time used in some large span of code including I/O then the CPU time figure will be pretty good. If you are measuring how long it took you to compute pi to ten digits, then you could conceivably end up with zero, because it was all computed since you "got" the CPU most recently.
>
I had pondered this issue in the past and decided that the most useful, though
not the most simply usable, value to store in the CVT is CPU time used minus
TOD clock value at time of dispatch. Then the programmer could STCK(F),
add that magic value in the CVT, and get instantaneous CPU time used. Yes,
I know there's a timing window and that it can be shut with more code.

>Perhaps others here have a strategy for forcing a new dispatch. Does the TIMEUSED macro force a dispatch?
>
I mentioned, perhaps not in time for Bill to read, a colleague who used a
neutral Change Priority SVC to force dispatch.

--gil

Charles Mills

unread,
Dec 21, 2016, 10:31:31 AM12/21/16
to
> the most simply usable, value to store in the CVT is CPU time used minus TOD clock value at time of dispatch

Which does not currently exist, right? You are suggesting that IBM might add this?

> perhaps not in time for Bill to read

Or me to assimilate, obviously. <g>

Are we re-inventing the wheel here? What's wrong with TIMEUSED, other than (for my purposes, if not the OP) that it requires a call out to assembler while ASCB or TCB arithmetic may be done directly in C (and perhaps even in COBOL)?

Charles

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-...@LISTSERV.UA.EDU] On Behalf Of Paul Gilmartin
Sent: Wednesday, December 21, 2016 10:15 AM
To: IBM-...@LISTSERV.UA.EDU
Subject: Re: COBOL inline CPU timer

On Wed, 21 Dec 2016 09:50:18 -0500, Charles Mills wrote:
>
>> when you say "despatch", I think "mail room" or "loading bay". As plain as possible is fine with me.
>
>Perhaps not quite the official explanation but dispatch = your task getting control/usage of a CPU: the last time your program "woke up." I would say that if you are measuring the CPU time used in some large span of code including I/O then the CPU time figure will be pretty good. If you are measuring how long it took you to compute pi to ten digits, then you could conceivably end up with zero, because it was all computed since you "got" the CPU most recently.
>
I had pondered this issue in the past and decided that the most useful, though not the most simply usable, value to store in the CVT is CPU time used minus TOD clock value at time of dispatch. Then the programmer could STCK(F), add that magic value in the CVT, and get instantaneous CPU time used. Yes, I know there's a timing window and that it can be shut with more code.

>Perhaps others here have a strategy for forcing a new dispatch. Does the TIMEUSED macro force a dispatch?
>
I mentioned, perhaps not in time for Bill to read, a colleague who used a neutral Change Priority SVC to force dispatch.

Greg Dyck

unread,
Dec 21, 2016, 11:20:56 AM12/21/16
to
On 12/20/2016 5:58 PM, Bill Woodger wrote:
> If there's some way to make it more "accurate" for what I want, I'd be pleased.

Use the assembler macro-

TIMEUSED STORADR=xxx,ECT=YES

This will call a *short* system routine (no PC, no SVC) which will use
the ECTG instruction to calculate the current CPU time consumed,
including CPU time consumed since the last dispatch, for the current task.

Regards,
Greg

Paul Gilmartin

unread,
Dec 21, 2016, 11:47:25 AM12/21/16
to
On Wed, 21 Dec 2016 10:20:42 -0600, Greg Dyck wrote:
>
>Use the assembler macro-
>
> TIMEUSED STORADR=xxx,ECT=YES
>
>This will call a *short* system routine (no PC, no SVC) which will use
>the ECTG instruction to calculate the current CPU time consumed,
>including CPU time consumed since the last dispatch, for the current task.
>
How does ECTG do that? Is there information in control registers supporting it?

Does it work likewise on systems other than z/OS, such as Linux? VM guests?

-- gil

Greg Dyck

unread,
Dec 21, 2016, 12:19:02 PM12/21/16
to
On 12/21/2016 10:46 AM, Paul Gilmartin wrote:
> How does ECTG do that? Is there information in control registers supporting it?
>
> Does it work likewise on systems other than z/OS, such as Linux? VM guests?

The high level answer is that ECTG atomically fetches 8 bytes of storage
from one location, PSADTSAV for z/OS, and subtracts the current CPU
timer value from it and loads 8 bytes from a second location, TCBTTIME
for z/OS, and loads a third value, which is the type of processor (GCP,
zIIP, zAAP) for z/OS. Using these atomically obtained values the
current CPU time can then be calculated. The calculations are *not*
something to be attempted in application code as proper adjustments have
to be made if all CP types do not run at the same speed.

If appropriate information was saved at dispatch time by other operating
systems then ECTG could be used there as well.

Regards,
Greg

Charles Mills

unread,
Dec 21, 2016, 4:00:00 PM12/21/16
to
When does ECTG come along? I've got high-level notes on the various z architecture levels but the "extract-CPU-time facility" is not in my notes.

Charles

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-...@LISTSERV.UA.EDU] On Behalf Of Greg Dyck
Sent: Wednesday, December 21, 2016 12:19 PM
To: IBM-...@LISTSERV.UA.EDU
Subject: Re: COBOL inline CPU timer

On 12/21/2016 10:46 AM, Paul Gilmartin wrote:
> How does ECTG do that? Is there information in control registers supporting it?
>
> Does it work likewise on systems other than z/OS, such as Linux? VM guests?

The high level answer is that ECTG atomically fetches 8 bytes of storage from one location, PSADTSAV for z/OS, and subtracts the current CPU timer value from it and loads 8 bytes from a second location, TCBTTIME for z/OS, and loads a third value, which is the type of processor (GCP, zIIP, zAAP) for z/OS. Using these atomically obtained values the current CPU time can then be calculated. The calculations are *not* something to be attempted in application code as proper adjustments have to be made if all CP types do not run at the same speed.

If appropriate information was saved at dispatch time by other operating systems then ECTG could be used there as well.

Greg Dyck

unread,
Dec 21, 2016, 6:09:37 PM12/21/16
to
On 12/21/2016 2:59 PM, Charles Mills wrote:
> When does ECTG come along? I've got high-level notes on the various z architecture levels but the "extract-CPU-time facility" is not in my notes.

For those who still remember S360 architecture ;-), it is a recent
addition. z9-109, circa September 2005.

Regards,
Greg

Charles Mills

unread,
Dec 22, 2016, 9:10:47 AM12/22/16
to
Thanks. FWIW, that is our floor architecture level. I'm not sure if we have any customers on a z9 but I know we have customers on a z10.

And yes, I remember S/360 architecture. I still marvel every time I write an ICM.

Charles

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-...@LISTSERV.UA.EDU] On Behalf Of Greg Dyck
Sent: Wednesday, December 21, 2016 6:09 PM
To: IBM-...@LISTSERV.UA.EDU
Subject: Re: COBOL inline CPU timer

On 12/21/2016 2:59 PM, Charles Mills wrote:
> When does ECTG come along? I've got high-level notes on the various z architecture levels but the "extract-CPU-time facility" is not in my notes.

For those who still remember S360 architecture ;-), it is a recent addition. z9-109, circa September 2005.

0 new messages