Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Cobol Myth Busters

13 views
Skip to first unread message

Robert

unread,
Aug 31, 2007, 10:22:35 PM8/31/07
to
In the Micro Focus manual Server Express (2.2 & 4.0):Program Development, chapter 1 part 1
is titled Writing Efficient Programs. Its top billing tells us then think speed is a Very
Important Topic we should know about. For fun, I put their advice to the test.

The machine I used is a high-end HP Superdome with 64 PA (RISC) processors. Of course,
the Cobol test program was only using one of them. For general reference, other timing
tests showed mid-range Sun SPARC CPUs to be 3 times faster than the PA, and HP Superdomes
with Itaniums to be 6-10 rimes faster. Despite that, customer demand forced HP to rescind
its decision to obsolete the PA. These tests were run on a 'new generation' PA.

I added a few comparisons that are not from the MF manual, but are widely believed in the
Cobol community. They are styled "Legacy:". Execution times are in microseconds (us), with
a resolution of plus or minus 5. I'll describe the timing methodology toward the end; for
now, take my word that the speeds are accurate.

Proposition: Use simple two-operand arithmetic statements wherever possible.

Test:
05 binary-number binary pic s9(09) sync.

add 1 to binary-number *> 1 us
compute binary-number = binary-number + 1 *> 1 us

add 1 to binary-number
multiply 5 by binary-number
divide 5 into binary-number *> 50 us

compute binary-number = ((binary-number + 1) * 5) / 5 *> 445 us

Finding: busted for simple cases, confirmed for cases with more than one operation.

Proposition: "Do not use the REMAINDER, ROUNDED, ON SIZE ERROR or CORRESPONDING phrases if
you want the fastest performance. No optimization is done on arithmetic statements if the
ON SIZE ERROR phrase is used. For this reason, we recommend you do not use this phrase if
high performance is required. The ROUNDED phrase impacts performance, but it is generally
faster to use ROUNDED than try to round the result using your own routine. "

Test:
compute binary-number rounded = binary-number + 1 *> 1 us (no penalty)
add 1 to binary-number *> 15 us
on size error display 'overflow'
end-add

Finding: busted for rounded, confirmed for size error.

Legacy belief: indexes are faster than subscripts

Test:
05 s-subscript binary pic s9(09) sync.
01 misaligned-area sync.
05 array-element occurs 4096 indexed x-index.
10 misaligned-number comp-5 pic s9(09).
10 to-cause-misalignment pic x(01).
move array-element (s-subscript) to test-byte *> 3 us
move array-element (x-index) to test-byte *> 6 us

Finding: BUSTED. Index is actually slower.

Proposition: When incrementing or decrementing a counter, terminate it with a literal
value rather than a value held in a data item. For example, to execute a loop n times, set
the counter to n and then decrement the counter until it becomes zero, rather than
incrementing the counter from zero to n.

Test:
perform varying binary-number from 10 by -1 until binary-number = 0 *> 150 us
perform varying binary-number from 1 by 1 until binary-number > 10 *> 154 us

Finding: BUSTED

Proposition: Access to tables defined with OCCURS ... DEPENDING is less efficient than
access to tables of fixed size, and so should be avoided where high performance is needed.

Test:

01 depending-area.
05 depending-element occurs 1 to 4096 depending on binary-number.
10 comp-5 pic s9(09).
10 pic x(01).
move array-element (s-subscript) to test-byte *> 3 us
move depending-element (s-subscript) to test-byte *> 3 us

Finding: BUSTED

Proposition: Arithmetic on COMP-3 data items is performed in packed decimal and is much
slower than arithmetic on COMP items. It should be avoided.

Test:
05 display-number pic 9(09).
05 packed-number comp-3 pic s9(09).

add 1 to display-number *> 174 us
add 1 to packed-number *> 160 us

Finding: CONFIRMED. Packed is almost as slow as display. It was fast on 1970-era
mainframes. There is no longer any reason to use it. If you want to save space, look at
space-filled strings and filler-padding.

To be continued with the most unexpected and interesting case: does aligning numbers on
memory boundaries matter?

Roger While

unread,
Sep 1, 2007, 5:56:43 AM9/1/07
to
Absolute rubbish.
You need to do an inline PERFORM of
at least a million iterations to determine this.
In fact, BINARY (aka COMP) is big endian (generally).
So anyway your tests are invalid.
(They force a endian swap on little endian)

You should be using BINARY-LONG (aka COMP-5)

Alignment DOES matter on machines where
this is not tolerated.

I have done all this machine stuff with OC.

Roger

"Robert" <n...@e.mail> schrieb im Newsbeitrag
news:dldhd39vccjgdgs3g...@4ax.com...

Robert

unread,
Sep 1, 2007, 8:25:52 AM9/1/07
to
On Sat, 1 Sep 2007 11:56:43 +0200, "Roger While" <si...@sim-basis.de> wrote:

>Absolute rubbish.

Thanks for the erudite rebuttal.

>You need to do an inline PERFORM of
>at least a million iterations to determine this.

I did 100 million.

>In fact, BINARY (aka COMP) is big endian (generally).
>So anyway your tests are invalid.
>(They force a endian swap on little endian)

The PA processor is little endian, so there's no difference between BINARY and COMP-5.
Try again when someone posts timing tests on an Intel or Alpha.

>You should be using BINARY-LONG (aka COMP-5)

I didn't post a comparison because most people find no difference boring.

>Alignment DOES matter on machines where
>this is not tolerated.

Modern machines have two or three levels of cache between the CPU and memory. There are no
alignment issues in a cache. But compilers that THINK alignment is important shoot
themselves in the foot by generating extra instructions to align he number to speed things
up. The extra instructions are counterproductive, they actually slow things down.

>I have done all this machine stuff with OC.

What's OC?

Roger While

unread,
Sep 1, 2007, 8:40:01 AM9/1/07
to
"Robert" <n...@e.mail> schrieb im Newsbeitrag
news:0ukid3h951nksjv34...@4ax.com...

> On Sat, 1 Sep 2007 11:56:43 +0200, "Roger While" <si...@sim-basis.de>
> wrote:
>
>>Absolute rubbish.
>
> Thanks for the erudite rebuttal.
>
>>You need to do an inline PERFORM of
>>at least a million iterations to determine this.
>
> I did 100 million.

Super, post the program.
Do not do calculations in your head:-)

>
>>In fact, BINARY (aka COMP) is big endian (generally).
>>So anyway your tests are invalid.
>>(They force a endian swap on little endian)
>
> The PA processor is little endian, so there's no difference between BINARY
> and COMP-5.
> Try again when someone posts timing tests on an Intel or Alpha.
>
>>You should be using BINARY-LONG (aka COMP-5)
>
> I didn't post a comparison because most people find no difference boring.
>

Really, This IS a major issue when doing big/liittle-endian.

>>Alignment DOES matter on machines where
>>this is not tolerated.
>
> Modern machines have two or three levels of cache between the CPU and
> memory. There are no
> alignment issues in a cache. But compilers that THINK alignment is
> important shoot
> themselves in the foot by generating extra instructions to align he number
> to speed things
> up. The extra instructions are counterproductive, they actually slow
> things down.
>
>>I have done all this machine stuff with OC.
>
> What's OC?
>

Follow the links here :-)

Roger


Pete Dashwood

unread,
Sep 1, 2007, 9:15:28 AM9/1/07
to

"Robert" <n...@e.mail> wrote in message
news:0ukid3h951nksjv34...@4ax.com...

It is Open COBOL. Roger is one of the people working on it.

I have no axe to grind, but I did find some of your results eyebrow-raising.
Have you thought carefully about exactly how "unbiased" your tests are?

Pete.
--
"I used to write COBOL...now I can do anything."


Robert

unread,
Sep 1, 2007, 3:35:55 PM9/1/07
to
On Sat, 1 Sep 2007 14:40:01 +0200, "Roger While" <si...@sim-basis.de> wrote:

>"Robert" <n...@e.mail> schrieb im Newsbeitrag
>news:0ukid3h951nksjv34...@4ax.com...
>> On Sat, 1 Sep 2007 11:56:43 +0200, "Roger While" <si...@sim-basis.de>
>> wrote:

>>>In fact, BINARY (aka COMP) is big endian (generally).
>>>So anyway your tests are invalid.
>>>(They force a endian swap on little endian)
>>
>> The PA processor is little endian, so there's no difference between BINARY
>> and COMP-5.
>> Try again when someone posts timing tests on an Intel or Alpha.
>>
>>>You should be using BINARY-LONG (aka COMP-5)
>>
>> I didn't post a comparison because most people find no difference boring.
>>
>
>Really, This IS a major issue when doing big/liittle-endian.

Conversion is a MINOR issue. It takes one instruction -- xchg al,ah -- for 16 bit and
three instructions -- xchg ah, al, ror eax, 16, xchg ah, al -- for 32 bit.

How do you handle bi-endian machines such as Itanium and PowerPC? The compiler doesn't
know the machine's state at execution time. A compiler running under Linux thinks the
Itanium is big endian. An LPAR running HP-UX on the same machine sees the world as little
endian. Conversions are handled by an emulator, not the compiler.

>>>Alignment DOES matter on machines where
>>>this is not tolerated.
>>
>> Modern machines have two or three levels of cache between the CPU and
>> memory. There are no
>> alignment issues in a cache. But compilers that THINK alignment is
>> important shoot
>> themselves in the foot by generating extra instructions to align he number
>> to speed things
>> up. The extra instructions are counterproductive, they actually slow
>> things down.
>>
>>>I have done all this machine stuff with OC.

Things change. The PA alignment instructions speeded things up in the late '80s. Now they
slow things down, especially on machines with an L2 cache such as the PA-7300 and 8800.

As religious wars rage, the poor compiler is forever playing catchup. This is a good
reason to separate code generation from the compiler, as done by GCC and Mercury.
I see OC does that by using the GCC C compiler as its back end. The problem with that
approach is you can't generate inline code for Cobol things that have no corresponding C
syntax. For instance, a SEARCH or STRING looking for a one byte delimiter on Intel SHOULD
generate an inline REPNE SCASB. There's no way to say that in C; you have to call a
function.

Doug Miller

unread,
Sep 1, 2007, 5:26:20 PM9/1/07
to
In article <dldhd39vccjgdgs3g...@4ax.com>, Robert <n...@e.mail> wrote:
[snip]

>Proposition: Use simple two-operand arithmetic statements wherever possible.
>
>Test:
>05 binary-number binary pic s9(09) sync.
>
>add 1 to binary-number *> 1 us
>compute binary-number = binary-number + 1 *> 1 us
>
>add 1 to binary-number
>multiply 5 by binary-number
>divide 5 into binary-number
> *> 50 us
>
>compute binary-number = ((binary-number + 1) * 5) / 5 *> 445 us
>
>Finding: busted for simple cases, confirmed for cases with more than one
> operation.

Correct interpretation of findings:

Unconfirmed for a single case consisting of a single operation.

Confirmed for a *single*case* (not cases, plural, as incorrectly stated) with
one operation involving integer arithmetic.

The testing conducted was insufficient, in terms both of types and of
numbers of cases, to permit any valid conclusions to be drawn. Further testing
with larger numbers of simple, complex, and intermediate cases involving both
integers and decimal fractions, with varying USAGEs, needed in order to draw
any valid conclusions.

>Proposition: "Do not use the REMAINDER, ROUNDED, ON SIZE ERROR or CORRESPONDING
> phrases if
>you want the fastest performance. No optimization is done on arithmetic statements if the
>ON SIZE ERROR phrase is used. For this reason, we recommend you do not use this phrase if
>high performance is required. The ROUNDED phrase impacts performance, but it is generally
>faster to use ROUNDED than try to round the result using your own routine. "
>
>Test:
>compute binary-number rounded = binary-number + 1 *> 1 us (no penalty)
>add 1 to binary-number *> 15 us
> on size error display 'overflow'
>end-add
>
>Finding: busted for rounded, confirmed for size error.

Correct interpretation of findings:

As with the previous "test", the testing conducted was insufficient to permit
any valid conclusions to be drawn.

Proposition is confirmed with respect to SIZE ERROR in one simple case.
Additional tests needed, using a variety of PICtures and USAGEs, to determine
whether this case is the general rule, or a fortuitous exception.

Valid test needed to determine effect with ROUNDED. _Of_course_ there's no
penalty for using ROUNDED on an *integer* operation. Why would you expect
otherwise, and why would you expect this test to tell you anything at all?

Effects of REMAINDER and CORRESPONDING not tested.

>Legacy belief: indexes are faster than subscripts
>
>Test:
>05 s-subscript binary pic s9(09) sync.
>01 misaligned-area sync.
> 05 array-element occurs 4096 indexed x-index.
> 10 misaligned-number comp-5 pic s9(09).
> 10 to-cause-misalignment pic x(01).
>move array-element (s-subscript) to test-byte *> 3 us
>move array-element (x-index) to test-byte *> 6 us
>
>Finding: BUSTED. Index is actually slower.

Correct interpretation of findings:

Too many variable conditions are present to allow valid conclusions
to be drawn. Additional tests needed to determine outcome, specifically (but
not necessarily limited to):

a) second test should be conducted with properly aligned data items, to
eliminate misalignment as a contributing factor;

b) third test should be conducted with USAGE DISPLAY data items, to eliminate
all alignment issues as contributing factors;

c) fourth test should be conducted using separate arrays for the subscripted
and indexed accesses, to eliminate INDEXED BY in the definition of the array
as a possible factor in speeding up subscripted access.

>
>Proposition: When incrementing or decrementing a counter, terminate it with a literal
>value rather than a value held in a data item. For example, to execute a loop n times, set
>the counter to n and then decrement the counter until it becomes zero, rather than
>incrementing the counter from zero to n.
>
>Test:
>perform varying binary-number from 10 by -1 until binary-number = 0 *> 150 us
>perform varying binary-number from 1 by 1 until binary-number > 10 *> 154 us
>
>Finding: BUSTED

Correct interpretation of findings:

Baloney. The proposition was not tested at all. Each test case compared the
counter to a literal, not to a data item, so it's hardly surprising that the
difference is so small.


>
>Proposition: Access to tables defined with OCCURS ... DEPENDING is less efficient than
>access to tables of fixed size, and so should be avoided where high performance is needed.
>
>Test:
>
>01 depending-area.
> 05 depending-element occurs 1 to 4096 depending on binary-number.
> 10 comp-5 pic s9(09).
> 10 pic x(01).
>move array-element (s-subscript) to test-byte *> 3 us
>move depending-element (s-subscript) to test-byte *> 3 us
>
>Finding: BUSTED

Correct interpretation of findings:

The testing conducted is grossly insufficient to permit any valid conclusions
to be drawn.

The *reporting* of what little testing was done is *also* grossly insufficient
to permit assessing the validity of that minimal testing. Specifically, it is
necessary to see the definitions of array-element, s-subscript, and test-byte.

Additional testing is needed, including (but not necessarily limited to):

a) Examine the results of comparing
OCCURS 10 vs. OCCURS 1 TO 10
OCCURS 100 vs. OCCURS 1 TO 100
OCCURS 1000 vs. OCCURS 1 TO 1000
etc. to determine if array size has any effect. Make sure that at least one
of these tests uses the largest array size permitted by the compiler.

b) Examine the results of comparing OCCURS 1000 vs OCCURS 1 TO 1000 vs OCCURS
500 TO 1000, e.g, and other similar tests, to determine if the *lower* bound
has any effect. Again, make sure that at least one of these tests uses the
largest array size permitted by the compiler.

c) Compare the results of
OCCURS 4000 vs. OCCURS 1 TO 4000
with the results of
OCCURS 4096 vs. OCCURS 1 TO 4096
to eliminate the [admittedly unlikely] possibility that the array size being
an exact power of two has anything to do with the results.

d) Repeat the one test conducted, changing the USAGEs of all data elements to
DISPLAY, to eliminate alignment issues as a contributing factor.

e) Repeat the tests described in a) and b) above, varying the USAGE of the
DEPENDING item to determine what, if any, difference this makes.

>Proposition: Arithmetic on COMP-3 data items is performed in packed decimal and is much
>slower than arithmetic on COMP items. It should be avoided.
>
>Test:
>05 display-number pic 9(09).
>05 packed-number comp-3 pic s9(09).
>
>add 1 to display-number *> 174 us
>add 1 to packed-number *> 160 us
>
>Finding: CONFIRMED. Packed is almost as slow as display. It was fast on 1970-era
>mainframes. There is no longer any reason to use it. If you want to save space, look at
>space-filled strings and filler-padding.

Correct interpretation of finding: Baloney. The proposition was not tested at
all, and no valid finding is possible.

Valid test needs to be conducted, comparing the execution speed of
instructions involving COMP-3 vs COMP data, rather than COMP-3 vs DISPLAY, and
using data items with identical PICtures.

>To be continued with the most unexpected and interesting case: does aligning numbers on
>memory boundaries matter?

Perhaps this time you can manage to devise some valid, *comprehensive* test
cases, conduct them properly, report them completely, and interpret the
results correctly.

--
Regards,
Doug Miller (alphageek at milmac dot com)

It's time to throw all their damned tea in the harbor again.

Alistair

unread,
Sep 1, 2007, 6:04:32 PM9/1/07
to
I have no wish to criticise your findings but I do have two points to
make:

Robert wrote:
>
> Proposition: Use simple two-operand arithmetic statements wherever possible.
>
> Test:
> 05 binary-number binary pic s9(09) sync.
>
> add 1 to binary-number *> 1 us
> compute binary-number = binary-number + 1 *> 1 us
>
> add 1 to binary-number
> multiply 5 by binary-number
> divide 5 into binary-number *> 50 us
>
> compute binary-number = ((binary-number + 1) * 5) / 5 *> 445 us
>
> Finding: busted for simple cases, confirmed for cases with more than one operation.

This proposition, I believe, derived from the early days when (perhaps
DD can cast his mind back that far and confirm it for us?) the COMPUTE
verb was shown to be less efficient than a multitude of other verbs
that accomplished the same task. Times move on and the COMPUTE verb is
no longer as inefficient as it was once. Personally, I would prefer to
use a complex COMPUTE rather than a series of simple verbs as I
believe that the COMPUTE, because it most closely resembles the
equation being represented, is a better form of self-documentation.

>
> Proposition: "Do not use the REMAINDER, ROUNDED, ON SIZE ERROR or CORRESPONDING phrases if
> you want the fastest performance. No optimization is done on arithmetic statements if the
> ON SIZE ERROR phrase is used. For this reason, we recommend you do not use this phrase if
> high performance is required. The ROUNDED phrase impacts performance, but it is generally
> faster to use ROUNDED than try to round the result using your own routine. "
>
>

> Finding: BUSTED. Index is actually slower.

I'm not really fussed by this but I think that making the algorithm
more obvious has some benefits so I am happy using these clauses (and
would be even where proven to be inefficient).

>
> Proposition: When incrementing or decrementing a counter, terminate it with a literal
> value rather than a value held in a data item. For example, to execute a loop n times, set
> the counter to n and then decrement the counter until it becomes zero, rather than
> incrementing the counter from zero to n.
>
> Test:
> perform varying binary-number from 10 by -1 until binary-number = 0 *> 150 us
> perform varying binary-number from 1 by 1 until binary-number > 10 *> 154 us
>
> Finding: BUSTED

The difference here is going to be so minuscule as to be hardly worth
mentioning. The reason you use a literal is because, if you use a data-
item, the format and location of the data-item has to be recalculated
on each execution of the loop. However, you are talking about a few
cpu cycles per loop. So no real problem. If you are bothered about cpu
cycles, than you should avoid ODO and anything else that requires the
machine to calculate displacements.

>
> Proposition: Access to tables defined with OCCURS ... DEPENDING is less efficient than
> access to tables of fixed size, and so should be avoided where high performance is needed.
>
>

> Finding: BUSTED

I don't think that there is much difference in calculating the
location of a data-item located in an ODO table as compared to a fixed
table size. You would be better off using specific data items (eg
data-1, data-2, data-3....) if those few cpu cycles bother you that
much.

>
> Proposition: Arithmetic on COMP-3 data items is performed in packed decimal and is much
> slower than arithmetic on COMP items. It should be avoided.
>
>

> Finding: CONFIRMED. Packed is almost as slow as display. It was fast on 1970-era
> mainframes. There is no longer any reason to use it. If you want to save space, look at
> space-filled strings and filler-padding.
>

This one I find very interesting, as it is actually machine dependant.
What I mean is that on an IBM mainframe, where PACKED is one of the
machine implementations then PACKED DECIMAL operations will, probably,
be faster than ZONED DECIMAL but not as fast as BINARY operations.
However, where PACKED is not a NATIVE mode then you will find that
the additional requirement to convert between data formats will result
severely impact any operation using the non-native PACKED mode. I can
confirm this because I tried this experiment using BINARY and PACKED
data on Natural programs (Yes, I know it isn't Cobol but....) running
on a dedicated PC and the PACKED code ran slowly. No surprise.

> To be continued with the most unexpected and interesting case: does aligning numbers on
> memory boundaries matter?

YES!!!!! If you use Assembler, IT DOES.

Alistair

unread,
Sep 1, 2007, 6:10:27 PM9/1/07
to

Robert wrote:
> In the Micro Focus manual Server Express (2.2 & 4.0):Program Development, chapter 1 part 1
> is titled Writing Efficient Programs. Its top billing tells us then think speed is a Very
> Important Topic we should know about. For fun, I put their advice to the test.
>

< BIG SNIP >


I should have added that, in my not-very-humble opinion, the machine
cycles used are much less important to me than the ease of maintenance
of the code. It costs pennies (alright, it can cost a fortune if the
code is really shite) to run inefficient code but it costs pounds (or
2*dollars) to maintain it.

Efficient coding should be encouraged and rewarded. Coding according
to outdated and mis-understood standards should be discouraged. In-
line documentation should be encouraged. Those three will save pounds
(or 2*dollars) and headaches.

Jeff Campbell

unread,
Sep 1, 2007, 6:15:22 PM9/1/07
to
Robert wrote:
> On Sat, 1 Sep 2007 11:56:43 +0200, "Roger While" <si...@sim-basis.de> wrote:
>
>> Absolute rubbish.
>
> Thanks for the erudite rebuttal.
>
>> You need to do an inline PERFORM of
>> at least a million iterations to determine this.
>
> I did 100 million.
>
>> In fact, BINARY (aka COMP) is big endian (generally).
>> So anyway your tests are invalid.
>> (They force a endian swap on little endian)
>
> The PA processor is little endian, so there's no difference between BINARY and COMP-5.
> Try again when someone posts timing tests on an Intel or Alpha.

Alphas are bi-endian. That is, the chip supports running in either mode.
The DEC OSs running on Alphas, OpenVMS, Tru64 UNIX and the port of WNT4,
run the CPU(s) little endian. The Linux distributions I have used on
Alphas, Red Hat, Debian and SuSE, are also little endian.

If you can post your test code I'll post the results I get on my
PWS 600au running VMS.

>
>> You should be using BINARY-LONG (aka COMP-5)
>
> I didn't post a comparison because most people find no difference boring.
>
>> Alignment DOES matter on machines where
>> this is not tolerated.
>
> Modern machines have two or three levels of cache between the CPU and memory. There are no
> alignment issues in a cache. But compilers that THINK alignment is important shoot
> themselves in the foot by generating extra instructions to align he number to speed things
> up. The extra instructions are counterproductive, they actually slow things down.
>
>> I have done all this machine stuff with OC.
>
> What's OC?
>


Jeff

----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----

Robert

unread,
Sep 1, 2007, 8:42:09 PM9/1/07
to
On Sat, 01 Sep 2007 16:15:22 -0600, Jeff Campbell <n8...@arrl.net> wrote:


>If you can post your test code I'll post the results I get on my
>PWS 600au running VMS.

Here it is:


* ---------------------------------------------------------------------
* Findings
* Aligned 1
* Unaligned 15
* Misaligned (1) 5
* Misaligned (2) 4
* Binary 1
* Linkage 30
* Compute n=n+1 1
* Rounded 1
* size error 18
* Display 174
* Packed 160
* Arithmetic 50
* Compute 445
* Index 6
* Subscript 3
* Depending 3
* Evaluate true 2
* Evaluate expression 3
* Go to depending 7
* Evaluate case 11
* Initialize 346
* Move zeros 339
* Dec to zero 149
* Inc to 10 154

$SET SOURCEFORMAT"FREE"
$SET NOBOUND
$SET OPT"2"
$SET NOTRUNC
$SET IBMCOMP
$SET NOCHECK
$SET ALIGN"8"
identification division.
program-id. Speed1.
author. Robert Wxagner.

data division.
working-storage section.
01 test-data.
05 comp5-number comp-5 pic s9(09) sync.
05 test-byte pic x(01).
05 unaligned-number comp-5 pic s9(09).
05 pic x(03).

05 binary-number binary pic s9(09) sync.

05 display-number pic 9(09).
05 packed-number comp-3 pic s9(09).

05 s-subscript binary pic s9(09) sync.

01 depending-area.


05 depending-element occurs 1 to 4096 depending on binary-number.
10 comp-5 pic s9(09).
10 pic x(01).

01 misaligned-area sync.
05 array-element occurs 4096 indexed x-index.
10 misaligned-number comp-5 pic s9(09).
10 to-cause-misalignment pic x(01).

01 timer-variables.
05 test-name pic x(30).
05 repeat-factor value 100000000 binary pic s9(09).
05 current-date-structure.
10 pic x(08).
10 time-now-hhmmsshh.
15 hours pic 9(02).
15 minutes pic 9(02).
15 seconds pic 9(02).
15 hundredths pic 9(02).
10 pic x(05).
05 time-now pic 9(06)v99.
05 time-start pic 9(06)v99.
05 timer-overhead value zero pic 9(06)v99.
05 elapsed-time pic s9(06)v99.
05 elapsed-time-display.
10 elapsed-time-edited pic z(05).

linkage section.
01 linkage-number binary pic s9(09) sync.

procedure division.

initialize test-data, misaligned-area

move 'Null test' to test-name
perform timer-on
perform timer-on
perform repeat-factor times
exit perform cycle
end-perform
perform timer-off
compute timer-overhead = (time-now - time-start)

move 'Aligned' to test-name
perform timer-on
perform repeat-factor times
add 1 to comp5-number
exit perform cycle
end-perform
perform timer-off

move 'Unaligned' to test-name
perform timer-on
perform repeat-factor times
add 1 to unaligned-number
exit perform cycle
end-perform
perform timer-off

move 'Misaligned (1)' to test-name
move 1 to s-subscript
perform timer-on
perform repeat-factor times
add 1 to misaligned-number (s-subscript)
exit perform cycle
end-perform
perform timer-off

move 'Misaligned (2)' to test-name
*> if this is faster than Unaligned above,
*> compiler generated alignment code is slowing things down
move 2 to s-subscript
perform timer-on
perform repeat-factor times
add 1 to misaligned-number (s-subscript)
exit perform cycle
end-perform
perform timer-off

move 'Binary' to test-name
move zero to binary-number
perform timer-on
perform repeat-factor times
add 1 to binary-number
exit perform cycle
end-perform
perform timer-off

move 'Linkage' to test-name
set address of linkage-number to address of binary-number
move zero to linkage-number
perform timer-on
perform repeat-factor times
add 1 to linkage-number
exit perform cycle
end-perform
perform timer-off

move 'Compute n=n+1' to test-name
move zero to binary-number
perform timer-on
perform repeat-factor times

compute binary-number = binary-number + 1

exit perform cycle
end-perform
perform timer-off

move 'Rounded' to test-name
move zero to binary-number
perform timer-on
perform repeat-factor times

compute binary-number rounded = binary-number + 1

exit perform cycle
end-perform
perform timer-off

move 'size error' to test-name
move zero to binary-number
perform timer-on
perform repeat-factor times
add 1 to binary-number

on size error display 'overflow'
end-add

exit perform cycle
end-perform
perform timer-off

move 'Display' to test-name
perform timer-on
perform repeat-factor times
*> add 1 to display-number
exit perform cycle
end-perform
perform timer-off

move 'Packed' to test-name
perform timer-on
perform repeat-factor times
*> add 1 to packed-number
exit perform cycle
end-perform
perform timer-off

move 'Arithmetic' to test-name
move zero to binary-number
perform timer-on
perform repeat-factor times


add 1 to binary-number
multiply 5 by binary-number
divide 5 into binary-number

exit perform cycle
end-perform
perform timer-off

move 'Compute' to test-name
move zero to binary-number
divide 10 into repeat-factor
perform timer-on
perform repeat-factor times

compute binary-number = ((binary-number + 1) * 5) / 5

exit perform cycle
end-perform
perform timer-off
multiply 10 by repeat-factor

move 'Index' to test-name
set x-index to 1000
perform timer-on
perform repeat-factor times

move array-element (x-index) to test-byte

exit perform cycle
end-perform
perform timer-off

move 'Subscript' to test-name
move 1000 to s-subscript
perform timer-on
perform repeat-factor times

move array-element (s-subscript) to test-byte

exit perform cycle
end-perform
perform timer-off

move 'Depending' to test-name
move 2000 to binary-number
move 1000 to s-subscript
perform timer-on
perform repeat-factor times

move depending-element (s-subscript) to test-byte

exit perform cycle
end-perform
perform timer-off

move 'Evaluate true' to test-name
move zero to binary-number
perform timer-on
perform repeat-factor times
evaluate true
when binary-number equal to zero
exit perform cycle
when other
display 'error'
end-evaluate
end-perform
perform timer-off

move 'Evaluate expression' to test-name
move zero to binary-number
perform timer-on
perform repeat-factor times
evaluate binary-number
when zero
exit perform cycle
when other
display 'error'
end-evaluate
end-perform
perform timer-off

move 'Go to depending' to test-name
move 2 to binary-number
perform timer-on
perform go-depending-test repeat-factor times
perform timer-off

move 'Evalaute case' to test-name
move 2 to binary-number
perform timer-on
perform evaluate-case-test repeat-factor times
perform timer-off

move 'Initialize' to test-name
perform timer-on
perform repeat-factor times
initialize test-data
exit perform cycle
end-perform
perform timer-off

move 'Move zeros' to test-name
perform timer-on
perform repeat-factor times
move zeros to
comp5-number
test-byte
unaligned-number
binary-number
display-number
packed-number
s-subscript
exit perform cycle
end-perform
perform timer-off


move 'Dec to zero' to test-name
perform timer-on
perform repeat-factor times

perform varying binary-number from 10 by -1 until binary-number
= 0

end-perform
exit perform cycle
end-perform
perform timer-off

move 'Inc to 10' to test-name

perform timer-on

perform repeat-factor times

perform varying binary-number from 1 by 1 until binary-number >
10

end-perform

exit perform cycle

end-perform

perform timer-off


goback

. go-depending-test section.
go to p1 p2 p3 depending on binary-number
display 'error'
. p1. display 'error'
. p2. exit section
. p3. display 'error'
. evaluate-case-test section.
evaluate binary-number
when 1
display 'error'
when 2
exit section
when other
display 'error'
end-evaluate

. end-of-previous section
. timer-on.
perform read-the-time
move time-now to time-start
. timer-off.
perform read-the-time
compute elapsed-time rounded = ((time-now - time-start)
* 100000000 / repeat-factor)
- timer-overhead

if elapsed-time not greater than zero
move 'error' to elapsed-time-display
else
compute elapsed-time-edited rounded = elapsed-time * 10
end-if
display test-name elapsed-time-display
. read-the-time.
accept time-now-hhmmsshh from time
*> move function current-date to current-date-structure
compute time-now =
((((hours * 60) +
minutes) * 60) +
seconds) +
(hundredths / 100)
.

Pete Dashwood

unread,
Sep 1, 2007, 9:24:00 PM9/1/07
to
A very professional appraisal, Doug.

I look forward to Robert's further tests, if he has time or inclination to
run them.

Pete.
--
"I used to write COBOL...now I can do anything."

TOP POST - nothing new below...

"Doug Miller" <spam...@milmac.com> wrote in message
news:0OkCi.4660$JD....@newssvr21.news.prodigy.net...

docd...@panix.com

unread,
Sep 1, 2007, 9:32:16 PM9/1/07
to
In article <1188684272.6...@22g2000hsm.googlegroups.com>,

Alistair <alis...@ld50macca.demon.co.uk> wrote:
>I have no wish to criticise your findings but I do have two points to
>make:
>
>Robert wrote:

[snip]

>> Finding: busted for simple cases, confirmed for cases with more than
>> one operation.
>
>This proposition, I believe, derived from the early days when (perhaps
>DD can cast his mind back that far and confirm it for us?) the COMPUTE
>verb was shown to be less efficient than a multitude of other verbs
>that accomplished the same task.

Mr Maclean, I recall being taught something like that lo, those many moons
ago... but I never tested it and I don't recall ever seeing a PMAP where a
COMPUTE was shown to be of lesser efficiency than simpler instructions.

My experiences are limited, of course, and my memory is, admittedly,
porous.

DD

Charles Hottel

unread,
Sep 1, 2007, 9:40:06 PM9/1/07
to

"Alistair" <alis...@ld50macca.demon.co.uk> wrote in message
news:1188684627....@k79g2000hse.googlegroups.com...
>

<snip>

> code is really shite)

<snip>

Which meaning of the word are you using?
The word shite may refer to various things:

a.. A variant of the word shit
b.. A shi'ite, a person who practices the Shi'a Islam faith
c.. The shite, the principal character in a Japanese Noh play
d.. Shite, the person who performs the technique in aikido


Robert

unread,
Sep 1, 2007, 10:00:23 PM9/1/07
to
On Sat, 01 Sep 2007 21:26:20 GMT, spam...@milmac.com (Doug Miller) wrote:

>In article <dldhd39vccjgdgs3g...@4ax.com>, Robert <n...@e.mail> wrote:
>[snip]
>>Proposition: Use simple two-operand arithmetic statements wherever possible.
>>
>>Test:
>>05 binary-number binary pic s9(09) sync.
>>
>>add 1 to binary-number *> 1 us
>>compute binary-number = binary-number + 1 *> 1 us
>>
>>add 1 to binary-number
>>multiply 5 by binary-number
>>divide 5 into binary-number
>> *> 50 us
>>
>>compute binary-number = ((binary-number + 1) * 5) / 5 *> 445 us
>>
>>Finding: busted for simple cases, confirmed for cases with more than one
>> operation.
>
>Correct interpretation of findings:
>
>Unconfirmed for a single case consisting of a single operation.
>
>Confirmed for a *single*case* (not cases, plural, as incorrectly stated) with
>one operation involving integer arithmetic.
>
>The testing conducted was insufficient, in terms both of types and of
>numbers of cases, to permit any valid conclusions to be drawn. Further testing
>with larger numbers of simple, complex, and intermediate cases involving both
>integers and decimal fractions, with varying USAGEs, needed in order to draw
>any valid conclusions.

I expected the compiler to compute 5/5=1 at compile time. Other compilers do. If it can't
do that, it's unlikely to do better with more complex cases.

>>Proposition: "Do not use the REMAINDER, ROUNDED, ON SIZE ERROR or CORRESPONDING
>> phrases if
>>you want the fastest performance. No optimization is done on arithmetic statements if the
>>ON SIZE ERROR phrase is used. For this reason, we recommend you do not use this phrase if
>>high performance is required. The ROUNDED phrase impacts performance, but it is generally
>>faster to use ROUNDED than try to round the result using your own routine. "
>>
>>Test:
>>compute binary-number rounded = binary-number + 1 *> 1 us (no penalty)
>>add 1 to binary-number *> 15 us
>> on size error display 'overflow'
>>end-add
>>
>>Finding: busted for rounded, confirmed for size error.
>
>Correct interpretation of findings:
>
>As with the previous "test", the testing conducted was insufficient to permit
>any valid conclusions to be drawn.
>
>Proposition is confirmed with respect to SIZE ERROR in one simple case.
>Additional tests needed, using a variety of PICtures and USAGEs, to determine
>whether this case is the general rule, or a fortuitous exception.
>
>Valid test needed to determine effect with ROUNDED. _Of_course_ there's no
>penalty for using ROUNDED on an *integer* operation. Why would you expect
>otherwise, and why would you expect this test to tell you anything at all?

It might have gone through the motions of unnecessary rounding. I'll add a case:
compute binaary-number rounded = binary-number + .5

>Effects of REMAINDER and CORRESPONDING not tested.

I would expect CORRESPONDING to be free. Why should it cost more than individual ADDs?
I did test INITIALIZE, another no-no according to the manual. As expected, it ran exactly
the same speed as individual MOVE ZERO and MOVE SPACEs.

REMAINDER should be cheap. The machine's divide instruction gives it for no additional
cost.

>>Legacy belief: indexes are faster than subscripts
>>
>>Test:
>>05 s-subscript binary pic s9(09) sync.
>>01 misaligned-area sync.
>> 05 array-element occurs 4096 indexed x-index.
>> 10 misaligned-number comp-5 pic s9(09).
>> 10 to-cause-misalignment pic x(01).
>>move array-element (s-subscript) to test-byte *> 3 us
>>move array-element (x-index) to test-byte *> 6 us
>>
>>Finding: BUSTED. Index is actually slower.
>
>Correct interpretation of findings:
>
>Too many variable conditions are present to allow valid conclusions
>to be drawn. Additional tests needed to determine outcome, specifically (but
>not necessarily limited to):
>
>a) second test should be conducted with properly aligned data items, to
>eliminate misalignment as a contributing factor;

It is moving one byte (test-byte pic x). Alignment is not an issue. The only variable is
index versus subscript.

>b) third test should be conducted with USAGE DISPLAY data items, to eliminate
>all alignment issues as contributing factors;
>
>c) fourth test should be conducted using separate arrays for the subscripted
>and indexed accesses, to eliminate INDEXED BY in the definition of the array
>as a possible factor in speeding up subscripted access.

Huh? How could INDEXED BY speed up a subscript?

>>Proposition: When incrementing or decrementing a counter, terminate it with a literal
>>value rather than a value held in a data item. For example, to execute a loop n times, set
>>the counter to n and then decrement the counter until it becomes zero, rather than
>>incrementing the counter from zero to n.
>>
>>Test:
>>perform varying binary-number from 10 by -1 until binary-number = 0 *> 150 us
>>perform varying binary-number from 1 by 1 until binary-number > 10 *> 154 us
>>
>>Finding: BUSTED
>
>Correct interpretation of findings:
>
>Baloney. The proposition was not tested at all. Each test case compared the
>counter to a literal, not to a data item, so it's hardly surprising that the
>difference is so small.

You're right. I'll change one of the limits to a variable.

To execute a loop n times, the code should read PERFORM n TIMES, not PERFORM VARYING. I'll
add that comparison as well.

>>Proposition: Access to tables defined with OCCURS ... DEPENDING is less efficient than
>>access to tables of fixed size, and so should be avoided where high performance is needed.
>>
>>Test:
>>
>>01 depending-area.
>> 05 depending-element occurs 1 to 4096 depending on binary-number.
>> 10 comp-5 pic s9(09).
>> 10 pic x(01).
>>move array-element (s-subscript) to test-byte *> 3 us
>>move depending-element (s-subscript) to test-byte *> 3 us
>>
>>Finding: BUSTED
>
>Correct interpretation of findings:
>
>The testing conducted is grossly insufficient to permit any valid conclusions
>to be drawn.

Is GROSSLY insufficient 144 times worse than simply insufficient?

>The *reporting* of what little testing was done is *also* grossly insufficient
>to permit assessing the validity of that minimal testing. Specifically, it is
>necessary to see the definitions of array-element, s-subscript, and test-byte.
>
>Additional testing is needed, including (but not necessarily limited to):
>
>a) Examine the results of comparing
>OCCURS 10 vs. OCCURS 1 TO 10
>OCCURS 100 vs. OCCURS 1 TO 100
>OCCURS 1000 vs. OCCURS 1 TO 1000
>etc. to determine if array size has any effect. Make sure that at least one
>of these tests uses the largest array size permitted by the compiler.

32 bit programs can address 4GB. There's no difference between OCCURS 10 and OCCURS 10000.
You seem to be thinking of 16 bit compilers that bust a gut allowing arrays bigger than
64K.

>b) Examine the results of comparing OCCURS 1000 vs OCCURS 1 TO 1000 vs OCCURS
>500 TO 1000, e.g, and other similar tests, to determine if the *lower* bound
>has any effect. Again, make sure that at least one of these tests uses the
>largest array size permitted by the compiler.

That's an issue for bounds checking, not for access. Bounds checking was turned off for
this speed test.

FWIW, if bounds checking had been on, Micro Focus doesn't use the DEPENDING item to do it.
It only checks whether the subscript/index is greater than the maximum, which is a
literal.

>c) Compare the results of
>OCCURS 4000 vs. OCCURS 1 TO 4000
>with the results of
>OCCURS 4096 vs. OCCURS 1 TO 4096
>to eliminate the [admittedly unlikely] possibility that the array size being
>an exact power of two has anything to do with the results.
>
>d) Repeat the one test conducted, changing the USAGEs of all data elements to
>DISPLAY, to eliminate alignment issues as a contributing factor.

The usage is one byte pic x. Don't complicate it.

>e) Repeat the tests described in a) and b) above, varying the USAGE of the
>DEPENDING item to determine what, if any, difference this makes.

Given that access speed is identical, it is obviously not using the DEPENDING item. Why
would it?

The point being made here is that avoiding DEPENDING is a false Cobol myth.

>>Proposition: Arithmetic on COMP-3 data items is performed in packed decimal and is much
>>slower than arithmetic on COMP items. It should be avoided.
>>
>>Test:
>>05 display-number pic 9(09).
>>05 packed-number comp-3 pic s9(09).
>>
>>add 1 to display-number *> 174 us
>>add 1 to packed-number *> 160 us
>>
>>Finding: CONFIRMED. Packed is almost as slow as display. It was fast on 1970-era
>>mainframes. There is no longer any reason to use it. If you want to save space, look at
>>space-filled strings and filler-padding.
>
>Correct interpretation of finding: Baloney. The proposition was not tested at
>all, and no valid finding is possible.
>
>Valid test needs to be conducted, comparing the execution speed of
>instructions involving COMP-3 vs COMP data, rather than COMP-3 vs DISPLAY, and
>using data items with identical PICtures.

The speed for COMP with identical picture was given above. It is 1.

The point is that COMP-3 (PACKED-DECIMAL) is an anachronism. It's very slow on any machine
except a mainframe, not to mention non-portable when stored in a file. I laugh at files
that contain packed numbers "to save space", then pad the record with a few hundred bytes
of filler.

>>To be continued with the most unexpected and interesting case: does aligning numbers on
>>memory boundaries matter?
>
>Perhaps this time you can manage to devise some valid, *comprehensive* test
>cases, conduct them properly, report them completely, and interpret the
>results correctly.

I said I did it for fun. Compiler companies run more comprehensive tests, but seldom
publish the results. They seem to think it's 'competitive information' and don't like
speed tests making them look bad.

Robert

unread,
Sep 1, 2007, 10:16:56 PM9/1/07
to
On Sat, 01 Sep 2007 15:04:32 -0700, Alistair <alis...@ld50macca.demon.co.uk> wrote:

>I have no wish to criticise your findings but I do have two points to
>make:
>
>Robert wrote:
>>
>> Proposition: Use simple two-operand arithmetic statements wherever possible.
>>
>> Test:
>> 05 binary-number binary pic s9(09) sync.
>>
>> add 1 to binary-number *> 1 us
>> compute binary-number = binary-number + 1 *> 1 us
>>
>> add 1 to binary-number
>> multiply 5 by binary-number
>> divide 5 into binary-number *> 50 us
>>
>> compute binary-number = ((binary-number + 1) * 5) / 5 *> 445 us
>>
>> Finding: busted for simple cases, confirmed for cases with more than one operation.
>
>This proposition, I believe, derived from the early days when (perhaps
>DD can cast his mind back that far and confirm it for us?) the COMPUTE
>verb was shown to be less efficient than a multitude of other verbs
>that accomplished the same task. Times move on and the COMPUTE verb is
>no longer as inefficient as it was once.

That's what I thought. COMPUTE *is* efficient on most compilers.

>Personally, I would prefer to
>use a complex COMPUTE rather than a series of simple verbs as I
>believe that the COMPUTE, because it most closely resembles the
>equation being represented, is a better form of self-documentation.

I agree.

> If you are bothered about cpu
>cycles, than you should avoid ODO and anything else that requires the
>machine to calculate displacements.

There you go, repeating a myth about ODO being slow.

>> To be continued with the most unexpected and interesting case: does aligning numbers on
>> memory boundaries matter?
>
>YES!!!!! If you use Assembler, IT DOES.

Not any longer. It used to be before memory caches. It still is if the compiler generates
extra instructions intended to save time. They you have to find ways to blind the compiler
so it will stop.

Robert

unread,
Sep 1, 2007, 10:52:45 PM9/1/07
to
On Sat, 01 Sep 2007 15:10:27 -0700, Alistair <alis...@ld50macca.demon.co.uk> wrote:

>
>Robert wrote:
>> In the Micro Focus manual Server Express (2.2 & 4.0):Program Development, chapter 1 part 1
>> is titled Writing Efficient Programs. Its top billing tells us then think speed is a Very
>> Important Topic we should know about. For fun, I put their advice to the test.
>>
>
>< BIG SNIP >
>
>
>I should have added that, in my not-very-humble opinion, the machine
>cycles used are much less important to me than the ease of maintenance
>of the code.

According to Pete Dashwood, program maintenance is obsolete.

>It costs pennies (alright, it can cost a fortune if the
>code is really shite) to run inefficient code but it costs pounds (or
>2*dollars) to maintain it.

You wouldn't say that if your program was running 20,000 transactions PER SECOND, in real
time.

>Efficient coding should be encouraged and rewarded. Coding according
>to outdated and mis-understood standards should be discouraged.

You'll never make it in the world of contract programming.

Standards are a misnomer because they are different in every company and even between
departments within the same company. De facto standards are usually different from the
published ones. Their purpose is not to simplify maintenance (the people who wrote the
standard no longer do maintenance), it's to keep out competition from programmers who are
better than management. One team lead had the candor to say "Management thinks time
stopped in 1974. They'd have a stroke if they saw this EXIT PERFORM CYCLE. You can't do
that! It's not in the standard because they never saw it."

I walked out of that place after one week. It's pretty typical of non-mainframe Cobol
shops (mainframe shops are ALL like that). The ones that aren't like that say "Yeah, we
maintain it. When we have to add significant code, we rewrite the thing in C."

Doug Miller

unread,
Sep 1, 2007, 11:00:38 PM9/1/07
to

First, given the manner in which the expression is parenthesized, that is not
a reasonable expectation.

Second, this expectation, reasonable or not, is not relevant to the test
supposedly being conducted -- which was nominally to determine if single
arithemetic operations were more, or less, efficiently executed than complex
ones, not to determine the compiler's ability to optimize.

Yes, it might have -- but you didn't test to see if that was the case, and
reached a conclusion that was unjustified by available evidence.


>
>>Effects of REMAINDER and CORRESPONDING not tested.
>
>I would expect CORRESPONDING to be free. Why should it cost more than
> individual ADDs?

The point is that the stated proposition included CORRESPONDING, and your
testing did not.

>I did test INITIALIZE, another no-no according to the manual. As expected, it
> ran exactly
>the same speed as individual MOVE ZERO and MOVE SPACEs.
>
>REMAINDER should be cheap. The machine's divide instruction gives it for no
> additional
>cost.

Again, no testing was conducted to determine whether this is or is not the
case.


>
>>>Legacy belief: indexes are faster than subscripts
>>>
>>>Test:
>>>05 s-subscript binary pic s9(09) sync.
>>>01 misaligned-area sync.
>>> 05 array-element occurs 4096 indexed x-index.
>>> 10 misaligned-number comp-5 pic s9(09).
>>> 10 to-cause-misalignment pic x(01).
>>>move array-element (s-subscript) to test-byte *> 3 us
>>>move array-element (x-index) to test-byte *> 6 us
>>>
>>>Finding: BUSTED. Index is actually slower.
>>
>>Correct interpretation of findings:
>>
>>Too many variable conditions are present to allow valid conclusions
>>to be drawn. Additional tests needed to determine outcome, specifically (but
>>not necessarily limited to):
>>
>>a) second test should be conducted with properly aligned data items, to
>>eliminate misalignment as a contributing factor;
>
>It is moving one byte (test-byte pic x).

The definition of 'test-byte' was not provided.

>Alignment is not an issue.

That being the case, one wonders why you constructed for your testing an array
that deliberately misaligns COMP items.

>The only
> variable is index versus subscript.

Incorrect, for reasons already described.

>>b) third test should be conducted with USAGE DISPLAY data items, to eliminate
>>all alignment issues as contributing factors;
>>
>>c) fourth test should be conducted using separate arrays for the subscripted
>>and indexed accesses, to eliminate INDEXED BY in the definition of the array
>>as a possible factor in speeding up subscripted access.
>
>Huh? How could INDEXED BY speed up a subscript?

Quite possibly, when the array definition includes INDEXED BY, the compiler
emits the same instructions for all array accesses whether they use an index
or a subscript. By testing references by both subscript and index against the
*same* array, you failed to exclude this possibility. Valid conclusions can be
drawn *only* by comparing executions speeds against arrays defined identically
*except* for the presence or absence of INDEXED BY, and accessed by index or
subscript respectively.


>
>>>Proposition: When incrementing or decrementing a counter, terminate it with a
> literal
>>>value rather than a value held in a data item. For example, to execute a loop
> n times, set
>>>the counter to n and then decrement the counter until it becomes zero, rather
> than
>>>incrementing the counter from zero to n.
>>>
>>>Test:
>>>perform varying binary-number from 10 by -1 until binary-number = 0 *> 150
> us
>>>perform varying binary-number from 1 by 1 until binary-number > 10 *> 154
> us
>>>
>>>Finding: BUSTED
>>
>>Correct interpretation of findings:
>>
>>Baloney. The proposition was not tested at all. Each test case compared the
>>counter to a literal, not to a data item, so it's hardly surprising that the
>>difference is so small.
>
>You're right. I'll change one of the limits to a variable.

You need to do more than that: you need to run both the loops in the same
direction, too, either incrementing both or decrementing both.


>
>To execute a loop n times, the code should read PERFORM n TIMES, not PERFORM
> VARYING. I'll
>add that comparison as well.

Not relevant to the stated proposition.


>
>>>Proposition: Access to tables defined with OCCURS ... DEPENDING is less
> efficient than
>>>access to tables of fixed size, and so should be avoided where high
> performance is needed.
>>>
>>>Test:
>>>
>>>01 depending-area.
>>> 05 depending-element occurs 1 to 4096 depending on binary-number.
>>> 10 comp-5 pic s9(09).
>>> 10 pic x(01).
>>>move array-element (s-subscript) to test-byte *> 3 us
>>>move depending-element (s-subscript) to test-byte *> 3 us
>>>
>>>Finding: BUSTED
>>
>>Correct interpretation of findings:
>>
>>The testing conducted is grossly insufficient to permit any valid conclusions
>>to be drawn.
>
>Is GROSSLY insufficient 144 times worse than simply insufficient?

Yep.


>
>>The *reporting* of what little testing was done is *also* grossly insufficient
>>to permit assessing the validity of that minimal testing. Specifically, it is
>>necessary to see the definitions of array-element, s-subscript, and test-byte.
>>
>>Additional testing is needed, including (but not necessarily limited to):
>>
>>a) Examine the results of comparing
>>OCCURS 10 vs. OCCURS 1 TO 10
>>OCCURS 100 vs. OCCURS 1 TO 100
>>OCCURS 1000 vs. OCCURS 1 TO 1000
>>etc. to determine if array size has any effect. Make sure that at least one
>>of these tests uses the largest array size permitted by the compiler.
>
>32 bit programs can address 4GB. There's no difference between OCCURS 10 and
> OCCURS 10000.

Did you actually test to make sure this is the case? Or is this just an
unsupported assumption?

>You seem to be thinking of 16 bit compilers that bust a gut allowing arrays
> bigger than 64K.

Point is, you tested on *one* instance, and assumed that your partial result
indicated a general principle.


>
>>b) Examine the results of comparing OCCURS 1000 vs OCCURS 1 TO 1000 vs OCCURS
>>500 TO 1000, e.g, and other similar tests, to determine if the *lower* bound
>>has any effect. Again, make sure that at least one of these tests uses the
>>largest array size permitted by the compiler.
>
>That's an issue for bounds checking, not for access. Bounds checking was turned
> off for this speed test.

.. thus invalidating it.


>
>FWIW, if bounds checking had been on, Micro Focus doesn't use the DEPENDING
> item to do it.
>It only checks whether the subscript/index is greater than the maximum, which
> is a literal.
>
>>c) Compare the results of
>>OCCURS 4000 vs. OCCURS 1 TO 4000
>>with the results of
>>OCCURS 4096 vs. OCCURS 1 TO 4096
>>to eliminate the [admittedly unlikely] possibility that the array size being
>>an exact power of two has anything to do with the results.
>>
>>d) Repeat the one test conducted, changing the USAGEs of all data elements to
>>DISPLAY, to eliminate alignment issues as a contributing factor.
>
>The usage is one byte pic x. Don't complicate it.

Not the usage of the array item. "Don't complicate it" means using the
simplest data structure possible -- which means USAGE DISPLAY -- to ensure
that structure and alignment issues are kept to a minimum.


>
>>e) Repeat the tests described in a) and b) above, varying the USAGE of the
>>DEPENDING item to determine what, if any, difference this makes.
>
>Given that access speed is identical, it is obviously not using the DEPENDING
> item. Why would it?

Your testing is insufficient to enable you to draw that conclusion.


>
>The point being made here is that avoiding DEPENDING is a false Cobol myth.

And *my* point is that while you *may* be right, the testing you have
conducted up to this point is *very* far from demonstrating that.


>
>>>Proposition: Arithmetic on COMP-3 data items is performed in packed decimal
> and is much
>>>slower than arithmetic on COMP items. It should be avoided.
>>>
>>>Test:
>>>05 display-number pic 9(09).
>>>05 packed-number comp-3 pic s9(09).
>>>
>>>add 1 to display-number *> 174 us
>>>add 1 to packed-number *> 160 us
>>>
>>>Finding: CONFIRMED. Packed is almost as slow as display. It was fast on
> 1970-era
>>>mainframes. There is no longer any reason to use it. If you want to save
> space, look at
>>>space-filled strings and filler-padding.
>>
>>Correct interpretation of finding: Baloney. The proposition was not tested at
>>all, and no valid finding is possible.
>>
>>Valid test needs to be conducted, comparing the execution speed of
>>instructions involving COMP-3 vs COMP data, rather than COMP-3 vs DISPLAY, and
>>using data items with identical PICtures.
>
>The speed for COMP with identical picture was given above. It is 1.

Not true.

>
>The point is that COMP-3 (PACKED-DECIMAL) is an anachronism. It's very slow on
> any machine except a mainframe,

Your testing does not demonstrate that.

>not to mention non-portable when stored in a file. I laugh
> at files that contain packed numbers "to save space", then pad the record with a few
> hundred bytes
>of filler.
>
>>>To be continued with the most unexpected and interesting case: does aligning
> numbers on
>>>memory boundaries matter?
>>
>>Perhaps this time you can manage to devise some valid, *comprehensive* test
>>cases, conduct them properly, report them completely, and interpret the
>>results correctly.
>
>I said I did it for fun. Compiler companies run more comprehensive tests, but
> seldom
>publish the results. They seem to think it's 'competitive information' and
> don't like
>speed tests making them look bad.

If you did it only for fun, why publish the results here? Especially, why
label a proposition "BUSTED" when you've conducted only one incomplete test on
it?

Pete Dashwood

unread,
Sep 1, 2007, 11:07:01 PM9/1/07
to

"Robert" <n...@e.mail> wrote in message

news:2d7kd39f1h14n43q5...@4ax.com...


> On Sat, 01 Sep 2007 15:10:27 -0700, Alistair
> <alis...@ld50macca.demon.co.uk> wrote:
>
>>
>>Robert wrote:
>>> In the Micro Focus manual Server Express (2.2 & 4.0):Program
>>> Development, chapter 1 part 1
>>> is titled Writing Efficient Programs. Its top billing tells us then
>>> think speed is a Very
>>> Important Topic we should know about. For fun, I put their advice to the
>>> test.
>>>
>>
>>< BIG SNIP >
>>
>>
>>I should have added that, in my not-very-humble opinion, the machine
>>cycles used are much less important to me than the ease of maintenance
>>of the code.
>
> According to Pete Dashwood, program maintenance is obsolete.

Hmmm... hardly fair, Robert.

It is obsolete for people using component based OO development. And that
includes me. I have never stated it is obsolete as a blanket statement, and
I don't believe that to be the case.

I DO believe that the continuing cost of maintaining procedural code will
render it a non-viable option for most companies, and THAT will make it
obsolete, but I am uder no illusions as to the current state of affairs.

I don't mind being quoted, but please try and ensure you do so accurately
and in context. (It is a no less a courtesy than I would afford you :-))

Robert

unread,
Sep 2, 2007, 12:49:52 AM9/2/07
to
On Sun, 2 Sep 2007 15:07:01 +1200, "Pete Dashwood" <dash...@removethis.enternet.co.nz>
wrote:

>"Robert" <n...@e.mail> wrote in message
>news:2d7kd39f1h14n43q5...@4ax.com...
>> On Sat, 01 Sep 2007 15:10:27 -0700, Alistair
>> <alis...@ld50macca.demon.co.uk> wrote:
>>
>>>
>>>Robert wrote:
>>>> In the Micro Focus manual Server Express (2.2 & 4.0):Program
>>>> Development, chapter 1 part 1
>>>> is titled Writing Efficient Programs. Its top billing tells us then
>>>> think speed is a Very
>>>> Important Topic we should know about. For fun, I put their advice to the
>>>> test.
>>>>
>>>
>>>< BIG SNIP >
>>>
>>>
>>>I should have added that, in my not-very-humble opinion, the machine
>>>cycles used are much less important to me than the ease of maintenance
>>>of the code.
>>
>> According to Pete Dashwood, program maintenance is obsolete.
>
>Hmmm... hardly fair, Robert.
>
>It is obsolete for people using component based OO development. And that
>includes me. I have never stated it is obsolete as a blanket statement, and
>I don't believe that to be the case.
>
>I DO believe that the continuing cost of maintaining procedural code will
>render it a non-viable option for most companies, and THAT will make it
>obsolete, but I am uder no illusions as to the current state of affairs.

It has been 28 years since Bjarne Stroustrup invented C++. Other OO languages such as
Smalltalk and Simula 67 have been around longer. How long will it take? I'm becoming
impatient.

I was recently IN THE BUILDING where Bjarne invented C++ -- AT&T Labs in Floram Park NJ.
It seemed like a holy place, like the Sistene Chapel or something. I thought maybe I
should cross myself out of respect. But no, Bjarne was long gone to Texas A&M, the
building was full of the usual computer geeks I work with everywhere, half of them from
India, Russia, China and Israel.

I now live across the street from Motorola HQ in Schaumburg IL. Nearly everyone in this
750 unit apartment complex is from India. Geek Central. This evening in a nearby
supermarket parking lot I encountered a guy wearing an MIT sweat shirt. I was wearing a
Caltech sweat shirt. The scene was set for a physics Smack Down .. right there in the
Dominick's parking lot. I let it pass. Thought you might find the imagry amusing.

My attention was distracted by an average looking American guy with a stunningly beautiful
Asian wife. I thought about the pain he would endure when she inevitably dumped him.

>I don't mind being quoted, but please try and ensure you do so accurately
>and in context. (It is a no less a courtesy than I would afford you :-))

No disrespect intended. It was just a flippant comment.

Message has been deleted

Alistair

unread,
Sep 2, 2007, 6:42:32 AM9/2/07
to
On 2 Sep, 02:40, "Charles Hottel" <chot...@earthlink.net> wrote:
> "Alistair" <alist...@ld50macca.demon.co.uk> wrote in message

The poo related version.

Alistair

unread,
Sep 2, 2007, 6:46:28 AM9/2/07
to
On 2 Sep, 03:16, Robert <n...@e.mail> wrote:

No myth. It takes more cpu cycles to calculate and then use a
displacement for an ODO data-item referenced by subscript or index
than it does to refer to a fixed-position data-item.

> >> To be continued with the most unexpected and interesting case: does aligning numbers on
> >> memory boundaries matter?
>
> >YES!!!!! If you use Assembler, IT DOES.
>
> Not any longer. It used to be before memory caches. It still is if the compiler generates
> extra instructions intended to save time. They you have to find ways to blind the compiler

> so it will stop.- Hide quoted text -
>
> - Show quoted text -


Alistair

unread,
Sep 2, 2007, 6:53:49 AM9/2/07
to
On 2 Sep, 03:52, Robert <n...@e.mail> wrote:

> On Sat, 01 Sep 2007 15:10:27 -0700, Alistair <alist...@ld50macca.demon.co.uk> wrote:
>
> >Robert wrote:
> >> In the Micro Focus manual Server Express (2.2 & 4.0):Program Development, chapter 1 part 1
> >> is titled Writing Efficient Programs. Its top billing tells us then think speed is a Very
> >> Important Topic we should know about. For fun, I put their advice to the test.
>
> >< BIG SNIP >
>
> >I should have added that, in my not-very-humble opinion, the machine
> >cycles used are much less important to me than the ease of maintenance
> >of the code.
>
> According to Pete Dashwood, program maintenance is obsolete.

Nice observation. I surrender.

>
> >It costs pennies (alright, it can cost a fortune if the
> >code is really shite) to run inefficient code but it costs pounds (or
> >2*dollars) to maintain it.
>
> You wouldn't say that if your program was running 20,000 transactions PER SECOND, in real
> time.
>

I worked in a shop where the original programmer had written a program
where the code obfuscated the function. After a morning's
consideration, a colleague re-wrote the code and cut the run-time down
from 100 cpu seconds to 2 cpu seconds. In the same shop, programs
which ran several times daily and processed hundreds of thousands of
records each day were re-written and saved 75% of the cpu time in the
process.

> >Efficient coding should be encouraged and rewarded. Coding according
> >to outdated and mis-understood standards should be discouraged.
>
> You'll never make it in the world of contract programming.
>

I did make it as a contractor. I followed in-house standards and in
one shop, re-wrote them. I pride myself on trying to make the new code
blend in to the program.

> Standards are a misnomer because they are different in every company and even between
> departments within the same company. De facto standards are usually different from the
> published ones. Their purpose is not to simplify maintenance (the people who wrote the
> standard no longer do maintenance), it's to keep out competition from programmers who are
> better than management. One team lead had the candor to say "Management thinks time
> stopped in 1974. They'd have a stroke if they saw this EXIT PERFORM CYCLE. You can't do
> that! It's not in the standard because they never saw it."
>
> I walked out of that place after one week. It's pretty typical of non-mainframe Cobol
> shops (mainframe shops are ALL like that). The ones that aren't like that say "Yeah, we
> maintain it. When we have to add significant code, we rewrite the thing in C."

Interestingly, the reason for each standard dictat is rarely
documented so old dictats are improperly retained when the compiler
moves on (see the thou shalt not use COMPUTE debate).

Pete Dashwood

unread,
Sep 2, 2007, 6:53:49 AM9/2/07
to

"Alistair" <alis...@ld50macca.demon.co.uk> wrote in message

news:1188729683.0...@y42g2000hsy.googlegroups.com...
> On 2 Sep, 02:32, docdw...@panix.com () wrote:
>> In article <1188684272.693113.252...@22g2000hsm.googlegroups.com>,

> Just like Pete's toolbox?

Alistair, now you are simply confused.

My toolbox is far from porous... :-)

Charles Hottel

unread,
Sep 2, 2007, 8:28:08 AM9/2/07
to

"Alistair" <alis...@ld50macca.demon.co.uk> wrote in message
news:1188729752.1...@50g2000hsm.googlegroups.com...
Good, I first thought you meant religious COBOL. Actually I did not
recognite that it was a proper, though ambiguous word.


Robert

unread,
Sep 2, 2007, 10:00:04 AM9/2/07
to
Here's the alignment test:

05 comp5-number comp-5 pic s9(09) sync.
05 test-byte pic x(01).

05 unaligned-number comp-5 pic s9(09).

add 1 to comp5-number *> time - 1
add 1 to unaligned-number *> time - 15

Wow, it's hard to believe an extra memory cycle makes it run 15 times slower. At worst, it
should be 4 times slower -- 2x for the load and 2x for the store. It appears the compiler
is generating extra code for the unaligned case. Let's blind the compiler so it can't
tell.

01 misaligned-area sync.
05 array-element occurs 4096 indexed x-index.
10 misaligned-number comp-5 pic s9(09).
10 to-cause-misalignment pic x(01).

move 1 to s-subscript
add 1 to misaligned-number (s-subscript) *> time - 5
move 2 to s-subscript
add 1 to misaligned-number (s-subscript) *> time - 4

Times are almost the same. We know from another test that the subscript costs time 2, so
the add times are 3 and 2. The second case here is identical to the second case above.
Eliminating the extra code made it run 7 times faster. Let's verify that.

add 1 to misaligned-number (1) *> time - 1
add 1 to misaligned-number (2) *> time - 15

Results are the same as the first pair above. When the compiler KNOWS whether the word is
aligned or not, it makes the unaligned case 7 times slower than when it DOESN'T know.

This makes the alignment myth a self-fulfilling prophecy. Alignment is important, not
because the machine cares (any longer) but because the compiler mistakenly THINKS it
matters.

Robert

unread,
Sep 2, 2007, 10:24:14 AM9/2/07
to
On Sun, 02 Sep 2007 03:46:28 -0700, Alistair <alis...@ld50macca.demon.co.uk> wrote:

>> > If you are bothered about cpu
>> >cycles, than you should avoid ODO and anything else that requires the
>> >machine to calculate displacements.
>>
>> There you go, repeating a myth about ODO being slow.
>>
>
>No myth. It takes more cpu cycles to calculate and then use a
>displacement for an ODO data-item referenced by subscript or index
>than it does to refer to a fixed-position data-item.

Computing the offset of a subscript is exactly the same, whether the table has ODO or not.
I tested that and posted the times.

What you say would be true for items FOLLOWING the ODO, but Cobol doesn't allow you to do
that.

Robert

unread,
Sep 2, 2007, 10:31:31 AM9/2/07
to
On Sun, 02 Sep 2007 03:53:49 -0700, Alistair <alis...@ld50macca.demon.co.uk> wrote:

>Interestingly, the reason for each standard dictat is rarely
>documented so old dictats are improperly retained when the compiler
>moves on (see the thou shalt not use COMPUTE debate).

Or the one about numbering paragraphs, so you can find them in a 200 page listing. That
should have been dropped when we started using text editors.

docd...@panix.com

unread,
Sep 2, 2007, 10:31:16 AM9/2/07
to
In article <1188729683.0...@y42g2000hsy.googlegroups.com>,

Alistair <alis...@ld50macca.demon.co.uk> wrote:
>On 2 Sep, 02:32, docdw...@panix.com () wrote:

[snip]

>> My experiences are limited, of course, and my memory is, admittedly,
>> porous.
>

>Just like Pete's toolbox?

I'm completely unfamiliar with what the box contains, Mr Maclean, let
alone the container's quality... but I'm sure that Mr Dashwood would
consider assuring you that both are not porous...

... their quality is the fines'.

(note to non-native English speakers: 'porous', in some dialects of
English, is almost homonymous to 'poorest'; likewise, in some dialects,
the final 't' of some words is dropped.)

DD

docd...@panix.com

unread,
Sep 2, 2007, 10:41:52 AM9/2/07
to
In article <2d7kd39f1h14n43q5...@4ax.com>,
Robert <n...@e.mail> wrote:

[snip]

>One team lead had the candor to say "Management


>thinks time
>stopped in 1974. They'd have a stroke if they saw this EXIT PERFORM
>CYCLE. You can't do
>that! It's not in the standard because they never saw it."

A similar situation was described to this newsgroup a mere
eight-and-a-half years ago or so:

<http://groups.google.com/group/comp.lang.rexx/msg/29d1b77320d7bfc7?output=gplain>

Search for 'bedrool' (a mis-typing of 'bedroll') (no ').

DD

Robert

unread,
Sep 2, 2007, 10:43:55 AM9/2/07
to
On Sun, 02 Sep 2007 03:53:49 -0700, Alistair <alis...@ld50macca.demon.co.uk> wrote:

> I followed in-house standards and in one shop, re-wrote them.

Standards I've written told people what TO do, rather than what NOT to do.

>I pride myself on trying to make the new code blend in to the program.

I write the code well, get it working, finally edit it to follow standards.

Michael Mattias

unread,
Sep 2, 2007, 11:00:48 AM9/2/07
to
"Robert" <n...@e.mail> wrote in message
news:rneld31af20op9hl2...@4ax.com...

>
> Wow, it's hard to believe an extra memory cycle makes it run 15 times
> slower. At worst, it
> should be 4 times slower -- 2x for the load and 2x for the store. It
> appears the compiler
> is generating extra code for the unaligned case. Let's blind the compiler
> so it can't
> tell.

This is why some compilers are better - and more expensive - than others.

They transparently handle things like alignment, or using a 'decrement'
instead of an ' increment' to control loop counters, or combine common
literals dispersed throughout the source code into a non-redundant literal
pool.

This is why COBOL, FORTRAN and BASIC compilers are called 'high level"
langauge products. You - the applications programmer - tell the compiler
what you want to happen, and the compiler assumes responsibility for making
it happen efficiently when it tells the hardware what to do.

MCM

Robert

unread,
Sep 2, 2007, 11:23:48 AM9/2/07
to
On Sun, 02 Sep 2007 03:00:38 GMT, spam...@milmac.com (Doug Miller) wrote:

>If you did it only for fun, why publish the results here? Especially, why
>label a proposition "BUSTED" when you've conducted only one incomplete test on
>it?

Because the manual contains bad advice. It even says:

-- quotation --
Other suggestions (to help prevent inefficient coding)

* REMOVE "ROUNDED"
* REMOVE "ERROR"
* REMOVE "INITIALIZE"
* REMOVE "CORRESPONDING"
* REMOVE "THRU"
* REMOVE "THROUGH"

By removing these reserved words you prevent the possibility that code using these
inefficient constructs will be added to the program.

Robert

unread,
Sep 2, 2007, 12:29:42 PM9/2/07
to
On Sun, 02 Sep 2007 03:46:28 -0700, Alistair <alis...@ld50macca.demon.co.uk> wrote:


>> There you go, repeating a myth about ODO being slow.
>>
>
>No myth. It takes more cpu cycles to calculate and then use a
>displacement for an ODO data-item referenced by subscript or index
>than it does to refer to a fixed-position data-item.

A good use for ODO is on tables that will be SEARCHed ALL. With ODO, the search will take
log2(n). Without ODO, padded with high values, the search will take log2(max). On average,
assuming the table is half full, the ODO search will run 10% faster.

Most programmers use the slower method because they believe the myth that ODO is slow.

Doug Miller

unread,
Sep 2, 2007, 2:44:23 PM9/2/07
to
In article <rneld31af20op9hl2...@4ax.com>, Robert <n...@e.mail> wrote:

>01 misaligned-area sync.
> 05 array-element occurs 4096 indexed x-index.
> 10 misaligned-number comp-5 pic s9(09).
> 10 to-cause-misalignment pic x(01).
>

And have you examined a load map to see what the addresses of these items are,
specifically 'misaligned-number(2)' ?

I may be mistaken... but it is my belief that the compiler will *force*
alignment by emitting a slack byte, making the length of 'array-element' one
byte longer than you expect ...

>move 1 to s-subscript
>add 1 to misaligned-number (s-subscript) *> time - 5
>move 2 to s-subscript
>add 1 to misaligned-number (s-subscript) *> time - 4
>

.. with this entirely predicable result:

>Times are almost the same.

--

Doug Miller

unread,
Sep 2, 2007, 2:45:46 PM9/2/07
to
In article <0jjld3hiedrd5ifsj...@4ax.com>, Robert <n...@e.mail> wrote:
>On Sun, 02 Sep 2007 03:00:38 GMT, spam...@milmac.com (Doug Miller) wrote:
>
>>If you did it only for fun, why publish the results here? Especially, why
>>label a proposition "BUSTED" when you've conducted only one incomplete test on
>
>>it?
>
>Because the manual contains bad advice.

In my opinion, you have not yet adequately demonstrated that the advice was
bad.

Robert

unread,
Sep 2, 2007, 3:09:15 PM9/2/07
to
On Sun, 02 Sep 2007 18:44:23 GMT, spam...@milmac.com (Doug Miller) wrote:

>In article <rneld31af20op9hl2...@4ax.com>, Robert <n...@e.mail> wrote:
>
>>01 misaligned-area sync.
>> 05 array-element occurs 4096 indexed x-index.
>> 10 misaligned-number comp-5 pic s9(09).
>> 10 to-cause-misalignment pic x(01).
>>
>And have you examined a load map to see what the addresses of these items are,
>specifically 'misaligned-number(2)' ?
>
>I may be mistaken... but it is my belief that the compiler will *force*
>alignment by emitting a slack byte, making the length of 'array-element' one
>byte longer than you expect ...

Whoops. The manual says "If the SYNCHRONIZED clause is specified with a non-elementary
item, then the clause applies to all the items subordinate to that non-elementary item."
I need to rerun the test without SYNC on the 01 level.

Alistair

unread,
Sep 2, 2007, 3:56:11 PM9/2/07
to

Charles Hottel wrote:

> "Alistair" <alis...@ld50macca.demon.co.uk> wrote in message
> news:1188729752.1...@50g2000hsm.googlegroups.com...
> >>

> >> > code is really shite)
> >>
> >> <snip>
> >>
> >> Which meaning of the word are you using?
> >> The word shite may refer to various things:
> >>
> >> a.. A variant of the word shit
> >> b.. A shi'ite, a person who practices the Shi'a Islam faith
> >> c.. The shite, the principal character in a Japanese Noh play
> >> d.. Shite, the person who performs the technique in aikido
> >
> > The poo related version.
> >
> Good, I first thought you meant religious COBOL. Actually I did not
> recognite that it was a proper, though ambiguous word.

When I firt came across the word shi'ite I was mildly amused in
observing the closeness to shite in spelling. I have avoided being
crass enough to insult shi'ite muslims by refering to them in the
shorter form. I think such an insult is beneath me.

However, judging by the posts to this group some people take their
variant of Cobol very religiously.

Doug Miller

unread,
Sep 2, 2007, 3:58:52 PM9/2/07
to
Yes, *and* verify by examination of a load map that the COMP item at the 10
level is in fact misaligned, too.

Alistair

unread,
Sep 2, 2007, 4:00:58 PM9/2/07
to

Robert wrote:

> Here's the alignment test:
>
>

> This makes the alignment myth a self-fulfilling prophecy. Alignment is important, not
> because the machine cares (any longer) but because the compiler mistakenly THINKS it
> matters.

Alignment on what hardware? a pc? I don't recall what reason alignment
existed on mainframes but as far as I recall the use of aligned data
only resulted in undocumented fillers between data items to nudge
aligned data up to the boundaries. It may be that, if you are running
on a pc, the extra overhead with aligned data is an artefact that
would not appear on a mainframe where alignment is/was important.

Alistair

unread,
Sep 2, 2007, 4:05:23 PM9/2/07
to

Robert wrote:

> On Sun, 02 Sep 2007 03:46:28 -0700, Alistair <alis...@ld50macca.demon.co.uk> wrote:
>
> >> > If you are bothered about cpu
> >> >cycles, than you should avoid ODO and anything else that requires the
> >> >machine to calculate displacements.
> >>
> >> There you go, repeating a myth about ODO being slow.
> >>
> >
> >No myth. It takes more cpu cycles to calculate and then use a
> >displacement for an ODO data-item referenced by subscript or index
> >than it does to refer to a fixed-position data-item.
>
> Computing the offset of a subscript is exactly the same, whether the table has ODO or not.

It is the data-item that has its' displacement calculated, not the
subscript.

> I tested that and posted the times.
>
> What you say would be true for items FOLLOWING the ODO, but Cobol doesn't allow you to do
> that.

If what you say is true (and I call into question the veracity of it)
then it would be more efficient for us to code all records as ODO and
not as fixed position. But we don't. If ODO is as efficient on your
computer as you make out then I would question the competency of your
compiler.

Robert

unread,
Sep 2, 2007, 6:30:17 PM9/2/07
to
On Sun, 02 Sep 2007 13:05:23 -0700, Alistair <alis...@ld50macca.demon.co.uk> wrote:

>
>Robert wrote:
>
>> On Sun, 02 Sep 2007 03:46:28 -0700, Alistair <alis...@ld50macca.demon.co.uk> wrote:
>>
>> >> > If you are bothered about cpu
>> >> >cycles, than you should avoid ODO and anything else that requires the
>> >> >machine to calculate displacements.
>> >>
>> >> There you go, repeating a myth about ODO being slow.
>> >>
>> >
>> >No myth. It takes more cpu cycles to calculate and then use a
>> >displacement for an ODO data-item referenced by subscript or index
>> >than it does to refer to a fixed-position data-item.
>>
>> Computing the offset of a subscript is exactly the same, whether the table has ODO or not.
>
>It is the data-item that has its' displacement calculated, not the
>subscript.

You are right.

>> I tested that and posted the times.
>>
>> What you say would be true for items FOLLOWING the ODO, but Cobol doesn't allow you to do
>> that.
>
>If what you say is true (and I call into question the veracity of it)
>then it would be more efficient for us to code all records as ODO and
>not as fixed position.

It would be. Fixed length records are not used outside the mainframe world. The norm is a
delimiter at the end rather than a length at the front (record sequential), so ODO is not
useful in that case.

> But we don't. If ODO is as efficient on your
>computer as you make out then I would question the competency of your
>compiler.

Here's a challenge: post a program that demonstrates the slowness of ODO.

William M. Klein

unread,
Sep 3, 2007, 12:02:25 AM9/3/07
to
"Robert" <n...@e.mail> wrote in message
news:uchld3petcfs3qe7j...@4ax.com...

> On Sun, 02 Sep 2007 03:46:28 -0700, Alistair <alis...@ld50macca.demon.co.uk>
> wrote:
>
<snip>

> Computing the offset of a subscript is exactly the same, whether the table has
> ODO or not.
> I tested that and posted the times.
>
> What you say would be true for items FOLLOWING the ODO, but Cobol doesn't
> allow you to do
> that.
>

Micro Focus - the ONLY compiler that you claim to be testing - does allow it.
Furthermore, the timing will be different depending on whether you use ODOSLIDE
or NOODOSLIDE.

P.S. As you are using NOTRUNC as your compiler option, you can't even claim to
be testing for "conforming" COBOL source code. (I suspect several of your
results MIGHT be different with TRUNC - per standard COBOL - and NO-IBM-COMP
might also make a difference).


Robert

unread,
Sep 3, 2007, 1:00:45 AM9/3/07
to
On Mon, 03 Sep 2007 04:02:25 GMT, "William M. Klein" <wmk...@nospam.netcom.com> wrote:

>"Robert" <n...@e.mail> wrote in message
>news:uchld3petcfs3qe7j...@4ax.com...
>> On Sun, 02 Sep 2007 03:46:28 -0700, Alistair <alis...@ld50macca.demon.co.uk>
>> wrote:
>>
><snip>
>> Computing the offset of a subscript is exactly the same, whether the table has
>> ODO or not.
>> I tested that and posted the times.
>>
>> What you say would be true for items FOLLOWING the ODO, but Cobol doesn't
>> allow you to do
>> that.
>>
>
>Micro Focus - the ONLY compiler that you claim to be testing - does allow it.
>Furthermore, the timing will be different depending on whether you use ODOSLIDE
>or NOODOSLIDE.

I've seen that done in PL/I, but NEVER in Cobol.

>P.S. As you are using NOTRUNC as your compiler option, you can't even claim to
>be testing for "conforming" COBOL source code. (I suspect several of your
>results MIGHT be different with TRUNC - per standard COBOL - and NO-IBM-COMP
>might also make a difference).

For this speed test, I tried to eliminate overhead unrelated to the test topic. NOTRUNC
makes "add 1" a pure machine language ADD (or INC), without the overhead of testing for
decimal overflow.

NOIBMCOMP would cause SYNC to be ignored. I wouldn't be able to test alignment.

Clark F Morris

unread,
Sep 3, 2007, 12:26:53 PM9/3/07
to
On Fri, 31 Aug 2007 21:22:35 -0500, Robert <n...@e.mail> wrote:

>In the Micro Focus manual Server Express (2.2 & 4.0):Program Development, chapter 1 part 1
>is titled Writing Efficient Programs. Its top billing tells us then think speed is a Very
>Important Topic we should know about. For fun, I put their advice to the test.
>

>The machine I used is a high-end HP Superdome with 64 PA (RISC) processors. Of course,
>the Cobol test program was only using one of them. For general reference, other timing
>tests showed mid-range Sun SPARC CPUs to be 3 times faster than the PA, and HP Superdomes
>with Itaniums to be 6-10 rimes faster. Despite that, customer demand forced HP to rescind
>its decision to obsolete the PA. These tests were run on a 'new generation' PA.
>
>I added a few comparisons that are not from the MF manual, but are widely believed in the
>Cobol community. They are styled "Legacy:". Execution times are in microseconds (us), with
>a resolution of plus or minus 5. I'll describe the timing methodology toward the end; for
>now, take my word that the speeds are accurate.

>
>Proposition: Use simple two-operand arithmetic statements wherever possible.
>
>Test:
>05 binary-number binary pic s9(09) sync.
>
>add 1 to binary-number *> 1 us
>compute binary-number = binary-number + 1 *> 1 us
>
>add 1 to binary-number
>multiply 5 by binary-number
>divide 5 into binary-number *> 50 us
>
>compute binary-number = ((binary-number + 1) * 5) / 5 *> 445 us
>

>Finding: busted for simple cases, confirmed for cases with more than one operation.
>

>Proposition: "Do not use the REMAINDER, ROUNDED, ON SIZE ERROR or CORRESPONDING phrases if
>you want the fastest performance. No optimization is done on arithmetic statements if the
>ON SIZE ERROR phrase is used. For this reason, we recommend you do not use this phrase if
>high performance is required. The ROUNDED phrase impacts performance, but it is generally
>faster to use ROUNDED than try to round the result using your own routine. "
>
>Test:
>compute binary-number rounded = binary-number + 1 *> 1 us (no penalty)
>add 1 to binary-number *> 15 us
> on size error display 'overflow'
>end-add
>
>Finding: busted for rounded, confirmed for size error.
>
>Legacy belief: indexes are faster than subscripts
>
>Test:
>05 s-subscript binary pic s9(09) sync.


>01 misaligned-area sync.
> 05 array-element occurs 4096 indexed x-index.
> 10 misaligned-number comp-5 pic s9(09).
> 10 to-cause-misalignment pic x(01).

>move array-element (s-subscript) to test-byte *> 3 us
>move array-element (x-index) to test-byte *> 6 us
>
>Finding: BUSTED. Index is actually slower.
>
>Proposition: When incrementing or decrementing a counter, terminate it with a literal
>value rather than a value held in a data item. For example, to execute a loop n times, set
>the counter to n and then decrement the counter until it becomes zero, rather than
>incrementing the counter from zero to n.
>
>Test:
>perform varying binary-number from 10 by -1 until binary-number = 0 *> 150 us
>perform varying binary-number from 1 by 1 until binary-number > 10 *> 154 us
>
>Finding: BUSTED
>
>Proposition: Access to tables defined with OCCURS ... DEPENDING is less efficient than
>access to tables of fixed size, and so should be avoided where high performance is needed.
>
>Test:
>
>01 depending-area.
> 05 depending-element occurs 1 to 4096 depending on binary-number.
> 10 comp-5 pic s9(09).
> 10 pic x(01).
>move array-element (s-subscript) to test-byte *> 3 us
>move depending-element (s-subscript) to test-byte *> 3 us
>
>Finding: BUSTED
>
>Proposition: Arithmetic on COMP-3 data items is performed in packed decimal and is much
>slower than arithmetic on COMP items. It should be avoided.
>
>Test:
>05 display-number pic 9(09).
>05 packed-number comp-3 pic s9(09).
>
>add 1 to display-number *> 174 us
>add 1 to packed-number *> 160 us
>
>Finding: CONFIRMED. Packed is almost as slow as display. It was fast on 1970-era
>mainframes. There is no longer any reason to use it. If you want to save space, look at
>space-filled strings and filler-padding.

There may be other good reasons to go to display but if you are using
a z series computer (latest evolution of the IBM 360), packed decimal
is still faster than display. Most results depend on the computer
architecture. I suspect in answer to your next question that
alignment still matters on some currently sold computers with an
architecture different from the ones tested on.

Jeff Campbell

unread,
Sep 3, 2007, 8:12:02 PM9/3/07
to
Robert wrote:
> On Sat, 01 Sep 2007 16:15:22 -0600, Jeff Campbell <n8...@arrl.net> wrote:
>
>
>> If you can post your test code I'll post the results I get on my
>> PWS 600au running VMS.
>
> Here it is:
>
>
> * ---------------------------------------------------------------------
> * Findings
> * Aligned 1
> * Unaligned 15
> * Misaligned (1) 5
> * Misaligned (2) 4
> * Binary 1
> * Linkage 30
> * Compute n=n+1 1
> * Rounded 1
> * size error 18
> * Display 174
> * Packed 160
> * Arithmetic 50
> * Compute 445
> * Index 6
> * Subscript 3
> * Depending 3
> * Evaluate true 2
> * Evaluate expression 3
> * Go to depending 7
> * Evaluate case 11
> * Initialize 346
> * Move zeros 339
> * Dec to zero 149
> * Inc to 10 154
>
> $SET SOURCEFORMAT"FREE"
> $SET NOBOUND
> $SET OPT"2"
> $SET NOTRUNC
> $SET IBMCOMP
> $SET NOCHECK
> $SET ALIGN"8"
> identification division.
> program-id. Speed1.
> author. Robert Wxagner.
>
> data division.
> working-storage section.
> 01 test-data.
> 05 comp5-number comp-5 pic s9(09) sync.
> 05 test-byte pic x(01).
> 05 unaligned-number comp-5 pic s9(09).
> 05 pic x(03).
> 05 binary-number binary pic s9(09) sync.
> 05 display-number pic 9(09).
> 05 packed-number comp-3 pic s9(09).
> 05 s-subscript binary pic s9(09) sync.
>
> 01 depending-area.
> 05 depending-element occurs 1 to 4096 depending on binary-number.
> 10 comp-5 pic s9(09).
> 10 pic x(01).
> 01 misaligned-area sync.
> 05 array-element occurs 4096 indexed x-index.
> 10 misaligned-number comp-5 pic s9(09).
> 10 to-cause-misalignment pic x(01).
>
> 01 timer-variables.
> 05 test-name pic x(30).
> 05 repeat-factor value 100000000 binary pic s9(09).
> 05 current-date-structure.
> 10 pic x(08).
> 10 time-now-hhmmsshh.
> 15 hours pic 9(02).
> 15 minutes pic 9(02).
> 15 seconds pic 9(02).
> 15 hundredths pic 9(02).
> 10 pic x(05).
> 05 time-now pic 9(06)v99.
> 05 time-start pic 9(06)v99.
> 05 timer-overhead value zero pic 9(06)v99.
> 05 elapsed-time pic s9(06)v99.
> 05 elapsed-time-display.
> 10 elapsed-time-edited pic z(05).
>
> linkage section.
> 01 linkage-number binary pic s9(09) sync.
>
> procedure division.
>
> initialize test-data, misaligned-area
>
> move 'Null test' to test-name
> perform timer-on
> perform timer-on
> perform repeat-factor times
> exit perform cycle
> end-perform
> perform timer-off
> compute timer-overhead = (time-now - time-start)
>
> move 'Aligned' to test-name
> perform timer-on
> perform repeat-factor times
> add 1 to comp5-number
> exit perform cycle
> end-perform
> perform timer-off
>
> move 'Unaligned' to test-name
> perform timer-on
> perform repeat-factor times
> add 1 to unaligned-number
> exit perform cycle
> end-perform
> perform timer-off
>
> move 'Misaligned (1)' to test-name
> move 1 to s-subscript
> perform timer-on
> perform repeat-factor times
> add 1 to misaligned-number (s-subscript)
> exit perform cycle
> end-perform
> perform timer-off
>
> move 'Misaligned (2)' to test-name
> *> if this is faster than Unaligned above,
> *> compiler generated alignment code is slowing things down
> move 2 to s-subscript
> perform timer-on
> perform repeat-factor times
> add 1 to misaligned-number (s-subscript)
> exit perform cycle
> end-perform
> perform timer-off
>
> move 'Binary' to test-name
> move zero to binary-number
> perform timer-on
> perform repeat-factor times
> add 1 to binary-number
> exit perform cycle
> end-perform
> perform timer-off
>
> move 'Linkage' to test-name
> set address of linkage-number to address of binary-number
> move zero to linkage-number
> perform timer-on
> perform repeat-factor times
> add 1 to linkage-number
> exit perform cycle
> end-perform
> perform timer-off
>
> move 'Compute n=n+1' to test-name
> move zero to binary-number
> perform timer-on
> perform repeat-factor times
> compute binary-number = binary-number + 1
> exit perform cycle
> end-perform
> perform timer-off
>
> move 'Rounded' to test-name
> move zero to binary-number
> perform timer-on
> perform repeat-factor times
> compute binary-number rounded = binary-number + 1
> exit perform cycle
> end-perform
> perform timer-off
>
> move 'size error' to test-name
> move zero to binary-number
> perform timer-on
> perform repeat-factor times
> add 1 to binary-number
> on size error display 'overflow'
> end-add
> exit perform cycle
> end-perform
> perform timer-off
>
> move 'Display' to test-name
> perform timer-on
> perform repeat-factor times
> *> add 1 to display-number
> exit perform cycle
> end-perform
> perform timer-off
>
> move 'Packed' to test-name
> perform timer-on
> perform repeat-factor times
> *> add 1 to packed-number
> exit perform cycle
> end-perform
> perform timer-off
>
> move 'Arithmetic' to test-name
> move zero to binary-number
> perform timer-on
> perform repeat-factor times

> add 1 to binary-number
> multiply 5 by binary-number
> divide 5 into binary-number
> exit perform cycle
> end-perform
> perform timer-off
>
> move 'Compute' to test-name
> move zero to binary-number
> divide 10 into repeat-factor
> perform timer-on
> perform repeat-factor times
> compute binary-number = ((binary-number + 1) * 5) / 5
> exit perform cycle
> end-perform
> perform timer-off
> multiply 10 by repeat-factor
>
> move 'Index' to test-name
> set x-index to 1000
> perform timer-on
> perform repeat-factor times
> move array-element (x-index) to test-byte
> exit perform cycle
> end-perform
> perform timer-off
>
> move 'Subscript' to test-name
> move 1000 to s-subscript
> perform timer-on
> perform repeat-factor times
> move array-element (s-subscript) to test-byte
> exit perform cycle
> end-perform
> perform timer-off
>
> move 'Depending' to test-name
> move 2000 to binary-number
> move 1000 to s-subscript
> perform timer-on
> perform repeat-factor times
> move depending-element (s-subscript) to test-byte
> exit perform cycle
> end-perform
> perform timer-off
>
> move 'Evaluate true' to test-name
> move zero to binary-number
> perform timer-on
> perform repeat-factor times
> evaluate true
> when binary-number equal to zero
> exit perform cycle
> when other
> display 'error'
> end-evaluate
> end-perform
> perform timer-off
>
> move 'Evaluate expression' to test-name
> move zero to binary-number
> perform timer-on
> perform repeat-factor times
> evaluate binary-number
> when zero
> exit perform cycle
> when other
> display 'error'
> end-evaluate
> end-perform
> perform timer-off
>
> move 'Go to depending' to test-name
> move 2 to binary-number
> perform timer-on
> perform go-depending-test repeat-factor times
> perform timer-off
>
> move 'Evalaute case' to test-name
> move 2 to binary-number
> perform timer-on
> perform evaluate-case-test repeat-factor times
> perform timer-off
>
> move 'Initialize' to test-name
> perform timer-on
> perform repeat-factor times
> initialize test-data
> exit perform cycle
> end-perform
> perform timer-off
>
> move 'Move zeros' to test-name
> perform timer-on
> perform repeat-factor times
> move zeros to
> comp5-number
> test-byte
> unaligned-number
> binary-number
> display-number
> packed-number
> s-subscript
> exit perform cycle
> end-perform
> perform timer-off
>
>
> move 'Dec to zero' to test-name
> perform timer-on
> perform repeat-factor times
> perform varying binary-number from 10 by -1 until binary-number
> = 0
> end-perform
> exit perform cycle
> end-perform
> perform timer-off
>
> move 'Inc to 10' to test-name
>
> perform timer-on
>
> perform repeat-factor times

>
> perform varying binary-number from 1 by 1 until binary-number >
> 10
> end-perform
>
> exit perform cycle
>
> end-perform
>
> perform timer-off
>
>
> goback
>
>
> . go-depending-test section.
> go to p1 p2 p3 depending on binary-number
> display 'error'
> . p1. display 'error'
> . p2. exit section
> . p3. display 'error'
> . evaluate-case-test section.
> evaluate binary-number
> when 1
> display 'error'
> when 2
> exit section
> when other
> display 'error'
> end-evaluate
>
> . end-of-previous section
> . timer-on.
> perform read-the-time
> move time-now to time-start
> . timer-off.
> perform read-the-time
> compute elapsed-time rounded = ((time-now - time-start)
> * 100000000 / repeat-factor)
> - timer-overhead
>
> if elapsed-time not greater than zero
> move 'error' to elapsed-time-display
> else
> compute elapsed-time-edited rounded = elapsed-time * 10
> end-if
> display test-name elapsed-time-display
> . read-the-time.
> accept time-now-hhmmsshh from time
> *> move function current-date to current-date-structure
> compute time-now =
> ((((hours * 60) +
> minutes) * 60) +
> seconds) +
> (hundredths / 100)
> .

Both the HP compiler on my Alpha and the Fujitsu compiler (COBOL97)
I have access to on a windows PC do not like this code. 8-) 8-)

I've not used the Micro Focus product so am unfamiliar with it.

Question:
What is the advantage of using EXIT PERFORM CYCLE over CONTINUE?


Jeff

----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----

Robert

unread,
Sep 3, 2007, 11:56:06 PM9/3/07
to

I would expect binary to be the fastest on a z series, as it is on every other computer.

>Most results depend on the computer architecture.

The program shouldn't be tied to a specific machine, especially when the feature is out of
step with industry norms. Doing so locks users into one manufacturer, which is contrary to
the spirit of high level languages.

> I suspect in answer to your next question that
>alignment still matters on some currently sold computers with an
>architecture different from the ones tested on.

Yes, some CPUs don't require alignment e.g. IBM, Motorola, Intel 16/32 bit. Some throw an
exception and expect the operating system to handle it in software (expensively)
i.e.IA-64, most RISC machines including PA, PowerPC and Alpha. A few operating systems
abort the process when they get a misalignment fault e.g. old Apple.

Micro Focus' use of IBMCOMP for the compiler option that turns on memory boundary
awareness gives the impression that is (or was) a concern in the IBM mainframe world. Not
in my experience. I've never seen a mainframe, IBM or other, throw a fault for
misalignment. It doesn't seem appropriate for ANY machine with an L2 cache to do so. I
wrote the speed program primarily to test whether a modern PA processor (88xx/89xx, which
have L2) is slowed down by misalignment, secondarily to disprove (or not) Cobol myths
such as ODO being slow and demonstrate real inefficiencies such as packed decimal.

FWIW, IBM coined the word cache in the context of memory in 1967. The S/360 model 85 was
probably the first computer to use memory cache.
http://en.wikipedia.org/wiki/Memory_cache

Robert

unread,
Sep 4, 2007, 12:46:56 AM9/4/07
to
On Mon, 03 Sep 2007 18:12:02 -0600, Jeff Campbell <n8...@arrl.net> wrote:

>> move 'Null test' to test-name
>> perform timer-on
>> perform timer-on
>> perform repeat-factor times
>> exit perform cycle
>> end-perform
>> perform timer-off
>> compute timer-overhead = (time-now - time-start)

>Both the HP compiler on my Alpha and the Fujitsu compiler (COBOL97)


>I have access to on a windows PC do not like this code. 8-) 8-)
>
>I've not used the Micro Focus product so am unfamiliar with it.
>
>Question:
> What is the advantage of using EXIT PERFORM CYCLE over CONTINUE?

CONTINUE is the same as nothing, an empty loop. The null test was getting optimized out
when it was an empty loop. I added EXIT PERFORM CYCLE to stop that from happening, then
had to add it to all the others.

Removing the EXIT PERFORM CYCLEs, or replacing them with CONTINUE, would not seriously
affect the results.

If tests run too slowly, >10 seconds, lower repeat-factor to 10,000,000; If they run too
quckly, < 1 second, raise it to 1,000,000,000.

For a valid test of misaligned, remove SYNC from the 01 level. The compiler options are
Miicro Focus; you'll have to replace them with your compiler's. At minimum, you want to
turn off bounds checking. The HP Alpha compiler might be a rebranded Micro Focus.

William M. Klein

unread,
Sep 4, 2007, 12:58:29 AM9/4/07
to
"Clark F Morris" <cfmp...@ns.sympatico.ca> wrote in message
news:p7dod314t207iqabi...@4ax.com...

> On Fri, 31 Aug 2007 21:22:35 -0500, Robert <n...@e.mail> wrote:
<snip>

> There may be other good reasons to go to display but if you are using
> a z series computer (latest evolution of the IBM 360), packed decimal
> is still faster than display. Most results depend on the computer
> architecture. I suspect in answer to your next question that
> alignment still matters on some currently sold computers with an
> architecture different from the ones tested on.
>>

Clark,
Robert was clear (in his first note) that he was quoting efficiency
recommendations FROM a Micro Focus manual and that was the ONLY compiler that he
was talking about.

I wouldn't assume that either any recommendations OR test results would
necessarily be "portable" across compilers or operating systems.

Whether I think his test were ore were not "comprehensive" - I do think that he
was fair in applying rules from the documentation for a specific compiler and
O/S to that combination and then reporting the results he got.

I could ALMOST guarantee, that I could get different results (even with MF on
differen platforms - and with different directives) much less on zSeries.


--
Bill Klein
wmklein <at> ix.netcom.com


William M. Klein

unread,
Sep 4, 2007, 1:10:29 AM9/4/07
to
Binary (when working with other Binary) may or may not be faster than PD for
some cases on zSeries. However, there are even MORE options that impact this
than just TRUNC (wich has 3 flavors on IBM zSeries). Furthermore, PD is usually
(not always) BEST when working with "combined" usages (such as input from a
"screen" in the same operation as something stored in a Database).

The following is the information on "comapring data types" for the Enterprise
COBOL Performance paper available at:
http://www-1.ibm.com/support/docview.wss?rs=203&q=7001475&uid=swg27001475

(You might want to look at the entire paper to see what a COMPREHENVISE set of
performance test covers - in the way of "variations. Also it has some firm
statistics on indexes vs subscripts with this compiler.)

***

Comparing Data Types

When selecting your data types, it is important to understand the performance
characteristics of them before you use them. Shown below are some performance
considerations of doing several ADDs and SUBTRACTs on the various data types of
the specified precision.

Performance considerations for comparing data types (using ARITH(COMPAT)):

Packed decimal (COMP-3) compared to binary (COMP or COMP-4) with TRUNC(STD)
using 1 to 9 digits: packed decimal is 30% to 60% slower than binary
using 10 to 17 digits: packed decimal is 55% to 65% faster than binary
using 18 digits: packed decimal is 74% faster than binary

Packed decimal (COMP-3) compared to binary (COMP or COMP-4) with TRUNC(OPT)
using 1 to 8 digits: packed decimal is 160% to 200% slower than binary
using 9 digits: packed decimal is 60% slower than binary
using 10 to 17 digits: packed decimal is 150% to 180% slower than binary
using 18 digits: packed decimal is 74% faster than binary

Packed decimal (COMP-3) compared to binary (COMP or COMP-4) with TRUNC(BIN) or
COMP-5
using 1 to 8 digits: packed decimal is 130% to 200% slower than binary
using 9 digits: packed decimal is 85% slower than binary
using 10 to 18 digits: packed decimal is 88% faster than binary

DISPLAY compared to packed decimal (COMP-3)
using 1 to 6 digits: DISPLAY is 100% slower than packed decimal
using 7 to 16 digits: DISPLAY is 40% to 70% slower than packed decimal
using 17 to 18 digits: DISPLAY is 150% to 200% slower than packed decimal

DISPLAY compared to binary (COMP or COMP-4) with TRUNC(STD)
using 1 to 8 digits: DISPLAY is 150% slower than binary
using 9 digits: DISPLAY is 125% slower than binary
using 10 to 16 digits: DISPLAY is 20% faster than binary
using 17 digits: DISPLAY is 8% slower than binary
using 18 digits: DISPLAY is 25% faster than binary

DISPLAY compared to binary (COMP or COMP-4) with TRUNC(OPT)
using 1 to 8 digits: DISPLAY is 350% slower than binary
using 9 digits: DISPLAY is 225% slower than binary
using 10 to 16 digits: DISPLAY is 380% slower than binary
using 17 digits: DISPLAY is 580% slower than binary
using 18 digits: DISPLAY is 35% faster than binary

DISPLAY compared to binary (COMP or COMP-4) with TRUNC(BIN) or COMP-5
using 1 to 4 digits: DISPLAY is 400% to 440% slower than binary
using 5 to 9 digits: DISPLAY is 240% to 280% slower than binary
using 10 to 18 digits: DISPLAY is 70% to 80% faster than binary

--
Bill Klein
wmklein <at> ix.netcom.com

"Robert" <n...@e.mail> wrote in message

news:iu8pd39fh6huj8r55...@4ax.com...

Howard Brazee

unread,
Sep 4, 2007, 11:18:29 AM9/4/07
to
On Fri, 31 Aug 2007 21:22:35 -0500, Robert <n...@e.mail> wrote:

>Finding: CONFIRMED. Packed is almost as slow as display. It was fast on 1970-era
>mainframes. There is no longer any reason to use it. If you want to save space, look at
>space-filled strings and filler-padding.

Which illustrates that the thing that counts in this kind of test is
knowing that your tests were for specific a specific compiler and
hardware (and possibly compiler optimizing setting).

The machine I program with still has hardware support for Packed
decimal.

n8...@arrl.net

unread,
Sep 4, 2007, 4:16:27 PM9/4/07
to

No it is not.

Here are the results I obtained. Machine is 600 MHz Alpha Personal
Workstation running
OpenVMS 7.3-1, COBOL compiler is version 2.8-1286.

$ cobol/nocheck/notruncate/alignment/noansi_format/optimize t.cob
$ link t.obj
$ run t.exe
Null test 0
Aligned 0
Unaligned 0
Misaligned (1) 1
Misaligned (2) 1
Binary 0
Linkage 0
Compute n=n+1 0
Rounded 0
size error 30
Display 0
Packed 0
Arithmetic 45
Compute 43
Index 2
Subscript 2
Depending 2
Evaluate true 5
Evaluate expression 5
Go to depending 23
Evalaute case 41
Initialize 0
Move zeros 0
Dec to zero 20
Inc to 10 20

Repeat count is 100,000,000.

Jeff

Clark F Morris

unread,
Sep 4, 2007, 9:27:29 PM9/4/07
to

The IBM 360 required the binary data to be appropriately aligned,
half-word, word or double word. The 370 allowed misalignment but
extracted a performance penalty. I haven't kept up with later models.
In regard to packed decimal, if you are running business programs on a
360/370/390/z series machine, then packed decimal makes sense. It
avoids several problems. On other series of machines that don't have
full fixed decimal arithmetic, different rules apply. In regard to
ODO, the operative word is look at the generated code for the
operations that are actually affected by the ODO and then decide.

>
>FWIW, IBM coined the word cache in the context of memory in 1967. The S/360 model 85 was
>probably the first computer to use memory cache.
>http://en.wikipedia.org/wiki/Memory_cache


Clark Morris who started on an IBM 650, went to a Honeywell 800, an
IBM 1401, a RCA 301, and various models of IBM 360, 370, 4300, 390 and
z series.

William M. Klein

unread,
Sep 4, 2007, 9:36:46 PM9/4/07
to
The primary purpose of the IBMCOMP Micro Focus compiler directive is to turn on
word-storage mode, i.e.

"In word-storage mode every data item of USAGE COMP or COMP-5 occupies either
two bytes or a multiple of four bytes."

It is true that is also interacts with the SYNC clause and the ALIGN directive,
but the thing that makes it "like" IBM mainframes is that it does not allow for
COBOL to have either 1 byte or 3 byte (or 5, 7) byte binary fields. Other
compilers (for COBOL) and even zArchitecture (and MVS and up) allow for this,
but not COBOL on the IBM mainframe (or OS/400 / iSeires - as I recall)

--
Bill Klein
wmklein <at> ix.netcom.com

"Clark F Morris" <cfmp...@ns.sympatico.ca> wrote in message
news:c21sd3hl65n3738n1...@4ax.com...

Robert

unread,
Sep 4, 2007, 10:37:41 PM9/4/07
to
On Sun, 02 Sep 2007 09:00:04 -0500, Robert <n...@e.mail> wrote:

>Here's the alignment test:
>

>05 comp5-number comp-5 pic s9(09) sync.
>05 test-byte pic x(01).

>05 unaligned-number comp-5 pic s9(09).
>
>add 1 to comp5-number *> time - 1
>add 1 to unaligned-number *> time - 15


>
>Wow, it's hard to believe an extra memory cycle makes it run 15 times slower. At worst, it
>should be 4 times slower -- 2x for the load and 2x for the store. It appears the compiler
>is generating extra code for the unaligned case. Let's blind the compiler so it can't
>tell.
>

>01 misaligned-area sync.
> 05 array-element occurs 4096 indexed x-index.
> 10 misaligned-number comp-5 pic s9(09).
> 10 to-cause-misalignment pic x(01).
>

>move 1 to s-subscript
>add 1 to misaligned-number (s-subscript) *> time - 5
>move 2 to s-subscript

>add 1 to misaligned-number (s-subscript) *> time - 4

The last set is wrong, because both were SYNChronized. The correct times, without SYNC,
are:

move 1 to s-subscript
add 1 to misaligned-number (s-subscript) *> time - 19
move 2 to s-subscript
add 1 to misaligned-number (s-subscript) *> time - 19

Since the compiler doesn't know whether the words are synchronized, it is generating extra
code to load and store the four bytes individually. If the machine was causing the
slowdown, subscript=1 would be faster because it is in fact on a word boundary.

I couldn't find a way to trick the compiler into ignoring alignment and just doing it.
NOIBMCOMP gave identical results. Passing a pointer to a misaligned word defined as SYNC
in linkage section didn't fool it.

Conclusion: on an HP PA (RISC) processor, alignment does matter .. a lot.

----------------------------------------------------------
Roundied and size error:

add 1 to binary-number *> time 1
compute binary-number rounded = binary-number + 1 *> time 1 (no rounding)
compute binary-number rounded = binary-number + .5 *> time 841
add 1 to binary-number on size error ... *> time 867

Loop control, time per iteration:

perform varying binary-number from 10 by -1 until binary-number = 0 *> time 15
perform varying binary-number from 10 by -1 until binary-number = comp5 *> time 16
perform 10 times *> time 23

Pretty sad when generated loop control is slower than user written.

Robert

unread,
Sep 4, 2007, 10:37:41 PM9/4/07
to
On Tue, 04 Sep 2007 05:10:29 GMT, "William M. Klein" <wmk...@nospam.netcom.com> wrote:

>Binary (when working with other Binary) may or may not be faster than PD for
>some cases on zSeries. However, there are even MORE options that impact this
>than just TRUNC (wich has 3 flavors on IBM zSeries). Furthermore, PD is usually
>(not always) BEST when working with "combined" usages (such as input from a
>"screen" in the same operation as something stored in a Database).
>
>The following is the information on "comapring data types" for the Enterprise
>COBOL Performance paper available at:
> http://www-1.ibm.com/support/docview.wss?rs=203&q=7001475&uid=swg27001475
>
>(You might want to look at the entire paper to see what a COMPREHENVISE set of
>performance test covers - in the way of "variations. Also it has some firm
>statistics on indexes vs subscripts with this compiler.)

It says an index is 30% faster than a binary subscript. Subscripts require a
multiplication. Both require the addition of base + offset, which is usually 'free', i.e.
the referencing operand is base:index. So the difference is multiply versus load. Machines
can do either in one execution frame. The difference is pipelining -- the load might be
done before the instruction executes.

>Comparing Data Types
>
>When selecting your data types, it is important to understand the performance
>characteristics of them before you use them. Shown below are some performance
>considerations of doing several ADDs and SUBTRACTs on the various data types of
>the specified precision.
>
> Performance considerations for comparing data types (using ARITH(COMPAT)):
>
> Packed decimal (COMP-3) compared to binary (COMP or COMP-4) with TRUNC(STD)
> using 1 to 9 digits: packed decimal is 30% to 60% slower than binary
> using 10 to 17 digits: packed decimal is 55% to 65% faster than binary
> using 18 digits: packed decimal is 74% faster than binary
>
> Packed decimal (COMP-3) compared to binary (COMP or COMP-4) with TRUNC(OPT)
> using 1 to 8 digits: packed decimal is 160% to 200% slower than binary
> using 9 digits: packed decimal is 60% slower than binary
> using 10 to 17 digits: packed decimal is 150% to 180% slower than binary
> using 18 digits: packed decimal is 74% faster than binary

I don't understand why 9 and 18 digits are special cases. A 32 bit number holds 4294967296
(9.4 digits) ; a 64 bit number holds 18446744073709551616 (19.1 digits). It appears that
IBM 'reserves' 2 bits out of 32 and 5 bits out of 64.


Robert

unread,
Sep 4, 2007, 11:46:50 PM9/4/07
to
On Tue, 04 Sep 2007 22:27:29 -0300, Clark F Morris <cfmp...@ns.sympatico.ca> wrote:


>In regard to packed decimal, if you are running business programs on a
>360/370/390/z series machine, then packed decimal makes sense.

Only if you plan to stay with IBM forever.

>It avoids several problems.

Modern computers are a *big* problem for IBM marketing.

>On other series of machines that don't have
>full fixed decimal arithmetic, different rules apply.

You don't mean decimal arithmetic, you mean integer arithmetic. Binary numbers work fine
as pennies counters.

>Clark Morris who started on an IBM 650, went to a Honeywell 800, an
>IBM 1401, a RCA 301, and various models of IBM 360, 370, 4300, 390 and
>z series.

Another newbie who never worked on an IBM 704.

Robert

unread,
Sep 5, 2007, 12:09:13 AM9/5/07
to

It looks like your clock has a resolution of one second, rather than 1/100 sec. You'll
have to increase the repeat count to 1,000,000,000 and its pic to 9(10).

Some of the times appear 10 times too small. In timer-off, did you omit a zero from
100,000,000 or omit * 10 from calculation of rounded time?

William M. Klein

unread,
Sep 5, 2007, 12:29:35 AM9/5/07
to

"Robert" <n...@e.mail> wrote in message
news:3m1sd3p062k4dbden...@4ax.com...

The point is that EVERY compiler creates its own "code sequences" (machine
instructions, assembler, p-code, whatever) and that one needs to do a variety of
tests to determine what may (or may not) impact performance.

If one were to get the assembler listing of any of this code and then provide it
to a zSeries "expert" they could explain the performance differences.

As far as (IBM's) TRUNC "std" vs" opt" vs "bin" - they definitely DO produce
different code sequences depending on COBOL picture clauses. "OPT" is an
interesting case as the "optimization" is proprietary and they don't guarantee
for which pictures the optimization does and does not occur (or whether it will
in the next release of the compiler).

William M. Klein

unread,
Sep 5, 2007, 12:31:17 AM9/5/07
to
So want to give us any (current)_ comparisons on WHERE the most COBOL compiled
programs are run today? (IBM mainframes vs the entire rest of the hardware and
software world)?

--
Bill Klein
wmklein <at> ix.netcom.com

"Robert" <n...@e.mail> wrote in message

news:6v6sd35f40hoco7dh...@4ax.com...

robert...@yahoo.com

unread,
Sep 5, 2007, 12:33:36 AM9/5/07
to
On Sep 3, 10:56 pm, Robert <n...@e.mail> wrote:
> > I suspect in answer to your next question that
> >alignment still matters on some currently sold computers with an
> >architecture different from the ones tested on.
>
> Yes, some CPUs don't require alignment e.g. IBM, Motorola, Intel 16/32 bit. Some throw an
> exception and expect the operating system to handle it in software (expensively)
> i.e.IA-64, most RISC machines including PA, PowerPC and Alpha. A few operating systems
> abort the process when they get a misalignment fault e.g. old Apple.


The cost of unaligned accesses is difficult to quantify in a general
sort of way. It often varies based on exactly how unaligned an access
is.

For example, x86, which allows unaligned accesses in most user mode
cases, has some implementations where the penalty is zero or one so
long as an unaligned item is entirely within a single cache line, a
clock or two if it crosses a cache line, and half a dozen or more if
it crosses a page boundary (assuming there are no delays due to cache
or TLB misses).

Alpha, for example, never supported unaligned accesses (although they
did have reasonable multi-instruction sequences to synthesize them -
at least from the 21164 onward), so referencing an unaligned item
would (without special code) always trap (and usually be fixed up in
software). That's usually good for a least a hundred cycles, and thus
if you're expecting more than a percent or two of your accesses to be
unaligned, it's worth generating the longer code to avoid the trap
(which will usually only be two or three times slower than a simple
aligned access). In short, if you're expecting any frequency of
unaligned, it's well worth it to generate the slow code to avoid the
traps.

PPC and IPF are interesting, since they offer an intermediate case.
PPC allows unaligned accesses that don't cross a page, or sometimes,
if the moon phase is right, if the translations for both pages are
available and basically have identical attributes. But in short, PPC
has a small penalty for unaligned accesses, unless they corss a page
boundary, in which case you often get a trap. IPF allows unaligned
accesses within a cache line, but will trap otherwise. As with
Alphas, in both cases there are ways to code the unaligned access as
two memory accesses plus a merge/split of some sort.


Robert

unread,
Sep 5, 2007, 1:13:44 AM9/5/07
to
On Wed, 05 Sep 2007 04:31:17 GMT, "William M. Klein" <wmk...@nospam.netcom.com> wrote:

>So want to give us any (current)_ comparisons on WHERE the most COBOL compiled
>programs are run today? (IBM mainframes vs the entire rest of the hardware and
>software world)?

My SWAG estimate is 70% mainframe. If anyone has supported numbers, I've never seen them.

Most Unix Cobol originated in the mainframe world and was ported during the '90s. There is
almost no new development in the Unix world, just maintenance changes. They're slowly
converting the Cobol to Java or C.

Talking about myths, most in the Unix world believe Cobol is inherently slow, like Basic.
When I say it's at least as fast as C, they simply don't believe me.

Arnold Trembley

unread,
Sep 5, 2007, 2:33:30 AM9/5/07
to

In some situations, Unix COBOL is extremely fast:
http://home.att.net/~arnold.trembley/perf001.htm

And in other situations, Unix C is slower than IBM Mainframe C:
http://home.att.net/~arnold.trembley/perf002.txt


--
http://arnold.trembley.home.att.net/

Howard Brazee

unread,
Sep 5, 2007, 1:25:19 PM9/5/07
to
On Tue, 04 Sep 2007 22:46:50 -0500, Robert <n...@e.mail> wrote:

>>In regard to packed decimal, if you are running business programs on a
>>360/370/390/z series machine, then packed decimal makes sense.
>
>Only if you plan to stay with IBM forever.

Nothing is forever. Make decisions about the system you will be
using now, looking at trends that will continue for the life of that
system.

docd...@panix.com

unread,
Sep 5, 2007, 1:53:44 PM9/5/07
to
In article <hiptd3poqp34nms98...@4ax.com>,

Howard Brazee <how...@brazee.net> wrote:
>On Tue, 04 Sep 2007 22:46:50 -0500, Robert <n...@e.mail> wrote:
>
>>>In regard to packed decimal, if you are running business programs on a
>>>360/370/390/z series machine, then packed decimal makes sense.
>>
>>Only if you plan to stay with IBM forever.
>
>Nothing is forever.

Including that statement?

>Make decisions about the system you will be
>using now, looking at trends that will continue for the life of that
>system.

This may be a reasonwhy a bunch of folks made a decent living for a few
years converting 2-digit years to 4-digit ones.

DD

Clark F Morris

unread,
Sep 5, 2007, 8:16:43 PM9/5/07
to
On Tue, 04 Sep 2007 22:46:50 -0500, Robert <n...@e.mail> wrote:

>On Tue, 04 Sep 2007 22:27:29 -0300, Clark F Morris <cfmp...@ns.sympatico.ca> wrote:
>
>
>>In regard to packed decimal, if you are running business programs on a
>>360/370/390/z series machine, then packed decimal makes sense.
>
>Only if you plan to stay with IBM forever.
>
>>It avoids several problems.
>
>Modern computers are a *big* problem for IBM marketing.
>
>>On other series of machines that don't have
>>full fixed decimal arithmetic, different rules apply.
>
>You don't mean decimal arithmetic, you mean integer arithmetic. Binary numbers work fine
>as pennies counters.

I MEAN decimal arithmetic. Rounding and division can give different
answers depending on the base (2, 8, 10, 12, 16, etc.). I remember my
8th grade math teach discussing dozenal (base twelve).


>
>>Clark Morris who started on an IBM 650, went to a Honeywell 800, an
>>IBM 1401, a RCA 301, and various models of IBM 360, 370, 4300, 390 and
>>z series.
>
>Another newbie who never worked on an IBM 704.

But did you work on the 301?

HeyBub

unread,
Sep 5, 2007, 8:30:14 PM9/5/07
to
Clark F Morris wrote:
>
> I MEAN decimal arithmetic. Rounding and division can give different
> answers depending on the base (2, 8, 10, 12, 16, etc.). I remember my
> 8th grade math teach discussing dozenal (base twelve).
>>
>>> Clark Morris who started on an IBM 650, went to a Honeywell 800, an
>>> IBM 1401, a RCA 301, and various models of IBM 360, 370, 4300, 390
>>> and z series.
>>
>> Another newbie who never worked on an IBM 704.
> But did you work on the 301?

Ah, the "Defecator" and the "Cruncher."

I once shook Roy Roger's hand.


Judson McClendon

unread,
Sep 5, 2007, 9:37:15 PM9/5/07
to
"Clark F Morris" <cfmp...@ns.sympatico.ca> wrote:
>
> I MEAN decimal arithmetic. Rounding and division can give different
> answers depending on the base (2, 8, 10, 12, 16, etc.). I remember my
> 8th grade math teach discussing dozenal (base twelve).

The problem between base 10 and other bases is not because of division,
but because only some non-integral values can be expressed in any given
base, and the ones that can vary from base to base. The problem might
crop up in addition (see below) or multiplication, or even a simple move
from a variable in one base to a variable in another base. I ran this simple
BASIC program using QBASIC. The strange output is because decimal
value .01 (or .1) cannot be expressed exactly in binary, and the cumulative
error in the sum increases. Rounding in the PRINT routine mitigates the
result to a small degree, which is why some values appear to be correct.
Any integer can be expresses precisely in any integral base, of course.

10 FOR I =.01 TO 1 STEP .01
20 PRINT I,
30 NEXT I

Output:

.01 .02 .03 .04 .05
5.999999E-02 6.999999E-02 7.999999E-02
8.999999E-02 9.999999E-02 .11
.12 .13 .14 .15 .16
.17 .18 .19 .2 .21
.22 .23 .24 .25 .26
.27 .28 .29 .3 .31
.32 .33 .3399999 .3499999 .3599999
.3699999 .3799999 .3899999 .3999999 .4099999
.4199999 .4299999 .4399998 .4499998 .4599998
.4699998 .4799998 .4899998 .4999998 .5099998
.5199998 .5299998 .5399998 .5499998 .5599998
.5699998 .5799997 .5899997 .5999997 .6099997
.6199997 .6299997 .6399997 .6499997 .6599997
.6699997 .6799996 .6899996 .6999996 .7099996
.7199996 .7299996 .7399996 .7499996 .7599996
.7699996 .7799996 .7899995 .7999995 .8099995
.8199995 .8299995 .8399995 .8499995 .8599995
.8699995 .8799995 .8899994 .8999994 .9099994
.9199994 .9299994 .9399994 .9499994 .9599994
.9699994 .9799994 .9899994 .9999993
--
Judson McClendon ju...@sunvaley0.com (remove zero)
Sun Valley Systems http://sunvaley.com
"For God so loved the world that He gave His only begotten Son, that
whoever believes in Him should not perish but have everlasting life."


Robert

unread,
Sep 5, 2007, 9:43:57 PM9/5/07
to

That thinking caused Y2K.

One system created in 1990, with estimated life of 3-5 years, used a one digit year to
conserve space. In 1999, the company paid big bucks to expand the year to four digits.

In another case, mainframers learned the database's date type takes 10 bytes of disk.
They saved the company a ton of money by storing dates as 3 SMALLINTs, which took 4 bytes
less. Date functions were done in Cobol rather than in SQL. Database reporting languages
were not used because 'programmers' couldn't figure out how to CAST three integers into a
date. Who says databases are automatically Y2K compliant? Cobolers found a way to make
them just like a VSAM file.

In yet another, they fixed the Y2K problem by putting the 4 digit years in the filler at
the end of the interface record (flat file), far away from the dates they went with. There
were several years clumped together out there. In 1998, management wanted to advance dates
two years to test Y2K readiness. At the time, there were many commercial tools to do that,
but none of them (that we could find) handled non-contiguous dates. To make it even more
challenging, the month, day and unused two digit year were packed as a single field, the
four digit year was binary. I wrote a tool (in Cobol) that could handle pathological date
formats, and sold it to the company.

All three happened at the same company, which was in the Dow 30 at the time, later
replaced by Microsoft. Shortly after 2000, management scrapped the three systems (along
with dozens more), replacing them with Peoplesoft on Unix. The reason wasn't poor quality,
they did it to save money on hardware and support.

Judson McClendon

unread,
Sep 5, 2007, 9:55:28 PM9/5/07
to
Many of you can do this kind of thing in your sleep, but others
might get something out of this article I posted many moons ago
about number base notation and base conversion. It is formatted
for a mono-spaced font like Courier.
--------
There is a simple underlying principle to all positional based notation
systems, and once you get that, it is fairly straightforward to decode
and convert numbers in any base. Starting with the units position,
each position to the left is valued base times greater. For example,
in base 10, the number 1234 is actually understood like this:

1234 = 1 * 10*10*10 = 1*1000 = 1000
+ 2 * 10*10 = 2* 100 = 200
+ 3 * 10 = 3* 10 = 30
+ 4 * 1 = 4* 1 = 4

Looking at it as powers of the base, the pattern is clearer:

1234 = 1 * 10^3 = 1*1000 = 1000
+ 2 * 10^2 = 2* 100 = 200
+ 3 * 10^1 = 3* 10 = 30
+ 4 * 10^0 = 4* 1 = 4

Any real number 'n' to the 'zeroth' power (n^0) = 1, so the units
position in any base is always valued at 1. The pattern is the same
for every base. Consider 1234 base 8, octal:

1234 = 1 * 8^3 = 1*512 = 512
+ 2 * 8^2 = 2* 64 = 128
+ 3 * 8^1 = 3* 8 = 24
+ 4 * 8^0 = 4* 1 = 4
----
668 decimal

Now consider 1111 base 2, binary:

1111 = 1 * 2^3 = 1*8 = 8
+ 1 * 2^2 = 1*4 = 4
+ 1 * 2^1 = 1*2 = 2
+ 1 * 2^0 = 1*1 = 1
--
15 decimal

Because each position to the left is exactly base times as large as
the position to the right, we do not need a symbol = base. That
value is represented by a 0 with a 1 to the left. For example, in
base 10 we don't need a symbol with a value of 10, because we use a
0 with 1 in the position to the left: 10. The same is true for
every base. In base 2 there is no symbol for 2, we use 10, in base
8 there is no symbol for 8, we use 10, in base 16 we have symbols
for all the values 0 through 15, then we use 10 for the base size.

In any base system, 10 is the way we write the 'base' number. Each
position increases from 0 up to the value of base -1, then we carry.

Base 2: 0 1 10
Base 3: 0 1 2 10
Base 4: 0 1 2 3 10
Base 5: 0 1 2 3 4 10
Base 8: 0 1 2 3 4 5 6 7 10
Base 10: 0 1 2 3 4 5 6 7 8 9 10
Base 16: 0 1 2 3 4 5 6 7 8 9 A B C D E F 10
Base 20: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J 10

Because our decimal system has no numeric digit above 9, we borrow
letters from the Roman alphabet to represent the values > 9. Note
that used in this way, the letters -do not- represent the alphabetic
characters we normally use them for. They are simply familiar
symbols used to represent single-digit values of 10 or more. This
is much easier than having to learn a whole new set of unfamiliar
symbols for bases > 10.

0 = 0 C = 12 O = 24
1 = 1 D = 13 P = 25
2 = 2 E = 14 Q = 26
3 = 3 F = 15 R = 27
4 = 4 G = 16 S = 28
5 = 5 H = 17 T = 29
6 = 6 I = 18 U = 30
7 = 7 J = 19 V = 31
8 = 8 K = 20 W = 32
9 = 9 L = 21 X = 33
A = 10 M = 22 Y = 34
B = 11 N = 23 Z = 35

The pattern above continues on into fractional values. Consider
the number 1234.5678 in base 10, decimal:

1234.5678 = 1 * 10^3 = 1*1000 = 1000
+ 2 * 10^2 = 2* 100 = 200
+ 3 * 10^1 = 3* 10 = 30
+ 4 * 10^0 = 4* 1 = 4
+ 5 * 10^-1 = 5* .1 = .5
+ 6 * 10^-2 = 6* .01 = .06
+ 7 * 10^-3 = 7* .001 = .007
+ 8 * 10^-4 = 8* .0001 = .0008

Or 1111.1111 in base 2, binary:

1111.1111 = 1 * 2^3 = 1* 8 = 8
+ 1 * 2^2 = 1* 4 = 4
+ 1 * 2^1 = 1* 2 = 2
+ 1 * 2^0 = 1* 1 = 1
+ 1 * 2^-1 = 1* 1/2 = .5
+ 1 * 2^-2 = 1* 1/4 = .25
+ 1 * 2^-3 = 1* 1/8 = .125
+ 1 * 2^-4 = 1* 1/16 = .0625
-------
15.9375

Note that the symbol '.' (or ',' for our European friends), which we
call a 'decimal point' in base 10 decimal, is called a 'binary point'
in base 2 binary, a 'hexadecimal point' in base 16 hexadecimal an
'octal point' in base 8 octal, etc.

Any integer can be represented exactly in any base, but what about
fractional values? Unfortunately, the answer is "no". Many values
which can be represented exactly in one base result in an infinite
series of repeating digits in other bases. For example, the decimal
values .1 and .01 (and many others) cannot be exactly represented in
binary. This has some unpleasant side effects. If you run the BASIC
program below:

FOR I = 0 TO 1 STEP .1
PRINT I;
NEXT

You will probably (not all BASICs use binary) get an output similar
to this:

0 .1 .2 .3 .4 .5 .6 .7 .8000001 .9000001

Imagine your bank using math like that to calculate your bank balance.
What happened is that .1 is rounded to the nearest binary fraction,
which in this case is slightly greater than .1 decimal. The print
function then rounds this back to decimal when printing. This rounding
makes the result look okay for a while, until the cumulative error
reaches the point when it rounds up or down to a different printable
value for that precision. If we multiply imprecise rounded values,
the error is even greater, and if we use them as powers, as in loan or
statistical calculations, the error becomes literally 'exponential'.
This is a major reason why COBOL has historically been, so popular
for business programming, where handling decimal fractions accurately
is essential. COBOL supports scaled decimal math, thus avoiding this
nasty little problem. The COBOL programmer still must be deal with
normal round off errors, but COBOL's scaled decimal math avoids
the binary/decimal round off issue entirely.

Below are programs in BASIC and C which convert integers from any base
to any other base, where base = 2-36. The limit of 36 is because we
need 'base' number of symbols to represent a number in a given 'base'.
There are 10 numeric digits plus 26 letters = 36 symbols. The maximum
value we can convert is limited to the largest value the language can
store as an integer variable.

* NOTE * Programs omitted to save space. You can download source
and executable from my website below. Look for BASECONV.ZIP
under C, PowerBASIC Console Compiler, PowerBASIC DOS or
QuickBasic. The QuickBasic version runs with QBASIC or MS PDS.

Paul Raulerson

unread,
Sep 5, 2007, 10:00:11 PM9/5/07
to

"Robert" <n...@e.mail> wrote in message
news:0ukid3h951nksjv34...@4ax.com...
>>Alignment DOES matter on machines where
>>this is not tolerated.
>
> Modern machines have two or three levels of cache between the CPU and
> memory. There are no
> alignment issues in a cache. But compilers that THINK alignment is
> important shoot
> themselves in the foot by generating extra instructions to align he number
> to speed things
> up. The extra instructions are counterproductive, they actually slow
> things down.

Oh no- alignment does indeed matter. It is true on both Mainframes, Alphas,
and to my certain but not firsthand knowledge, on Itaniums. Trust me, I just
got bitten on that a week or two ago. :)

-Paul


Judson McClendon

unread,
Sep 5, 2007, 10:11:06 PM9/5/07
to
"Robert" <n...@e.mail> wrote:

> Howard Brazee <how...@brazee.net> wrote:
>>Nothing is forever. Make decisions about the system you will be
>>using now, looking at trends that will continue for the life of that
>>system.
>
> That thinking caused Y2K.
>
> One system created in 1990, with estimated life of 3-5 years, used a
> one digit year to conserve space. In 1999, the company paid big bucks
> to expand the year to four digits.

The last system I wrote using 2 digit years was a PC based client server
inventory system in '88-'89. Written in COBOL, running under DOS,
using a Novell Netware server, and Btrieve as a database engine. Who
knew the thing would still be used in 2007, running as a 32 bit application
under WinXP using Windows Server 2003 and Btrieve, with no end in
sight? I'll have to say that, back in the 1970's and early 1980's, we simply
had no baseline to indicate that applications we were writing then would
still be operational in 2000. I remember changing single digit years in
punched card applications to 2 digits in 1969, but things were changing
so fast, and the computer industry was so young, I don't think anyone
realized the longevity some of that software would have. But that should
not be a problem any more. Except for COBOL, probably no language
platform being used today will compile code more than a very few years
old. I have clients running COBOL programs of mine that haven't been
recompiled in 20 years. Only on a mainframe. :-)

Robert

unread,
Sep 5, 2007, 11:11:45 PM9/5/07
to
On Wed, 05 Sep 2007 21:16:43 -0300, Clark F Morris <cfmp...@ns.sympatico.ca> wrote:

>On Tue, 04 Sep 2007 22:46:50 -0500, Robert <n...@e.mail> wrote:
>
>>On Tue, 04 Sep 2007 22:27:29 -0300, Clark F Morris <cfmp...@ns.sympatico.ca> wrote:
>>
>>
>>>In regard to packed decimal, if you are running business programs on a
>>>360/370/390/z series machine, then packed decimal makes sense.
>>
>>Only if you plan to stay with IBM forever.
>>
>>>It avoids several problems.
>>
>>Modern computers are a *big* problem for IBM marketing.
>>
>>>On other series of machines that don't have
>>>full fixed decimal arithmetic, different rules apply.
>>
>>You don't mean decimal arithmetic, you mean integer arithmetic. Binary numbers work fine
>>as pennies counters.
>
>I MEAN decimal arithmetic. Rounding and division can give different
>answers depending on the base (2, 8, 10, 12, 16, etc.). I remember my
>8th grade math teach discussing dozenal (base twelve).

Imprecision is in the fractional part to the right of the decimal, not the part left of
the decimal. The solution is simple -- scale the numerator up by factors of 10 before a
division, discard remainder, scale the quotient down by the same factors of 10.

For example, to compute a whole number percentage say (N * 100) / D, rather than (N / D) *
100 (always a good practice to avoid lost digits). If you need it rounded, say (((N *
1000) / D) + 5) / 10

There is no imprecision on add, subtract or multiply.

>>>Clark Morris who started on an IBM 650, went to a Honeywell 800, an
>>>IBM 1401, a RCA 301, and various models of IBM 360, 370, 4300, 390 and
>>>z series.
>>
>>Another newbie who never worked on an IBM 704.
>But did you work on the 301?

No retry on IO errors. The machine stopped and a red light came on. The operator fixed the
data in memory (how?), then restarted with the next instruction.

Disk drive made by Wurlitzer that looked and functioned exactly like a juke box. Each disk
held 4K characters.

Three speed card reader that required the program to start downshifting a few hundred
cards from the end (how did it know?) Ran at fixed speed, like a check sorter, here comes
a card, ready or not. The card reader was driving the machine rather than the other way
around.

Tape hook and leader that never worked. It was easier to cut the tape with scissors when
dismounting, splice tape to leader when mounting.

Magnetic card unt that jammed and mutilated cards all the time.

Accidentally printing nulls blew all fuses in the printer.

No operating system.

Arithmetic by table lookup, with table in unprotected low memory. Answers would come out
wrong because another program stomped on the table an hour ago. I called it CADET for
Can't Add, Doesn't Even Try.

Branch to zero made it execute the arithmetic table, the first two characters of which,
01, were the instruction to read a card. You knew your program blew up if it read a card
unexpectedly.

The machine was a huge collection of practical jokes. I pictured engineers giving each
other high fives and saying "let's see you top THIS".

RCA: the most trusted name in electronics.

I rewrote the Spectra/70 file system (Logical IOCS) so native programs could read and
write 301 tapes. It was much faster than running under emulation.

Robert

unread,
Sep 5, 2007, 11:41:49 PM9/5/07
to

The Itanium throws an exception (interrupt) when running in IA-64 (native) mode. It does
not when running in IA-32 mode. The Alpha also throws an exception. Mainframes (z9) do
not.

Ask any mainframe systems programmer whether he or she worries about alignment. The answer
will be 'not 99% of the time'. If you ask an applications programmer, you might get a
wrong answer.

Paul Raulerson

unread,
Sep 5, 2007, 11:55:02 PM9/5/07
to

"Robert" <n...@e.mail> wrote in message
news:0btud3licvbfo1r77...@4ax.com...

Okay- I'll ask myself. Yep! I care about it.

Haven't you ever noticed that assembler code is dotted with DS 0H's?
Programs
will fail without proper alignment. That is true on an zArch machine. Data
that is misaligned will cause a runtime error too.

-Paul


Robert

unread,
Sep 6, 2007, 12:21:27 AM9/6/07
to

Yes, but I thought they were stupid. I never used them unless there was a compelling
reason.

>Programs
>will fail without proper alignment. That is true on an zArch machine. Data
>that is misaligned will cause a runtime error too.

I'll take your word for it. I lost my faith in 1982 when the IBM PC came out. It was an
epiphany. I had a relapse in 1998-2000, saw nothing had changed in 16 years, returned to
real programming.

In the day, I completely rewrote the CICS memory manager. My GETMAIN and LINK were >100
times faster than IBM's. If alignment has been a speed issue, surely I would have
encountered it there.

HeyBub

unread,
Sep 6, 2007, 8:14:17 AM9/6/07
to
Robert wrote:
>
> I'll take your word for it. I lost my faith in 1982 when the IBM PC
> came out. It was an epiphany. I had a relapse in 1998-2000, saw
> nothing had changed in 16 years, returned to real programming.
>
> In the day, I completely rewrote the CICS memory manager. My GETMAIN
> and LINK were >100 times faster than IBM's. If alignment has been a
> speed issue, surely I would have encountered it there.

Think how fast your stuff would have been had you taken alignment into
account.

Five or six orders of magnitude faster instead of two. You'd have gotten a
raise, married your childhood sweetheart, and founded a dot-com empire.

Good thing it didn't matter.


Howard Brazee

unread,
Sep 6, 2007, 11:07:12 AM9/6/07
to
On Wed, 5 Sep 2007 17:53:44 +0000 (UTC), docd...@panix.com () wrote:

>>Make decisions about the system you will be
>>using now, looking at trends that will continue for the life of that
>>system.
>
>This may be a reasonwhy a bunch of folks made a decent living for a few
>years converting 2-digit years to 4-digit ones.

They didn't make decisions for the life of their system, did they?

If IBM decides to drop hardware support for packed decimal, then
programs on the IBM mainframes won't be as efficient. But they will
work, and efficiency won't be as expensive then.

And it doesn't make sense to write your IBM mainframe CoBOL programs
in such a way to be efficient on Intel now that Intel doesn't have
hardware support for packed decimal. The unlikely event of reusing
the same programs and data without reorganizing them before switching
to Intel is not something to base a strategy on.

Sometimes it is cheaper to buy a condo now, and then move when you
have children and different needs.

Robert

unread,
Sep 6, 2007, 10:39:54 PM9/6/07
to
On Thu, 06 Sep 2007 09:07:12 -0600, Howard Brazee <how...@brazee.net> wrote:

>On Wed, 5 Sep 2007 17:53:44 +0000 (UTC), docd...@panix.com () wrote:
>
>>>Make decisions about the system you will be
>>>using now, looking at trends that will continue for the life of that
>>>system.
>>
>>This may be a reasonwhy a bunch of folks made a decent living for a few
>>years converting 2-digit years to 4-digit ones.
>
>They didn't make decisions for the life of their system, did they?
>
>If IBM decides to drop hardware support for packed decimal, then
>programs on the IBM mainframes won't be as efficient. But they will
>work, and efficiency won't be as expensive then.
>
>And it doesn't make sense to write your IBM mainframe CoBOL programs
>in such a way to be efficient on Intel now that Intel doesn't have
>hardware support for packed decimal.

Binary is more efficient than packed on TODAY'S MAINFRAME.

Packed decimal (COMP-3) compared to binary (COMP or COMP-4) with TRUNC(STD)
using 1 to 9 digits: packed decimal is 30% to 60% slower than binary
using 10 to 17 digits: packed decimal is 55% to 65% faster than binary
using 18 digits: packed decimal is 74% faster than binary

Packed decimal (COMP-3) compared to binary (COMP or COMP-4) with TRUNC(OPT)
using 1 to 8 digits: packed decimal is 160% to 200% slower than binary
using 9 digits: packed decimal is 60% slower than binary
using 10 to 17 digits: packed decimal is 150% to 180% slower than binary
using 18 digits: packed decimal is 74% faster than binary

>The unlikely event of reusing


>the same programs and data without reorganizing them before switching
>to Intel is not something to base a strategy on.

You contradict yourself here. If IBM dropped support for packed, programs would continue
to work. If programs were moved to Intel, they would have to be reorganized. That's the
kind of thing an IBM salesman would say.

It is not unlikely; there is lots of former mainframe Cobol running on Unix.
Users don't 'reorganize' it, they just recompile it.

Tim Josling

unread,
Sep 7, 2007, 4:08:11 AM9/7/07
to
On Sep 2, 5:35 am, Robert <n...@e.mail> wrote:
> ...
> I see OC does that by using the GCC C compiler as its back end. The problem with that
> approach is you can't generate inline code for Cobol things that have no corresponding C
> syntax. For instance, a SEARCH or STRING looking for a one byte delimiter on Intel SHOULD
> generate an inline REPNE SCASB. There's no way to say that in C; you have to call a
> function.

Unless you're using GCC:

http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html
Here is an example:

http://www.gelato.unsw.edu.au/lxr/source/arch/i386/lib/memcpy.c?a=i386

1 #include <linux/config.h>
2 #include <linux/string.h>
3 #include <linux/module.h>
4
5 #undef memcpy
6 #undef memset
7
8 void *memcpy(void *to, const void *from, size_t n)
9 {
10 #ifdef CONFIG_X86_USE_3DNOW
11 return __memcpy3d(to, from, n);
12 #else
13 return __memcpy(to, from, n);
14 #endif
15 }
16 EXPORT_SYMBOL(memcpy);
17
18 void *memset(void *s, int c, size_t count)
19 {
20 return __memset(s, c, count);
21 }
22 EXPORT_SYMBOL(memset);
23
24 void *memmove(void *dest, const void *src, size_t n)
25 {
26 int d0, d1, d2;
27
28 if (dest < src) {
29 memcpy(dest,src,n);
30 } else {
31 __asm__ __volatile__(
32 "std\n\t"
33 "rep\n\t"
34 "movsb\n\t"
35 "cld"
36 : "=&c" (d0), "=&S" (d1), "=&D" (d2)
37 :"" (n),
38 "1" (n-1+(const char *)src),
39 "2" (n-1+(char *)dest)
40 :"memory");
41 }
42 return dest;
43 }
44 EXPORT_SYMBOL(memmove);
45

Tim Josling

docd...@panix.com

unread,
Sep 7, 2007, 5:28:59 AM9/7/07
to
In article <7g50e35s5bk6svu62...@4ax.com>,

Howard Brazee <how...@brazee.net> wrote:
>On Wed, 5 Sep 2007 17:53:44 +0000 (UTC), docd...@panix.com () wrote:
>
>>>Make decisions about the system you will be
>>>using now, looking at trends that will continue for the life of that
>>>system.
>>
>>This may be a reasonwhy a bunch of folks made a decent living for a few
>>years converting 2-digit years to 4-digit ones.
>
>They didn't make decisions for the life of their system, did they?

It might be, Mr Brazee, that they certainly *did* make decisions for the
life of their systems... as they saw/understood/intended these systems to
be; I recall being taught something about a system life expectancy of five
to seven years as being reasonable.

That others then said 'this Winton is chugging along right fine, no
need to get one of those fancy new T-model Fords' is not the decision of
the designers.

DD

Robert

unread,
Sep 7, 2007, 11:13:34 PM9/7/07
to

In the Good Old Days (B. PC), many Cobol compilers had ENTER ASSEMBLY LANGUAGE. One wrote
normal looking assembly language (no macros or fancy stuff) referencing Cobol data names
(qualified in the Cobol manner) or (rarely) procedure names. The block terminated with
ENTER COBOL. That was judged to be A Bad Thing and fell out of usage.

You could provide a comparable feature with ENTER C, which would let the programmer
write inline functions you didn't anticipate.

Roger While

unread,
Sep 8, 2007, 6:25:59 AM9/8/07
to
OC uses a C compiler as backend,
not necessarily gcc.
OC has been ported to various incarnations of C.
Assembler code maybe good for a particular implementation.
See (in OC) libcob/byteswap.h and/or libcob/codegen.h and /or libcob/move.h)
Note that memset with gcc up to and including
4.1 will never inline(depending on parameters ie. are they compile time
determable).
OC does try to optimize alignment tolerant machines.

The OC lib is continually being optimized (with input from
many sources).
A rather interesting port was to an old DEC (pre 21164 processor) that
does not like eg.(Now C) short accesses at anything other than a 4 byte
boundary.

And, yes, OC works on this dinosaur(now).

Thanks to all the people that are using/testing OC :-)

PS.
For OC users, download latest.
Then if you want to generate C code,
simply supply the -C option.
eg.
cobc -C myprog.cob

Note the includes.
When one specifies an optimization option
(-O, -O2 , -Os), thne the codegen optimizations
take place.

Roger

"Tim Josling" <t...@melbpc.org.au> schrieb im Newsbeitrag
news:1189152491.9...@k79g2000hse.googlegroups.com...

Roger While

unread,
Sep 8, 2007, 7:41:12 AM9/8/07
to
As regarding the posted source -
This does not compile with cuurent MF and OC.
There is a mismatch with parentheses on the
last compute statement.
Also "SECONDS" is a reserved word.

Roger


Roger While

unread,
Sep 8, 2007, 10:14:17 AM9/8/07
to
Can we be also quite clear with
NOTRUNC negates use of
OVERFLOW/EXCEPTION
therefore that can not occur.
And is with with a ON clause is optimzed away.

Roger


Tim Josling

unread,
Sep 9, 2007, 5:04:08 AM9/9/07
to
On Sep 8, 8:25 pm, "Roger While" <si...@sim-basis.de> wrote:
> OC uses a C compiler as backend,
> not necessarily gcc.
> OC has been ported to various incarnations of C.
> Assembler code maybe good for a particular implementation.
> See (in OC) libcob/byteswap.h and/or libcob/codegen.h and /or libcob/move.h)
> Note that memset with gcc up to and including
> 4.1 will never inline(depending on parameters ie. are they compile time
> determable).
> OC does try to optimize alignment tolerant machines.
>
> ... Roger
>

Never is a big word. The first test I did, it inlined:

#include <stdlib.h>
#include <memory.h>
int imaginary_function(char*);

int main ()
{
char aa[4];
memset(aa,31,4);
imaginary_function(aa);
return 0;

}

+

gcc -Wall -dgdb -S temp.c

(on
Linux tim-asus 2.6.20-16-generic #2 SMP Thu Jun 7 19:00:28 UTC 2007
x86_64 GNU/Linux
gcc (GCC) 4.1.2 (Ubuntu 4.1.2-0ubuntu4))

=>

.file "temp.c"
.text
.globl main
.type main, @function
main:
.LFB5:
pushq %rbp
.LCFI0:
movq %rsp, %rbp
.LCFI1:
subq $16, %rsp
.LCFI2:
leaq -16(%rbp), %rax
movl $522133279, (%rax)
leaq -16(%rbp), %rdi
call imaginary_function
movl $0, %eax
leave
ret
.LFE5:
.size main, .-main
.section .eh_frame,"a",@progbits
.Lframe1:
.long .LECIE1-.LSCIE1
.LSCIE1:
.long 0x0
.byte 0x1
.string "zR"
.uleb128 0x1
.sleb128 -8
.byte 0x10
.uleb128 0x1
.byte 0x3
.byte 0xc
.uleb128 0x7
.uleb128 0x8
.byte 0x90
.uleb128 0x1
.align 8
.LECIE1:
.LSFDE1:
.long .LEFDE1-.LASFDE1
.LASFDE1:
.long .LASFDE1-.Lframe1
.long .LFB5
.long .LFE5-.LFB5
.uleb128 0x0
.byte 0x4
.long .LCFI0-.LFB5
.byte 0xe
.uleb128 0x10
.byte 0x86
.uleb128 0x2
.byte 0x4
.long .LCFI1-.LCFI0
.byte 0xd
.uleb128 0x6
.align 8
.LEFDE1:
.ident "GCC: (GNU) 4.1.2 (Ubuntu 4.1.2-0ubuntu4)"
.section .note.GNU-stack,"",@progbits

Clark F Morris

unread,
Sep 9, 2007, 7:59:49 PM9/9/07
to

Instructions have to be aligned on a half word (16 bit) boundary,
hence the use of DS 0H which is primarily used to have a tag
associated with an instruction sequence instead of a specific
instruction. Unless floating point has to be aligned, the binary
numbers don't have to be aligned although I believe there is a
moderate performance penalty. I think some things have to be page
aligned but that doesn't concern anything COBOL programmers normally
worry about.

Richard

unread,
Sep 9, 2007, 10:52:29 PM9/9/07
to
On Sep 1, 2:22 pm, Robert <n...@e.mail> wrote:
> In the Micro Focus manual Server Express (2.2 & 4.0):Program Development, chapter 1 part 1
> is titled Writing Efficient Programs. Its top billing tells us then think speed is a Very
> Important Topic we should know about. For fun, I put their advice to the test.
>
> The machine I used is a high-end HP Superdome with 64 PA (RISC) processors.

The MicroFocus advice is of a general nature and may, or may not,
apply on particular processors, or with options that may make the
items moot.

> I added a few comparisons that are not from the MF manual, but are widely believed in the
> Cobol community. They are styled "Legacy:".

'widely believed' is just Robert's way of saying that he has an
opinion and wants to denigrate 'the Cobol community' by asserting they
are wrong.

> Execution times are in microseconds (us), with
> a resolution of plus or minus 5.

Plus or minus 5 then.

> Proposition: Use simple two-operand arithmetic statements wherever possible.
>
> Test:
> 05 binary-number binary pic s9(09) sync.
>
> add 1 to binary-number *> 1 us
> compute binary-number = binary-number + 1 *> 1 us
>
> add 1 to binary-number
> multiply 5 by binary-number
> divide 5 into binary-number *> 50 us
>
> compute binary-number = ((binary-number + 1) * 5) / 5 *> 445 us
>
> Finding: busted for simple cases, confirmed for cases with more than one operation.

The _reason_ that COMPUTE is slower is that any intermediate results
are required to have a large range and accuracy. Simple assignment has
no 'intermediate result'.

You have 'busted' nothing.

> Proposition: "Do not use the REMAINDER, ROUNDED, ON SIZE ERROR or CORRESPONDING phrases if
> you want the fastest performance. No optimization is done on arithmetic statements if the
> ON SIZE ERROR phrase is used. For this reason, we recommend you do not use this phrase if
> high performance is required. The ROUNDED phrase impacts performance, but it is generally
> faster to use ROUNDED than try to round the result using your own routine. "
>
> Test:
> compute binary-number rounded = binary-number + 1 *> 1 us (no penalty)
> add 1 to binary-number *> 15 us
> on size error display 'overflow'
> end-add
>
> Finding: busted for rounded, confirmed for size error.

You have 'busted' nothing. Rounding will not be needed on an add. You
have merely shown that the compiler is cleverer than you are.

> Legacy belief: indexes are faster than subscripts
>
> Test:
> 05 s-subscript binary pic s9(09) sync.
> 01 misaligned-area sync.
> 05 array-element occurs 4096 indexed x-index.
> 10 misaligned-number comp-5 pic s9(09).
> 10 to-cause-misalignment pic x(01).
> move array-element (s-subscript) to test-byte *> 3 us
> move array-element (x-index) to test-byte *> 6 us
>
> Finding: BUSTED. Index is actually slower.

Were they moving the same element ? In any case this is just one
factor in the use of subscript/index. What about performance of
setting the value using SET, UP/DOWN, or MOVE and then accessing the
table ?


> Proposition: When incrementing or decrementing a counter, terminate it with a literal
> value rather than a value held in a data item. For example, to execute a loop n times, set
> the counter to n and then decrement the counter until it becomes zero, rather than
> incrementing the counter from zero to n.
>
> Test:
> perform varying binary-number from 10 by -1 until binary-number = 0 *> 150 us
> perform varying binary-number from 1 by 1 until binary-number > 10 *> 154 us
>
> Finding: BUSTED

You have busted nothing. You don't seem to know what a 'data item' is.

> Proposition: Access to tables defined with OCCURS ... DEPENDING is less efficient than
> access to tables of fixed size, and so should be avoided where high performance is needed.
>
> Test:
>
> 01 depending-area.
> 05 depending-element occurs 1 to 4096 depending on binary-number.
> 10 comp-5 pic s9(09).
> 10 pic x(01).
> move array-element (s-subscript) to test-byte *> 3 us
> move depending-element (s-subscript) to test-byte *> 3 us
>
> Finding: BUSTED

You have busted nothing. 3us is less than the 5us resolution. Now try
it with bound checking on (which is the default).

You may well claim that turning bound checking off is more effective
than avoiding ODO, but that is a different claim.


> Proposition: Arithmetic on COMP-3 data items is performed in packed decimal and is much
> slower than arithmetic on COMP items. It should be avoided.
>
> Test:
> 05 display-number pic 9(09).
> 05 packed-number comp-3 pic s9(09).
>
> add 1 to display-number *> 174 us
> add 1 to packed-number *> 160 us
>
> Finding: CONFIRMED. Packed is almost as slow as display. It was fast on 1970-era
> mainframes. There is no longer any reason to use it. If you want to save space, look at
> space-filled strings and filler-padding.

That result is highly machine dependent. When I cared about such
things I found that packed was an advantage where there was only a
small amount of arithmetic operations (such as adding a column) and a
large amount of display formatting (such as printing the column and
the total). But that was on 2.5 MHz 8085 machines and I haven't cared
since.

> To be continued with the most unexpected and interesting case: does aligning numbers on
> memory boundaries matter?

On some hardware it matters enormously. For example a 680x0 is byte
addressable but gives an exception if an add is done using an odd
address.

Your usual highly opinionated rant that doesn't let facts get in the
way.

Richard

unread,
Sep 9, 2007, 10:56:17 PM9/9/07
to
On Sep 2, 2:52 pm, Robert <n...@e.mail> wrote:

> I walked out of that place after one week.

I always wonder when that is said at whose option the walk was done.


Robert

unread,
Sep 10, 2007, 1:54:33 AM9/10/07
to
On Sun, 09 Sep 2007 19:52:29 -0700, Richard <rip...@Azonic.co.nz> wrote:

>On Sep 1, 2:22 pm, Robert <n...@e.mail> wrote:

>> Proposition: "Do not use the REMAINDER, ROUNDED, ON SIZE ERROR or CORRESPONDING phrases if
>> you want the fastest performance. No optimization is done on arithmetic statements if the
>> ON SIZE ERROR phrase is used. For this reason, we recommend you do not use this phrase if
>> high performance is required. The ROUNDED phrase impacts performance, but it is generally
>> faster to use ROUNDED than try to round the result using your own routine. "
>>
>> Test:
>> compute binary-number rounded = binary-number + 1 *> 1 us (no penalty)
>> add 1 to binary-number *> 15 us
>> on size error display 'overflow'
>> end-add
>>
>> Finding: busted for rounded, confirmed for size error.
>
>You have 'busted' nothing. Rounding will not be needed on an add. You
>have merely shown that the compiler is cleverer than you are.

A later test computed binary-number rounded = binary-number + .5, which ran slowly. The
compiler wasn't clever enough to just add 1.

>> Legacy belief: indexes are faster than subscripts
>>
>> Test:
>> 05 s-subscript binary pic s9(09) sync.
>> 01 misaligned-area sync.
>> 05 array-element occurs 4096 indexed x-index.
>> 10 misaligned-number comp-5 pic s9(09).
>> 10 to-cause-misalignment pic x(01).
>> move array-element (s-subscript) to test-byte *> 3 us
>> move array-element (x-index) to test-byte *> 6 us
>>
>> Finding: BUSTED. Index is actually slower.
>
>Were they moving the same element ? In any case this is just one
>factor in the use of subscript/index. What about performance of
>setting the value using SET, UP/DOWN, or MOVE and then accessing the
>table ?

You hit on *why* the subscript was faster. The optimizer figured out it wasn't changed
inside the loop, so used the one it had in a register. The index was reloaded each
iteration.

SET .. UP BY 1 should be the same speed as adding 1 to a binary number.

>> Proposition: When incrementing or decrementing a counter, terminate it with a literal
>> value rather than a value held in a data item. For example, to execute a loop n times, set
>> the counter to n and then decrement the counter until it becomes zero, rather than
>> incrementing the counter from zero to n.
>>
>> Test:
>> perform varying binary-number from 10 by -1 until binary-number = 0 *> 150 us
>> perform varying binary-number from 1 by 1 until binary-number > 10 *> 154 us
>>
>> Finding: BUSTED
>
>You have busted nothing. You don't seem to know what a 'data item' is.

In a later test, one of the limits was a data item. The literal ran only slightly faster.
PERFORM 10 TIMES was significantly slower than either.

>> Proposition: Access to tables defined with OCCURS ... DEPENDING is less efficient than
>> access to tables of fixed size, and so should be avoided where high performance is needed.
>>
>> Test:
>>
>> 01 depending-area.
>> 05 depending-element occurs 1 to 4096 depending on binary-number.
>> 10 comp-5 pic s9(09).
>> 10 pic x(01).
>> move array-element (s-subscript) to test-byte *> 3 us
>> move depending-element (s-subscript) to test-byte *> 3 us
>>
>> Finding: BUSTED
>
>You have busted nothing. 3us is less than the 5us resolution. Now try
>it with bound checking on (which is the default).
>
>You may well claim that turning bound checking off is more effective
>than avoiding ODO, but that is a different claim.

Bounds checking doesn't use the ODO variable on Micro Focus. It checks that the subscript
is below the allocated maximum.

>> To be continued with the most unexpected and interesting case: does aligning numbers on
>> memory boundaries matter?
>
>On some hardware it matters enormously. For example a 680x0 is byte
>addressable but gives an exception if an add is done using an odd
>address.

Same on the HP PA, as I demonstrated.

>Your usual highly opinionated rant that doesn't let facts get in the
>way.

Measured times ARE facts. I didn't choose the propositions to test, them came from a Micro
Focus manual.

Richard

unread,
Sep 10, 2007, 2:22:41 AM9/10/07
to
On Sep 10, 5:54 pm, Robert <n...@e.mail> wrote:
> On Sun, 09 Sep 2007 19:52:29 -0700, Richard <rip...@Azonic.co.nz> wrote:
> >On Sep 1, 2:22 pm, Robert <n...@e.mail> wrote:
> >> Proposition: "Do not use the REMAINDER, ROUNDED, ON SIZE ERROR or CORRESPONDING phrases if
> >> you want the fastest performance. No optimization is done on arithmetic statements if the
> >> ON SIZE ERROR phrase is used. For this reason, we recommend you do not use this phrase if
> >> high performance is required. The ROUNDED phrase impacts performance, but it is generally
> >> faster to use ROUNDED than try to round the result using your own routine. "
>
> >> Test:
> >> compute binary-number rounded = binary-number + 1 *> 1 us (no penalty)
> >> add 1 to binary-number *> 15 us
> >> on size error display 'overflow'
> >> end-add
>
> >> Finding: busted for rounded, confirmed for size error.
>
> >You have 'busted' nothing. Rounding will not be needed on an add. You
> >have merely shown that the compiler is cleverer than you are.
>
> A later test computed binary-number rounded = binary-number + .5, which ran slowly. The
> compiler wasn't clever enough to just add 1.

Do you know of _any_ compiler for any language that would do that ?


> >> Legacy belief: indexes are faster than subscripts
>
> >> Test:
> >> 05 s-subscript binary pic s9(09) sync.
> >> 01 misaligned-area sync.
> >> 05 array-element occurs 4096 indexed x-index.
> >> 10 misaligned-number comp-5 pic s9(09).
> >> 10 to-cause-misalignment pic x(01).
> >> move array-element (s-subscript) to test-byte *> 3 us
> >> move array-element (x-index) to test-byte *> 6 us
>
> >> Finding: BUSTED. Index is actually slower.
>
> >Were they moving the same element ? In any case this is just one
> >factor in the use of subscript/index. What about performance of
> >setting the value using SET, UP/DOWN, or MOVE and then accessing the
> >table ?
>
> You hit on *why* the subscript was faster. The optimizer figured out it wasn't changed
> inside the loop, so used the one it had in a register. The index was reloaded each
> iteration.

So another advice might be: "Don't loop around doing the same
identical piece of code, just do it once".

Once again the compiler was cleverer than you are.


> SET .. UP BY 1 should be the same speed as adding 1 to a binary number.

Assertion rather than evidence. That's par for the course with you.


> >> Proposition: When incrementing or decrementing a counter, terminate it with a literal
> >> value rather than a value held in a data item. For example, to execute a loop n times, set
> >> the counter to n and then decrement the counter until it becomes zero, rather than
> >> incrementing the counter from zero to n.
>
> >> Test:
> >> perform varying binary-number from 10 by -1 until binary-number = 0 *> 150 us
> >> perform varying binary-number from 1 by 1 until binary-number > 10 *> 154 us
>
> >> Finding: BUSTED
>
> >You have busted nothing. You don't seem to know what a 'data item' is.
>
> In a later test, one of the limits was a data item. The literal ran only slightly faster.

But it _was_ faster. You busted nothing yet again.

> PERFORM 10 TIMES was significantly slower than either.

So, you were wrong about that too.

Maybe but your tests were incompetent and your times were meaningless.

You started out to "prove" that the advise was "bad" but failed. Just
more grandstanding Wagnerisms.


Judson McClendon

unread,
Sep 10, 2007, 1:08:14 PM9/10/07
to
"Robert" <n...@e.mail> wrote:
>
> A later test computed binary-number rounded = binary-number + .5,
> which ran slowly. The compiler wasn't clever enough to just add 1.

I wouldn't expect a compiler to be that clever. :-)

Robert

unread,
Sep 10, 2007, 11:17:42 PM9/10/07
to
On Sun, 09 Sep 2007 23:22:41 -0700, Richard <rip...@Azonic.co.nz> wrote:

>On Sep 10, 5:54 pm, Robert <n...@e.mail> wrote:

>> SET .. UP BY 1 should be the same speed as adding 1 to a binary number.
>
>Assertion rather than evidence. That's par for the course with you.

Do you think adding 5 is slower than adding 1?

>> >> Proposition: When incrementing or decrementing a counter, terminate it with a literal
>> >> value rather than a value held in a data item. For example, to execute a loop n times, set
>> >> the counter to n and then decrement the counter until it becomes zero, rather than
>> >> incrementing the counter from zero to n.
>>
>> >> Test:
>> >> perform varying binary-number from 10 by -1 until binary-number = 0 *> 150 us
>> >> perform varying binary-number from 1 by 1 until binary-number > 10 *> 154 us
>>
>> >> Finding: BUSTED
>>
>> >You have busted nothing. You don't seem to know what a 'data item' is.
>>
>> In a later test, one of the limits was a data item. The literal ran only slightly faster.
>
>But it _was_ faster. You busted nothing yet again.

The author was thinking of Intel, where almost every arithmetic instruction sets the zero
flag, giving a 'free' test for zero. Or he was thinking of a LOOP instruction. The speed
difference is so slight, it's not worth debating.

>> PERFORM 10 TIMES was significantly slower than either.
>
>So, you were wrong about that too.

I said nothing about that, but it gives an insight into optimization. It seems obvious
that Micro Focus studied frequency of use, then optimized frequently used features while
ignoring the others. That explains why subscripts are optimized more than indexes, because
they're used more often. That explains why VARYING is optimized more than TIMES, because
it's used more. Never mind that TIMES is technically the simplest to optimize.

>You started out to "prove" that the advise was "bad" but failed. Just
>more grandstanding Wagnerisms.

Readers can judge who offers facts and who offers ad homina.

Message has been deleted

Doug Miller

unread,
Sep 11, 2007, 7:36:40 AM9/11/07
to
In article <ds0ce3d60gk2m7apm...@4ax.com>, Robert <n...@e.mail> wrote:
>On Sun, 09 Sep 2007 23:22:41 -0700, Richard <rip...@Azonic.co.nz> wrote:
>
>>On Sep 10, 5:54 pm, Robert <n...@e.mail> wrote:
>
>>> SET .. UP BY 1 should be the same speed as adding 1 to a binary number.
>>
>>Assertion rather than evidence. That's par for the course with you.
>
>Do you think adding 5 is slower than adding 1?

Depending on the hardware and the compiler, it could be -- on IBM 360/370/etc
series, *subtracting* 5 from a register is certainly slower than subtracting 1
(e.g. S R8, =F'5' vs. BCTR R8, 0). I don't know if there are any platforms
with hardware instructions to *add* 1, but at least such can be imagined.

Of course, whether adding 5 is slower, or faster, than adding 1 is completely
irrelevant to the question of whether SET x UP BY 1 is slower, or faster, than
ADD 1 TO x.


>
>>> >> Proposition: When incrementing or decrementing a counter, terminate it
> with a literal
>>> >> value rather than a value held in a data item. For example, to execute a
> loop n times, set
>>> >> the counter to n and then decrement the counter until it becomes zero,
> rather than
>>> >> incrementing the counter from zero to n.
>>>
>>> >> Test:
>>> >> perform varying binary-number from 10 by -1 until binary-number = 0 *>
> 150 us
>>> >> perform varying binary-number from 1 by 1 until binary-number > 10 *>
> 154 us
>>>
>>> >> Finding: BUSTED
>>>
>>> >You have busted nothing. You don't seem to know what a 'data item' is.
>>>
>>> In a later test, one of the limits was a data item. The literal ran only
> slightly faster.
>>
>>But it _was_ faster. You busted nothing yet again.
>
>The author was thinking of Intel, where almost every arithmetic instruction
> sets the zero
>flag, giving a 'free' test for zero. Or he was thinking of a LOOP instruction.
> The speed
>difference is so slight, it's not worth debating.

And yet you labeled it as "BUSTED".


>
>>> PERFORM 10 TIMES was significantly slower than either.
>>
>>So, you were wrong about that too.
>
>I said nothing about that, but it gives an insight into optimization. It seems
> obvious
>that Micro Focus studied frequency of use, then optimized frequently used
> features while
>ignoring the others. That explains why subscripts are optimized more than
> indexes, because
>they're used more often. That explains why VARYING is optimized more than
> TIMES, because
>it's used more. Never mind that TIMES is technically the simplest to optimize.
>
>>You started out to "prove" that the advise was "bad" but failed. Just
>>more grandstanding Wagnerisms.
>
>Readers can judge who offers facts and who offers ad homina.

Facts incorrectly interpreted are less useful than an absence of facts. His ad
hominem is perhaps uncalled for, but his statement that you failed to prove
your points is dead on (as I've pointed out at length in another post).

--
Regards,
Doug Miller (alphageek at milmac dot com)

It's time to throw all their damned tea in the harbor again.

Robert

unread,
Sep 12, 2007, 1:41:46 AM9/12/07
to
On Tue, 11 Sep 2007 11:36:40 GMT, spam...@milmac.com (Doug Miller) wrote:

>In article <ds0ce3d60gk2m7apm...@4ax.com>, Robert <n...@e.mail> wrote:
>>On Sun, 09 Sep 2007 23:22:41 -0700, Richard <rip...@Azonic.co.nz> wrote:
>>
>>>On Sep 10, 5:54 pm, Robert <n...@e.mail> wrote:
>>
>>>> SET .. UP BY 1 should be the same speed as adding 1 to a binary number.
>>>
>>>Assertion rather than evidence. That's par for the course with you.
>>
>>Do you think adding 5 is slower than adding 1?
>
>Depending on the hardware and the compiler, it could be -- on IBM 360/370/etc
>series, *subtracting* 5 from a register is certainly slower than subtracting 1
>(e.g. S R8, =F'5' vs. BCTR R8, 0). I don't know if there are any platforms
>with hardware instructions to *add* 1, but at least such can be imagined.

Intel has an INC instruction to add 1.

>Of course, whether adding 5 is slower, or faster, than adding 1 is completely
>irrelevant to the question of whether SET x UP BY 1 is slower, or faster, than
>ADD 1 TO x.

Don't be facile. SET x UP BY 1 can only be an add of the reach to a binary number.

In the Very Old Good Old Days, IBM's mainframe compiler did a bounds check on SET x UP BY
1. Nobody here believes me.

>>>> >> Proposition: When incrementing or decrementing a counter, terminate it
>> with a literal
>>>> >> value rather than a value held in a data item. For example, to execute a
>> loop n times, set
>>>> >> the counter to n and then decrement the counter until it becomes zero,
>> rather than
>>>> >> incrementing the counter from zero to n.

The recommendations are confusing two different things. One is terminating the loop with a
literal, the other is decrementing the counter to zero.

>>>> >> Test:
>>>> >> perform varying binary-number from 10 by -1 until binary-number = 0 *>
>> 150 us
>>>> >> perform varying binary-number from 1 by 1 until binary-number > 10 *>
>> 154 us
>>>>
>>>> >> Finding: BUSTED
>>>>
>>>> >You have busted nothing. You don't seem to know what a 'data item' is.
>>>>
>>>> In a later test, one of the limits was a data item. The literal ran only
>> slightly faster.
>>>
>>>But it _was_ faster. You busted nothing yet again.
>>The author was thinking of Intel, where almost every arithmetic instruction
>> sets the zero
>>flag, giving a 'free' test for zero. Or he was thinking of a LOOP instruction.
>> The speed
>>difference is so slight, it's not worth debating.
>
>And yet you labeled it as "BUSTED".

I busted the proposition that either literal or zero is faster. The difference is smaller
than my margin of error.

>>>> PERFORM 10 TIMES was significantly slower than either.
>>>
>>>So, you were wrong about that too.
>>
>>I said nothing about that, but it gives an insight into optimization. It seems
>> obvious
>>that Micro Focus studied frequency of use, then optimized frequently used
>> features while
>>ignoring the others. That explains why subscripts are optimized more than
>> indexes, because
>>they're used more often. That explains why VARYING is optimized more than
>> TIMES, because
>>it's used more. Never mind that TIMES is technically the simplest to optimize.
>>
>>>You started out to "prove" that the advise was "bad" but failed. Just
>>>more grandstanding Wagnerisms.
>>
>>Readers can judge who offers facts and who offers ad homina.
>
>Facts incorrectly interpreted are less useful than an absence of facts. His ad
>hominem is perhaps uncalled for, but his statement that you failed to prove
>your points is dead on (as I've pointed out at length in another post).

CLC regulars are more interested in recreational pedantry than they are in writing good
Cobol.

Richard

unread,
Sep 12, 2007, 2:29:07 AM9/12/07
to
On Sep 12, 5:41 pm, Robert <n...@e.mail> wrote:

> On Tue, 11 Sep 2007 11:36:40 GMT, spamb...@milmac.com (Doug Miller) wrote:
> >In article <ds0ce3d60gk2m7apm3icuag2io65tht...@4ax.com>, Robert <n...@e.mail> wrote:
> >>On Sun, 09 Sep 2007 23:22:41 -0700, Richard <rip...@Azonic.co.nz> wrote:
>
> >>>On Sep 10, 5:54 pm, Robert <n...@e.mail> wrote:
>
> >>>> SET .. UP BY 1 should be the same speed as adding 1 to a binary number.
>
> >>>Assertion rather than evidence. That's par for the course with you.
>
> >>Do you think adding 5 is slower than adding 1?
>
> >Depending on the hardware and the compiler, it could be -- on IBM 360/370/etc
> >series, *subtracting* 5 from a register is certainly slower than subtracting 1
> >(e.g. S R8, =F'5' vs. BCTR R8, 0). I don't know if there are any platforms
> >with hardware instructions to *add* 1, but at least such can be imagined.
>
> Intel has an INC instruction to add 1.
>
> >Of course, whether adding 5 is slower, or faster, than adding 1 is completely
> >irrelevant to the question of whether SET x UP BY 1 is slower, or faster, than
> >ADD 1 TO x.
>
> Don't be facile. SET x UP BY 1 can only be an add of the reach to a binary number.
>
> In the Very Old Good Old Days, IBM's mainframe compiler did a bounds check on SET x UP BY
> 1. Nobody here believes me.

Geez, Robert. First you claim that SET UP BY 1 can only be an add and
then you claim that it also (in at least one implementation) included
a bound check.


> >>>> >> Proposition: When incrementing or decrementing a counter, terminate it
> >> with a literal
> >>>> >> value rather than a value held in a data item. For example, to execute a
> >> loop n times, set
> >>>> >> the counter to n and then decrement the counter until it becomes zero,
> >> rather than
> >>>> >> incrementing the counter from zero to n.
>
> The recommendations are confusing two different things. One is terminating the loop with a
> literal, the other is decrementing the counter to zero.

No, Robert. If it is necessary to process all items in a table one
could (if sequence is unimportant):

PERFORM VARYING X FROM 1 BY 1 UNTIL X > Y
use X here as subscript/index
END-PERFORM
or PERFORM VARYING X FROM Y BY -1 UNTIL X = ZERO
use X here as subscript/index
END-PERFORM

The first requires Y comparisons to a data item. The second does the
comparisons to a literal. The assertion was that the second would be
faster.

Sheesh, how hard is this ?

> >>>> >> Test:
> >>>> >> perform varying binary-number from 10 by -1 until binary-number = 0 *>
> >> 150 us
> >>>> >> perform varying binary-number from 1 by 1 until binary-number > 10 *>
> >> 154 us
>
> >>>> >> Finding: BUSTED
>
> >>>> >You have busted nothing. You don't seem to know what a 'data item' is.
>
> >>>> In a later test, one of the limits was a data item. The literal ran only
> >> slightly faster.
>
> >>>But it _was_ faster. You busted nothing yet again.
> >>The author was thinking of Intel, where almost every arithmetic instruction
> >> sets the zero
> >>flag, giving a 'free' test for zero. Or he was thinking of a LOOP instruction.
> >> The speed
> >>difference is so slight, it's not worth debating.
>
> >And yet you labeled it as "BUSTED".
>
> I busted the proposition that either literal or zero is faster. The difference is smaller
> than my margin of error.

Which wasn't a proposition made by the claim. What part of 'data item'
do you not understand ?


> >>>> PERFORM 10 TIMES was significantly slower than either.
>
> >>>So, you were wrong about that too.
>
> >>I said nothing about that, but it gives an insight into optimization. It seems
> >> obvious
> >>that Micro Focus studied frequency of use, then optimized frequently used
> >> features while
> >>ignoring the others. That explains why subscripts are optimized more than
> >> indexes, because
> >>they're used more often. That explains why VARYING is optimized more than
> >> TIMES, because
> >>it's used more. Never mind that TIMES is technically the simplest to optimize.
>
> >>>You started out to "prove" that the advise was "bad" but failed. Just
> >>>more grandstanding Wagnerisms.
>
> >>Readers can judge who offers facts and who offers ad homina.
>
> >Facts incorrectly interpreted are less useful than an absence of facts. His ad
> >hominem is perhaps uncalled for, but his statement that you failed to prove
> >your points is dead on (as I've pointed out at length in another post).
>
> CLC regulars are more interested in recreational pedantry than they are in writing good
> Cobol.

Ad hominem noted.

Well you certainly were doing "recreational pedantry" by attempting to
decry the advice on performance. The fact is that you failed to
understand the advice and/or failed to show that it was wrong.

Note that the MF advice was about improving performance not
necessarily making it 'good Cobol'. Regardless of that you haven't
attempted to make any 'good Cobol' in this thread.


Judson McClendon

unread,
Sep 12, 2007, 6:06:26 AM9/12/07
to
"Robert" <n...@e.mail> wrote:
>
> Intel has an INC instruction to add 1.

And DEC to subtract 1. At least in that, the x86 is symmetrical. :-)

Pete Dashwood

unread,
Sep 12, 2007, 6:55:04 AM9/12/07
to

"Robert" <n...@e.mail> wrote in message

news:attee3hhoje0r9r6e...@4ax.com...


> On Tue, 11 Sep 2007 11:36:40 GMT, spam...@milmac.com (Doug Miller) wrote:
>
>>In article <ds0ce3d60gk2m7apm...@4ax.com>, Robert
>><n...@e.mail> wrote:
>>>On Sun, 09 Sep 2007 23:22:41 -0700, Richard <rip...@Azonic.co.nz> wrote:
>>>
>>>>On Sep 10, 5:54 pm, Robert <n...@e.mail> wrote:
>>>

<snip>


>>
>>Facts incorrectly interpreted are less useful than an absence of facts.
>>His ad
>>hominem is perhaps uncalled for, but his statement that you failed to
>>prove
>>your points is dead on (as I've pointed out at length in another post).
>
> CLC regulars are more interested in recreational pedantry than they are in
> writing good
> Cobol.

You really must watch these sweeping statements, Robert, if you wish to
attain any credibility here :-)

The lack of qualification in your last sentence means it includes me (a
"regular" here), and I'm certainly not going to let you (or anyone) get away
with that :-)

Sometimes, it certainly seems as you describe, and I have felt the same way.
But never would I suggest that all the people here are guilty, and even the
ones who specifically are sometimes, on other occasions can offer very
useful and informative information, and there is never any doubt as to their
ability. It shouldn't get personal, (even though it inevitably does...)

The fact is that there is value here.

Today you came up with an excellent solution to the "detecting changed
fields" problem, but it was further improved by points made about it by both
Bill and Richard. As Michael remarked recently: "No man is an island". We
are all better when we work together.

And most of the regulars (even me... who is no longer actively writing
COBOL) DO care about maintaining high standards of COBOL. In my case I base
it on the fact that if you're going to do anything, you should do it as well
as you possibly can.

I think you have been out there in the real world, working on sites where
you have seen a lot of bad COBOL. This has engendered a lack of respect for
COBOL people in general (you KNOW you can do better and you KNOW you have
deeper levels of knowledge than the average shop floor COBOL guy), but you
should not bring that attitude to this forum where there are people who are
NOT the average shop floor COBOL guy. (I am not meaning to demean "average
shop floor COBOL guys" here; we all have to learn...)

I therefore contend that your last statement above is wrong on two counts:

1. As written, it is a sweeping statement (while some regulars may sometimes
get sidetracked into pedantry, it is by no means the case for everybody, or
even for anybody, all of the time.) Sweeping statements suffer from the flaw
that if there is one single solitary case that does not comply, then the
whole proposition is rendered invalid. There are definitely regulars here
who are not, and have never been pedantic, so your case is lost on that
alone.

2. As long as you believe you have the monopoly on writing "good" COBOL and
only what YOU consider "good" IS good, then your statement is suffering from
an inadequate definition of what is "good" COBOL.

Leaving aside the rhetoric and logic, as a statement, it doesn't help much,
does it?

If you qualified it as an observation: " I note that (many/some/but not
"all") CLC regulars..."
you wouldn't hear a peep out of me... :-) It is then a matter of opinion and
you are as much entitled to yours as anyone else here.

Pete.
--
"I used to write COBOL...now I can do anything."


It is loading more messages.
0 new messages