Cobol sort Algorithm?

Michael Urquiola

unread,

Sep 16, 1996, 3:00:00 AM9/16/96

to

Just more for curiousity than anything else, I was wondering if anyone
knows what the the lower level implimentation of the Cobol sort which is
built into the language. Just from reading the code it looks like alot of
I/O and therefore slow, but I'm sure there must be more to it.

The reason I ask is I'm new to programming Cobol, but have a good
knowledge of C, where I got use to writing my own sort routines, so I'm
curious as to what type of Algorithm is used. Any educated guesses
welcome.

later,

--
Mike Urquiola ,,,
(o o)
--o0O--(_)--O0o-- urqu...@ronin.tiac.net
Some see the world as it is and ask why. I see the world as it never was and ask why not? - Robert Kennedy

mor...@nbnet.nb.ca

unread,

Sep 17, 1996, 3:00:00 AM9/17/96

to

On IBM mainframes the SORT verb interfaces with the sorting utility chosen by the
installation: IBM's DFsort, SYNCSORT by the company of the same name, or CA sort
from computer associates. Each of these does various gyrations to determine the
best algorithm depending on available virtual and real memory, disk contention
multi-programming level, and if told to, the time of day. I would expect that in
most other non-PC environments similar statements could be made with changes
of vendor name and sort utility name. All products will try for in memory sort
if the input data will fit.

Clark F. Morris, Jr.
on assignment in St. John, New Brunswick, Canada - mor...@nbnet.nb.ca
506-657-9331
at home in Bridgetown, Nova Scotia, Canada - cmo...@fox.nstn.ca
902-665-4006

Glenn Grotzinger

unread,

Sep 17, 1996, 3:00:00 AM9/17/96

to

urqu...@ronin.tiac.net (Michael Urquiola) wrote:

>The reason I ask is I'm new to programming Cobol, but have a good
>knowledge of C, where I got use to writing my own sort routines, so I'm
>curious as to what type of Algorithm is used. Any educated guesses
>welcome.

I'm not entirely familiar with the sort method of all COBOL
compilers...but I'm familiar with what the standard IBM COBOL
mainframe compiler uses (probably what they all use, since all the
statements in COBOL and implementation are pretty well standardized).

What the standard IBM COBOL mainframe compiler uses is what I would
describe as a sliding merge sort....it sorts in groups by powers of
2...16,8,4,2......

Glenn A. Mitchell

unread,

Sep 17, 1996, 3:00:00 AM9/17/96

to Michael Urquiola

Michael Urquiola wrote:
>
> Just more for curiousity than anything else, I was wondering if anyone
> knows what the the lower level implimentation of the Cobol sort which is
> built into the language.

<SNIP>
Implementation of the SORT verb is an exercise left to the compiler
implementor. All of the compilers I've used (mostly mainframe and mini)
implement this by invoking the standard system sort. As for slow, my
experience is that it can be quite fast. Many COBOL programmers can
implement a simple bubble sort, but frequently not in an optimal
fashion. Recursive sort algorithms are pretty much out of the question
in COBOL (at least with older versions of the language). The system
sort products are typically highly optimized: sort in memory only if
possible, use special I/O techniques to take best advantage of the
hardware architecture, etc. Not to mention the maintenance impact of a
programmer-written sort routine.

I'd suggest you limit programmer sorts to simple bubble sorts or
insertion sorts of relatively small COBOL tables where sorting
performance isn't critical. For other sorts, rely on the SORT verb.
--
Glenn A. Mitchell mailto:mit...@poa.mmc.org
Director, Computer Technology http://www.mmc.org
Maine Medical Center
Portland, ME 04101

rschna...@internetmci.com

unread,

Sep 17, 1996, 3:00:00 AM9/17/96

to

On a IBM mainframe computer COBOL calls the system utility sort,
Syncsort, CA-sort, etc. I don't know what it does on a PC.
Good question!

---------------------------------------------------------------------

In <urquiola-160...@ronin.tiac.net , urqu...@ronin.tiac.net (Michael Urquiola) writes:

Just more for curiousity than anything else, I was wondering if anyone
knows what the the lower level implimentation of the Cobol sort which is

built into the language. Just from reading the code it looks like alot of
I/O and therefore slow, but I'm sure there must be more to it.

The reason I ask is I'm new to programming Cobol, but have a good

knowledge of C, where I got use to writing my own sort routines, so I'm
curious as to what type of Algorithm is used. Any educated guesses
welcome.

later,

Jeff Raben

unread,

Sep 17, 1996, 3:00:00 AM9/17/96

to

urqu...@ronin.tiac.net (Michael Urquiola) wrote:

> Just more for curiousity than anything else, I was wondering if anyone
>knows what the the lower level implimentation of the Cobol sort which is
>built into the language. Just from reading the code it looks like alot of
>I/O and therefore slow, but I'm sure there must be more to it.

IBM and others do NOT use the old merge-to-double-string-length.
Depending on media and capacity, both ram and intermediate storage,
there are half a dozen different string building and merge phase
iplementations around around.

Depending upon the operating environment the SORT is either a system,
dare I say, object or a cobol function which is passed file(s) and
procedures.

So don't bother writing one unless it's for fun.

Bubble sorts are easy to write but SLOW. Heap sorts can be
implemented in cobol but takes some work.

Go get any intro programming or Dykstra's (spelling?) book on sorting
etc (I think vol 2 of his series)

JR

and stir with a Runcible spoon...

UDS...@dsibm.okladot.state.ok.us

unread,

Sep 17, 1996, 3:00:00 AM9/17/96

to

Hi all,

Is anyone familiar with CORBA? I have seen this acronym quite a few time
Does this have anything to do with COBOL? Any help appreciated.

Thank you
Suman

Richard Plinston

unread,

Sep 17, 1996, 3:00:00 AM9/17/96

to

In message <<323EC9...@poa.mmc.org>> "Glenn A. Mitchell" <mit...@poa.mmc.org> writes:
> experience is that it can be quite fast. Many COBOL programmers can
> implement a simple bubble sort, but frequently not in an optimal
> fashion. Recursive sort algorithms are pretty much out of the question
> in COBOL (at least with older versions of the language). The system

Iteritive versions of recursive algorithms are viable in Cobol, and
are oftem faster than the recursive version. I have a quicksort in
Cobol that is in a copy file, just define the actual data and it
works.

I did this on a micro system many years ago and haven't changed the
system as it has grown to latest compilers.

> I'd suggest you limit programmer sorts to simple bubble sorts or
> insertion sorts of relatively small COBOL tables where sorting
> performance isn't critical. For other sorts, rely on the SORT verb.

I have changed SORTs into writing and reading an ISAM file in suitable
cases. The ISAM run-time is already loaded, the SORT runtime isn't.
When the memory space is tight the SORT slows down and can lock-up
(in my experience). Replacing it with an ISAM file has made the
system more reliable.

Arnold J. Trembley

unread,

Sep 18, 1996, 3:00:00 AM9/18/96

to

UDS...@DSIBM.OKLADOT.STATE.OK.US wrote:

>Hi all,

>Thank you
>Suman

I think it stands for Common Object Request Broker Architecture, or
something like that. I believe it's a standard for interlanguage
object-oriented programming supported by multiple vendors, including
IBM, but not supported by MicroSoft.

That's all I know about it, and it comes from reading PR's in the
trade papers. Your mileage may vary.

Doug & Rose Miller

unread,

Sep 18, 1996, 3:00:00 AM9/18/96

to

rip...@kcbbs.gen.nz (Richard Plinston) wrote:
+In message <<323EC9...@poa.mmc.org>> "Glenn A. Mitchell" <mit...@poa.mmc.org> writes:
+> experience is that it can be quite fast. Many COBOL programmers can
+> implement a simple bubble sort, but frequently not in an optimal
+> fashion. Recursive sort algorithms are pretty much out of the question
+> in COBOL (at least with older versions of the language). The system
+
+Iteritive versions of recursive algorithms are viable in Cobol, and
+are oftem faster than the recursive version. I have a quicksort in
+Cobol that is in a copy file, just define the actual data and it
+works.
+
+I did this on a micro system many years ago and haven't changed the
+system as it has grown to latest compilers.
+
+> I'd suggest you limit programmer sorts to simple bubble sorts or
+> insertion sorts of relatively small COBOL tables where sorting
+> performance isn't critical. For other sorts, rely on the SORT verb.
+
+I have changed SORTs into writing and reading an ISAM file in suitable
+cases. The ISAM run-time is already loaded, the SORT runtime isn't.
+When the memory space is tight the SORT slows down and can lock-up
+(in my experience). Replacing it with an ISAM file has made the
+system more reliable.
+
Yes, but what about your I/O overhead caused by repeated block splits
as records are inserted in the file? In my experience, this causes a performance
degradation on the order of a factor of four or worse, when compared to using
the system sort routines -- which makes the extra time required to load the
sort run-time pale by comparison.

Jeff Raben

unread,

Sep 18, 1996, 3:00:00 AM9/18/96

to

Optimal sort systems increase in run time at N.logN times. ISAM can
increase at N*N without index splits interval/block splits and minor
tree rebalancing etc. UNLESS the file is built in descending
order... but that's a hell of a way to reverse sort a data file.

Bubble uses the least amount of memory but POSSIBLE maximum CPU time
unless binary insertions are done on the fly.

There are many ways... has it been discussed more the sum of the
runtimes from now until the specs change?

Jeff Raben

unread,

Sep 18, 1996, 3:00:00 AM9/18/96

to

>>Go get any intro programming or Dykstra's (spelling?) book on sorting
>>etc (I think vol 2 of his series)
>>

>Don't you mean Knuth (see above)?

well I got the book number right.

btw. 3 books are out waiting for all 10.

Tim Oxler

unread,

Sep 18, 1996, 3:00:00 AM9/18/96

to

UDS...@DSIBM.OKLADOT.STATE.OK.US wrote:

>Hi all,

>Is anyone familiar with CORBA? I have seen this acronym quite a few time

CORBA is a distributed computing framework or specification that is
being proposed and/or backed by Netscape, IBM, Sun, Oracle and others.

The competing spec is DCOM which is being pushed by Microsoft.

Both are trying to become the standard that will allow developers to
write to comminication services rather than building connectivity
directly into the applications each time.

The CORBA group is pushing Internet Inter-ORB Protocol (IIOP) that
will replace static HTTP pages with live, object-based components as
the internet std. (ORB stands for Object Request Broker).

Distributed Component Object Model (DCOM) spec will supersede WinSock
on desktop clients.

>Does this have anything to do with COBOL?

No.

Tim Oxler
TEO Computer Technologies Inc.
tro...@i1.net
http://www.i1.net/~troxler
http://users.aol.com/TEOcorp

Chris Westbury

unread,

Sep 18, 1996, 3:00:00 AM9/18/96

to

In article <51mck6$g...@condor.ic.net>,

jra...@ic.net (Jeff Raben) wrote:
>
> Go get any intro programming or Dykstra's (spelling?) book on sorting
> etc (I think vol 2 of his series)

It's spelled "Donald E. Knuth's".

--
Christopher Westbury, Midtown Associates, 15 Fallon Place, Cambridge, MA 02138

Kent Bloom

unread,

Sep 18, 1996, 3:00:00 AM9/18/96

to

In <51mck6$g...@condor.ic.net>, jra...@ic.net (Jeff Raben) writes:
>urqu...@ronin.tiac.net (Michael Urquiola) wrote:
>
>> Just more for curiousity than anything else, I was wondering if anyone
>>knows what the the lower level implimentation of the Cobol sort which is
>>built into the language. Just from reading the code it looks like alot of
>>I/O and therefore slow, but I'm sure there must be more to it.
>

<snip>

>
>Bubble sorts are easy to write but SLOW. Heap sorts can be
>implemented in cobol but takes some work.
>

They're not just slow, they're abominable resource-wasters.
IMHO, they should not be allowed in any programming text. A
simple insertion sort is enormously faster, and not harder.
And there are a lot of really efficient sorts documented in
"The Art of Computer Programming, Volume 3: Sorting and
Searching", by Donald E. Knuth, Addison-Wesley, 1973. It's
on my small shelf of frequently-referred-to books. I've
translated from MIX to COBOL, FORTRAN, PL/I, C, and probably
others I've forgotten, and often seen savings up to 95% in
run-time and CPU usage. It always amazes me how someone
can use a bubble sort on a table with 10,000 entries.
Of course, with only 100 in the test run, problems don't
usually show up until it's in production.

>Go get any intro programming or Dykstra's (spelling?) book on sorting
>etc (I think vol 2 of his series)
>

Don't you mean Knuth (see above)?

>JR
>

Kent Bloom
Software Performance Engineer

Bill Curran

unread,

Sep 19, 1996, 3:00:00 AM9/19/96

to

UDS...@DSIBM.OKLADOT.STATE.OK.US wrote:
: Hi all,

: Is anyone familiar with CORBA? I have seen this acronym quite a few time

: Does this have anything to do with COBOL? Any help appreciated.

Common Object Request Broker Architecture or something like that. The
rest of the world is touting this this as opposed to OLE and activex
from Micro$oft. Cobol will be CORBA enabled, but CORBA is an architecture
for all languages that have OO capabilities. Go to www.software.ibm.com
and do a search on CORBA - there's lots to read.

--
Bill Curran Sysprog - MVS OS/2 xNIX WINx DOS
email : bcu...@stelco.e-mail.com Rexx on every one of them!
http://www.freenet.hamilton.on.ca/~aa131 All Standard Disclaimers Apply

Doug Miller

unread,

Sep 19, 1996, 3:00:00 AM9/19/96

to

jra...@ic.net (Jeff Raben) wrote:
>
>>>Go get any intro programming or Dykstra's (spelling?) book on sorting
>>>etc (I think vol 2 of his series)
>>>
>
>>Don't you mean Knuth (see above)?
>
>

>well I got the book number right.
>

No, you didn't. It's volume 3.

Doug Miller
dlmi...@inetdirect.net ('from' field rigged to foil e-mail spammers)
views expressed are mine and not those of
Hospital Health Plan Corp. "all health care is local"

lsv...@ibm.net

unread,

Sep 20, 1996, 3:00:00 AM9/20/96

to

D. Sorting Algorithms

Sorting is one of the fundamental devices in information
processing. It is also one of the most subtle and least
understood. The difference between a good and a mediocre
sorting algorithm is hard to see when dealing with small
amounts of data. However, with large amounts, the
difference is so massive that it can totally destroy the
effectiveness of a program.

To see why this difference is so important, look at the
way that sorting works. We compare the data items with each
other, and re-arrange them in order. Simply comparing two
values takes a certain amount of time. Moving data around
takes significantly longer. The best sorting algorithm does
as few compares and moves as possible.

In the worst algorithm (simple bubble sort), each data
item is compared to the remaining unsorted items, so that
the maximum number of comparisons needed for n items is n(n-
1). For ten items, this means around 100 comparisons. For
1000 items, 1,000,000 comparisons. At ten thousand items,
the system is no longer responding. The fastest sorting
methods approach kn(log2n) comparisons, where k is a small
constant depending on the method. If k is 2, this means 46
comparisons for 10 items, 14 thousand comparisons for 1000
items, and 28 million comparisons for 1 million items.

As well as reducing the number of comparisons, good
sorting algorithms try to reduce the number of moves.
Typically, the data part of an item is much larger than the
key part, so moving data around is significantly slower then
comparing. A useful way to reduce the costs of moving data
is to use indirect tables (see page (?)).

Note that the code required to implement a really
efficient quicksort is large and complex compared to bubble
sort. The sort algorithm you use depends on the amount and
type of data you may need to sort.

In all examples of code below, we assume that the data
being sorted is defined as follows:

01 TABLE-TO-SORT.
02 TABLE-SIZE PIC S9(5) COMP.
02 TABLE-MAX PIC S9(5) COMP VALUE +1600.
02 TABLE-ITEM OCCURS 1600 TIMES
03 TABLE-KEY PIC X(9).
03 TABLE-DATA PIC X(11).

The size of the key and data simply reflect the data used
in the test programs. The table size, 1600 items, makes the
table 32000 bytes large. COBOL does not allow tables
greater than 32767 bytes (except as a non-portable extension
on some systems). A useful device for working with larger
tables (memory permitting) is to split the table:

01 LARGE-TABLE-ITEM1.
02 TABLE-KEY PIC X(9) OCCURS 2900 TIMES.

01 LARGE-TABLE-ITEM2.
02 TABLE-DATA PIC X(11) OCCURS 2900 TIMES.

1. Bubble Sort

As the name suggests, bubble sort moves items up a table
like bubbles in a tube. The algorithm can be explained as
follows: pass over the data, comparing and exchanging items
so that the largest item ends up at the end of the table.
Repeat for the remaining items until the table is sorted,
that is, no exchanges were made during the latest pass.

The following code does a bubble sort:

01 VARIOUS-INDICES.
02 ITEM-NBR PIC S9(5) COMP.
02 SWAP-NBR PIC S9(5) COMP.
02 JUMP-SIZE PIC S9(5) COMP.
02 UPPER-LIMIT PIC S9(5) COMP.

01 VARIOUS-VALUES.
02 SWAP-ITEM PIC X(20).
02 SWAP-INDICATOR PIC X(1).
88 NO-MORE-SWAPS VALUE IS SPACE.

BUBBLE-SORT-THE-ARRAY.
MOVE "SWAP" TO SWAP-INDICATOR
MOVE 1 TO JUMP-SIZE
PERFORM BUBBLE-SORT
UNTIL NO-MORE-SWAPS
.

BUBBLE-SORT.
MOVE SPACE TO SWAP-INDICATOR
COMPUTE UPPER-LIMIT = TABLE-SIZE - JUMP-SIZE

PERFORM COMPARE-AND-SWAP-KEYS
VARYING ITEM-NBR FROM 1 BY 1
UNTIL ITEM-NBR > UPPER-LIMIT
.

COMPARE-AND-SWAP-KEYS.
COMPUTE SWAP-NBR = ITEM-NBR + JUMP-SIZE
IF TABLE-KEY (ITEM-NBR) > TABLE-KEY (SWAP-NBR)
MOVE TABLE-ITEM (ITEM-NBR) TO SWAP-ITEM
MOVE TABLE-ITEM (SWAP-NBR) TO TABLE-ITEM (ITEM-NBR)
MOVE SWAP-ITEM TO TABLE-ITEM (SWAP-NBR)
MOVE "SWAP" TO SWAP-INDICATOR
.

2. Heapsort

Heapsort uses an intermediate structure, a heap, which is
rather like a binary tree, with each son less than or equal
to its father. Once the data is partially ordered into a
heap, it can be sorted very quickly. The heap is built up
within the actual table being sorted, so does not need any
extra storage. The detailed workings of heapsort are quite
complex, though very efficient, and the reader is referred
to Wirth[10].

The advantage of heapsort over bubble sort is simply that
it is a lot faster for large amounts of data; 2n(log2n)
comparisons are needed on average. Additionally, heapsort
is quite compact compared to the quicksort algorithm
presented later:

01 VARIOUS-INDICES.
02 ITEM-NBR PIC S9(5) COMP.
02 COPY-PTR PIC S9(5) COMP.
02 LEFT-PTR PIC S9(5) COMP.
02 RIGHT-PTR PIC S9(5) COMP.
02 TO-PTR PIC S9(5) COMP.
02 FROM-PTR PIC S9(5) COMP.

01 VARIOUS-VALUES.
02 EXCHANGE-ITEM.
03 EXCHANGE-KEY PIC X(9).
03 EXCHANGE-DATA PIC X(11).

HEAP-SORT-THE-TABLE.
COMPUTE LEFT-PTR = TABLE-SIZE / 2 + 1
COMPUTE RIGHT-PTR = TABLE-SIZE

PERFORM SIFT-THE-HEAP-BY-LEFT
UNTIL LEFT-PTR < 2

PERFORM SIFT-THE-HEAP-BY-RIGHT
UNTIL RIGHT-PTR < 2
.

SIFT-THE-HEAP-BY-LEFT.
SUBTRACT 1 FROM LEFT-PTR
PERFORM SIFT-THE-HEAP
.

SIFT-THE-HEAP.
COMPUTE TO-PTR = LEFT-PTR
COMPUTE FROM-PTR = LEFT-PTR + LEFT-PTR
MOVE TABLE-ITEM (TO-PTR) TO EXCHANGE-ITEM
PERFORM BUILD-ITEMS-INTO-HEAP
UNTIL FROM-PTR > RIGHT-PTR

MOVE EXCHANGE-ITEM TO TABLE-ITEM (TO-PTR)
.

BUILD-ITEMS-INTO-HEAP.
IF FROM-PTR < RIGHT-PTR
COMPUTE COPY-PTR = FROM-PTR + 1
IF TABLE-KEY (FROM-PTR) < TABLE-KEY (COPY-PTR)
MOVE COPY-PTR TO FROM-PTR
.
IF EXCHANGE-KEY NOT < TABLE-KEY (FROM-PTR)
COMPUTE FROM-PTR = RIGHT-PTR + 1
ELSE
MOVE TABLE-ITEM (FROM-PTR) TO TABLE-ITEM (TO-PTR)
MOVE FROM-PTR TO TO-PTR
COMPUTE FROM-PTR = TO-PTR + TO-PTR
.

SIFT-THE-HEAP-BY-RIGHT.
MOVE TABLE-ITEM (1) TO EXCHANGE-ITEM
MOVE TABLE-ITEM (RIGHT-PTR) TO TABLE-ITEM (1)
MOVE EXCHANGE-ITEM TO TABLE-ITEM (RIGHT-PTR)
SUBTRACT 1 FROM RIGHT-PTR
PERFORM SIFT-THE-HEAP
.

3. Quicksort

Sorting is a problem which has been solved. The fastest
sort algorithm is quicksort. If you ever need to sort
significant amounts of data, use this method. There are
some notable aspects of quicksort that you should be aware
of, even if the detailed mechanism is not important:

+ Quicksort is a recursive procedure. This is easiest
to implement in a language such as Pascal or C, but
quite possible in COBOL, as we shall see.

+ Quicksort is unstable. This means that two records
with the same key may not end up in the same order
after sorting. Bubble sort and heapsort are, in
contrast, stable. You can add stability by
extending the key to include a sequence number, so
that there are in fact no duplicate keys.

+ Quicksort performs badly once the amounts of data
become small due to the overhead of recursion. This
can be avoided by using a secondary sort when the
partition size is less than some magic figure.

+ Quicksort performs badly with certain types of data;
this can be improved by judicious choice of a pivot
point at each sort, f. ex. the middle point.

The following highly-optimised implementation is a good
example of pseudo-recursive programming in COBOL. It uses
an insertion sort to handle small partitions:

01 SORT-HANDLING.
02 CUR-LOWER-LIMIT PIC S9(5) COMP.
02 CUR-UPPER-LIMIT PIC S9(5) COMP.
02 PARTITION-SIZE PIC S9(5) COMP.
02 MINIMUM-PARTITION PIC S9(5) COMP VALUE +13.
02 LOWER-PTR PIC S9(5) COMP.
02 MIDDLE-PTR PIC S9(5) COMP.
02 UPPER-PTR PIC S9(5) COMP.
02 PREV-PTR PIC S9(5) COMP.

02 STACK-TOP PIC S9(5) COMP VALUE +20.
02 STACK-PTR PIC S9(5) COMP.
02 STACK-NXT PIC S9(5) COMP.
02 QUICKSORT-STACK OCCURS 20.
03 LOWER-LIMIT PIC S9(5) COMP.
03 UPPER-LIMIT PIC S9(5) COMP.

02 PIV-ITEM.
03 PIV-KEY PIC X(9).
03 PIV-DATA PIC X(11).
02 COMP-ITEM.
03 COMP-KEY PIC X(9).
03 COMP-DATA PIC X(11).

02 EXCHG-ITEM PIC X(20).

02 COMPARE-RESULT PIC X.
88 COMPARE-GREATER-THAN VALUE "G".
88 COMPARE-LESS-THAN VALUE "L".

QUICKSORT-THE-TABLE.
MOVE 1 TO STACK-PTR, LOWER-LIMIT (STACK-PTR)
MOVE TABLE-SIZE TO UPPER-LIMIT (STACK-PTR)

PERFORM QUICKSORT-THE-ENTRIES
UNTIL STACK-PTR = ZERO

PERFORM INSERTION-SORT
VARYING UPPER-PTR FROM 2 BY 1
UNTIL UPPER-PTR > TABLE-SIZE
.

QUICKSORT-THE-ENTRIES.
MOVE LOWER-LIMIT (STACK-PTR) TO CUR-LOWER-LIMIT
MOVE UPPER-LIMIT (STACK-PTR) TO CUR-UPPER-LIMIT
IF CUR-LOWER-LIMIT < CUR-UPPER-LIMIT
MOVE CUR-LOWER-LIMIT TO LOWER-PTR
MOVE CUR-UPPER-LIMIT TO UPPER-PTR

COMPUTE PARTITION-SIZE = UPPER-PTR - LOWER-PTR + 1
IF PARTITION-SIZE > MINIMUM-PARTITION
MOVE CUR-LOWER-LIMIT TO LOWER-PTR
MOVE CUR-UPPER-LIMIT TO UPPER-PTR

PERFORM FIND-A-PIVOT-ENTRY
PERFORM PARTITION-ENTRIES
UNTIL UPPER-PTR < LOWER-PTR

PERFORM QUICKSORT-THE-PARTITIONS
PERFORM INCREASE-AND-CHECK-STACK-PTR
ELSE
SUBTRACT 1 FROM STACK-PTR
ELSE
SUBTRACT 1 FROM STACK-PTR
.

INCREASE-AND-CHECK-STACK-PTR.
IF STACK-PTR < STACK-TOP
ADD 1 TO STACK-PTR
ELSE
MOVE ZERO TO STACK-PTR
.

FIND-A-PIVOT-ENTRY.
COMPUTE MIDDLE-PTR = (LOWER-PTR + UPPER-PTR) / 2
MOVE TABLE-ITEM (MIDDLE-PTR) TO PIV-ITEM
.

PARTITION-ENTRIES.
MOVE "LT" TO COMPARE-RESULT
PERFORM INCREASE-LOWER
UNTIL COMPARE-GREATER-THAN

MOVE "GT" TO COMPARE-RESULT
PERFORM DECREASE-UPPER
UNTIL COMPARE-LESS-THAN

IF LOWER-PTR NOT > UPPER-PTR
PERFORM EXCHANGE-UPPER-AND-LOWER
ADD 1 TO LOWER-PTR
SUBTRACT 1 FROM UPPER-PTR
.

INCREASE-LOWER.
MOVE TABLE-ITEM (LOWER-PTR) TO COMP-ITEM
IF COMP-KEY < PIV-KEY
ADD 1 TO LOWER-PTR
ELSE
MOVE "GT" TO COMPARE-RESULT
.

DECREASE-UPPER.
MOVE TABLE-ITEM (UPPER-PTR) TO COMP-ITEM
IF COMP-KEY > PIV-KEY
SUBTRACT 1 FROM UPPER-PTR
ELSE
MOVE "LT" TO COMPARE-RESULT
.

EXCHANGE-UPPER-AND-LOWER.
MOVE TABLE-ITEM (LOWER-PTR) TO EXCHG-ITEM
MOVE TABLE-ITEM (UPPER-PTR) TO TABLE-ITEM (LOWER-PTR)
MOVE EXCHG-ITEM TO TABLE-ITEM (UPPER-PTR)
.

QUICKSORT-THE-PARTITIONS.
COMPUTE STACK-NXT = STACK-PTR + 1
IF UPPER-PTR - CUR-LOWER-LIMIT < CUR-UPPER-LIMIT - LOWER-PTR
MOVE CUR-LOWER-LIMIT TO LOWER-LIMIT (STACK-NXT)
MOVE UPPER-PTR TO UPPER-LIMIT (STACK-NXT)

MOVE LOWER-PTR TO LOWER-LIMIT (STACK-PTR)
MOVE CUR-UPPER-LIMIT TO UPPER-LIMIT (STACK-PTR)
ELSE
MOVE LOWER-PTR TO LOWER-LIMIT (STACK-NXT)
MOVE CUR-UPPER-LIMIT TO UPPER-LIMIT (STACK-NXT)

MOVE CUR-LOWER-LIMIT TO LOWER-LIMIT (STACK-PTR)
MOVE UPPER-PTR TO UPPER-LIMIT (STACK-PTR)
.

INSERTION-SORT.
MOVE "GT" TO COMPARE-RESULT
PERFORM INSERT-IN-PLACE
VARYING LOWER-PTR FROM UPPER-PTR BY -1
UNTIL NOT COMPARE-GREATER-THAN
.

INSERT-IN-PLACE.
COMPUTE PREV-PTR = LOWER-PTR - 1
IF PREV-PTR < 1
MOVE "LT" TO COMPARE-RESULT
ELSE
MOVE TABLE-ITEM (PREV-PTR) TO COMP-ITEM
MOVE TABLE-ITEM (LOWER-PTR) TO PIV-ITEM
IF COMP-KEY > PIV-KEY
MOVE PIV-ITEM TO TABLE-ITEM (PREV-PTR)
MOVE COMP-ITEM TO TABLE-ITEM (LOWER-PTR)
ELSE
MOVE "LT" TO COMPARE-RESULT
.

4. Combsort

Just as we thought that the last word had been said about
sorting, a breakthrough comes along and spoils everything.
In the April 1991 issue of BYTE magazine, Stephen Lacey and
Richard Box show that a simple modification to bubble sort
makes it a fast and efficient sort method on par with
heapsort and quicksort.

In a bubble sort, each item is compared to the next; if
the two are out of order, they are swapped. This method is
slow because it is susceptible to the appearance of what Box
and Lacey call turtles. A turtle is a relatively low value
located near the end of the table. During a bubble sort,
this element moves only one position for each pass, so a
single turtle can cause maximal slowing. Almost every long
table of items contains a turtle.

Their simple modification of bubble sort which they call
`combsort' eliminates turtles quickly by allowing the
distance between compared items to be greater than one.
This distance - the JUMP-SIZE - is initially set to the
TABLE-SIZE. Before each pass, the JUMP-SIZE is divided by
1.3 (the shrink factor). If this causes it to become less
than 1, it is simply set to 1, collapsing combsort into
bubble sort. An exchange of items moves items by JUMP-SIZE
positions rather than only one position, causing turtles to
jump rather than crawl. As with any sort method where the
displacement of an element can be larger than one position,
combsort is not stable - like elements do not keep their
relative positions. This is rarely a problem in practice.

Successively shrinking the JUMP-SIZE is analogous to
combing long, tangled hair - stroking first with your
fingers alone, then with a pick comb that has widely spaced
teeth, followed by finer combs with progressively closer
teeth - hence the name comb sort. Lacey and Box came up
with a shrink factor of 1.3 empirically by testing combsort
on over 200,000 random tables. There is at present no
theoretical justification for this particular value; it just
works...

Here is then the magic code (re-using declarations and a
code paragraph from bubble sort). It is clearly correct, as
it (unless the table is empty) ends with JUMP-SIZE = 1
(ensured by the `+3') and therefore degenerates into bubble
sort:

COMB-SORT-THE-ARRAY.
MOVE SPACE TO SWAP-INDICATOR
MOVE TABLE-SIZE TO JUMP-SIZE
PERFORM COMB-SORT
UNTIL NO-MORE-SWAPS
AND JUMP-SIZE NOT > 1
.

COMB-SORT.
COMPUTE JUMP-SIZE = (10 * JUMP-SIZE + 3) / 13
COMPUTE UPPER-LIMIT = TABLE-SIZE - JUMP-SIZE
MOVE SPACE TO SWAP-INDICATOR
PERFORM COMPARE-AND-SWAP-KEYS
VARYING ITEM-NBR FROM 1 BY 1
UNTIL ITEM-NBR > UPPER-LIMIT
.

The careful termination test (JUMP-SIZE NOT > 1) also
caters for the case where the table is empty.

5. Comparison of Table Sorting Methods

The above implementations of bubble sort, heapsort,
quicksort, and combsort were tested on from 200 to 1600
items, each consisting of a 9-character key and 11
characters of data. The times below are measured in
1/1000ths of a second and are the averages of sorting four
different random sequences. The test program was run on a
PC (16 MHz 386SX) and on an AS/400 (B35). After the tables
had been sorted, they were sorted again to measure the time
for sorting already sorted data.

a. Results for Bubble Sort

Items: Order: Time PC: Time AS/400:

200 random 1072 5490
sorted 9 36

400 random 4372 24126
sorted 16 62

800 random 17875 100725
sorted 28 113

1600 random 71390 401785
sorted 55 213

Bubble sort is very fast on already sorted data, but very
slow on random data. As the number of data elements grows,
bubble sort slows to a crawl and becomes useless.

b. Results for Heapsort

Items: Order: Time PC: Time AS/400:

200 random 82 342
sorted 68 362

400 random 151 770
sorted 165 813

800 random 330 1734
sorted 357 1829

1600 random 783 3857
sorted 811 4046

Heapsort is of medium complexity and does a very good job.
On already sorted data it slows down a trifle.

c. Results for Quicksort

Items: Order: Time PC: Time AS/400:

200 random 68 275
sorted 20 127

400 random 165 650
sorted 68 273

800 random 330 1410
sorted 151 613

1600 random 715 2974
sorted 316 1374

Quicksort is of high complexity and owes its performance
on sorted data to the built-in insertion-sort.

d. Results for Combsort

Items: Order: Time PC: Time AS/400:

200 random 110 453
sorted 55 326

400 random 192 1103
sorted 151 814

800 random 467 2621
sorted 316 1826

1600 random 1076 6299
sorted 715 4254

Combsort is just a single, easy line more than bubble
sort, but performs spectacularly; in fact, almost as well as
the grotesquely more complex quicksort. Because of its
magic simplicity, this is the method we recommend - except
for the most demanding applications where the 50 percent
improvement quicksort offers may be important.

6. Pseudo-sorting

Sometimes it is not necessary to do a full sort of the
array. The classical case is that of finding the median,
ie. that value for which half of the array items are greater
and the other half smaller.

The following very fast algorithm is due to C.A.R. Hoare
(CACM, Jan. 1971). The program finds the element of the
array whose value is fth in order or magnitude; and
rearranges the array such that this element is placed at
index f; and furthermore that all elements with subscripts
lower than f have lesser values, and all elements with
subscripts greater than f have greater values.

To find the median, set f = (1 + ARRAY-SIZE)/2; to find
the first quartile, set f = (1 + ARRAY-SIZE)/4, etc.:

FIND-THE-MEDIAN.
MOVE 1 TO LOWER-BOUND
MOVE ARRAY-SIZE TO UPPER-BOUND
COMPUTE MEDIAN-INDEX = (1 + ARRAY-SIZE) / 2

PERFORM REDUCE-MIDDLE-PART
UNTIL LOWER-BOUND NOT < UPPER-BOUND
.

REDUCE-MIDDLE-PART.
MOVE ARRAY-KEY (MEDIAN-INDEX) TO MEDIAN-KEY
MOVE LOWER-BOUND TO LOWER-INDEX
MOVE UPPER-BOUND TO UPPER-INDEX
PERFORM NARROW-THE-RANGE
UNTIL LOWER-INDEX > UPPER-INDEX

IF MEDIAN-INDEX NOT > UPPER-INDEX
MOVE UPPER-INDEX TO UPPER-BOUND
ELSE
IF MEDIAN-INDEX NOT < LOWER-INDEX
MOVE LOWER-INDEX TO LOWER-BOUND
ELSE
MOVE LOWER-BOUND TO UPPER-BOUND
.

NARROW-THE-RANGE.
PERFORM INCREASE-LOWER-INDEX
UNTIL ARRAY-KEY (LOWER-INDEX) NOT < MEDIAN-KEY

PERFORM DECREASE-UPPER-INDEX
UNTIL ARRAY-KEY (UPPER-INDEX) NOT > MEDIAN-KEY

IF LOWER-INDEX NOT > UPPER-INDEX
MOVE ARRAY-ITEM (LOWER-INDEX) TO SWAP-ITEM
MOVE ARRAY-ITEM (UPPER-INDEX) TO ARRAY-ITEM (LOWER-INDEX)
MOVE SWAP-ITEM TO ARRAY-ITEM (UPPER-INDEX)
PERFORM INCREASE-LOWER-INDEX
PERFORM DECREASE-UPPER-INDEX
.

INCREASE-LOWER-INDEX.
ADD 1 TO LOWER-INDEX
.

DECREASE-UPPER-INDEX.
SUBTRACT 1 FROM UPPER-INDEX
.

The algorithm uses the standard array declarations of this
section with the following additional data declarations:

01 FIND-HANDLING.
02 LOWER-BOUND PIC S9(5) COMP.
02 UPPER-BOUND PIC S9(5) COMP.
02 MEDIAN-INDEX PIC S9(5) COMP.
02 LOWER-INDEX PIC S9(5) COMP.
02 UPPER-INDEX PIC S9(5) COMP.

E. Pseudo-recursive Programming

The quicksort algorithm is a perfect example of an
algorithm which can best be expressed in a recursive manner.
This means that when one procedure calls another (or itself)
two things happen:

+ instruction pointers are stacked so that at the end
of the called procedure, execution can continue in
the calling procedure.

+ local data is also stacked, so that when the calling
procedure resumes, its local data is reinstated.

COBOL saves only one instruction pointer per paragraph,
and has no concept of local data. We adapt to the
limitations of COBOL by using a stack to store `requests'
and a loop which takes the next request and processes it. A
`recursive' call is in the form of new requests, which are
stacked. When the stack is empty, all recursive calls have
been processed. Any local data is also stacked.

Take an (unrealistic) Pascal procedure which does a
recursive call:

procedure doit (number: integer);
begin
if number > 1 then doit (number - 1)
end;

In COBOL, we would implement this as follows:

MOVE 1 TO STACK-PTR
MOVE NUMBER TO STACKED-NUMBER (STACK-PTR)
PERFORM DOIT
UNTIL STACK-PTR = ZERO
.

DOIT.
MOVE STACKED-NUMBER (STACK-PTR) TO NUMBER
SUBTRACT 1 FROM STACK-PTR
IF NUMBER > 1
SUBTRACT 1 FROM NUMBER
ADD 1 TO STACK-PTR
MOVE NUMBER TO STACKED-NUMBER (STACK-PTR)
.

A good way to understand this technique is to take a
recursive procedure such as a tree-printing algorithm, and
to implement it in COBOL. We show below two more realistic
examples. First the famous `Towers of Hanoi' problem and
then a program to print the call tree of a program (for
reasons of space not showN here - email me for more info,
Leif: le...@ibm.net)

1. Towers of Hanoi

In a monastery in Hanoi there are three diamond rods. God
placed 64 gold disks of different sizes - each with a
central hole- on rod number 1 such that the largest disk is
at the bottom of the stack and the smallest disk is on top.
The munks mission is to move all the disks onto rod number
2. Only one disk may be moved at a time and never may a
larger disk be placed on a smaller one. If needed, disks
may temporarily be placed on the third rod, again in such a
manner that no larger disk rests on a smaller one.

This classical problem can be solved by a recursive
procedure that first moves all disks except the last one
from the source rod to the spare rod, then moves the last -
and largest - disk from the source rod to its final position
on the destination rod, and finally moves all disks from the
spare rod to the destination rod. Moving N-1 disks is then
performed in the same manner recursively until all disks
have been moved:

BEGIN INTEGER M;

PROCEDURE MOVE (N, SOURCE, DEST, SPARE);
INTEGER N; STRING SOURCE, DEST, SPARE;
BEGIN
IF N > 1 THEN MOVE (N-1, SOURCE, SPARE, DEST);
WRITE (OUT, "MOVE DISK ", N,
" FROM ", SOURCE,
" TO " , DEST);
IF N > 1 THEN MOVE (N-1, SPARE, DEST, SOURCE)
END;

WRITE (OUT, "NBR OF DISKS = "); READ (IN, M);
IF M > 0 THEN MOVE (M, "SOURCE", "DEST", "SPARE")
END

The COBOL version of the program follows below. As we
wish to illustrate pseudo-recursion rather than
input/output, we exceptionally tolerate a non-portable
program that uses ACCEPT and DISPLAY. The program counter
is explicitly managed with the variable WHAT-TO-DO and
parameters (which are in a sense local to each invocation of
the procedure) are stacked along with the program counter.
The call of the procedure in the main program is simulated
by stacking the first set of parameters.

000100 IDENTIFICATION DIVISION.
000200 PROGRAM-ID. HANOI.
000300
000400 AUTHOR. LEIF SVALGAARD.
000500 DATE-WRITTEN. 91/04/22
000600 -REVISED: 91/05/01.
000700
000800 ENVIRONMENT DIVISION.
000900
001000 CONFIGURATION SECTION.
001100 SOURCE-COMPUTER. ALMOST-PORTABLE.
001200 OBJECT-COMPUTER. ALMOST-PORTABLE.
001300
001400 DATA DIVISION.
001500
001600 WORKING-STORAGE SECTION.
001700
002000 01 STACK-SPACE.
002100 02 STACK-PTR PIC S9(3) COMP.
002200 02 STACK-ITEM OCCURS 64.
002300 03 DISK-NBR PIC 9(2).
002400 03 SOURCE-ROD PIC X(6).
002500 03 DEST-ROD PIC X(6).
002600 03 SPARE-ROD PIC X(6).
002700 03 WHAT PIC 9(1).
002800
002900 01 LOCAL-VARIABLES.
003000 02 THE-DISK-NBR PIC 9(2).
003100 02 THE-SOURCE-ROD PIC X(6) VALUE "SOURCE".
003200 02 THE-DEST-ROD PIC X(6) VALUE "DEST".
003300 02 THE-SPARE-ROD PIC X(6) VALUE "SPARE".
003400 02 THE-WHAT PIC 9(1).
003500
003600 01 GLOBAL-VARIABLES.
003700 02 SWAP-ROD PIC X(6).
003800 02 WHAT-TO-DO PIC 9(1).
003900
004000 PROCEDURE DIVISION.
004100 BEGIN-PROGRAM.
004200 DISPLAY "NBR OF DISKS = " NO ADVANCING
004300 ACCEPT THE-DISK-NBR
004400
004500 IF THE-DISK-NBR > ZERO
004600 MOVE 1 TO STACK-PTR, WHAT-TO-DO
004700 MOVE LOCAL-VARIABLES TO STACK-ITEM (1)
004800 PERFORM MOVE-DISK
004900 UNTIL STACK-PTR = ZERO
005000 .
005100 STOP RUN
005200 .
005300
005400 MOVE-DISK.
005500 MOVE STACK-ITEM (STACK-PTR) TO LOCAL-VARIABLES
005600 IF WHAT-TO-DO = 1
005700 PERFORM MOVE-DISKS-AWAY
005800 ELSE
005900 IF WHAT-TO-DO = 2
006000 PERFORM SHOW-DISK-MOVED
006100 ELSE
006200 IF WHAT-TO-DO = 3
006300 PERFORM MOVE-DISKS-BACK
006400 ELSE
006500 MOVE WHAT (STACK-PTR) TO WHAT-TO-DO
006600 SUBTRACT 1 FROM STACK-PTR
006700 .
006800
006900 MOVE-DISKS-AWAY.
007000 MOVE THE-SPARE-ROD TO SWAP-ROD
007100 MOVE THE-DEST-ROD TO THE-SPARE-ROD
007200 MOVE SWAP-ROD TO THE-DEST-ROD
007300 PERFORM MOVE-THE-DISKS
007400 .
007500
007600 MOVE-THE-DISKS.
007700 ADD 1 TO WHAT-TO-DO
007800 IF THE-DISK-NBR > 1
007900 SUBTRACT 1 FROM THE-DISK-NBR
008000 MOVE WHAT-TO-DO TO THE-WHAT
008100 ADD 1 TO STACK-PTR
008200 MOVE LOCAL-VARIABLES TO STACK-ITEM (STACK-PTR) 008300 MOVE 1 TO WHAT-TO-DO
008400 .
008500
008600 SHOW-DISK-MOVED.
008700 ADD 1 TO WHAT-TO-DO
008800 DISPLAY "MOVE DISK " THE-DISK-NBR
008900 " FROM " THE-SOURCE-ROD
009000 " TO " THE-DEST-ROD
009100 .
009200
009300 MOVE-DISKS-BACK.
009400 MOVE THE-SPARE-ROD TO SWAP-ROD
009500 MOVE THE-SOURCE-ROD TO THE-SPARE-ROD
009600 MOVE SWAP-ROD TO THE-SOURCE-ROD
009700 PERFORM MOVE-THE-DISKS
009800 .

In a small program like this, the extra machinery needed
to handle the recursion looms large. In realistic programs
- like the Structure Analyser in the following example -
the extra code to implement recursion is negligible. If a
problem is inherently recursive - and many are - it pays to
think recursively and use the pseudo-recursive techniques
shown here. It is no excuse to say: COBOL does not support
recursion directly, therefore I cannot solve my problem.
The programming language used is very rarely the true
blocking factor.

It is often said that recursion is too expensive in terms
of overhead. This is only true for trivial, small programs
that use recursion where simple iteration should have been
used. In realistic programs, recursion is not in the inner
loop anyway and thus carries little or no run-time penalty.
It is also not true that recursion is difficult or esoteric
to use.

Jason

unread,

Sep 20, 1996, 3:00:00 AM9/20/96

to

Corba uses object request brokers therefore enabling GUI client/server
based applications to communicate with legacy systems (normally written
in PL1 or COBOL) via an API.

Jeff Raben

unread,

Sep 21, 1996, 3:00:00 AM9/21/96

to

spam.fr...@no-spam.com (Doug Miller) wrote:

OK so i was never good at taking ordinals.........

jr

Jeff Raben

unread,

Sep 21, 1996, 3:00:00 AM9/21/96

to

Reposting article removed by rogue canceller.

StevenLN

unread,

Sep 23, 1996, 3:00:00 AM9/23/96

to

In general, this is an interesting discussion, but...
In the "get the payroll checks out on time" mode (standard business
programming), invoke the SORT verb and let the routine written by the
wonks do the work. You have (or should have) more critical and immediate
problems to solve. In the mainframe world, usually, in my experience (10+
years), the best answer is to have two modules with a system sort between.
Not only does this provide good performance (usually), but, it provides
an intermediate check point, and, some degree of modularization on the
system.
I have seen a few very specific instances where an internal sort was
preferable to a system sort, but can not recall any at the moment. As
always, the criteria is (or should be) what will be most maintainable.
And, it is usually easier, quicker, and less error prone to change sort
control cards then recompile a program.

Steven L. Newton
Milwaukee, WI
Asimov, Heinlein, and Zappa are still the best

Jeff Raben

unread,

Sep 23, 1996, 3:00:00 AM9/23/96

to

stev...@aol.com (StevenLN) wrote:

Interesting point. Memory management...(resource management)
a standalone sort will have more memory available for string building
and merging and MAY, because of that, sort faster.

However, ignoring that, the sort-in/sort-out each would have to be
passed an additional time.
the standalone feeder output file re-read by the sort
the sort output file re-read by the standalone 3rd program.
all the efficiencies depend upon the memory available, the channel
capacity, the pre-order of the raw data.

OR BE 4th / 5th generation and just build the best request you can and
let the system implementer take the blame. disks and mips are
getting cheaper all the time.
(JOKE and very unprofessional... )

JR

William Clerico

unread,

Sep 25, 1996, 3:00:00 AM9/25/96

to

OMG's (Object Management Group) CORBA (Common Object Request Broker
Architecture) is "one" of the industry standards for describing object
interaction. OMG is a consortium of vendors and users (last time I
looked there were over 300 participants) that are collaborating on
standards for Object Technology.

IBM's SOM (System Object Model) is an implementation of CORBA. CORBA
used to define the interfaces of language independent classes. This
means that classes defined in one CORBA compliant language can be used
extended, inherited, etc. by classes defined in a different CORBA
compliant language.

My experience is with SOM and the IBM products. The latest COBOL
compilers from IBM are CORBA compliant (via SOM) and feature a
"direct-to-SOM" compiler. Direct-to-SOM allows you to define your
classes and their interfaces using the native language and NOT CORBA's
IDL (Interface Definition Language). Using IBM's VisualAge for COBOL,
can create and object based COBOL application on the workstation and
rehost it to, say, MVS. Recompile and I'm back in business! Pretty
cool, huh?

The other VisualAge products (C++, SmallTalk, Basic, and (soon) JAVA)
are also SOM based and CORBA compliant.

More info? See: http://www.software.ibm.com/ad/cobol/cobol.htm.

--
William J. Clerico, Director of COBOL Services
QED, Inc. 172 Madison Ave, Suite 403, NY NY 10016
(212) 481-6868

LawrenceC9

unread,

Oct 3, 1996, 3:00:00 AM10/3/96

to

I find that there are many uses for internal sorts in mainframe code. I
use a small
set of copybooks to tailor the code for each particular problem. Sorts are
done by pointer only, no data is moved around by the sort routines
themselves.
For example, CICS screens often display records in a different order
than they are stored on the file(s). They might be listed by some kind of
classification number, or whatever. Or records from one file might be
displayed, while the updates are applied (in some other order) to another
file. If ENQUEUES are used, then updates MUST be ordered to avoid deadly
embraces.
However, these are small "side sorts", they are not the typical "order
the records for a report" or "order the invoices by amount-outstanding"
kind of thing.