I've done quite a bit of research on how a COMP field is stored and quite
frankly, I'm a bit stumped and I'm hoping that someone can help. I realize
that the storage structure is compiler specific and unfortunately i don't
know what the compiler is. From what i've read it may be irrelavent since
all my COMP fields are signed.
What i'm trying to do is parse a file that is output from a UNISYS system on
a windows platform. In my examinations of the file I've had no problem
identifying the fields that are not marked with USAGE COMP. Like most other
people, my problem is with reading these COMP fields.
At first i believed these fields were stored as packed decimal. I've
calculated the amount of bytes present in the file and the amount of bytes
that i think that the fields should take and i'm off by quite a bit. Below
is a sample of the structure that i'm reading as well as a sample record. By
my calculations there should be 64 bytes of data. The data available only
has 46.
Here is the record structure that i'm trying to read and the number of bytes
that i think the data should take.
05 SOME-FIELDS.
MY CALCULATED FIELD SIZE
07 FIELD-1 PIC S9(05)V99
4
USAGE COMP.
07 FIELD-2 PIC S9(05)V99
4
USAGE COMP.
07 FIELD-3 PIC S9(05)V99
4
USAGE COMP.
07 FIELD-4 PIC S9(03)V99
3
USAGE COMP.
07 FIELD-5 PIC S9(02)V99
3
USAGE COMP.
07 FIELD-6 PIC S9(02)V99
3
USAGE COMP.
07 FIELD-7 PIC SV999
2
USAGE COMP.
07 FIELD-8 PIC S9(05)V99
4
USAGE COMP.
07 FIELD-9 PIC S99V99
3
USAGE COMP.
07 FIELD-10 PIC S9(05)V99
4
USAGE COMP.
07 FIELD-11 PIC S9(03)V99
3
USAGE COMP.
07 FIELD-12 PIC S9(05)V99
4
USAGE COMP.
07 FIELD-13 PIC S9(03)V99
3
USAGE COMP.
07 FIELD-14 PIC S9(05)V99
4
USAGE COMP.
07 FIELD-15 PIC S9(02)V99
3
USAGE COMP.
07 FIELD-16 PIC S9(03)V99
3
USAGE COMP.
07 FIELD-17 PIC S9(03)V99
3
USAGE COMP.
07 FIELD-18 PIC S9(03)V99
3
USAGE COMP.
07 FIELD-19 PIC S9(03)V99
3
USAGE COMP.
07 FIELD-20 PIC X(01).
1
TOTAL
64 bytes by my calculations
Here is a sample byte stream in Hex. Notice there are 46 bytes
00 00 00 00 00 00 00 00 00 00 00 0B 70 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 E8 00 00 00 00 20
Anyone have any ideas why my calculations are off? If my fields are not
stored in packed decimal then what format could they be stored in?
Thanks for your help in advance,
Shawn
How and when was the "byte stream" displayed? Is this the byte stream as it
existed on the Unisys computer, or is it the byte stream as it showed up on
your Windows machine (and, perhaps, after it had been translated from EBCDIC
to ASCII, as the last byte might be taken to indicate)?
-Chuck Stevens
"Tom Levesque" <tlev...@hotmail.com> wrote in message
news:WZr2b.8714$qA3.7...@ursa-nb00s0.nbnet.nb.ca...
> Anyone have any ideas why my calculations are off? If my fields are not
> stored in packed decimal then what format could they be stored in?
They could be stored in whatever format the authors of the partcicular
compiler used to compile the program that produced the file decided was
appropriate for the particular machine architecture on which programs
compiled with that compiler were expected to run.
That's why the first question that needs to be answered is "Which of the
many Unisys systems that have COBOL compilers available for them wrote this
file?".
For the same reason, the second question that needs to be answered is "If
there is or was more than one COBOL compiler available for this particular
Unisys system at any point in its history, which version (and which standard
dialect) of COBOL was used to compile the program that created it?".
-Chuck Stevens
Assuming 2200 because I don't know the other one, COMP is Binary with
the ACOB compiler and can be set to binary with the UCOB compiler if
called with one of several COMPAT options (COMPAT = ACOB compatible).
The two are functionally identical and this is obviously what you
have.
The 2200 has 9-bit bytes and if you download a file using FTP in
Ascii, you will lose the most significant bit in every 9-bit byte.
Downloading in binary would be almost as bad because you would then
have to understand 2200 file-formats which do not map directly into
the windows world.
You have 2 possibilities that make sense:
1 - get the programmer to produce the files in Ascii, with leading
spaces or leading zeros, just agree on a format
2 - get someone to write a Cobol program to convert the beast.
Both are trivial exercises for someone who knows Cobol and has the
file description. Examples were posted here earlier but this is so
trivial for a Cobol programmer, who needs examples?
What it comes down to is: you need a 2200 if you want to do this. The
other alternative would be to send someone the file-description in
Ascii and the file in Binary and get them to convert it.
If your Unisys is A-Series/NX, forget all this. They have 8-bit bytes
and emulation on PC's is the way they are going (I believe).
btw, the E-Mail address in the header is a red herring
"Tom Levesque" <tlev...@hotmail.com> wrote in message news:<WZr2b.8714$qA3.7...@ursa-nb00s0.nbnet.nb.ca>...
> Hello All,
>
> I've done quite a bit of research on how a COMP field is stored and quite
> frankly, I'm a bit stumped and I'm hoping that someone can help. I realize
> that the storage structure is compiler specific and unfortunately i don't
> know what the compiler is. From what i've read it may be irrelavent since
> all my COMP fields are signed.
>
> What i'm trying to do is parse a file that is output from a UNISYS system on
> a windows platform. In my examinations of the file I've had no problem
> identifying the fields that are not marked with USAGE COMP. Like most other
> people, my problem is with reading these COMP fields.
>
> At first i believed these fields were stored as packed decimal. I've
> calculated the amount of bytes present in the file and the amount of bytes
> that i think that the fields should take and i'm off by quite a bit. Below
> is a sample of the structure that i'm reading as well as a sample record. By
> my calculations there should be 64 bytes of data. The data available only
> has 46.
>
> Here is the record structure that i'm trying to read and the number of bytes
> that i think the data should take.
>
snip
> If your Unisys is A-Series/NX, forget all this. They have 8-bit bytes
> and emulation on PC's is the way they are going (I believe).
Smaller MCP/AS systems are indeed emulated, and even the larger ones have a
"Windows side" to them; the same data is not necessarily immediately
transparent from both sides of the machine, so the fact that the MCP/AS
system has 8-BIT bytes, stores DISPLAY data as EBCDIC, and treats COMP data
as being packed-decimal with a leading 4-bit sign is immaterial as to how
that same data would be viewed when retrieved *directly* from the MCP/AS
media by Windows. That some of these systems are emulated (and some are
not) doesn't relate to how the two "sides" of the machine view or access the
data. The presence or absence of emulation is a "red herring" in this
discussion.
The length (46 bytes) of the group item SOME-FIELDS doesn't match any A
Series COBOL format that I'm aware of.
Were this fragment compiled by the the now-no-longer-supported A Series
COBOL(68) compiler, I would expect the length of the SOME-FIELDS to reflect
nineteen 48-bit words for the numeric fields plus one byte for FIELD-20 ,
for a total of 115 bytes.
For A Series COBOL74 *and* COBOL85, I see the length as 122 digits for the
numeric fields by default (nineteen allocated for 4-bit leading signs and
the remainder for digits) plus one byte for FIELD-20, for a total of 62
bytes.
If the MCP/AS COBOL74/COBOL85 programmer had included $SET BINARYCOMP for
compatibility with COBOL(68), I'd expect the COBOL(68)-format 115 bytes.
Thus, it doesn't strike me as likely that this 46-byte group "came from" a
COBOL program running on a Unisys A Series (or its ancestors or its
descendants) system.
-Chuck Stevens
# digits # bytes
1-2 1
3-5 2
6-7 3
8-10 4
11-13 5
14-15 6
16-18 7
This could probably account for the byte count that you are receiving. The
sign would be stored as the most significant bit of the byte string. Most of
the fields that you have would require two or three bytes.
Now comes the fun stuff. Since the OS2200 systems have a nine bit byte, if
you receive the file via FTP, the most significant bit of each byte will be
stripped unless you receive the file via a binary transfer (in which case,
the file would appear to be garbage to you). Since the most significant bit
is stripped, you would lose the sign bit and data bits.
We developed a simple set of rules when we got our OS2200 system over 20
years ago. Any data leaving the OS2200 environment has only "USAGE DISPLAY"
and "SIGN LEADING SEPARATE" if the field is signed. Make whoever is giving
you this file follow those rules and you will have no problems. It has
worked well for us.
The UCOB compiler on OS2200 systems normally stores "USAGE COMP" the same as
"USAGE DISPLAY" but this can be changed with a compatibility option to be
the same as ACOB.
George Ewins
On 8/25/03 1:48 PM, in article
WZr2b.8714$qA3.7...@ursa-nb00s0.nbnet.nb.ca, "Tom Levesque"
>64 bytes by my calculations
I count 60 bytes (BICBW).
In any case, the only way you are going to successfully read this data is
for it to be written in DISPLAY format before is passes through any
interface which corrupts non-DIsplay data.
--
RB |\ © Randall Bart
aa |/ ad...@RandallBart.spam.com Bart...@att.spam.net
nr |\ Please reply without spam I LOVE YOU 1-917-715-0831
dt ||\ http://RandallBart.com/ Ånåheim Ångels 2002 World Chåmps!
a |/ Multiple sclerosis: http://www.cbc.ca/webone/alison/
l |\ DOT-HS-808-065 The Church Of The Unauthorized Truth:
l |/ MS^7=6/28/107 http://yg.cotut.com mailto:s...@cotut.com
Firstly, it is now clear that the data is not A-Series so the chances
of it *not* being ACOB COMP are virtually zero. I agree with the rest
of your post but with one caveat:
- In binary (= ACOB's COMP), negative numbers are represented by
negating *all* bits - not by setting a 'sign bit'. This is actually
very unusual - most other architectures use 'negate all bits and add
1' instead. They don't have the concept of 'negative zero', we have
to live with it.
I had always thought that UCOB COMP was packed-decimal unless modified
by one of the COMPAT option-variations (that is rather irrelevant here
because the data are clearly not packed-decimal) but you appear to be
right - it is 'display'.
George Ewins <geo...@ewins.org> wrote in message news:<BB700970.20B7%geo...@ewins.org>...
All this leaves me with one more question. I just want to verify that i
understand what's going on...
Assuming that i have the field
07 FIELD-2 PIC S9(05)V99
This would take 3 bytes of data if stored as a binary integer. If my data
was
00 1B 47 then my resulting value should be 69.83
Is this correct???? If not what's the value and why?
Thanks for the information!!!!
Shawn
"andrew williams" <andrew....@t-online.de> wrote in message
news:995f2314.03082...@posting.google.com...
Thanks a lot for your help!
Shawn
"Shawn Hogan" <sho...@NOSPAMwhitehilltech.com> wrote in message
news:BzJ2b.85$Ej.1...@ursa-nb00s0.nbnet.nb.ca...
Thanks for clearing this stuff up for me. You've been a great help. Judging
from your post, it seems you have experience moving files from the 9-bit to
the 8-bit world. I believe that the problem that I'm seeing is exactly as
you've described. The most significant bit of each byte is missing.
Does anyone know of any way to "prepare" the data for a transfer from the
9-bit to the 8-bit world? It seems to me that if you knew the structure of
the records, you could write a program, on the mainframe side, to move the
most significant bit from one byte to the next(sometimes creating a new
byte) and not lose any data. Is there any other way?
Shawn
"George Ewins" <geo...@ewins.org> wrote in message
news:BB700970.20B7%geo...@ewins.org...
As some people have posted here before, it is trivial if you can get someone
to write a program for you on the mainframe. Just write the data in Usage
is Display instead of Comp. Then you get the data in ascii characters with
no problem.
--
- Stephen Fuld
e-mail address disguised to prevent spam
As others have posted before as well, SIGN [LEADING/TRAILING] SEPARATE is a
good idea too. Otherwise you've got to worry about the encoding of the
character that contains the sign.
-Chuck Stevens
If you are dealing with an existing file, just write a conversion program
for the mainframe that uses the same name for the input and output record
descriptions. The input record description has the USAGE COMP clause and the
output record description omits the USAGE COMP clause and adds the SIGN
LEADING SEPARATE clause. You then just do a READ, MOVE CORRESPONDING and a
WRITE. It is a lot easier than the bit shifting necessary to translate give
you 8 bit binary data.
In the COBOL world you do not send COMP data to a foreign system because the
format of the data is dependent upon the compilers implementation. For the
same reason, you use the SIGN LEADING SEPARATE clause to eliminate ambiguity
with the sign embedded in the least significant byte as an "over punch". I
can specify the ASCII characters that are used to represent the various
digits with positive or negative signs but what happens when you send that
to a system that translates the data to EBCDIC?
On 8/26/03 1:06 PM, in article isM2b.177$Ej.2...@ursa-nb00s0.nbnet.nb.ca,
Not quite. According to the standard, in the ABSENCE of a SIGN clause on a
DISPLAY item that has a "S" in the PICTURE clause, both the position and
the representation of the sign are up to the implementor. And specifying
SIGN LEADING [or TRAILING] only states which character position has some
sort of sign representation associated with it, and that that representation
isn't counted in the size of the item in terms of standard characters.
I grant that probably most implementors have followed the convention
(introduced I think by IBM System/360 COBOL for DOS, probably in the
early/mid 1960's) in having the sign be in the zone position of the trailing
character, but such a convention is by no means universal, required by the
standard, or applicable to all character encodings.
When SIGN LEADING [/TRAILING] SEPARATE applies to an item, the standard
REQUIRES that the sign be represented by the characters "+" for positive and
"-" for negative in the leading[/trailing] character position.
Bottom line: It's not just the translation of an overpunched low-order
character that's an ambiguity issue when SIGN isn't specified on a signed
item; standard COBOL offers no guarantee that the default sign position is
in the low-order (or any specific) character position of the item, that it's
in the "zone" bits of that character position, or even whether the character
encoding used for DISPLAY includes the ability to "overpunch" the sign into
the character position in the first place. The ONLY way to ensure
absolutely unambiguous representation of such an item is with explicit and
complete specification of a leading or trailing sign as a separate
character; that requires only the characters "+", "-", and "0" through "9"
to represent an unedited signed numeric datum clearly and unambiguously.
-Chuck Stevens
I have not really investigated this, but believe that you should
consider a PIC S9(5)V99 COMP field to be PIC S9(7) COMP and divide the
result by 100 when you want to use it.
Also, do not forget that the 2200 has 9-bit bytes. I set the leftmost
bit of each byte to zero for the test, which is what FTP in Ascii-Mode
would do.
That is also why FTP is useless for your purposes - it drops bits.
"Shawn Hogan" <sho...@NOSPAMwhitehilltech.com> wrote in message news:<BzJ2b.85$Ej.1...@ursa-nb00s0.nbnet.nb.ca>...
--
---------
MVH/Regards
Leif johansson
- Experience is a wonderful thing.
It enables you to recognize a mistake when you make it again -
"Tom Levesque" <tlev...@hotmail.com> skrev i meddelandet
news:WZr2b.8714$qA3.7...@ursa-nb00s0.nbnet.nb.ca...
But also that the ninth bit up is indeterminate after the conversion.
Playing around with the Scientific view in Windows Calculator, it works out
in detail as follows:
If you are seeing hex 00 1B 47, you have hex 47 in the bottom 8 bits
(decimal 71), then the next bit up is invisible because it got chopped out
during the FTP, so you might or might not have 256 added in; then finally
the 1B (decimal 27) in the next byte up needs to be multiplied by 512, not
by 256, because of that missing bit. And finally, there could be another 1
bit in the top of the second byte; that would add a further 131072 to the
value.
So if there's only the missing bit from the rightmost byte to consider, it's
(27*512)+71 = 13895 (Andrew's right there as well), possibly plus 256 to
make 14151.
Seen as S9(5)V99, is either 138.95 or 141.51. Either is equally possible,
because of the lost bit.
Then, to make it even more complex, you may also have to add 1310.72 to
either of those values. :-(
As Andrew says, it is literally impossible to resolve without fixing it
before the convert.
Cheers
Colin
"andrew williams" <andrew....@t-online.de> wrote in message
news:995f2314.03082...@posting.google.com...
This might be a stupid question but is there any way for me to move the file
from OS2200 to Windows without losing the 9th bit? My restriction for doing
this is that I cannot change the file structure on the mainframe side. I can
do anything I want to the file once it's on Windows. If i got the file to
Windows with the 9th bit intact i could potentially do some bit manipulation
to fix things.
Thanks for the input,
Shawn
"andrew williams" <andrew....@t-online.de> wrote in message
news:995f2314.03082...@posting.google.com...
Yes, you can FTP in binary, but I don't think you will like the results.
(see below)
> My restriction for doing
> this is that I cannot change the file structure on the mainframe side. I
can
> do anything I want to the file once it's on Windows. If i got the file to
> Windows with the 9th bit intact i could potentially do some bit
manipulation
> to fix things.
You will have to do an awful lot of very messy bit manipulation after
figuring out what you have to do. I can understand people not wanting to
change the format of that file, as it might be used in other places and the
changes would cascade, but a far better solution is to add another program
on the mainframe that takes that file as input and outputs the same file
with the data in display format, with appropriate signs expression and FTP
that file to Windows.
You also have interesting issues with signed COMP fields (the fact that
it's one's complement rather than 2's complement has been explained), and
with any COMP fields greater than 9(10) - a single 2200 word - which may,
depending on the age of the compiler that created the data, introduce
another bit-shift that nobody has yet mentioned.
The amount of work involved in trying to decode this file on a non-2200
is astronomical. The amount to do it on a 2200 is truly trivial. If the
people supplying the file to you won't co-operate, what sort of business
deal is this? Are they trying to make it hard for you deliberately?
From where you sit, it would almost certainly be cheaper to enlist the
help of someone else with OS2200 time to assist you, rather than trying to
decode the data. I think I already saw someone make the offer; if not,
assuming that you can demonstrate that you are bona-fide entitled to access
this data, I could quote for doing it from Unisys for you as a service if
you can send me a binary FTP of the file.
Regards
Colin
"Shawn Hogan" <sho...@NOSPAMwhitehilltech.com> wrote in message
news:ji63b.619$Ej.7...@ursa-nb00s0.nbnet.nb.ca...
Shawn
"Stephen Fuld" <s.f...@PleaseRemove.att.net> wrote in message
news:9v63b.115972$0v4.8...@bgtnsc04-news.ops.worldnet.att.net...
You should be able to cause the compiler to do exactly that by declaring the
destination item "PIC S9(9)V99 USAGE DISPLAY SIGN LEADING SEPARATE". If you
do that, the compiler should take care of the "+" or "-" appropriate to the
value of the COMP source. The SIGN clause has been standard COBOL since
'74, though some implementations (notably IBM, probably among others)
included it in their '68 implementations as an extension. SIGN LEADING does
the equivalent of having the implicit PIC X *before* the item, SIGN TRAILING
has it *after* the item.
-Chuck Stevens
That should have been SIGN LEADING SEPARATE and SIGN TRAILING SEPARATE
respectively.
-Chuck Stevens
The first 12 bits of the record header indicate whether the record is a
control record and the length of the record that follows in words. If the
most significant bit is set, it is a control record and the first six bits
determine what kind of control record it is and the second six bits the
length of the record. If the most significant bit is not set, the next
eleven bits are the length of a data record. The largest SDF record is 2047
words for a data record and 63 words for a control record. The remaining 24
bits of the word depend upon the flavor of SDF file.
If you really want to do this, I suggest that you go to
http://public.support.unisys.com and click on the documentation link. After
you accept Unisys' terms with a mouse click you can search out and down load
some manuals. I suggest select "Title containing" and searching on the
following:
DATA STRUCTURES - This manual will give you a description of the various
SDF data and control records.
PCIOS - This is the routine used by the compiler languages to
do I/O. While the maximum record size is 2047 words, PCIOS will segment
large records.
CPFTP - This is the currently favored FTP server.
The -nnn at the end of the manual number is the revision level.
Another thing to watch out for: The Unisys is a ones complement machine
rather than a twos complement machine. This basically means that we have a
negative zero and the negative numbers are off by one (i.e., all bits set
indicate a negative zero rather than a minus one).
On 8/27/03 2:33 PM, in article DP63b.632$Ej.8...@ursa-nb00s0.nbnet.nb.ca,
Shawn
"George Ewins" <geo...@ewins.org> wrote in message
news:BB72AD7E.217C%geo...@ewins.org...
|Does anyone know of any way to "prepare" the data for a transfer from the
|9-bit to the 8-bit world? It seems to me that if you knew the structure of
|the records, you could write a program, on the mainframe side, to move the
|most significant bit from one byte to the next(sometimes creating a new
|byte) and not lose any data. Is there any other way?
If you're going to the trouble to do that, it would be simpler and more
productive to write a program to expand the COMP field into text
representations and just copy it over in ASCII as a print file. Surely?
--
Marc Wilson
___________________________________________________________________
Cleopatra Consultants Limited - IT Consultants - CoolICE Partner - MAPPER Associate
Tel: (44/0) 70-500-15051 Fax: (44/0) 870-164-0054
Mail: in...@cleopatra.co.uk Web: http://www.cleopatra.co.uk
___________________________________________________________________
MAPPER User Group mailing list: send *SUBSCRIBE to M...@cleopatra.co.uk
What is wrong with doing the conversion on the host? The conversion
would take a couple of hours including writing and programs necessary.
Converting on a PC will take (at a guess) weeks.
Quite honestly, the only reason not to do it there is if it is not
'your' data and you can't expect anyone on the host to support you.
Your E-Mail address leads me to believe that this is not the case and
that the two of you are involved in a project of some kind. If this
is the case, you are learning how to bash your heads up against a
brick wall. You might even succeed in battering a hole in the wall,
but using the door would be far easier.
"Shawn Hogan" <sho...@NOSPAMwhitehilltech.com> wrote in message news:<Z1m3b.970$Ej.1...@ursa-nb00s0.nbnet.nb.ca>...
> Thanks George, You've been a great help!
>
> Shawn
>
> "George Ewins" <geo...@ewins.org> wrote in message
> news:BB72AD7E.217C%geo...@ewins.org...
snip
Thanks for your help,
Shawn
"andrew williams" <andrew....@t-online.de> wrote in message
news:995f2314.03082...@posting.google.com...
You could write the program, the client would just have to run it.
However, if you want to try, by all means go ahead; I think we've given
you pretty near enough info to work with.
Regards
Colin
"Shawn Hogan" <sho...@NOSPAMwhitehilltech.com> wrote in message
news:F3L3b.1839$Ej.2...@ursa-nb00s0.nbnet.nb.ca...
I don't understand the constant cries of how horribly messy and
hopeless decoding this file would be once it's transferred in binary
mode. Bit string manipulation is a SMOC. Converting between ones
complement and twos complement (and even signed magnitude, as MCP
systems use) is a SMOC. We're talking about lines of code that you
could count on your fingers. Well, fingers and toes. Sure, those few
lines must be carefully designed and written, but it's only rocket
science, not something really hard. You can't write a COBOL layout for
it, but you can extract the fields without difficulty. (Hint: write a
subroutine that returns the next n bits from the input as a full word
and changes from ones to twos complement.)
I once decoded MCP printer backup files under OS/360 (running on an
Amdahl V8). This requires bit field manipulation on the control words.
I wrote and debugged the program in PL/I on the B6700, then moved it to
the Amdahl with only one minor thing to fix (and that had nothing to do
with the bit manipulation). Only PL/I program I ever wrote. Yes, the
bit manipulation discussed here is more complex, but not a whole not
more.
I totally agree that by far the simplest way to handle this is to
convert it with a trivial COBOL program before transfer. Not only is it
easier, but if this has to be done for many files, this will avoid a
wretched continued synchronization problem as the file formats change.
And of course a binary file transfer is essential. But it sounds like
some 2200 people are too wrapped up in the "difficult format" issue,
instead of just looking at the problem as a bit string manipulation
issue.
Here's an untested (not even desk-checked!) Algol procedure to get the
next nbytes 9-bit bytes and convert from ones complement to signed
magnitude. Obviously this is not directly useful to the OP, who is
running on a Windows system and not an MCP system; I simply present it
to show that the problem isn't that hard. A mixed record would require
a similar "getdisplay" routine in addition to this routine. Fields with
more than 11 digits would need a slight more complex routine to return
a double-word integer. It would also be easy to add a conversion from
digits to bytes according to the table that George Ewins posted, or
presumably by calculation, but for this example I've done that by hand.
If one wants to get really fancy, one could parse the COBOL record
definition. Now *that's* messy.
real array therecord [0:whatever];
integer nextbit; % first bit in record is number 0
integer procedure getcomp (nbytes);
value nbytes;
integer nbytes;
% getcomp will fault if nbytes > 5.
% getcomp will return incorrect results, and possibly fault, if the
% actual number of bits in the signed magnitude value (excluding
% sign) is > 39.
% A production routine would check for these errors.
begin
integer currentword, currentbit, bitsneeded;
% first bit in record is word 0, bit 47
real returnvalue; % unscaled -- caller must do any needed scaling
currentword := nextbit div 48;
currentbit := 47 - (nextbit - currentword * 48);
bitsneeded := nbytes * 9;
if bitsneeded <= currentbit + 1 then
returnvalue := therecord [currentword].[currentbit:bitsneeded]
else begin
% assume it's no more than 48 bits, since we can't handle more ;-)
returnvalue := 0
& therecord [currentword]
[currentbit:bitsneeded-1:currentbit+1]
& therecord [currentword+1]
[47:bitsneeded-currentbit-1:bitsneeded-currentbit];;
end;
% if negative, convert from ones complement to signed magnitude
if returnvalue.[bitsneeded-1:1] = 1 then
returnvalue :=
- real(not(boolean(returnvalue.[bitsneeded-2:bitsneeded-1])));
getcomp := returnvalue;
end getcomp;
integer field1, field2, field3; % etc
<<read a record>>
nextbit := 0;
% extract fields. digits have been manually converted to bytes
% using George Ewins' table.
field1 := getcomp (3);
field2 := getcomp (3);
field3 := getcomp (2);
% etc etc ad nauseum
Edward Reid
>On Wed, 27 Aug 2003 14:11:17 -0400, Stephen Fuld wrote
>> You will have to do an awful lot of very messy bit manipulation after
>> figuring out what you have to do.
>
>I don't understand the constant cries of how horribly messy and
>hopeless decoding this file would be once it's transferred in binary
>mode. Bit string manipulation is a SMOC. Converting between ones
>complement and twos complement (and even signed magnitude, as MCP
>systems use) is a SMOC.
But if it's been spuriously translated from "EBCDIC" to ASCII?
Sometimes more than one EBCDIC character is mapped to the same ASCII
character, so you couldn't even reliably translate it back.
We have established what it is, and EBCDIC is not part of the equation.
>On Wed, 27 Aug 2003 14:11:17 -0400, Stephen Fuld wrote
>> You will have to do an awful lot of very messy bit manipulation after
>> figuring out what you have to do.
>
>I don't understand the constant cries of how horribly messy and
>hopeless decoding this file would be once it's transferred in binary
>mode. Bit string manipulation is a SMOC. Converting between ones
>complement and twos complement (and even signed magnitude, as MCP
>systems use) is a SMOC. ...
Being from the A-series side, you don't understand what the other
poster is saying. Your problem is that you assume that after a
binary file transfer, all you get is a bit-shifted but otherwise
un-adulterated data. Not so. Bits of the OS2200 file system get
transferred as well, more specifically, you will need to decode
the SDF control and data structures. True it can still be done,
but it's not as simple as you think it is.
/Leif
"Chuck Stevens" <charles...@unisys.com> wrote in message news:<bij16b$26uo$1...@si05.rsvl.unisys.com>...
As I said, the transfer must be done in binary -- no one disagrees on
that point. If it's not done in binary, then the task is impossible
rather than difficult. Those are qualitatively, not quantitatively,
different situations.
On Mon, 1 Sep 2003 21:28:23 -0400, Paul wrote
> Being from the A-series side, you don't understand what the other
> poster is saying. Your problem is that you assume that after a
> binary file transfer, all you get is a bit-shifted but otherwise
> un-adulterated data. Not so. Bits of the OS2200 file system get
> transferred as well, more specifically, you will need to decode
> the SDF control and data structures. True it can still be done,
> but it's not as simple as you think it is.
I read George Ewins' posting on this topic. It still sounds to me like
a SMOC. Yes, a few dozen more lines, but still just code. The OP would
only need to decode one type of file. (I also assumed the simplest byte
ordering, but no one has mentioned byte ordering being an issue.)
I will say that if binary FTP is adding system control information to
the data that a program reading the file directly would see, then it's
broken. The whole point of binary FTP is to transfer unadulterated data
WITHOUT additional information. For example, on MCP systems, doing a
binary FTP preserves all the bits but loses all the attribute
information, including information about record size. This is a PITA
because it means that if you send an MCP file somewhere by binary FTP
planning to bring it back to an MCP system, you need to package it
first (called "wrap"), but that's tough because this is how binary FTP
works! It sounds like you are saying that the 2200 sends the file
system information without being asked (and sends it only on binary and
not on text FTP), and that's wrong. That would be like an MCP system
sending the file header as part of the file.
You may be accustomed to thinking in terms of files being stored in
this format. But the point is that if a program writes a file and then
I FTP it in binary mode, I should get exactly the bits that the program
saw itself writing. Any other bits are part of the file system. Or to
put it another way, binary FTP is not a raw dump of the disk sectors,
it's an untranslated bit stream of the data that a program sees.
Edward Reid
I think you are still missing the point. Everyone agrees that it is a
"MOC". the disagreement here is over how "S" it is.
The control information that had been referred to earlier is not something
added by FTP. They are part of the file format. There are control words
"embedded" in the file at various places, including the start of a record,
the start of a block (if a record spans a block), the start of the label at
the beginning of the file. etc. While each of these words is a single 36
bit word, the formats vary slightly depending on what type they are and in
general, are not "byte" oriented, but are based on the old sixths of words,
so multiples of 6 bits. In normal host usage, these images are stripped by
various pieces of code so the user doesn't have to concern himself with
them, but in a binary FTP, they will transferred because they are part of
the file. Thus the code on the Windows side would have to decode these
control words as well as the actual user data. Certainly doable, but more
work than might be apparent to someone not knowing about such things.
I think the point you're missing is the program isn't writting "bits"
that go to the file as is. It's talking to a file handler that's
adding file labels, control words, etc, in addition to translating and
packaging the data. The FTP process is sending that AS IS, which is
what you'd expect, but the recipient doesn't have the advantage of
having the handler to unpack the data. That means the recipient must
know about file labels, control words, etc, and in addition must be
cognisant of the file format since you unpack the data differently
based on field type. Binary data is large integers meaning slack bits
must be maintained. You need to know defined field size since the sign
bit will have a different location if it's quarter, half, or a full
word. ASCII or packed decimal fields need to have the slack bits
dropped and repacked into byte alignment. Then you have fun stuff to
deal with like decimal alignment and god forbid unpacking the mantissa
on floating fields.
Sure, it's just lines of code, but it's a lot of lines that will have
to be done custom for each file unless you write an FD parser. You're
not going to be able to recognize field types on the fly since there's
not going to be any markers to tell you the difference between two
integers, two characters, one big integer that doesn't use bits in the
middle, etc.
The original poster seems to want a generic PC solution to work with
many files. That will be a tremendous development effort compared to
the rather trivial effort needed to ship a COBOL program to the host
that pulls in an existing COPY book and compiles it.
"The file" is what a user program sees when it opens the file and reads
it. You are telling me that the user program does not see these bits,
and yet binary FTP moves them. This is wrong. Binary FTP does not mean
send a different set of data; it simply means to send the SAME data
with no translation, simply as a string of bits. (RFC959 does not state
this explicitly, but even a quick reading of the document makes it
clear that ASCII vs binary is merely a difference of encoding.)
Sending this structural data in binary FTP makes no more sense than,
say, sending the unused bits of a sector at the end of a block (an MCP
phenomenon; these bits are not sent), or the ECC codes from a disk
sector, or the (more voluminous) ECC codes from a CD. All of these are
quite properly not sent.
Again, binary vs ASCII (or EBCDIC) mode is simply a matter of
transformation. Neither a client nor a server should send a different
file depending on the transfer mode. To put it another way, if a user
program uses a handler to read a file, then the FTP client or server
should use the same handler to read the file, and then send what it
reads either in binary or translated depending on the transfer mode.
On Tue, 2 Sep 2003 16:49:27 -0400, Keith Stone wrote
> Sure, it's just lines of code, but it's a lot of lines that will have
> to be done custom for each file unless you write an FD parser. You're
> not going to be able to recognize field types on the fly since there's
> not going to be any markers to tell you the difference between two
> integers, two characters, one big integer that doesn't use bits in the
> middle, etc.
True, but also true even if the file is written in DISPLAY as it should
be. In this respect the 2200 is no different from any other source
system. The advantage of a properly written file is that the content is
well defined by the COBOL standards and thus can be decoded without
reference to the specifics of the creating system, and that it can be
read by another COBOL program using the same record layout. COBOL
records are not intrinsically self-describing. If you need
self-describing, you will write your output as XML or some other tagged
format.
> The original poster seems to want a generic PC solution to work with
> many files. That will be a tremendous development effort compared to
> the rather trivial effort needed to ship a COBOL program to the host
> that pulls in an existing COPY book and compiles it.
I agree completely that a properly written file -- all fields DISPLAY,
numeric fields SIGN LEADING SEPARATE, and yes definitely no floating
point! -- is going to be a great deal easier, and I join all others
here in recommending this approach. As I said before, no argument on
this point.
If the OP has COBOL available on the receiving system, then it really
will be enormously easier to use the properly formatted files, since
the identical record layouts can be used. But if the target system is
to read the files in any other language, then even properly written
files are going to be quite a chore, and a really general solution will
require a large amount of work anyway to make the COBOL layout
automatically accessible to the code in another language.
But I didn't see the OP asking for a generic PC solution. He said "Here
is the record structure that i'm trying to read and the number of bytes
that i think the data should take." I took that as an indication that
the project was to read a specific file (or a very few files), not just
any file coming from the 2200.
Edward Reid
It seem clear to me that you don't understand the issue with different file
formats on the 2200. Let me give a simplified example. (I am assuming here
that the file is in SDF format, which was not explicitly stated, but seems
to be assumed by others). The records are variable length. For example, in
a file containing text lines, trailing blanks are automatically trimmed by
the file handler. There is no CR or LF or something like that at the end of
each record. The length of the record is one of the things in the control
word in front of each record. If you deleted the control words, you could
not tell from the stream of bits comming at you where one record ended and
the next started.
Let me try explaining from another perspective to see if that helps. There
are two ways for a program to read a file on the 2200. One is by doing
direct I/O to the disk, which retrieves all the data bits, perhaps modulo
any trailing stuff at the end of a block (but this method has no
understanding of what a data block is, so you probably read those too). No
ECC bits are sent. The other is via use of what, on some other systems is
called an "access method". For example, IBM has several of these, one
purpose of which is to strip out any control words giving the data length,
etc. If you use the access method, it is assumed that you know what you are
doing with regard to handling variable length records, etc. There are
several file formats on the 2200 and, in general, there is nothing in the
directory entry to tell you which one is being used. You can frequently
tell by reading the first few words. However, most programs that don't want
to deal with the complexity of worrying about what the file format is and
which access method to use, etc. just use the direct I/O to get the bits of
the file. I don't know, but I suspect that the FTP reader does this. If it
did, then it would send the control words because it is not going through
the code that would strip them out.
I hope this clarifies things.
BTW, on A series, how are true variable length records (such as from a COBOL
Occurs Depending On clause or a Fortran writes with different I/O lists
handled on disk?
No, Ed, I think you're mistaken, even from a MCP/AS perspective. Here's
what I think is a parallel example in that context:
Physically a KEYEDIOII file (used for COBOL74 and COBOL85 indexed and
relative files) has a bunch of extra "stuff" in it. A COBOL74 or COBOL85
program is completely ignorant of that extra stuff, and does "normal" I/O
just as for a sequential (or even old-form indexed -- AKA KEYEDIO -- file).
Decoding of this information is actually delegated by the MCP to the
KEYEDIOII library which passes the actual user information back through the
MCP *as if* it had been a "normal" I/O.
It is, I believe, possible to set file attributes in the application program
to read the "raw data" from the file, but I would not expect the results in
the application program to be meaningful if the application program was
expecting only the *user* data.
Accessing a "raw" KEYEDIOII file with a user program is fraught with all the
same pitfalls and cautions as accessing a "raw" DMSII data set in a user
program. The expectation that a KEYEDIOII *physical file* record would
match the 01-level record describing the user data is as unreasonable as the
expectation that a physical record in a DMSII data set would match the
logical record description of that record as passed by DMINTERFACE to a user
program. , or the record description of the data set as it is passed by
DMINTERFACE into the compilers, is It's very much like attempting to read a
DMSII data set as if it were a file.
I have no significant 2200 experience (last time I worked on something like
this was on an 1108 in something like 1971), but by all the discussion it
appears to me that much if not all 2200 file handling is more like the A
Series KEYEDIOII mechanism than it is MCP/AS "direct file" handling.
-Chuck Stevens
As Keith and others stated, you need to know the file structure, to make it
even more interesting, for binary file transfers from OS 2200 you need to
know what software was used as the "ftp server" and which options were
selected.
- There are at least three products act as "ftp servers",
and they are NOT compatible.
- Unisys TCP/IP Application Services (TAS),
- Unisys cpFTP (ClearPath FTP)
- Attachmate has an ftp product for OS 2200 series systems.
I don't have any documentation available for this product.
- Options control how the 9th bit of each quarter word is handled.
The following quotes are from the Unisys FTP Services for ClearPath OS 2200
User's Guide, 7847 5753-007.
"3.3.10. BINARY, IMAGE, TENEX, L8-Set the Data Type to IMAGE
The BINARY, IMAGE, TENEX and L8 commands set the data types to image.
The AFMT, CFMT, and TASC commands determine how the ninth bit is handled
when
any of the IMAGE types are selected. AFMT is the default."
"3.3.46. TASC-Set the Binary Subtype to TAS Compatible
The TASC command sets the binary subtype to be compatible with TAS. This
command communicates with TAS software. This command applies to binary
transfer mode only. For more information, see 3.7."
"3.5.2. Binary Transfer of a Data File
The transfer of binary data between the client and server is called binary
or image transfer. The data is transferred without record separators codes
such as CR/LF, so no data is added. Data is transferred only during the
binary transfer of an SDF file. All SDF control words are omitted.
FTP Services provides six subtypes for binary transfer (see Table 3-4 and
Table 3-5). When AFMT, SDF, PCIOS, or IOW is specified, the ninth bit is
not sent. When CFMT or TASC is specified, all bits are sent.
..."
So the default binary file transfer using cpFTP (AFMT) does NOT preserve the
ninth bit. The default binary transfer for TAS DOES preserver the ninth
bit.
The task is not impossible, but it is NOT simple. In addition to specifying
a binary transfer they MUST specify a format that preserves the 9th bit. If
the owner of the data is not willing to run a simple COBOL program to build
a pure text file, will they be receptive to changing the format used to FTP
the file?
cheers,
Mike
>"The file" is what a user program sees when it opens the file and reads
>it. You are telling me that the user program does not see these bits,
>and yet binary FTP moves them. This is wrong. Binary FTP does not mean
>send a different set of data; it simply means to send the SAME data
>with no translation, simply as a string of bits. (RFC959 does not state
>this explicitly, but even a quick reading of the document makes it
>clear that ASCII vs binary is merely a difference of encoding.)
I think you're missing some of the complexities. A different encoding
could include control strings.
--
RB |\ © Randall Bart
aa |/ ad...@RandallBart.spam.com Bart...@att.spam.net
nr |\ Please reply without spam I LOVE YOU 1-917-715-0831
dt ||\ http://RandallBart.com/ Ånåheim Ångels 2002 World Chåmps!
a |/ Multiple sclerosis: http://www.cbc.ca/webone/alison/
l |\ DOT-HS-808-065 The Church Of The Unauthorized Truth:
l |/ MS^7=6/28/107 http://yg.cotut.com mailto:s...@cotut.com
From the sounds of it, I think we have a match. The COBOL (and PL/I or
any PCIOS) file handlers attempt to leave data in it's native format
and handle all the buffered I/O by using control words, slack records,
etc to optimize the I/O processing independant of the higher level
program. I've written some lower level routine to access PCIOS (which
are in SA Utilities if you look real close, think MSAM from PLUS) and
always thought it was an excellent I/O handler.
This interface does lead to problems when moving the raw files
generated through binary FTP to a foriegn platform. If we look at an
example ACOB structure:
01 MYREC.
03 MYSTRING PIC X(04). (36 bits, 9 bit byte
aligned)
03 MYBINARY PIC 9(10) COMP. (36 bits, high bit sign)
03 MYPACKED PIC 9(08) COMP-3. (72 bits, 9 bit byte
aligned, signed packed)
03 MYFLOAT PIC 9(10) COMP-1. (36 bit float, 6 bit
mantissa)
03 MYBIT PIC 1(18) PIC-1. (18 bit binary)
If memory serves (and I'm too lazy to break out the manual) will take
up 198 bits. On disk it'll take up 252 bits because there'll be one
control word (36 bits) at the front and the final 18 bits will be
padded to an event word. In raw binary the end of the record (or the
front) will contain 4 bits from the preceding or following record on
the foriegn system running 8 bit bytes.
Now this record size works good because it's 7 words long and fits
nice into a 28 word sector. There won't be any padding due to
blocking. Say we add a PIC X(12) field to make it an even 10 words and
block at 1792 an even track, then we'll have a two word record at the
end of each track consisting of a control word and a null word before
the next block.
If you decoded 32256 8 bit bytes at a shot you'd get 4 track and I've
rarelt seen files blocked high than that. The sub increment blocks
would mostly be contained as subset of that boundary. You'd still have
to know the record layout, since the byte alignment can cahnge
throughout the record depending on the native type.
I just wanted to thank the members of the group for enlightening me about
some things related to this question.
Having done COBOL only on IBM mainframes and 1100 series systems with older
compilers, I assumed that all compilers did what both of these did - store
COMP fields in binary. I was surprised to find out that apparently A series
uses packed decimal (which would have been COMP-3 on IBM and older 1100
series compilers) and that apparently on the newer 1100 series successor's
compilers they are stored as display. Neither of those would have occurred
to me and I am gratefull for being disabused of my ignorance. It is truely
a case of the advantage of knowing multiple systems helps to broaden your
conception of what is possible.
I was also unaware of the "text" method (using text on the data item
description) of specifying where the sign is put on display items. Does
this method yield the same results as what I would, given my particular set
of experiences, have done, specifically putting a plus sign in the
appropriate place in the picture clause?
In a word, Stephen - Yes.
Colin
Thanks Colin. I just keep learning more and more. Good stuff.
The ACOB manual is at
http://public.support.unisys.com/2200/docs/ix61/PDF/78307709.PDF, and the
stuff about the SIGN clause is on page 144.
An 'S' at the start of the PIC clause, in all COBOL standards, tells the
compiler to apply whatever operational signing convention it considers
appropriate. It says nothing about how the sign is represented, merely that
the field must behave as signed. How the compiler chooses to express that
sign is an implementor option.
On COMP (or COMP-n) fields in the '74 standard, the nature of the sign
depends on the hardware; for ACOB, as you know, COMP uses 1's complement
binary, COMP-3 is packed decimal with the sign in the trailing half-byte,
COMP-1 and COMP-2 are floating point, etc. If you put an S in a PIC for a
character numeric (DISPLAY) in ACOB, then as I recall you get what we know
as "trailing sign overpunch", with the trailing character showing the value
that would result from putting a + sign (row 12) or a - sign (row 11) onto
the same column on the 80-column card as the numeric value in that position.
It was possible to use the SIGN clause in ACOB as well, with exactly the
same interpretation as for UCOB.
With COBOL '85, the usage of COMP has been made very explicitly
vendor-dependent, but words such as PACKED-DECIMAL and BINARY have been
introduced to allow the user to force the radix more explicitly. The UCOB
implementation (assuming you're not using COMPAT options to force '74
compatibility) renders COMP as character numeric with trailing overpunch,
pretty much the same as signed DISPLAY was in ACOB; I assume this is because
UCOB only runs on modern 2200's, with character arithmetic hardware, and
they decided that that would be efficient enough.
The "SIGN" clause variants are for numeric DISPLAY fields only, and
explicitly state the position (LEADING / TRAILING) and style (SEPARATE
CHARACTER or not) of the sign character. If it's not SEPARATE CHARACTER,
then (for 2200) you get the "overpunch" code again.
Just to add to the fun, the MCP folks do it totally differently. I think I
have this right ... but no doubt someone will correct me if I don't!
N.B. This next paragraph is for real bit-nerd types like me. Skip if you
find it boring!
For MCP COBOL '74, COMP gives binary like 2200, but it's word-sized rather
than byte-sized, so PIC 9 COMP gives you a whole (48-bit) word, just like
all PICs up to PIC 9(11) COMP. And unlike the 2200, that word is always
word-aligned as well. More than 9(11) COMP goes to double-word. That seems
odd, given 48 available bits; but it's because the A Series doesn't actually
have integer artithmetic - it's a floating-point-only machine. The
difference is that if the exponent (top 9 bits) is zero, the alignment point
of the value is at the right-hand end of the 39-bit mantissa, unlike the
left-aligned mantissa of 2200 floating point values. So there's actually
only 39 bits to represent the mantissa, which means they have to go to 2
words at 9(11); and a single precision floating point value up to 39 bits in
size can be correctly represented at the hardware level with a zero
exponent - so it looks like an integer, even though actually it's floating
point. And then, to make it even more complex, an A Series double precision
floating point number is held in such a fashion that if you look at the
first word as a single-precision FP value, you get the right value, just
with less precision - they put the most significant exponent bits in with
the least significant mantissa bits, or something like that, so as to
achieve this. Oh, and to round it off, the exponent is in powers of 8 rather
than powers of 2 - so that floating-point field has an absolutely awesome
range and precision, with 18 or so bits of power-of-8 exponent and about 78
bits of mantissa.
Then, just to be different, A Series packed decimal is optionally signed;
unlike the IBM mechanism that ACOB emulates (and UCOB implements in 2200
packed decimal hardware), the trailing half-byte doesn't necessarily have to
be a sign; and if it isn't, the space isn't used. Furthermore, if you have
an odd number of 9's in an unsigned PIC (or an even number in a signed PIC),
then if I recall correctly the next numeric, if any, starts on the half-byte
boundary.
And the other oddity is that even in '74 COBOL, the MCP folks did proper
decimal truncation in binary COMP fields; if you moved 13 to a PIC 9 COMP
field, you'd get a value of 3 in that big word, whereas on the 2200 with
ACOB you'd still get 13 (binary 1101) in the 6- or 9-bit byte. I'm glad to
say that UCOB has corrected that non-standard implementation if you don't
use the COMPAT option.
I hope that hasn't bored everyone to distraction ...
Colin
"Stephen Fuld" <s.f...@PleaseRemove.att.net> wrote in message
news:IkR6b.130387$0v4.9...@bgtnsc04-news.ops.worldnet.att.net...
> On COMP (or COMP-n) fields in the '74 standard, the nature of the sign
> depends on the hardware; ...
Umm... not quite. The only USAGEs specified in the '74 standard are
DISPLAY, COMP, and (if the implementor has chosen to provide the "table
handling" module in the language), INDEX.
Basically, the '74 standard says COMP is capable of representing a value
used in computations, must be numeric, must have a PICTURE clause in which
the only allowed characters are S, 9, V and P, and not much else.
Everything about the layout of data is up to the implementor. COMP-3 (or
any other hyphenated-COMP, or indeed any other usage like MCP/AS COBOL74
REAL, DOUBLE, BINARY, EVENT, LOCK, TASK, etc.) is itself an implementor
extension, not standard COBOL.
> With COBOL '85, the usage of COMP has been made very explicitly
> vendor-dependent,
That's not new; it's always been true.
> but words such as PACKED-DECIMAL and BINARY have been
> introduced to allow the user to force the radix more explicitly.
True; BINARY requires that the implementor use a radix of two for its
represaentation. PACKED-DECIMAL has two requirements imposed by the
standard: 1) that it use a radix of ten; 2) that each digit position must
occupy the minimum possible configuration in computer storage (which,
presuming a binary system, pretty much requires four bits as I see it).
Everything else about both USAGEs is up to the implementor, including the
question as to whether USAGE BINARY is a single format covering all values
or in fact varies according to the number of digits in the PICTURE clause
("Sufficient computer storage must be allocated by the implementor to
contain the maximum range of values implied by the associated decimal
PICTURE character-string.").
> The "SIGN" clause variants are for numeric DISPLAY fields only, and
> explicitly state the position (LEADING / TRAILING) and style (SEPARATE
> CHARACTER or not) of the sign character. If it's not SEPARATE CHARACTER,
> then (for 2200) you get the "overpunch" code again.
> Just to add to the fun, the MCP folks do it totally differently. I think I
> have this right ... but no doubt someone will correct me if I don't!
MCP COBOL74 allows specification of SIGN LEADING and SIGN TRAILING on COMP
fields in addition to DISPLAY; COBOL85 applies these to PACKED-DECIMAL as
well. They also allow the specification of DEFAULT COMP SIGN and DEFAULT
DISPLAY SIGN in SPECIAL-NAMES, which I think is a really useful extension.
The COMP sign applies to PACKED-DECIMAL in COBOL85 since they're the same in
that dialect.
> For MCP COBOL '74, COMP gives binary like 2200, but it's word-sized rather
> than byte-sized, so PIC 9 COMP gives you a whole (48-bit) word, just like
> all PICs up to PIC 9(11) COMP.
Yep.
> And unlike the 2200, that word is always
> word-aligned as well.
Nope.
> More than 9(11) COMP goes to double-word. ...
Yep.
> Then, just to be different, A Series packed decimal is optionally signed;
> unlike the IBM mechanism that ACOB emulates (and UCOB implements in 2200
> packed decimal hardware), the trailing half-byte doesn't necessarily have
to
> be a sign; and if it isn't, the space isn't used. Furthermore, if you
have
> an odd number of 9's in an unsigned PIC (or an even number in a signed
PIC),
> then if I recall correctly the next numeric, if any, starts on the
half-byte
> boundary.
Yes, which means that the four bits associated with a COMP sign aren't
wasted UNLESS they're followed by something that requires alignment at
something coarser than digit boundaries.
> And the other oddity is that even in '74 COBOL, the MCP folks did proper
> decimal truncation in binary COMP fields; if you moved 13 to a PIC 9 COMP
> field, you'd get a value of 3 in that big word, whereas on the 2200 with
> ACOB you'd still get 13 (binary 1101) in the 6- or 9-bit byte. I'm glad to
> say that UCOB has corrected that non-standard implementation if you don't
> use the COMPAT option.
I'm not convinced I'd characterize "compliance with the requirements of
standard COBOL" as an oddity! ;-)
-Chuck Stevens
Thanks for the follow-up. But I think we may be talking past each other
here. See below.
>
> The ACOB manual is at
> http://public.support.unisys.com/2200/docs/ix61/PDF/78307709.PDF, and the
> stuff about the SIGN clause is on page 144.
>
> An 'S' at the start of the PIC clause, in all COBOL standards, tells the
> compiler to apply whatever operational signing convention it considers
> appropriate. It says nothing about how the sign is represented, merely
that
> the field must behave as signed. How the compiler chooses to express that
> sign is an implementor option.
>
> On COMP (or COMP-n) fields in the '74 standard, the nature of the sign
> depends on the hardware;
I understand about how COMP fields are represented on 2200s and based on the
comments in this thread, I am learning more about how they are done on A
series, but my question was not about signs on USAGE COMP fields, but on
DISPLAY fields.
As I understand it, in the PICTURE clause of a USAGE DISPLAY item, placing a
"+" (a plus sign) in the picture string will cause the system to replace
that position with a graphic "+" if the value is positive and a graphic "-"
if the value is negative when the data item is written to a file.
Thus if the clause is PIC +BBB9, and the value is say 123, then what is
displayed will be "+123", etc.
I believe this is the same as what would be achieved if you put the text
SIGN clause with Leading or Trailing corresponding to different placements
of the "+" in the PICTURE string. My question was to confirm that belief.
> As I understand it, in the PICTURE clause of a USAGE DISPLAY item, placing
a
> "+" (a plus sign) in the picture string will cause the system to replace
> that position with a graphic "+" if the value is positive and a graphic
"-"
> if the value is negative when the data item is written to a file.
So far, so good, and true in *standard* COBOL, for output.
> Thus if the clause is PIC +BBB9, and the value is say 123, then what is
> displayed will be "+123", etc.
No, what will be displayed will be "+ 3" (note three blanks). "B" means
"insert a blank". Did you mean "PIC +9999"? If that's true, what you'd end
up with would be "+0009". If you meant "+ZZZ9", you'd end up with "+ 123"
(note one blank). If you meant "++++9", you'd end up with " +123" (note
leading single blank). All of these are *EDITED* PICTURE strings. For
purposes of argument, let's say the two declarations we are talking about
are "PIC S9999 SIGN LEADING SEPARATE" and "PIC +9999". If you MOVE +123
into either item, you will end up with "+0123".
> I believe this is the same as what would be achieved if you put the text
> SIGN clause with Leading or Trailing corresponding to different placements
> of the "+" in the PICTURE string. My question was to confirm that belief.
Yes, the data does indeed end up looking the same. The question concerns
flexibility: how about understanding the data item as a *numeric* value?
COBOL85 and subsequent dialects do allow "de-editing" MOVEs, but not
generally de-editing retrievals elsewhere, and the object code for
"de-editing" MOVEs is likely to be generalized, and thus costly to run.
COBOL74 and earlier dialects do not; an item described "PIC +9999" can only
be a numeric destination, not a numeric source. As a source it's
alphanumeric.
The SIGN SEPARATE clause used with a numeric data item e.g. "PIC S9999 SIGN
SEPARATE" is of very long standing, is likely to be much more efficient
(because it is very specific to non-edited numeric items), and has broader
application than de-editing PICTUREs like "+9999", even though the contents
of the item would be the same when used as a destination, because the data
item remains NUMERIC even though the "form" is more presentable than its
"overpunch" cousin.
-Chuck Stevens
snip
>
> I hope that hasn't bored everyone to distraction ...
>
> Colin
>
no, it shocked me :-)
Well, there was an obvious alternative which I had not bothered
suggesting and no-one else has either:
for a PIC S9(5)V99 COMP field, move it to a PIC -----9.99 field.
Something I find rather weird is that defining such a field as PIC
-(5)9.99 produced a MINOR error under UCOB when I tried it 2 weeks
ago. Looks like a hole in the Cobol specifications was faithfully
implemented, or there is some internal error in the compiler triggered
by some other SERIOUS error the compilation threw up.
> No, what will be displayed will be "+ 3" (note three blanks). "B" means
> "insert a blank". Did you mean "PIC +9999"? If that's true, what you'd
end
> up with would be "+0009". ...
What I should have written was:
> No, what will be displayed will be "+ 3" (note three blanks). "B" means
> "insert a blank". Did you mean "PIC +9999"? If that's true, what you'd
end
> up with would be "+0123". ...
Apologies to the group.
-Chuck Stevens
Right. It's been to long and I messed up. I meant +ZZZ9, which you covered
below.
snip
> Yes, the data does indeed end up looking the same. The question concerns
> flexibility: how about understanding the data item as a *numeric* value?
You make a good point about being able to "de-edit" a field, so yes, there
is a, perhaps minor, advantage to the "text version", but it comes not in
the results written in the file, which are the same, but in perhaps more
flexibility in the program.
Thanks.
>for a PIC S9(5)V99 COMP field, move it to a PIC -----9.99 field.
>Something I find rather weird is that defining such a field as PIC
>-(5)9.99 produced a MINOR error under UCOB when I tried it 2 weeks
>ago.
PIC -----9.99 and PIC -(5)9.99 should be treated identically (unless COPY
REPLACING or REPLACE is affecting one of them).
"Chuck Stevens" <charles...@unisys.com> wrote in message news:<bjiker$2odq$1...@si05.rsvl.unisys.com>...
snip
I just changed -----9 back to -(5)9 and recompiled the program (which
no longer has any 'serious' errors). 'PIC -(5)9.' is accepted. Looks
like one of the other errors caused the compiler to misbehave. CP IX
7.1. It is probably non-reproducable without knowing which other
error triggered it. Thanks.
-Chuck Stevens
"andrew williams" <andrew....@t-online.de> wrote in message
news:995f2314.03090...@posting.google.com...