Re: Disassembly of old Turbo Pascal (V3) code - how to create data

13 views
Skip to first unread message

Robert Prins

unread,
May 7, 2021, 3:23:14 PM5/7/21
to
On 2021-04-17 13:48, Robert Prins wrote:
> Hi all,
>
> I would like to disassemble the final version of a self-written Turbo Pascal
V3 program, i.e. a simple .COM file, and to that effect I've dug out my old (AD
2004) registered copy of IDA Pro (V4.7.0.831). Not having used it for more than
10 years, and no longer having access to their forum, I'm now stuck. The .COM
file loads, IDA happily disassembles it, but it just creates one single segment,
and I have no (longer) a clue on how to create the data segment. There's a bit
of info in the TP3 Manual, and using David Lindauer's GRDB in DOXBox-X allows me
to single-step through the RTL initialisation code and that shows me it sets up
up DS and SS, but it doesn't help me in setting up these segments in IDA.
>
> I've tried the "Create Segment" option, but I'm lost entering the required
values for start address, end address and base, "class" is probably "DATA", the
once for the single "seg000" that IDA creates are CODE, start @ 0x0100, end @
0xD623, which leads me to assume that a to-be-created "seg001" should start at
0x0000, end at 0xffff, and have a base of 0xd63 (paragraphs), but that results
in a "Bad segment base: segment would have bytes with a negative offset" pop-up.
>
> Trying start @ 0xd630, end @ 0x1d630, with a base 0x0000 creates a segment,
but it looks like
>
> seg000:D622
> seg001:C8C00 ;
---------------------------------------------------------------------------
> seg001:C8C00
> seg001:C8C00 ; Segment type: Regular
> seg001:C8C00 seg001 segment byte public '' use16
> seg001:C8C00 assume cs:seg001
> seg001:C8C00 ;org 0C8C00h
> seg001:C8C00 assume es:nothing, ss:nothing, ds:nothing,
fs:nothing, gs:nothing
>
> Which may be correct, but the "org 0c8c00" makes absolutely no sense to me.

I've had a bit, or rather, a huge, amount of help from Hex-Rays' Ilfak
Guilfanov, and using the names in "scg.zip" (found @
<https://www.pcengines.ch/tp3.htm>, I've got a complete disassembly of the
compiler. I cut down the IDA generated .IDC file to include just the info about
the RTL, manually changed some data, which at some stage should be done with
built-in IDC functions, wrote a bit of REXX to add identifiers to every Pascal
procedure (basically inline statements that jump over upper-cased procedure
names in Pascal-string format) and got myself a nice assembly listing, with code
that's obviously working, but very "simple" (Let's just leave it at that...)

I could let IDA generate an assembler listing, hack that to pieces, most likely
in some automated way, as there are dozens of procedures that look like

cseg:4E77 proc day_ptr_is_td_top near
cseg:4E77
cseg:4E77 push bp
cseg:4E78 mov bp, sp
cseg:4E7A push bp
cseg:4E7B jmp $+3
cseg:4E7E ; ------------------------------------------------------------
cseg:4E7E
cseg:4E7E @01:
cseg:4E7E jmp short @02
cseg:4E7E ; ------------------------------------------------------------
cseg:4E80 db 17,'DAY_PTR_IS_TD_TOP'
cseg:4E92 ; ------------------------------------------------------------
cseg:4E92
cseg:4E92 @02:
cseg:4E92 mov eax, [td_top]
cseg:4E96 mov [day_ptr], eax
cseg:4E9A mov [winday_top], eax
cseg:4E9E call _day_list_is_day_ptr
cseg:4EA1 jmp $+3
cseg:4EA4 ; ------------------------------------------------------------
cseg:4EA4
cseg:4EA4 @03:
cseg:4EA4 mov sp, bp
cseg:4EA6 pop bp
cseg:4EA7 retn
cseg:4EA7 endp day_ptr_is_td_top

where a stack-frame isn't required, and likewise for the "jmp $+3"'s.

However, right now I've started to think about something else, making a few
tweaks to the compiler itself. IDA Pro has a built-in assemble command, and can
save a changed .COM file, but that would result in an output file with just a
lot of NOP instructions, like 20+ in the random number generator

x(n+1) = (x(n) * 129 + 907633385) mod 2^32

32-bit multiplication and addition are easier on a 32-bit CPU than on a 16-bit
one...

But of course it would be more interesting to see if it's possible to retrofit
Norbert Juffa's enhanced 6-byte-real IEEE-compliant (as far as that's possible
in this format) arithmetic to the RTL. That however would not realistically
possible via the assemble command, but would require a real reassembly. IDA Pro
provides two options for generating source, "generic" (aka MASM?) or TASM
"Ideal" mode.

Now I can probably figure out what to change where to let Turbo set up its
segmentation magic, but my disassembly contains a data segment with the
uninitialised RTL variables, and I don't want/need that in a .COM file. The
assembler listing generated by the program in the above-mentiond scg.zip and to
be assembled with "AS" from the same just has a series of "var = value" to set
up these variables. So is there a way to create in TASM/MASM some kind of
"dummy" data segment just to set up variable names/offsets? Googling on
dummy/virtual segment doesn't come up with anything helpful, but I'm sure that
this is not an uncommon situations.

Robert
--
Robert AH Prins
robert(a)prino(d)org
The hitchhiking grandfather - https://prino.neocities.org/indez.html
Some REXX code for use on z/OS - https://prino.neocities.org/zOS/zOS-Tools.html

Robert Prins

unread,
May 7, 2021, 3:23:16 PM5/7/21
to
Reply all
Reply to author
Forward
0 new messages