Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Wondering if the push/pop bp in the quoted Open Watcom C/C++ 1.9 inline assembly example is necessary

211 views
Skip to first unread message

S. Sokolow

unread,
May 10, 2019, 6:07:58 PM5/10/19
to
I've been slowly picking away at the beginnings of a DOS analogue to open-source installer creators like InnoSetup and NSIS and, since I want this to be suitable for use on floppy disks, I've been aiming for the smallest code size that I can tolerate working toward using free tooling that can be legally redistributed. (ie. Open Watcom C/C++ with small bits of inline assembly, not NASM)

While preparing to write my own wrappers around int 10h functions to avoid the weight of graph.h, I ran across the following example from page 255 of the Open Watcom C/C++ User's Guide
( ftp://ftp.openwatcom.org/manuals/current/cguide.pdf#page=267 ):

extern void BIOSSetCurPos( unsigned short __rowcol,
unsigned char __page );
#pragma aux BIOSSetCurPos = \
"push bp" \
"mov ah,2" \
"int 10h" \
"pop bp" \
parm [dx] [bh] \
modify [ah];

ousob.com's reference lists SP, BP, BI, SI, and DI as being clobbered by everything, but ctyme.com's reference doesn't mention that and none of Open Watcom's examples save and restore the others, so I have to assume that to be a low-level detail that's taken care of internally by the Watcom inline assembly support.

ctyme.com's reference (which appears to be an HTML rendering of Ralf Brown's Interrupt List) mentions some functions as having a BP-clobbering bug on some BIOSes (eg. http://www.ctyme.com/intr/rb-0096.htm ) but I couldn't find any mention of needing to handle BP specially for AH=02h in any of the INT 10h references I checked. (ousob.com, ctyme.com, The Undocumented PC, The MS-DOS Encyclopedia, DOS Power Tools 2e)

I know next to nothing about assembly language, so I've taken to saving and restoring BP in ALL my INT 10h wrappers as a mysterious incantation to guard against hypothetical problems on hardware I don't have available for testing.

Does it serve some purpose that I'm too ignorant to grasp? Is it necessary in a situation ctyme.com doesn't mention? Am I misunderstanding how to search up this sort of information? Did the authors of the manual just forget which functions did and didn't have a BP-clobbering bug?

As an assembly language newbie, I'm not particularly bothered by having one more thing I don't yet understand, but the fact that it could just be wasted instructions is nagging at me when I enjoy finding ways to micro-optimize the functions in this project for size.

(Currently, I've got a fancy Hello World! to test my graph.h alternatives which I've optimized down to 1630 bytes when compiling as a .com file, with 996 bytes of that being the portion of the Watcom C runtime which doesn't get dead-code eliminated from a `void main(void) {}`)

T. Ment

unread,
May 10, 2019, 6:32:12 PM5/10/19
to
On Fri, 10 May 2019 15:07:57 -0700 (PDT), S. Sokolow wrote:

> ousob.com's reference lists SP, BP, BI, SI, and DI as being clobbered
> by everything, but ctyme.com's reference doesn't mention that and none
> of Open Watcom's examples save and restore the others, so I have to
> assume that to be a low-level detail that's taken care of internally by
> the Watcom inline assembly support.

Don't assume anything with inline assembly support.

I don't know about Watcom, but Borland C++ 3.1 doesn't help you with
inline assembly at all. If you clobber a register, it's your own fault
for not knowing what registers Borland used in the C code.

bcc -S produces an assembly listing of your C code, so you can see
exactly what registers Borland used. Then you know what must be saved,
or not.



S. Sokolow

unread,
May 10, 2019, 7:28:56 PM5/10/19
to
Fair enough, but you seem to have an incomplete understanding of my question because a BP-clobbering bug in a BIOS routine wouldn't show up in a disassembly of my own code.

I'm wondering why the example would be saving BP before calling INT 10h/AH=02h and restoring it immediately after when I haven't been able to find any evidence that the INT 10h/AH=02h call in question will modify BP... even on BIOSes that have BP-clobbering bugs in *other* INT 10h functions.

T. Ment

unread,
May 10, 2019, 7:50:51 PM5/10/19
to
On Fri, 10 May 2019 16:28:55 -0700 (PDT), S. Sokolow wrote:

> you seem to have an incomplete understanding
> of my question because a BP-clobbering bug in
> a BIOS routine wouldn't show up in a disassembly
> of my own code.

Inline assembly is not safe for a beginner. It's like using firearms
without any training. BIOS BP bugs are not the worst problem. Shooting
yourself is.


S. Sokolow

unread,
May 10, 2019, 8:47:26 PM5/10/19
to
On Friday, May 10, 2019 at 11:50:51 PM UTC, T. Ment wrote:
> Inline assembly is not safe for a beginner. It's like using firearms
> without any training. BIOS BP bugs are not the worst problem. Shooting
> yourself is.

I'm generally a very cautious person, so I'd normally agree with you and stay away from it.

However, when a survey of my floppy collection showed that period DOS installers average between 15K and 50K and the base overhead for using graph.h to position a "Hello World!" (*with* dead code elimination) is around 46K, I have no choice but to call INT 10h functions directly.

That said...

* I don't use inline assembly for anything other than making `int 10h` and
`int 21h` calls. (Similar to how I only use `unsafe` in Rust for FFI calls to
C functions with simple type signatures and only when I can't find a safe
alternative to use instead.)
* To minimize the chance of hiding something, I don't try to do anything fancy
on the assembly side and I use only MOV, PUSH, POP, and INT.
(eg. Every bit of inline assembly gets a C wrapper function which does any
necessary argument preparation and asserts every invariant I can think to
assert.)
* I always err on the side of caution and research further when I'm uncertain of
something. (Hence my coming here when all my research pointed to that push/pop
in the example code being unnecessary.)
* My debug builds are so full of asserts on the C side that it causes noticeable
screen-drawing slowdown when run in an emulator set to perform no faster than
an IBM PC XT.
* I don't intend to expose any of this to a situation where a bug could cause
harm until after I've done everything I can think of to get one of my tests or
asserts to fire under one of several DOS emulators and multiple different
DOSes.

(I'm gearing up to implement an external test runner which will orchestrate the
full lifecycle of the emulators using debug logs from my project sent to COM1
to monitor when to inject keystrokes or take screenshots for automated comparison in addition to the more obvious checking of the debug messages and
examining modifications made to the emulated disk drives.)

In the end, the problem is that you come across as having a viewpoint that learning to use inline assembly safely is such a chicken-and-egg problem that, like driving a car, it should require training from a licensed instructor. (And I'm not going back to university for a hobby... though I AM willing to read any
pages you recommend, price out any books you think are particularly recommended and pick them up if I can find new or used copies at a decent price.)

In short, despite my agreeing that it IS the programming equivalent of a loaded firearm, I doubt your estimation of me, your input strikes me as "not useful", and my response is "If you don't have anything to say beyond 'the solution to your problem is to give up on your project', is there anyone else who has any insight or should I try another venue?"

T. Ment

unread,
May 10, 2019, 11:51:01 PM5/10/19
to
On Fri, 10 May 2019 17:47:25 -0700 (PDT), S. Sokolow wrote:

> In the end, the problem is that you come across as having a
> viewpoint that learning to use inline assembly safely is such
> a chicken-and-egg problem that, like driving a car, it should
> require training from a licensed instructor. (And I'm not
> going back to university for a hobby

That wasn't the point. Learning enough to know what you're doing, is.


> ... though I AM willing to read any >pages you recommend, price out
> any books you think are particularly recommended and pick them up if
> I can find new or used copies at a decent price.)

Books by William Jones and Robert Lafore are where I started.


> In short, despite my agreeing that it IS the programming equivalent
> of a loaded firearm, I doubt your estimation of me, your input strikes
> me as "not useful"

That's Usenet. Not always useful.


> and my response is "If you don't have anything to say beyond 'the
> solution to your problem is to give up on your project;

I didn't say quit. I'm just not interested in helping people who don't
take the time to acquire requisite skills. It's your project. Why should
I spend my time on it.


> is there anyone else who has any insight

Not many people in this ghost town. But some like questions. Maybe you
will get lucky,



S. Sokolow

unread,
May 11, 2019, 12:57:33 AM5/11/19
to
On Saturday, May 11, 2019 at 3:51:01 AM UTC, T. Ment wrote:
> On Fri, 10 May 2019 17:47:25 -0700 (PDT), S. Sokolow wrote:
>
> > In the end, the problem is that you come across as having a
> > viewpoint that learning to use inline assembly safely is such
> > a chicken-and-egg problem that, like driving a car, it should
> > require training from a licensed instructor. (And I'm not
> > going back to university for a hobby
>
> That wasn't the point. Learning enough to know what you're doing, is.

I certainly want to learn. My problem with your approach is that it's as if I'd just finished reading a full textbook and asked for clarification on one point and you response is "The fact that you have to ask that indicates that you misunderstood something. Re-read the whole book until your misconception magically leaps out at you." ...or like telling an author "Your manuscript has typos, but I'm not going to tell you where or even what kind".

I've *read* the Watcom manuals.

Unless I missed something so minor or subtle that I'll almost certainly miss it again during a re-read, everything I read about its calling convention and how it interacts with inline assembly says that I shouldn't need to manually preserve BP there unless the INT 10h call is going to clobber it ...and all of the documentation I've read on INT 10h/AH=02h says that it's not intended to clobber BP and no known BIOSes have bugs which do so either.

...so my question all along has been "Is there something special about this case that is considered to be such an *everybody knows* thing that it slipped through the cracks of all the documentation I've read on Watcom calling conventions, assembly language programming, and INT 10h or is it just some Watcom manual editor making a habit of saving and restoring BP to avoid having to memorize which specific INT 10h functions are buggy?

> Books by William Jones and Robert Lafore are where I started.

Thanks. I'll look into those when I have a moment.

> > In short, despite my agreeing that it IS the programming equivalent
> > of a loaded firearm, I doubt your estimation of me, your input strikes
> > me as "not useful"
>
> That's Usenet. Not always useful.

True, but you've been coming across not unlike a girlfriend saying "If you don't know why I'm mad at you, I'm certainly not going to tell you". Not very helpful to someone trying to learn.

> Not many people in this ghost town. But some like questions. Maybe you
> will get lucky,

Again, fair enough.

T. Ment

unread,
May 11, 2019, 11:07:32 AM5/11/19
to
On Fri, 10 May 2019 21:57:32 -0700 (PDT), S. Sokolow wrote:

> Unless I missed something so minor or subtle that I'll almost
> certainly miss it again during a re-read, everything I read about
> its calling convention and how it interacts with inline assembly
> says that I shouldn't need to manually preserve BP

I only know about Borland, but other C compilers may work the same.
Every C function call saves BP upon entry and restores it upon exit.

But in the middle of a long sequence of C code, Borland inline assembly
does nothing special to protect registers. BP provides addressability to
stack and automatic variables, if your function has any.


> there unless the INT 10h call is going to clobber it ...and all of
> the documentation I've read on INT 10h/AH=02h says that it's not
> intended to clobber BP and no known BIOSes have bugs which do so
> either.

Ralf Brown's hardcopy book says the same. But with BIOS you never know
for sure. Sometimes you have to try it and see.


> ...so my question all along has been "Is there something special about
> this case that is considered to be such an *everybody knows* thing that
> it slipped through the cracks of all the documentation I've read on
> Watcom calling conventions, assembly language programming, and INT 10h
> or is it just some Watcom manual editor making a habit of saving and
> restoring BP to avoid having to memorize which specific INT 10h
> functions are buggy?

Maybe they were mimicking the BP save/restore convention of normal C
functions. You could wrap your inline assembly in a C function call to
get the same effect.


> True, but you've been coming across not unlike a girlfriend saying
> "If you don't know why I'm mad at you, I'm certainly not going to
> tell you". Not very helpful to someone trying to learn.

Most Usenet newsgroups are unmoderated, so there is no enforcement of
rules, political correctness, or social etiquette. To endure Usenet, you
need a newsreader and killfile, to silence people who annoy you. Google
groups is not the tool for that.

If you like web forums, try https://stackoverflow.com/

They have people lined up to answer questions. I don't like web forums,
so you won't meet me there.


S. Sokolow

unread,
May 11, 2019, 3:07:47 PM5/11/19
to
On Saturday, May 11, 2019 at 3:07:32 PM UTC, T. Ment wrote:
> On Fri, 10 May 2019 21:57:32 -0700 (PDT), S. Sokolow wrote:
>
> I only know about Borland, but other C compilers may work the same.
> Every C function call saves BP upon entry and restores it upon exit.

Watcom's single biggest performance advantage back in the day came from using an optimized calling convention more like what modern C and C++ compilers use.

Even in pure C and without asking for any optimizations, Open Watcom 1.9's code generator won't emit code to manipulate BP unless absolutely necessary.

In my experience, that means that BP only gets manipulated if one of the following is true:

1. You asked for compliance with a standard calling convention
for interoperability.
2. The function takes more arguments than can be passed using only registers.
3. You've specified -d2 (full symbolic debugging info) or higher.

>
> But in the middle of a long sequence of C code, Borland inline assembly
> does nothing special to protect registers. BP provides addressability to
> stack and automatic variables, if your function has any.

I did consider that but the example that has me wondering makes no use of the stack, its only arguments are passed in registers, it properly announces to the compiler that it modifies AX, and it has no return value.

(INT 10h/AH=2 is for setting the cursor position)

>
> Ralf Brown's hardcopy book says the same. But with BIOS you never know
> for sure. Sometimes you have to try it and see.
>

Isn't that a bit backwards. I do have period machines to test on, but none of them are the particular machines that are listed as having BP-clobbering bugs, so I'd mistakenly conclude that preserving BP is unnecessary in those buggy functions if I limited myself to "try it and see".

>
> Maybe they were mimicking the BP save/restore convention of normal C
> functions. You could wrap your inline assembly in a C function call to
> get the same effect.
>

Always a possibility, but, given that they didn't do it in their other examples that doesn't seem likely... especially when Watcom's default calling convention only saves and restores BP as necessary.

Unfortunately, their only other example which uses INT 10h builds on that one, so I don't have enough data points to confirm or deny my hypothesis that they're just habitually saving and restoring BP around INT 10h calls.

>
> Most Usenet newsgroups are unmoderated, so there is no enforcement of
> rules, political correctness, or social etiquette. To endure Usenet, you
> need a newsreader and killfile, to silence people who annoy you. Google
> groups is not the tool for that.
>
> If you like web forums, try https://stackoverflow.com/
>
> They have people lined up to answer questions. I don't like web forums,
> so you won't meet me there.

I understand that people on Usenet can be abrasive and I never said you annoyed me.

I've just been assuming that, since you seem like a rational person, you agree that the purpose of communication is to reach some form of common understanding.

T. Ment

unread,
May 11, 2019, 3:52:33 PM5/11/19
to
On Sat, 11 May 2019 12:07:46 -0700 (PDT), S. Sokolow wrote:

> I've just been assuming that, since you seem like a rational person,
> you agree that the purpose of communication is to reach some form of
> common understanding

I tried Watcom once, and all I learned was, it's a big memory hog. I
didn't see the point. Borland C++ 3.1 and Microsoft C 6.0 provide all I
need. License and intellectual property concerns are not my job.

What you say about Watcom BP optimization, is interesting, but it's a
micro optimization that won't amount to much, compared to the overhead
of C startup code and library routines. If I need to micro optimize, I
won't fight a C compiler. Pure assembly language is best for that.

So I can't help you with Watcom. I think Mateusz uses it. Maybe he will
have something to say.


S. Sokolow

unread,
May 11, 2019, 4:43:37 PM5/11/19
to
On Saturday, May 11, 2019 at 7:52:33 PM UTC, T. Ment wrote:
> On Sat, 11 May 2019 12:07:46 -0700 (PDT), S. Sokolow wrote:
>
> > I've just been assuming that, since you seem like a rational person,
> > you agree that the purpose of communication is to reach some form of
> > common understanding
>
> I tried Watcom once, and all I learned was, it's a big memory hog. I
> didn't see the point. Borland C++ 3.1 and Microsoft C 6.0 provide all I
> need. License and intellectual property concerns are not my job.

That's where our motivations differ.

I've got various Borland and Microsoft compilers in my collection but, for my hobby projects, I insist on using a compiler that I can legally redistribute to anyone who might want to play around with my code at some point in the future.

On the technical side, I like that:

1. It's so easy to set up an Open Watcom cross-compiler (One installer, for DOS, Win32, or Linux, which installs support for compiling to all targets) so I can develop on my Linux PC without having it crammed inside a virtual machine.

I do still design and test my wmake Makefiles to ensure they're usable on a DOS install of Open Watcom, but it's been so long since I used DOS as a development host that I just don't have the tolerance to do significant work in period text editors anymore.

2. It still includes that nostalgic DOS/4GW DPMI extender as an option, including a license to use it, plus three others. (CauseWay, DOS32/A, and PMODE/W)

(Sadly, you have to choose between either a binary-only release of the newest version of DOS/4GW or a slightly older version with source, because the author of DOS/4GW passed away less than a year ago while in the process of trying to dig up the floppies containing the latest source.)

Now if only Embarcadero would release those iconic OWL common dialog icons under a comparable license and Flexera would do the same for a completely obsolete version of InstallShield. Childhood nostalgia could all be legally freely redistributable.

3. It includes Win386, which is like a DPMI extender for Windows 3.1 apps, providing a convenient way to write Win16 applications with a flat memory model which predates Win32s and gets bundled into your EXE. (FoxPro used it, as did the Windows 3.x version of Sierra's SCI engine.)

>
> What you say about Watcom BP optimization, is interesting, but it's a
> micro optimization that won't amount to much, compared to the overhead
> of C startup code and library routines. If I need to micro optimize, I
> won't fight a C compiler. Pure assembly language is best for that.
>

As I remember, it's implemented as a custom calling convention named __watcall, so I'd imagine it applies to the library routines too.

That aside, I'll never turn my nose up at a free performance boost. That's the whole point of having an optimizing compiler.

> So I can't help you with Watcom. I think Mateusz uses it. Maybe he will
> have something to say.

Here's hoping.

T. Ment

unread,
May 11, 2019, 6:15:04 PM5/11/19
to
On Sat, 11 May 2019 13:43:36 -0700 (PDT), S. Sokolow wrote:

> I've got various Borland and Microsoft compilers in my collection but,
> for my hobby projects, I insist on using a compiler that I can legally
> redistribute to anyone who might want to play around with my code

If they lack the skill to port your code to another DOS compiler, you're
wasting your time on them.


> As I remember, it's implemented as a custom calling convention
> named __watcall, so I'd imagine it applies to the library routines
> too.

That misses the point.

Maybe your question was more of an opportunity to show off, than a real
question. That's not uncommon on Usenet.

Have fun.


S. Sokolow

unread,
May 11, 2019, 6:31:14 PM5/11/19
to
On Saturday, May 11, 2019 at 10:15:04 PM UTC, T. Ment wrote:
> On Sat, 11 May 2019 13:43:36 -0700 (PDT), S. Sokolow wrote:
>
> > I've got various Borland and Microsoft compilers in my collection but,
> > for my hobby projects, I insist on using a compiler that I can legally
> > redistribute to anyone who might want to play around with my code
>
> If they lack the skill to port your code to another DOS compiler, you're
> wasting your time on them.

It's not about some kind of test of their skill.

It's about making it as easy as possible for others to enjoy what I've created in the way *they* want to.

>
>
> > As I remember, it's implemented as a custom calling convention
> > named __watcall, so I'd imagine it applies to the library routines
> > too.
>
> That misses the point.
>
> Maybe your question was more of an opportunity to show off, than a real
> question. That's not uncommon on Usenet.
>
> Have fun.

OK, now that's a bit of a trollish response.

I did follow the line you quoted with "That aside, I'll never turn my nose up at a free performance boost. That's the whole point of having an optimizing compiler." which addresses the point I believe you were intending to make.

Nonetheless, if you feel the productive conversation should end here, I'm willing to let you have the last word.

T. Ment

unread,
May 11, 2019, 8:03:50 PM5/11/19
to
On Sat, 11 May 2019 15:31:13 -0700 (PDT), S. Sokolow wrote:

>> If they lack the skill to port your code to another DOS compiler, you're
>> wasting your time on them.

> It's not about some kind of test of their skill.
>
> It's about making it as easy as possible for others to enjoy
> what I've created in the way *they* want to.

It's no wonder you've not learned assembly language. You waste too much
time blabbing about how cool you are.


> OK, now that's a bit of a trollish response.

You asked for it, showboat.


> I did follow the line you quoted with "That aside, I'll never turn my
> nose up at a free performance boost. That's the whole point of having
> an optimizing compiler." which addresses the point I believe you were
> intending to make.

Using C library code will overwhelm a micro BP optimization. That's the
point, which you ignore.


> Nonetheless, if you feel the productive conversation should end here,
> I'm willing to let you have the last word.

If you can't keep that promise, I have a newsreader and killfile.


rug...@gmail.com

unread,
May 12, 2019, 5:26:08 AM5/12/19
to
Hi, pal!

I've seen you on other forums (briefly). I'm no expert, but I'll
chime in with a few tidbits anyways.


On Friday, May 10, 2019 at 7:47:26 PM UTC-5, S. Sokolow wrote:
>
> However, when a survey of my floppy collection showed that period
> DOS installers average between 15K and 50K and the base overhead
> for using graph.h to position a "Hello World!" (*with* dead code
> elimination) is around 46K, I have no choice but to call INT 10h
> functions directly.

Does Watcom do dead code elimination? AFAIK, no. There is a switch
about something something segments, but that's not the same thing.

Honestly, I'd rather suggest Turbo Pascal (or preferably Free
Pascal, although that's not as small, for good reason). Those
have actual smartlinkers (and i8086-msdos is a valid cross-target
these days since 3.0.0). Granted, those use different calling
conventions!

> though I AM willing to read any pages you recommend, price out
> any books you think are particularly recommended and pick them
> up if I can find new or used copies at a decent price.)

The 8086 turned 40 last year, and the 8088 [sic] turns 40 this year.
You can read Steve Morse's book online for free. (Granted, you
may be more interested in newer cpus. I've heard good things from
Ray Seyfarth's x64 books. And there's supposedly a good one
using FASM written by Alexey Lyashko.) In fact, just check out
the FASM messageboard, it's extremely helpful!

* https://stevemorse.org/8086/index.html

* http://www.rayseyfarth.com/asm/
* https://board.flatassembler.net/topic.php?t=20183

rug...@gmail.com

unread,
May 12, 2019, 5:39:12 AM5/12/19
to
Hi,

On Friday, May 10, 2019 at 11:57:33 PM UTC-5, S. Sokolow wrote:
>
> Unless I missed something so minor or subtle that I'll almost
> certainly miss it again during a re-read, everything I read
> about its calling convention and how it interacts with inline
> assembly says that I shouldn't need to manually preserve BP
> there unless the INT 10h call is going to clobber it ...and
> all of the documentation I've read on INT 10h/AH=02h says
> that it's not intended to clobber BP and no known BIOSes have
> bugs which do so either.

It's wise to be cautious. But even if you're technically right,
from a standard point of view, there are still bugs and
incompatibilities. It's like trying to make code compile
across C compilers. (Other languages are less tested and
usually worse, even.)

There are many IBM BIOSes (and clones), so you're bound
to notice some differences eventually.

Don't worry about it. If it doesn't bite you, then you're
okay. You'll know if it's a problem. Reasonable testing
should iron out most obvious flaws.

But you can also be defensive. Have two binaries, or at
least leave some padding for binary patching.

> ...so my question all along has been "Is there something special
> about this case that is considered to be such an *everybody knows*
> thing that it slipped through the cracks of all the documentation

Nope. BP use is fairly rare in the BIOS (at least from the view of
a user calling it).

> I've read on Watcom calling conventions, assembly language
> programming,

IIRC, OpenWatcom supports register by default and cdecl is
also optionally available. I'm far from expert, but IIRC,
cdecl requires saving BP, BX, SI, DI across functions.
(These are the same registers used in extended addressing
in 8086.)

FreePascal uses its own "register" (compatible with Delphi),
which is faster. Well, the i8086-msdos target is still "pascal"
only, for now. At least the main FPC (e.g. Go32v2) can support
various others too, e.g. cdecl.

rug...@gmail.com

unread,
May 12, 2019, 5:45:51 AM5/12/19
to
Hi,

Again, I'm no expert, so take this with a grain of salt.
Corrections welcome.


On Saturday, May 11, 2019 at 2:07:47 PM UTC-5, S. Sokolow wrote:
>
> Even in pure C and without asking for any optimizations,
> Open Watcom 1.9's code generator won't emit code to
> manipulate BP unless absolutely necessary.

You can't directly use [SP] in 16-bit 8086 mode. So you're
forced to use BP. The 386 was more flexible, so we have
compilers like GCC (DJGPP) that have -momit-leaf-frame-pointer
and -fomit-frame-pointer , the latter of which is enabled
by default for some targets (when it doesn't interfere with
debugging). Sometimes that frees up an extra register (since
386 is register starved), but it also bloats up the code a bit
more.

I'm no compiler author, but I think the only reason to use BP
at all is for nested functions, local data, and recursion.
So yes, you can avoid it in many cases.

Again, back to FPC, "Stack frame omission":

"
Under specific conditions, the stack frame will be omitted,
and the variable will directly be accessed via the stack pointer.
"

* https://www.freepascal.org/docs-html/prog/progsu187.html

rug...@gmail.com

unread,
May 12, 2019, 5:58:22 AM5/12/19
to
Hi,

On Saturday, May 11, 2019 at 3:43:37 PM UTC-5, S. Sokolow wrote:
>
> On the technical side, I like that:
>
> 1. It's so easy to set up an Open Watcom cross-compiler
> (One installer, for DOS, Win32, or Linux, which installs
> support for compiling to all targets) so I can develop
> on my Linux PC without having it crammed inside a virtual machine.

Cross-compiling is very very useful, yes. But so are virtual machines.
I agree, though, that you don't want to be tied exclusively to a
virtual machine (although they work great and aren't very painful
these days).

* https://www.lazybrowndog.net/freedos/virtualbox/

> I do still design and test my wmake Makefiles to ensure they're
> usable on a DOS install of Open Watcom, but it's been so long
> since I used DOS as a development host that I just don't have
> the tolerance to do significant work in period text editors
> anymore.

There are a billion text editors for DOS, and most of the good
ones are compiled by DJGPP. Of course, OpenWatcom comes with
its own excellent vi clone. Again, I'm no expert, but I've
dabbled in a ton of them, all have various strengths. JED,
VILE, GNU Emacs, TDE, FTE, FED, THE ... the list goes on.
I think JED even has Watcom or Borland compiler error catching.
(DJGPP got a new GNU Emacs build recently, after four years,
but I haven't tried it yet.)

Of course, the problem then is to be careful about mixing
extenders since most don't play well together. I think Causeway
and DOS4GW will get along with CWSDPMI okay. So will HX's HDPMI32.

> 2. It still includes that nostalgic DOS/4GW DPMI extender
> as an option, including a license to use it, plus three others.
> (CauseWay, DOS32/A, and PMODE/W)

WDOSX and D3X also work, but they have some caveats (don't they all??).
Naively, I'd recommend Causeway unless there's a good reason otherwise.
But DOS/32A is quite good, too, obviously.

> > So I can't help you with Watcom. I think Mateusz uses it. Maybe he will
> > have something to say.
>
> Here's hoping.

He's very smart but always busy. I haven't seen him around lately.
Maybe try asking on freedos-user if you don't see him around soon.

T. Ment

unread,
May 12, 2019, 11:17:37 AM5/12/19
to
On Sun, 12 May 2019 02:45:50 -0700 (PDT), rug...@gmail.com wrote:

> I'm no compiler author, but I think the only reason to use BP
> at all is for nested functions, local data, and recursion.

You forgot parameters. Addressing parameters is the primary use of BP.


> Under specific conditions, the stack frame will be omitted,
> and the variable will directly be accessed via the stack pointer.

Dubious micro optimization.


> I'm no expert

Experts say:

Also, accessing parameters can get more expensive since they are
far away from the top of the stack and may require more expensive
addressing modes. Raymond Chen

https://stackoverflow.com/questions/14666665/trying-to-understand-gcc-option-fomit-frame-pointer


It's not safe for you in a place where you can't ban people. You might
get your feelings hurt.


S. Sokolow

unread,
May 12, 2019, 4:07:37 PM5/12/19
to
> Does Watcom do dead code elimination? AFAIK, no. There is a switch
> about something something segments, but that's not the same thing.

Yeah. I was referring to the combination of the "something something segments" compiler option and the "ELIMINATE" linker option because, primitive and limited as it is, the Watcom manuals chose to call it a form of dead code elimination, so that's the easiest way to track down mentions of it.

> Honestly, I'd rather suggest Turbo Pascal (or preferably Free
> Pascal, although that's not as small, for good reason). Those
> have actual smartlinkers (and i8086-msdos is a valid cross-target
> these days since 3.0.0). Granted, those use different calling
> conventions!

I actually started out experimenting with Free Pascal on the idea that I could prototype the design with units available on the GO32v2 DPMI target, like Free Vision and Unzipper, then migrate to more custom code once it stabilized. (Plus, it'd be nice to Pascal's stronger type system.)

Unfortunately, I discovered that the Free Pascal runtime imposed too high a base overhead on the file size to satisfy me, even after switching to the i8086 target. (Initially, DPMI with the stub swapped for the embeddable extender was an option I investigated to see how much of my prototyping might make it into the final version.)

In my measurements, Watcom's base overhead was just 996 bytes. (Apples-to-apples comparison in that I didn't try following Free Pascal's guide to making a custom build of the standard library and I didn't attempt it for Open Watcom either.)

After I switched away from Free Vision, I also discovered that "the CRT unit is un-optimized because we've had more pressing stuff to work on" is a known problem which makes it prohibitively slow on early DOS machines.

I'd still use and recommend Free Pascal for application development ("Batteries included" standard library for the DPMI target aside, it performs comparably to Java according to The Benchmarks Game) but, for utilities which need to sit on distribution floppies without crowding out the actual content and be compatible with the widest range of machines possible, it's not there yet.

As for Turbo Pascal, the fact that only "most" of the Turbo Pascal versions Borland freeware'd are still available on Embarcadero's website puts a bad taste in my mouth, knowing the kind of "no redistribution" terms Embarcadero puts on their downloads. I don't like my code depending on tooling which might only be available as "abandonware" later.

> The 8086 turned 40 last year, and the 8088 [sic] turns 40 this year.
> You can read Steve Morse's book online for free. (Granted, you
> may be more interested in newer cpus. I've heard good things from
> Ray Seyfarth's x64 books. And there's supposedly a good one
> using FASM written by Alexey Lyashko.) In fact, just check out
> the FASM messageboard, it's extremely helpful!
>
> * https://stevemorse.org/8086/index.html
>
> * http://www.rayseyfarth.com/asm/
> * https://board.flatassembler.net/topic.php?t=20183

Thanks. :)

I actually have two somewhat disjoint reasons for being interested in assembly, so your comments on x64 are appreciated:

1. Retro-hobby work where I want to write bits of assembly meant to support the oldest machines feasible. (eg. My INT 10h wrappers are intended to support all systems that provide the routines I'm calling, but some hardware combinations get prohibitively onerous to support before whichever revision of the PC XT is referred to as the New XT by RBIL.)

2. Modern work where I don't want to opt out of maximum compile-time safety enforcement, but I do want to inspect the assembly dumps for my hot loops to see whether the benchmark times I'm getting can be improved by massaging the code into something more optimizer-friendly.

> But you can also be defensive. Have two binaries, or at
> least leave some padding for binary patching.

Padding would defeat the whole purpose of not using "push bp" and "pop bp" Just In Caseâ„¢, so my plan was to just make the EXE tiny and easy to swap out.

(eg. When I get to the point where I'm supporting single-file installers, I intend to implement it as a Zip self-extractor stub which reads its control scripting from inside the appended Zip file rather than using a custom packfile format like InnoSetup does.)

Given that Watcom doesn't support conditional compilation directives *within* a block of inline assembly, two different versions of the assembly could easily drift out of sync and introduce bugs. However, I've gone assert-crazy with this code, I intend to support verbose debug logging, and I *will* be offering up both "development" and "distribution" builds in my release archives so anyone can easily test against the asserts-enabled version.

> IIRC, OpenWatcom supports register by default and cdecl is
> also optionally available. I'm far from expert, but IIRC,
> cdecl requires saving BP, BX, SI, DI across functions.
> (These are the same registers used in extended addressing
> in 8086.)

Yeah. It's actually very satisfying to see how clean and concise Open Watcom's default __watcall calling convention looks in the assembly dumps.

As far as other calling conventions go, I haven't needed to remember the details, but the -? output for changing the default calling convention suggests that it also supports stdcall, fastcall, pascal, fortran, and syscall conventions.

> Again, back to FPC, "Stack frame omission":
>
> "
> Under specific conditions, the stack frame will be omitted,
> and the variable will directly be accessed via the stack pointer.
> "
>
> * https://www.freepascal.org/docs-html/prog/progsu187.html

Interesting. Thanks. :)

> * https://www.lazybrowndog.net/freedos/virtualbox/

I'll still want to set up the rest of the test matrix I have planned, but that'll save me a little time. :)

> There are a billion text editors for DOS, and most of the good
> ones are compiled by DJGPP. Of course, OpenWatcom comes with
> its own excellent vi clone. Again, I'm no expert, but I've
> dabbled in a ton of them, all have various strengths. JED,
> VILE, GNU Emacs, TDE, FTE, FED, THE ... the list goes on.
> I think JED even has Watcom or Borland compiler error catching.
> (DJGPP got a new GNU Emacs build recently, after four years,
> but I haven't tried it yet.)

It's more that I'm too spoiled by resolutions and tooling that you can't get in DOS text mode.

For example, `ifndef __MSDOS__`, my Makefile runs splint on the code to warn me about things in my codebase that Watcom just lets past with maximum warnings enabled and to provide "converting between different typedefs of the same type must be done explicitly" warnings so I get notified if I do something like mixing up rows and columns.

> Of course, the problem then is to be careful about mixing
> extenders since most don't play well together. I think Causeway
> and DOS4GW will get along with CWSDPMI okay. So will HX's HDPMI32.

I actually just read a blog post about that a few days ago.

The guy's site (blarg.ca) is having problems at the moment, but he wound up making a DOS/4GW build of FTE which was also patched to support 80x50 text mode so he could comfortably use it for Watcom development. (Not Open Watcom. The boxed copy of Watcom he found.)

https://github.com/gered/fte

> Naively, I'd recommend Causeway unless there's a good reason otherwise.

But that prompts the question of whether it bodes well or ill that CauseWay is the only one to get its own volume in the bundle of Open Watcom manuals. (ie. Is it a customized version of the official Devore documentation or did they think the official documentation was so lacking that they wrote their own?)

> He's very smart but always busy. I haven't seen him around lately.
> Maybe try asking on freedos-user if you don't see him around soon.

Noted, but, if he's that busy, I may hold back on pursuing his attention as a courtesy.

Kerr-Mudd,John

unread,
May 13, 2019, 4:21:45 AM5/13/19
to
On Sun, 12 May 2019 00:03:49 GMT, T. Ment <t.m...@protocol.invalid>
wrote:

> On Sat, 11 May 2019 15:31:13 -0700 (PDT), S. Sokolow wrote:
>
[]
>
>> Nonetheless, if you feel the productive conversation should end here,
>> I'm willing to let you have the last word.
>
> If you can't keep that promise, I have a newsreader and killfile.
>
>
>
Usenet is a great place to share experiences and bicker.



--
Bah, and indeed, Humbug.

rug...@gmail.com

unread,
May 14, 2019, 8:05:47 PM5/14/19
to
Hi,

On Sunday, May 12, 2019 at 3:07:37 PM UTC-5, S. Sokolow wrote:
> >
> > Honestly, I'd rather suggest Turbo Pascal (or preferably Free
> > Pascal, although that's not as small, for good reason). Those
> > have actual smartlinkers
>
> Unfortunately, I discovered that the Free Pascal runtime imposed
> too high a base overhead on the file size to satisfy me, even
> after switching to the i8086 target.

Try something like this (half guessing):

-CX -XXs -O3 -Mtp -Si -Cppentium -Oppentium4

> (Initially, DPMI with the stub swapped for the embeddable extender
> was an option I investigated to see how much of my prototyping might
> make it into the final version.)

Newer versions of FPC (Go32v2) don't UPX properly due to internal
linker, but who cares, it's small enough output, IMHO. You could
also use WDOSX if you really wanted (though I don't "normally"
recommend it, it has some caveats).

> As for Turbo Pascal, the fact that only "most" of the Turbo Pascal
> versions Borland freeware'd are still available on Embarcadero's
> website puts a bad taste in my mouth, knowing the kind of
> "no redistribution" terms Embarcadero puts on their downloads.
> I don't like my code depending on tooling which might only be
> available as "abandonware" later.

I know, and I agree. My point was severalfold:

* TP 5.5 is freeware and small (and can run atop 8086, unlike FPC).
That version just turned 30 recently. It's a classic.
* It's output is buggier and more limited, but at least it's smaller
than FPC. (i8086-msdos supports more Delphi features, plus it supports
LFNs, *nix LF-only files, etc).

Obviously FPC is preferred, but TP is still good in a pinch.

> It's more that I'm too spoiled by resolutions and tooling
> that you can't get in DOS text mode.

Haven't tried, I usually stick to EGA's 80x43. But I think VESA will
go up to 132x60 or such. There are several programs for DOS for that
(SETMxx, SETLINES/ATILINES ... not necessarily for 132x60 VESA,
just in general) and even a port of SVGAtextMode (haven't tried,
a bit rough around the edges, but that's probably what you want).

I'm not sure, but I assume some (most?) of these editors can handle such,
but you might have to enable it manually (GNU Emacs?).

> > He's very smart but always busy. I haven't seen him around lately.
> > Maybe try asking on freedos-user if you don't see him around soon.
>
> Noted, but, if he's that busy, I may hold back on pursuing his
> attention as a courtesy.

I don't know any details, but I wouldn't be overly cautious. If he's
too busy, he'll say so.

S. Sokolow

unread,
May 15, 2019, 10:43:31 AM5/15/19
to
On Wednesday, May 15, 2019 at 12:05:47 AM UTC, rug...@gmail.com wrote:
> Hi,
>
> On Sunday, May 12, 2019 at 3:07:37 PM UTC-5, S. Sokolow wrote:
> > >
> > > Honestly, I'd rather suggest Turbo Pascal (or preferably Free
> > > Pascal, although that's not as small, for good reason). Those
> > > have actual smartlinkers
> >
> > Unfortunately, I discovered that the Free Pascal runtime imposed
> > too high a base overhead on the file size to satisfy me, even
> > after switching to the i8086 target.
>
> Try something like this (half guessing):
>
> -CX -XXs -O3 -Mtp -Si -Cppentium -Oppentium4

This is what my Makefile was using before I switched to Open Watcom:

-Os -CX -XX -Xs -v0ewnh

I tried everything listed on the wiki for reducing program size except using a custom build of the standard library.

As for the last two you suggested, optimizing for Pentium 4 would defeat the whole purpose. My main retro-hobby machine is an AST Adventure! 210 (133MHz Pentium) and, even when I was considering DPMI, my goal with this particular project was to support all the way back to the 386.

Once I decided to go real mode, it switched to supporting all the way back to whichever version of the IBM PC XT is the first to support the requisite INT 10h calls.

(Hence my obsession with hitting a size target of 10-50KiB. I intend this installer toolkit to be usable on a 720K floppy disk without crowding out the actual content.)

With Open Watcom C/C++, I've managed to keep things so compact that UPX refuses to compress the demonstration of the graphics routines that I've produced so far (https://imgur.com/a/ynn7SoJ) because the savings would be drowned out by the size of the UPX stub.

>
> > (Initially, DPMI with the stub swapped for the embeddable extender
> > was an option I investigated to see how much of my prototyping might
> > make it into the final version.)
>
> Newer versions of FPC (Go32v2) don't UPX properly due to internal
> linker, but who cares, it's small enough output, IMHO. You could
> also use WDOSX if you really wanted (though I don't "normally"
> recommend it, it has some caveats).

Normally, it'd be small enough, but not in this particular project.

My goal is to fit fit TUI-drawing routines, helpers for things like checking free disk space and listing available drives, a parser for a simple batch file-esque scripting language, and a Zip decompressor into 50KiB or less and, ideally, into 15KiB or less.

(I don't anticipate managing in, but it'd make my day if I managed to beat out the smallest period installer in my collection... one of my CD-ROM releases of Lemmings at 10.9KiB including compiled-in control "scripting".)

> I know, and I agree. My point was severalfold:
>
> * TP 5.5 is freeware and small (and can run atop 8086, unlike FPC).
> That version just turned 30 recently. It's a classic.
> * It's output is buggier and more limited, but at least it's smaller
> than FPC. (i8086-msdos supports more Delphi features, plus it supports
> LFNs, *nix LF-only files, etc).
>
> Obviously FPC is preferred, but TP is still good in a pinch.

Fair enough.

>
> > It's more that I'm too spoiled by resolutions and tooling
> > that you can't get in DOS text mode.
>
> Haven't tried, I usually stick to EGA's 80x43. But I think VESA will
> go up to 132x60 or such. There are several programs for DOS for that
> (SETMxx, SETLINES/ATILINES ... not necessarily for 132x60 VESA,
> just in general) and even a port of SVGAtextMode (haven't tried,
> a bit rough around the edges, but that's probably what you want).
>
> I'm not sure, but I assume some (most?) of these editors can handle such,
> but you might have to enable it manually (GNU Emacs?).

I'd forgotten about the VESA options, and I'll definitely have to experiment with that, but it's not just the dimensions. It's also the font, how DOSBox and DOS-in-VirtualBox only support "capture all input until you hit a release key" mode when something starts listening to the mouse (making it more hassle to switch out of DOS to run things like splint or test automation which launches a bunch of different DOS VMs) and, to be honest, various aspects of my .vimrc which I don't want to give up.

The papercuts just form too big a pile to make it worthwhile.

>
> > > He's very smart but always busy. I haven't seen him around lately.
> > > Maybe try asking on freedos-user if you don't see him around soon.
> >
> > Noted, but, if he's that busy, I may hold back on pursuing his
> > attention as a courtesy.
>
> I don't know any details, but I wouldn't be overly cautious. If he's
> too busy, he'll say so.

Yesterday, I started an effort to finally fix a sleep-related problem I have once and for all, so I'm currently exhausted and probably won't be starting any new conversations for a little while, but fair enough.

rug...@gmail.com

unread,
May 21, 2019, 12:07:14 PM5/21/19
to
Hi again,

On Wednesday, May 15, 2019 at 9:43:31 AM UTC-5, S. Sokolow wrote:
> On Wednesday, May 15, 2019 at 12:05:47 AM UTC, rug...@gmail.com wrote:
> >
> > On Sunday, May 12, 2019 at 3:07:37 PM UTC-5, S. Sokolow wrote:
> > > >
> > > Unfortunately, I discovered that the Free Pascal runtime imposed
> > > too high a base overhead on the file size to satisfy me, even
> > > after switching to the i8086 target.
> >
> > Try something like this (half guessing):
> >
> > -CX -XXs -O3 -Mtp -Si -Cppentium -Oppentium4
>
> This is what my Makefile was using before I switched to Open Watcom:
>
> -Os -CX -XX -Xs -v0ewnh

I'm not sure -Os makes much of a difference. It might sometimes actually
be smaller with -O3.

-XX -Xs is the same as -XXs.

-v0 I thought meant shut everything up. The other stuff is notes, hints,
whatever, so mostly just diagnostic warnings.

I was just weakly giving an example. Some switches make a difference,
and often you can also enable them with source directives.

> I tried everything listed on the wiki for reducing program size
> except using a custom build of the standard library.

Besides enabling smartlinking and using a compressor (and WDOSX does
have a compressor), there isn't much you can do. It's not worth
worrying too too much about (in normal circumstances).

> As for the last two you suggested, optimizing for Pentium 4
> would defeat the whole purpose. My main retro-hobby machine
> is an AST Adventure! 210 (133MHz Pentium) and, even when I
> was considering DPMI, my goal with this particular project
> was to support all the way back to the 386.

I understand. I was just trying to give an example of potential
options. I'm not sure how effective some of them are, anyways.

> Once I decided to go real mode, it switched to supporting all
> the way back to whichever version of the IBM PC XT is the first
> to support the requisite INT 10h calls.

I guess you know you can have runtime cpu detection. Although for
size only, that may not help. But for speed or other reasons
(extra functionality?), it's cool.

> (Hence my obsession with hitting a size target of 10-50KiB.
> I intend this installer toolkit to be usable on a 720K floppy
> disk without crowding out the actual content.)

Understandable. But features and stability must always come first.

> With Open Watcom C/C++, I've managed to keep things so compact
> that UPX refuses to compress the demonstration of the graphics
> routines that I've produced so far because the savings would be
> drowned out by the size of the UPX stub.

There are other compressors. Effectiveness varies.

> My goal is to fit fit TUI-drawing routines, helpers for things
> like checking free disk space and listing available drives,
> a parser for a simple batch file-esque scripting language,
> and a Zip decompressor into 50KiB or less and, ideally,
> into 15KiB or less.

DiskFree (unit DOS) works with FAT32 (int 21h, 7303h):
* http://www.delorie.com/djgpp/doc/rbinter/id/40/32.html

Listing available drives? Dunno, but check DJGPP's mntent.c
Here's an old thread on BTTR for it:
* http://www.bttr-software.de/forum/board_entry.php?id=6721#p6723

ZIP decompressor? You mean Deflate only? Try not to bite off more
than you can chew!

> (I don't anticipate managing in, but it'd make my day if I managed
> to beat out the smallest period installer in my collection...
> one of my CD-ROM releases of Lemmings at 10.9KiB including
> compiled-in control "scripting".)

IIRC, Lemmings 3D was written in assembly. You're not going to beat
that for size. But I wouldn't worry too too hard about it (although
I'm sympathetic!).

> I'd forgotten about the VESA options, and I'll definitely have
> to experiment with that, but it's not just the dimensions.
> It's also the font,

* https://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/util/system/fonts/

> how DOSBox and DOS-in-VirtualBox only support "capture all input
> until you hit a release key" mode when something starts listening
> to the mouse (making it more hassle to switch out of DOS to run

* https://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/distributions/unofficial/metados/

> things like splint or test automation which launches a bunch of
> different DOS VMs)

There's a 2008 port of Splint 3.1.2 to DJGPP (although I've never
used it):

* http://na.mirror.garr.it/mirrors/djgpp/current/v2tk/spl312br4.zip

> and, to be honest, various aspects of my .vimrc
> which I don't want to give up.

VIM dropped DJGPP in 7.3 or such. I sometimes prefer VILE anyways.
(Like I said, there's dozens of other DOS editors.)

> The papercuts just form too big a pile to make it worthwhile.

Whatever works for you. I'm just offering a few obvious hints.

> Yesterday, I started an effort to finally fix a sleep-related
> problem I have once and for all, so I'm currently exhausted
> and probably won't be starting any new conversations for a little
> while, but fair enough.

Understandable. Take care, and good luck!

S. Sokolow

unread,
May 21, 2019, 6:18:36 PM5/21/19
to
On Tuesday, May 21, 2019 at 4:07:14 PM UTC, rug...@gmail.com wrote:
> Hi again,
>
> On Wednesday, May 15, 2019 at 9:43:31 AM UTC-5, S. Sokolow wrote:
> > On Wednesday, May 15, 2019 at 12:05:47 AM UTC, rug...@gmail.com wrote:
> > >
> > > Try something like this (half guessing):
> > >
> > > -CX -XXs -O3 -Mtp -Si -Cppentium -Oppentium4
> >
> > This is what my Makefile was using before I switched to Open Watcom:
> >
> > -Os -CX -XX -Xs -v0ewnh
>
> I'm not sure -Os makes much of a difference. It might sometimes actually
> be smaller with -O3.

*nod* Next time I do a Free Pascal project, I'll leverage the "exhaustively test combinations of build flags to minimize output size" helper that I plan to write for this project.

>
> -XX -Xs is the same as -XXs.

Yeah. It didn't occur to me that the parser might allow condensing "sub-flags" the way it allows condensing argument-less flags at the top level.

>
> -v0 I thought meant shut everything up. The other stuff is notes, hints,
> whatever, so mostly just diagnostic warnings.

For all I know, it might. I don't remember what I was intending at the time.

That's just what was in the final commit before I scrapped the Free Pascal prototype and started over with Open Watcom C/C++.

>
> I was just weakly giving an example. Some switches make a difference,
> and often you can also enable them with source directives.

Yeah. I was using source directives for various things that I felt should be specified next to the source they're intended to primarily affect.

>
> Besides enabling smartlinking and using a compressor (and WDOSX does
> have a compressor), there isn't much you can do. It's not worth
> worrying too too much about (in normal circumstances).

There *are* other things you can do, they're just much more of a trade-off.

(eg. The "Size Matters" wiki page itself points out that the RTL plus the sysutils unit is liable to impart a 100-125kb overhead compared to about 25kb for the RTL alone... but, of course, that means you have to reinvent the bits of sysutils that you actually need.)

>
> I guess you know you can have runtime cpu detection. Although for
> size only, that may not help. But for speed or other reasons
> (extra functionality?), it's cool.

Definitely cool. I just don't see a need for it in this case.

Aside from ensuring my file copying isn't a bottleneck and coalescing my drawing routines into as few BIOS calls as possible, this is a completely I/O-bound project, running on a non-multitasking operating system, and its performance when I'm emulating a system that either matches or underperforms a PC XT indicates I have plenty of CPU cycles to burn in order to shave bytes off the binary image size.

>
> > (Hence my obsession with hitting a size target of 10-50KiB.
> > I intend this installer toolkit to be usable on a 720K floppy
> > disk without crowding out the actual content.)
>
> Understandable. But features and stability must always come first.

Certainly, but I don't anticipate having any trouble with either of those.

...though, in the name of reconciling features and size, I do plan to compromise on the "let end-users inspect and play around with installers they receive" goal by making the more declaractive InnoSetup-esque project definitions compile down to an NSIS-esque imperative syntax so developers don't have to distribute the parser for the more complex syntax with their installers.

All of the parsers I've seen for things like JSON are gigantic by "floppy installer" standards and I don't feel like trying to write one of my own.

>
> There are other compressors. Effectiveness varies.

True. I was more using it as an example of how compact I've managed to keep things so far.

That said, see also my preference for open-source build dependencies. That cuts against a lot of compressors.

>
> DiskFree (unit DOS) works with FAT32 (int 21h, 7303h):
> * http://www.delorie.com/djgpp/doc/rbinter/id/40/32.html
>
> Listing available drives? Dunno, but check DJGPP's mntent.c
> Here's an old thread on BTTR for it:
> * http://www.bttr-software.de/forum/board_entry.php?id=6721#p6723

Thanks. I'll take a look at those, though I'm pretty sure I have some resources for those tasks already bookmarked.

>
> ZIP decompressor? You mean Deflate only? Try not to bite off more
> than you can chew!

The Minimum Viable Product won't use compression at all and, when I add support for Zip, I'll start with a Store-only implementation since the first goal of that is to make single-file installers.

I do intend to eventually support Deflate as well because, if nothing else, I'm trying to build a reasonable approximation of "pack/repack your installer using any standard archive tool and it'll work".

Either way, I'm willing to implement from scratch, but I'm no fool.

I plan to drop into the Info-ZIP mailing lists and ask for clarification on the rules for how their DOS self-extractor stub's attribution message must be displayed to avoid the need to distribute it under more normal terms. (eg. Bundling a copy of the full license with it, etc.)

Looking at the makefiles, the real-mode versions of Info-Zip build under Open Watcom C/C++ and, on a technical level, I should just be able to pare that code down to what I need.

>
> IIRC, Lemmings 3D was written in assembly. You're not going to beat
> that for size. But I wouldn't worry too too hard about it (although
> I'm sympathetic!).

The game itself? Sure. ...but I think the installer was written by the publisher (or at least someone else in the company)... and it's definitely too big to be size-optimized assembly.

PKUNZJR.COM managed to pack a Deflate-capable Zip decompressor into 2KiB. THAT is the target I can't match.

I'm certain I could beat the installer I'm thinking of for size while replicating the look if I hard-coded the control scripting and made use of globs rather than hard-coding a list of files to install.

I've already got more drawing routines than would be needed in 1.6KiB and I'd be VERY surprised if it took me another 9.3KiB to do the rest of what they did.

>
> > I'd forgotten about the VESA options, and I'll definitely have
> > to experiment with that, but it's not just the dimensions.
> > It's also the font,
>
> * https://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/util/system/fonts/

Nice. Thanks. :)

>
> > how DOSBox and DOS-in-VirtualBox only support "capture all input
> > until you hit a release key" mode when something starts listening
> > to the mouse (making it more hassle to switch out of DOS to run
>
> * https://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/distributions/unofficial/metados/

That's interesting and I'm glad to now know where to find a convenient starting point for boot floppies, but the stuff I found doesn't mention anything about enabling the "seamless mouse" behaviour that I was referring to or, even better, exempting Alt+Tab from the set of events passed to the emulated environment.

That aspect of my problem with using a DOS-native editor/IDE has to do with efficiently switching focus between the editor and other tools.

>
> > things like splint or test automation which launches a bunch of
> > different DOS VMs)
>
> There's a 2008 port of Splint 3.1.2 to DJGPP (although I've never
> used it):
>
> * http://na.mirror.garr.it/mirrors/djgpp/current/v2tk/spl312br4.zip

Huh. Usually my Google Fu doesn't fail like that. I'll try to incorporate that into the DOS cases of my makefile.

(I've already gone so far as to ensure that all my lines hard-wrap at 78 columns so the code is comfortably editable in Open Watcom's TUI-based editor for DOS.)

>
> > and, to be honest, various aspects of my .vimrc
> > which I don't want to give up.
>
> VIM dropped DJGPP in 7.3 or such. I sometimes prefer VILE anyways.
> (Like I said, there's dozens of other DOS editors.)
>
> > The papercuts just form too big a pile to make it worthwhile.
>
> Whatever works for you. I'm just offering a few obvious hints.

Noted. I may still explore the idea once I don't have other things dragging down my productivity.

>
> > Yesterday, I started an effort to finally fix a sleep-related
> > problem I have once and for all, so I'm currently exhausted
> > and probably won't be starting any new conversations for a little
> > while, but fair enough.
>
> Understandable. Take care, and good luck!

Thanks. :)

rug...@gmail.com

unread,
May 22, 2019, 1:58:02 AM5/22/19
to
Hi,

On Tuesday, May 21, 2019 at 5:18:36 PM UTC-5, S. Sokolow wrote:
> On Tuesday, May 21, 2019 at 4:07:14 PM UTC, rug...@gmail.com wrote:
> >
> > On Wednesday, May 15, 2019 at 9:43:31 AM UTC-5, S. Sokolow wrote:
> > > On Wednesday, May 15, 2019 at 12:05:47 AM UTC, rug...@gmail.com wrote:
> > > >
> > > > Try something like this (half guessing):
> > > >
> > > > -CX -XXs -O3 -Mtp -Si -Cppentium -Oppentium4
> > >
> > > This is what my Makefile was using before I switched to Open Watcom:
> > >
> > > -Os -CX -XX -Xs -v0ewnh
> >
> > I'm not sure -Os makes much of a difference. It might sometimes actually
> > be smaller with -O3.
>
> *nod* Next time I do a Free Pascal project, I'll leverage the
> "exhaustively test combinations of build flags to minimize output
> size" helper that I plan to write for this project.

Too much effort for too little gain. Seriously, in my (limited)
experience, it makes extremely little difference. Just stick to -O3.

> > -XX -Xs is the same as -XXs.
>
> Yeah. It didn't occur to me that the parser might allow condensing
> "sub-flags" the way it allows condensing argument-less flags
> at the top level.

Probably not always ... but, sometimes, yes.
-Ctrio is valid, I think. Maybe obscure ones like -Sgo too, can't
quite remember.

> > -v0 I thought meant shut everything up. The other stuff is notes, hints,
> > whatever, so mostly just diagnostic warnings.
>
> For all I know, it might. I don't remember what I was intending at the time.

fpc.cfg has some defaults worth checking, too.

(The TUI ide, fp.exe, can use .chm for online help. Not perfect but
better than nothing. Plain text versions are also good for reference.)

> > I was just weakly giving an example. Some switches make a difference,
> > and often you can also enable them with source directives.
>
> Yeah. I was using source directives for various things that I felt
> should be specified next to the source they're intended to primarily
> affect.

Sometimes it's better to specify in source rather than accidentally
relying on someone knowing correct switches to manually use.
Flexibility can be both a blessing and a curse.

> > Besides enabling smartlinking and using a compressor (and WDOSX does
> > have a compressor), there isn't much you can do. It's not worth
> > worrying too too much about (in normal circumstances).
>
> There *are* other things you can do, they're just much more of a trade-off.

I know. Even using WDOSX is a tradeoff because it has some (indirect?)
flaws. But it does mostly work (esp. if you direly need compression).

> (eg. The "Size Matters" wiki page itself points out

Written by a guy (nice and smart ... but very critical) who abhors
the whole idea of size savings. Granted, he likes smartlinking,
but beyond that, he couldn't care less.

> that the RTL plus the sysutils unit is liable to impart
> a 100-125kb overhead compared to about 25kb for the RTL
> alone... but, of course, that means you have to reinvent
> the bits of sysutils that you actually need.)

Disclaimer: I know nothing of Delphi dialect. But sysutils
is mostly for that alone. So you can (and should!) do without.
It's also mandatory for exceptions, IIRC, which do add a
small size and speed penalty (but most Delphi users always
use it anyways). Dynamic arrays also use exceptions behind
the scenes.

Do keep in mind that FPC supports several dialects (e.g. "tp"),
and you can use different dialects for different modules / units.
(IIRC, default "fpc" dialect allows function overloading and
structured function returns, unlike "tp".)

> > I guess you know you can have runtime cpu detection. Although for
> > size only, that may not help. But for speed or other reasons
> > (extra functionality?), it's cool.
>
> Definitely cool. I just don't see a need for it in this case.

Not really, no. But it can be cool, in theory, to have 186 or 286
routines (or even 386 or 686). If it makes a noticeable difference,
of course.

> Aside from ensuring my file copying isn't a bottleneck
> and coalescing my drawing routines into as few BIOS calls
> as possible, this is a completely I/O-bound project,
> running on a non-multitasking operating system, and its
> performance when I'm emulating a system that either matches
> or underperforms a PC XT indicates I have plenty of CPU cycles
> to burn in order to shave bytes off the binary image size.

But the 8088 is slower than the 8086, and even those are way
slower than an actual 286. Something about effective addressing
and ALU, prefetch queue, clock speed, and more complications
(jumps are somewhat costly). Honestly, I wouldn't worry about
it *at all* until you've done everything else (if even then).
But I'm far from an expert in this.

> > There are other compressors. Effectiveness varies.
>
> True. I was more using it as an example of how compact I've managed
> to keep things so far.
>
> That said, see also my preference for open-source build dependencies.
> That cuts against a lot of compressors.

I totally sympathize here. It does make more sense, IMHO.
It's quite annoying not being able to reproduce builds.
UPX is pretty good overall.

> > ZIP decompressor? You mean Deflate only? Try not to bite off more
> > than you can chew!
>
> The Minimum Viable Product won't use compression at all and,
> when I add support for Zip, I'll start with a Store-only
> implementation since the first goal of that is to make
> single-file installers.

Would this help? (Probably not, but ....)

* ftp://ftp.freepascal.org/pub/fpc/contrib/zfs210.zip

Actually, ZLIB might have some sample code (miniunz?).

> PKUNZJR.COM managed to pack a Deflate-capable Zip decompressor
> into 2KiB. THAT is the target I can't match.

IIRC, it doesn't do subdirs. It's probably just the sfx modified
to be standalone / runnable. There are other small tools, but
I don't recall many with a good (free/libre) license.

Take a look at UNTAR (also does .tar.gz), it's fairly small:

* http://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/util/unix/tar/

I guess BOOZ (ZOO) might also be interesting?

* https://www.sac.sk/download/pack/booz20.zip

Kerr-Mudd,John

unread,
May 22, 2019, 4:14:28 AM5/22/19
to
a FREEDOS program, TUNZ is 2.5k; it's on sourceforge.
Might be worth a look. (I can't get to SF ATM;"sourceforge.net uses an
unsupported protocol.")

> IIRC, it doesn't do subdirs. It's probably just the sfx modified
> to be standalone / runnable. There are other small tools, but
> I don't recall many with a good (free/libre) license.
>
> Take a look at UNTAR (also does .tar.gz), it's fairly small:
>
> * http://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/util/unix/tar
> /
>
> I guess BOOZ (ZOO) might also be interesting?
>
> * https://www.sac.sk/download/pack/booz20.zip
>



Kerr-Mudd,John

unread,
May 22, 2019, 4:18:28 AM5/22/19
to
On Wed, 22 May 2019 08:14:27 GMT, "Kerr-Mudd,John"
<nots...@invalid.org> wrote:

> On Wed, 22 May 2019 05:58:01 GMT, rug...@gmail.com wrote:
>
>> Hi,
>>
>> On Tuesday, May 21, 2019 at 5:18:36 PM UTC-5, S. Sokolow wrote:
>>> On Tuesday, May 21, 2019 at 4:07:14 PM UTC, rug...@gmail.com wrote:

[unzip etc]

>>> The Minimum Viable Product won't use compression at all and,
>>> when I add support for Zip, I'll start with a Store-only
>>> implementation since the first goal of that is to make
>>> single-file installers.
>>
>> Would this help? (Probably not, but ....)
>>
>> * ftp://ftp.freepascal.org/pub/fpc/contrib/zfs210.zip
>>
>> Actually, ZLIB might have some sample code (miniunz?).
>>
>>> PKUNZJR.COM managed to pack a Deflate-capable Zip decompressor
>>> into 2KiB. THAT is the target I can't match.
>>
> a FREEDOS program, TUNZ is 2.5k; it's on sourceforge.
> Might be worth a look. (I can't get to SF ATM;"sourceforge.net uses an
> unsupported protocol.")
>
>> IIRC, it doesn't do subdirs. It's probably just the sfx modified
>> to be standalone / runnable. There are other small tools, but
>> I don't recall many with a good (free/libre) license.
>>
>> Take a look at UNTAR (also does .tar.gz), it's fairly small:
>>
>> * http://www.ibiblio.org/pub/micro/pc-
stuff/freedos/files/util/unix/tar
>> /
>>
>> I guess BOOZ (ZOO) might also be interesting?
>>
>> * https://www.sac.sk/download/pack/booz20.zip
>>
>
>
>
Sorry for not trimming.

S. Sokolow

unread,
May 23, 2019, 6:07:48 AM5/23/19
to
On Wednesday, May 22, 2019 at 5:58:02 AM UTC, rug...@gmail.com wrote:
> Hi,
>
> On Tuesday, May 21, 2019 at 5:18:36 PM UTC-5, S. Sokolow wrote:
> > On Tuesday, May 21, 2019 at 4:07:14 PM UTC, rug...@gmail.com wrote:
> >
> > *nod* Next time I do a Free Pascal project, I'll leverage the
> > "exhaustively test combinations of build flags to minimize output
> > size" helper that I plan to write for this project.
>
> Too much effort for too little gain. Seriously, in my (limited)
> experience, it makes extremely little difference. Just stick to -O3.

You overestimate how much effort it is.

To prove that, here's an implementation of it that took me 5-10 minutes.

...admittedly, untested to the point where I haven't even checked for syntax errors. Just a quick copy-paste of a helper function I had in another project and the rest from memory.


The only reason I hadn't written it already is that, with `wmake test` still on my TODO list, it wasn't useful yet and this is the first project I've done where I'm optimizing for size.

#!/usr/bin/env python3

import os, subprocess
from itertools import chain, combinations

# Command used to build the binary to test for size
BUILD_CMD = ['wmake', 'all']

# Compiler command and flags which should always be used
# (eg. warning level, target CPU, memory model, etc.)
REQUIRED_FLAGS = ['-q', '-0', '-ms', '-zpw', '-wx', '@$(src)hello.lnk']

# Command to be evaluated for size
BINARY_NAME = 'install.exe'

# Command to be run to verify that the flag combo doesn't cause breakages
TEST_CMD = ['wmake', 'test']

# Optimizer settings which should be tested
OPTIONAL_FLAGS = ['-ei', '-ob', '-oe', '-oh', '-oi', '-ok', '-ol', '-on',
'-or', '-os', '-wx', '-zm']

# Already-written function from another project of mine
def powerset(iterable): # type: (Iterable[Any]) -> Iterator[Sequence[Any]]
"""C{powerset([1,2,3])} --> C{() (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)}

@rtype: Iterable
"""
i = list(iterable)
return chain.from_iterable(combinations(i, j) for j in range(len(i) + 1))

smallest_size, smallest_flags = None, None

for flag_set in powerset(OPTIONAL_FLAGS):
flag_str = 'WCLFLAGS=' + ' '.join(flag_set)
try:
try:
subprocess.check_call(BUILD_CMD + [flag_str])
except subprocess.CalledProcessError:
continue

# Don't bother to test files unless they're smaller than the best known
size = os.stat(BINARY_NAME).st_size
if smallest_size and size >= smallest_size:
continue

# Run the test suite to verify that these flags don't break the build
# in some way
try:
subprocess.check_call(TEST_CMD + [flag_str])
except subprocess.CalledProcessError:
continue

smallest_size = size
smallest_flags = flag_set

finally:
subprocess.call(['wmake', 'clean'])

print("Smallest binary size achieved: %s" % smallest_size)
print("...using compiler flags: %r" % smallest_flags)

> -Ctrio is valid, I think. Maybe obscure ones like -Sgo too, can't
> quite remember.

> fpc.cfg has some defaults worth checking, too.

Thanks. I'll make a note to check those out.

> (The TUI ide, fp.exe, can use .chm for online help. Not perfect but
> better than nothing. Plain text versions are also good for reference.)

Oh, yeah. I remembered seeing a CHM unit but it never really hit me that it could be used to write a DPMI-based DOS CHM viewer.

> Sometimes it's better to specify in source rather than accidentally
> relying on someone knowing correct switches to manually use.
> Flexibility can be both a blessing and a curse.

*nod* My policy is to put the stuff which alters the behaviour of the code in the source and the stuff which merely alters optimization settings in the Makefile.

> > (eg. The "Size Matters" wiki page itself points out
>
> Written by a guy (nice and smart ... but very critical) who abhors
> the whole idea of size savings. Granted, he likes smartlinking,
> but beyond that, he couldn't care less.

Interesting to know, but I only gave it as an example. The flags I chose were the result of reading every bit of info I could find on how to size-optimize Free Pascal output, including looking through the list of compiler flags and directives for any that looked promising.

> Disclaimer: I know nothing of Delphi dialect. But sysutils
> is mostly for that alone. So you can (and should!) do without.
> It's also mandatory for exceptions, IIRC, which do add a
> small size and speed penalty (but most Delphi users always
> use it anyways). Dynamic arrays also use exceptions behind
> the scenes.
>
> Do keep in mind that FPC supports several dialects (e.g. "tp"),
> and you can use different dialects for different modules / units.
> (IIRC, default "fpc" dialect allows function overloading and
> structured function returns, unlike "tp".)

I think the unzip code I'd been planning to use required exceptions but I never got that far before I decided to switch to Open Watcom C/C++, so I can't be certain.

> Not really, no. But it can be cool, in theory, to have 186 or 286
> routines (or even 386 or 686). If it makes a noticeable difference,
> of course.

No argument there.

> But the 8088 is slower than the 8086, and even those are way
> slower than an actual 286. Something about effective addressing
> and ALU, prefetch queue, clock speed, and more complications
> (jumps are somewhat costly). Honestly, I wouldn't worry about
> it *at all* until you've done everything else (if even then).
> But I'm far from an expert in this.

Hey, I just grabbed the MIPS.COM benchmark suggested on the DOSBox Wiki for figuring out how to match DOSBox to a specific performance point and turned the cycles down until the fastest benchmark matched the slowest reference example it offered.

I may reconsider my target later if need be, but it's currently an enjoyable challenge to hold myself to appealing performance under those conditions, when I spend most of my coding time writing stuff in Python that remains solidly I/O-bound and "good enough" without me even trying as long as I pick a sane algorithm.

> > The Minimum Viable Product won't use compression at all and,
> > when I add support for Zip, I'll start with a Store-only
> > implementation since the first goal of that is to make
> > single-file installers.
>
> Would this help? (Probably not, but ....)
>
> * ftp://ftp.freepascal.org/pub/fpc/contrib/zfs210.zip

The attributions say it ported the same LGPLed Info-ZIP decompression code that I was planning to use.

Aside from that, it's technically illegal to use it for *anything* because the author added additional requirements beyond the terms of the LGPL license that came with the Info-ZIP routines, which means you have to satisfy both those additional requirements and the clause in the LGPL which says you are forbidden from doing so.

(Modern versions of the GPL and LGPL resolve this by including a clause which specifically says that, if the author is ignorant enough to impose conditions beyond those specified in the license text, they don't have legal force and you can ignore them.)

> Actually, ZLIB might have some sample code (miniunz?).

The unzip.c from miniunz which handles the archive structure is based on Info-ZIP code, so this clause still applies:

Redistributions in binary form (compiled executables and libraries) must reproduce the above copyright notice, definition, disclaimer, and this list of conditions in documentation and/or other materials provided with the distribution. Additional documentation is not needed for executables where a command line license option provides these and a note regarding this option is in the executable's startup banner. The sole exception to this condition is redistribution of a standard UnZipSFX binary (including SFXWiz) as part of a self-extracting archive; that is permitted without inclusion of this license, as long as the normal SFX banner has not been removed from the binary or disabled.

You can see why I'm planning to try to contact them about that. It's not really very suited to something like an installer creation kit, where I don't want to force a startup banner on users. It'd feel like a shareware nag screen.

I suspect, however, that I'll have to prototype with a binary that uses Info-ZIP code and then un-cripple my creation by writing my own from-scratch equivalent to miniunz if I feel that it's making the program too big.

>
> Take a look at UNTAR (also does .tar.gz), it's fairly small:
>
> * http://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/util/unix/tar/

That'd be perfect except that tar is a streaming archival format and I need one where it's easy to unpack arbitrary files without unpacking the whole thing.

My clever plan to avoid having to implement my own structured data store for the automation scripting is "I already need filesystem and/or archive manipulation routines, and both of those are hierarchical key-value stores when you think about it."

(ie. Similar to Mojosetup for Linux, I want to have a "scripts" folder in the self-extractor. However, here, to save on startup time, memory, and/or temp space on disk, the self-extractor stub will decompress the the individual pseudo-batch files on demand.)

To get the kind of random access I want, I'd need to individually gzip each file I wanted to be able to extract individually, put them inside an UNcompressed tar file, and then build a seeking index comparable to a Zip file's central directory on startup.

That's not a standard or familiar way to do it and the whole point of using Zip is to make it easy to build an installer using minimal custom tooling.

(With my current concept, all you need is to write your scripting using any text editor with batch file syntax highlighting, zip everything up using the default "maximum compatibility" options, and either specify that the installer stub be used as a custom SFX stub or, if that's not an option, concatenate them together and then use `zip -A` from Info-ZIP to fix up the Zip file's offsets.)

> I guess BOOZ (ZOO) might also be interesting?
>
> * https://www.sac.sk/download/pack/booz20.zip

Also a nice option, but ZOO is an even more esoteric format these days so, again, I'd still like to try for Zip first.

S. Sokolow

unread,
May 23, 2019, 6:30:27 AM5/23/19
to
On Wednesday, May 22, 2019 at 8:14:28 AM UTC, Kerr-Mudd,John wrote:
> a FREEDOS program, TUNZ is 2.5k; it's on sourceforge.
> Might be worth a look. (I can't get to SF ATM;"sourceforge.net uses an
> unsupported protocol.")

I found mentions of it on SourceForge-hosted mailing list archives, but I had to pull TUNZ itself out of the Wayback Machine.

However, according to discussion here...

https://sourceforge.net/p/freedos/mailman/freedos-user/?viewmonth=201101

...it may only supports Zip 1.0 and/or doesn't like extra metadata fields written by Info-ZIP and the source was never released, which means I might as well just use Info-ZIP's SFX stub as-is.

This nabble view of some of those messages includes the same quoted passages:

http://freedos.10956.n7.nabble.com/Re-tunz-problems-in-freedos-td7448.html

Also, there's apparently some machine code shared between it and PKUNZJR which calls into question whether it was written from scratch or not.

(I say "calls into question" because, when you're writing something that compact, I could believe they might independently hand-optimize their assembly language instructions to the same thing.)

Still, thanks for the effort.

rug...@gmail.com

unread,
May 27, 2019, 9:59:55 PM5/27/19
to
Hi,

On Thursday, May 23, 2019 at 5:07:48 AM UTC-5, S. Sokolow wrote:
> On Wednesday, May 22, 2019 at 5:58:02 AM UTC, rug...@gmail.com wrote:
> >
> > Would this help? (Probably not, but ....)
> >
> > * ftp://ftp.freepascal.org/pub/fpc/contrib/zfs210.zip
>
> The attributions say it ported the same LGPLed Info-ZIP decompression
> code that I was planning to use.
>
> Aside from that, it's technically illegal to use it for *anything*
> because the author added additional requirements beyond the terms
> of the LGPL license that came with the Info-ZIP routines, which
> means you have to satisfy both those additional requirements and
> the clause in the LGPL which says you are forbidden from doing so.

Eek. Normally I'm careful about licensing (which is always a mess,
and most people are very sloppy), sorry!

Here's an interesting Ada library, if that somehow helps you:

* https://unzip-ada.sourceforge.io/

But I haven't looked closely, so I only vaguely know about it.
(That guy did use it to read from .ZIP demo data for his 3D engine.)

> > Take a look at UNTAR (also does .tar.gz), it's fairly small:
> >
> > * http://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/util/unix/tar/
>
> That'd be perfect except that tar is a streaming archival format
> and I need one where it's easy to unpack arbitrary files without
> unpacking the whole thing.
>
> My clever plan to avoid having to implement my own structured data
> store for the automation scripting is "I already need filesystem
> and/or archive manipulation routines, and both of those are
> hierarchical key-value stores when you think about it."
>
> (ie. Similar to Mojosetup for Linux, I want to have a "scripts"
> folder in the self-extractor. However, here, to save on startup
> time, memory, and/or temp space on disk, the self-extractor stub
> will decompress the the individual pseudo-batch files on demand.)
>
> To get the kind of random access I want, I'd need to individually
> gzip each file I wanted to be able to extract individually, put them
> inside an UNcompressed tar file, and then build a seeking index
> comparable to a Zip file's central directory on startup.

Sounds overly complex. Reminds me of this (bfs), although again,
I've not looked closely:

* http://www.jasspa.com/zeroinst.html

> That's not a standard or familiar way to do it and the whole point
> of using Zip is to make it easy to build an installer using minimal
> custom tooling.

I don't think ZIP is as universal as you imply. There is "appnote"
for detailing the format, but overall you're probably still relying
on many incompatible subsets from different vendors (Windows Explorer,
Total Commander, Info-Zip, 7-Zip, etc). It's not THAT bad, hopefully,
but I'd be very selective and test before assuming much of anything.

* https://support.pkware.com/display/PKZIP/APPNOTE
* https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT

> > I guess BOOZ (ZOO) might also be interesting?
> >
> > * https://www.sac.sk/download/pack/booz20.zip
>
> Also a nice option, but ZOO is an even more esoteric format
> these days so, again, I'd still like to try for Zip first.

Well, it's public domain, IIRC. Even Red Hat (Fedora?) used to have
ZOO available somewhere, a few years ago, but I forget exactly.
Yeah, not as popular as ZIP, but it's still "something", if everything
else fails you.

Hmmm, a quick search only shows "unzoo":

* http://archives.math.utk.edu/software/multi-platform/gap/util/

Debian Stretch (9) seems to have "zoo" and also points to iBiblio:

* https://packages.debian.org/stretch/zoo
* http://www.ibiblio.org/pub/packages/ccic/software/unix/utils/

Hmmm, that's old stuff, too. You can find lots of archiver stuff
(e.g. UnARJ with C sources) at Sac.sk:

* https://www.sac.sk/download/pack/unarj265.exe

I don't think AR002 is free/libre, but IIRC, that's the origin of
some of these old archivers (ZOO, ARJ, LHA), but I could be
wrong (vaguely):

* https://www.sac.sk/download/pack/ar002.zip

Just FYI.

rug...@gmail.com

unread,
May 27, 2019, 10:11:35 PM5/27/19
to
Hi again,

I'm somewhat confused on how TUNZ or PKUNZJR would help you.
Yes, some of the old messages you refer to were by me, but I
am no expert. Still, I do remember some bits about various old
DOS archivers. But again, I don't know if any of that will help
you much (if at all)!

On Thursday, May 23, 2019 at 5:30:27 AM UTC-5, S. Sokolow wrote:
>
> Still, thanks for the effort.

For small extractors, I can vaguely remember LHE and ARCE:

* https://www.sac.sk/download/pack/lhe101.exe
* https://www.sac.sk/download/pack/arce41a.zip

Neither has sources (IIRC) and probably aren't freeware anyways
(maybe shareware). Probably not super useful.

There was LHA for *nix a few years ago, in Debian, that one guy
compiled for DJGPP, but ....

* http://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/distributions/1.2/repos/pkg-html/group-archiver.html

Actually, "7zdec" might be useful to you, too. Ah, never mind,
that's 386 pmode only (C for DJGPP).

Again, I'm no expert! But I also remember the obscure ESP archiver
for DOS (and the C sources of UnESP for *nix):

* https://www.sac.sk/download/pack/unesp.tgz

At this point, I guess you can figure it out for yourself. Sorry
I'm not much more help. It's just lots of little pieces. Sometimes
it's useful to reuse others' code, but often you just gotta cobble
together your own (as I'm sure you know).

rug...@gmail.com

unread,
May 27, 2019, 10:18:57 PM5/27/19
to
Hi,

On Monday, May 27, 2019 at 9:11:35 PM UTC-5, rug...@gmail.com wrote:
>
> Neither has sources (IIRC) and probably aren't freeware anyways
> (maybe shareware). Probably not super useful.

Well, there's also DEARC (Turbo Pascal), MDCD (TP + ASM), and a
tiny compressor-only called Kaboom! (MASM v6):

* http://www.vectorbd.com/bfd/pascal/dearc40.lzh
* https://www.sac.sk/download/pack/mdcd10.zip

* https://www.sac.sk/download/pack/kboom11.zip
* https://board.flatassembler.net/topic.php?t=21062

(The latter is just the fasmg guy porting Kaboom! to that.
You could also just use JWasm + a linker. Probably not ideal
code, no archiving, but it's fairly small, pure asm.)

Again, I'm grasping at straws here, trying to point you in
some right direction (barely). Only you know what you're
looking for exactly.

S. Sokolow

unread,
May 27, 2019, 11:56:49 PM5/27/19
to
On Tuesday, May 28, 2019 at 1:59:55 AM UTC, rug...@gmail.com wrote:
> Here's an interesting Ada library, if that somehow helps you:
>
> * https://unzip-ada.sourceforge.io/
>
> But I haven't looked closely, so I only vaguely know about it.
> (That guy did use it to read from .ZIP demo data for his 3D engine.)

Well, it claims to be MIT licensed and has a substantial amount of code, so it's a good candidate for something that could be ported to C or, at the very least, used as an alternative explanation of what the Zip spec is actually saying.

Of course, it's always possible that, when I look more closely, I'll discover that it's a mechanical translation of LGPLed Info-ZIP code and the author thought that was sufficient to make something "not a derived work" in the eyes of the law.

> Sounds overly complex. Reminds me of this (bfs), although again,
> I've not looked closely:

Implementing it in a streaming archive format like TAR is overly complex. For Zip, it'd just be a matter of using the Zip reading routines in place of fopen()/fread().

I'd basically be reimplementing the Python zipimport mechanism used by tools such as py2exe to load source files from a zip file concatenated onto the interpreter on demand.

bfs was interesting to read about but, as described, it sounds like they did the same thing as zipimport, but cooked up a custom archive format because they didn't think Zip performed well enough... something that leaves me wondering how long ago it was developed, since that opinion reminds me of the original motivation for the .C/.H split in C.

> I don't think ZIP is as universal as you imply. There is "appnote"
> for detailing the format, but overall you're probably still relying
> on many incompatible subsets from different vendors (Windows Explorer,
> Total Commander, Info-Zip, 7-Zip, etc). It's not THAT bad, hopefully,
> but I'd be very selective and test before assuming much of anything.

It's true that Zip does have a lot of extensions and optional features, but they're not necessary in this case.

Zip compression tools generally default to a combination of settings which is designed to be extractable with a specific PKZip for DOS 2.x version from the 1990s because Zip's big competitive advantage against things like ARJ, ARC, ZOO, RAR, Z-Zip, etc. is that everyone has a working unpacker for it.

For example:

* Compressed Folders for Windows breaks if you try to use it to extract files from a Zip file bigger than 4GiB, like the archives from the Fanfiction.net dump over on Archive.org.

* Info-ZIP supports ZIP64 and big zip files, but still doesn't support non-Deflate compression methods like bzip2. You need to use 7-Zip or p7zip for those.

* If I'm remembering correctly, Info-ZIP intentionally doesn't implement support for creating archives with PKZIP's archaic, weak DOS-era password-protection.

While I'll still definitely want to test against archives created by various tools, my goal is just to support that default compatibility profile in the form that makes sense for DOS installers created by a user zipping up a folder full of files with a non-standard but not custom-compiled SFX stub.

(So no large file support, no password-protection, only supporting Store or Deflate compression, etc.)

> Well, it's public domain, IIRC. Even Red Hat (Fedora?) used to have
> ZOO available somewhere, a few years ago, but I forget exactly.
> Yeah, not as popular as ZIP, but it's still "something", if everything
> else fails you.

I've got /usr/bin/zoo installed on my Kubuntu machine. It's more the Windows users that I'm worried about. Neither Compressed Folders nor 7-zip nor WinRAR can unpack ZOO archives, let alone create them.


> Hmmm, that's old stuff, too. You can find lots of archiver stuff
> (e.g. UnARJ with C sources) at Sac.sk:

It's also where I found one of the two barebones, freeware, not-single-file DOS installer TUIs that I eventually managed to track down, both of which expect to be sat in front of a self-extractor or copy of a tool like PKZIP.

https://twitter.com/deitarion/status/1116836145197912064

The other came from http://www.retroarchive.org/cdrom/index.html

> * https://www.sac.sk/download/pack/unarj265.exe

Hmm. The terms given in UNARJ.TXT would seem to set up an ad-hoc "do what you want with the source, but leave our names intact in the code and any documentation you retain" license.

That's another option if Zip doesn't work out and it's certainly better-known than ZOO... especially in the DOS retro-hobby world.

> I don't think AR002 is free/libre, but IIRC, that's the origin of
> some of these old archivers (ZOO, ARJ, LHA), but I could be
> wrong (vaguely):
>
> * https://www.sac.sk/download/pack/ar002.zip
>

Not sure about the chain of heredity, but AR002 includes sources and is under "You may copy, distribute, and rewrite this program freely." terms according to the usage message it prints.

It's also an interesting option as a last resort, since it's not only under those terms, but explicitly targets Turbo C with the compact memory model.

On Tuesday, May 28, 2019 at 2:11:35 AM UTC, rug...@gmail.com wrote:
> Hi again,
>
> I'm somewhat confused on how TUNZ or PKUNZJR would help you.
> Yes, some of the old messages you refer to were by me, but I
> am no expert. Still, I do remember some bits about various old
> DOS archivers. But again, I don't know if any of that will help
> you much (if at all)!

While it's true that I want the versatility and support for single-file operation that come from interfacing directly with the extraction code, it'd still be possible to build a worthwhile multi-file DOS installer TUI by shelling out to a tool like TUNZ, PKUNZJR, or a self-extractor.

(Like the two freeware ones I mentioned... though they left a lot of room for more features.)

> For small extractors, I can vaguely remember LHE and ARCE:
>
> * https://www.sac.sk/download/pack/lhe101.exe
> * https://www.sac.sk/download/pack/arce41a.zip
>
> Neither has sources (IIRC) and probably aren't freeware anyways
> (maybe shareware). Probably not super useful.

Yeah. They're really not suited to the task.

ARC-E is freeware, but the freeware license requires that you distribute the additional 4.9KiB documentation file along with the 4.0KiB executable.

LHE is freeware but must be distributed in the form of that lhe101.exe distribution archive unless permission is received from the authors.

> At this point, I guess you can figure it out for yourself. Sorry
> I'm not much more help. It's just lots of little pieces. Sometimes
> it's useful to reuse others' code, but often you just gotta cobble
> together your own (as I'm sure you know).

Still, I appreciate you mentioning them. Given how hit-or-miss it can be to search for DOS-era stuff, every new mention of what these files are is worthwhile.

On Tuesday, May 28, 2019 at 2:18:57 AM UTC, rug...@gmail.com wrote:
> Again, I'm grasping at straws here, trying to point you in
> some right direction (barely). Only you know what you're
> looking for exactly.

Even if I don't use them in this project, the things you linked to are going in my list of resources for future projects.

Kerr-Mudd,John

unread,
May 28, 2019, 6:46:58 AM5/28/19
to
On Tue, 28 May 2019 03:56:47 GMT, "S. Sokolow"
<stephan...@gmail.com> wrote:

> On Tuesday, May 28, 2019 at 1:59:55 AM UTC, rug...@gmail.com wrote:
[]

>> ZOO available somewhere, a few years ago, but I forget exactly.
>> Yeah, not as popular as ZIP, but it's still "something", if
>> everything else fails you.
>
> I've got /usr/bin/zoo installed on my Kubuntu machine. It's more the
> Windows users that I'm worried about. Neither Compressed Folders nor
> 7-zip nor WinRAR can unpack ZOO archives, let alone create them.
>

https://en.wikipedia.org/wiki/Zoo_%28file_format%29
seems to be a link to src code to create a DOS "unzoo";
but sadly the exe is 46k!
http://archives.math.utk.edu/software/multi-platform/gap/util/unzoo.exe
[]

rug...@gmail.com

unread,
May 28, 2019, 7:08:13 PM5/28/19
to
Hi,

On Tuesday, May 28, 2019 at 5:46:58 AM UTC-5, Kerr-Mudd,John wrote:
> On Tue, 28 May 2019 03:56:47 GMT, "S. Sokolow"
> <stephan...@hates.spam> wrote:
>
> > On Tuesday, May 28, 2019 at 1:59:55 AM UTC, rug...@gmail.com wrote:
>
> >> ZOO available somewhere, a few years ago, but I forget exactly.
> >> Yeah, not as popular as ZIP, but it's still "something", if
> >> everything else fails you.
> >
> > I've got /usr/bin/zoo installed on my Kubuntu machine. It's more the
> > Windows users that I'm worried about. Neither Compressed Folders nor
> > 7-zip nor WinRAR can unpack ZOO archives, let alone create them.
> >
>
> https://en.wikipedia.org/wiki/Zoo_%28file_format%29
> seems to be a link to src code to create a DOS "unzoo";

The source code is old (circa 2000, apparently), a much-cleaned
up BOOZ code (with a few extra features). It still refers to
DJGPP v1 (go32.exe) !! Haven't tried compiling it, but it'd
be cool if it worked with DJGPP. But that's 386+ only, not what
he wants here.
There are two .EXEs, both are Win32/PE, old one for Cygwin and newer
for MinGW. 46 kb is on the smaller side for decompressors. The DJGPP
build will almost definitely be bigger (esp. with libc 2.04 or 2.05
with symlink support). Though there is partial, experimental COFF
support for --gc-sections , which does sometimes work, but that's
not saving too much. DJGPP (esp. libc) was never optimized well
for small size. But we love it anyways. (UPX helps with DJGPP but
not too much. You'd have to check the linker map and manually
work around it accordingly.)

No idea if the code is 16-bit clean, but probably not. Lots of stuff
was never designed to work with 16-bit cpus (e.g. Bzip2 or LZMA).
Maybe some genius could use EMS or swap to disk, but overall you're
out of luck. Deflate is good because it only needs 32 kb dictionary.

rug...@gmail.com

unread,
May 28, 2019, 7:43:32 PM5/28/19
to
Hi,

On Monday, May 27, 2019 at 10:56:49 PM UTC-5, S. Sokolow wrote:
> On Tuesday, May 28, 2019 at 1:59:55 AM UTC, rug...@gmail.com wrote:
>
> > I don't think ZIP is as universal as you imply. There is "appnote"
> > for detailing the format, but overall you're probably still relying
> > on many incompatible subsets from different vendors (Windows Explorer,
> > Total Commander, Info-Zip, 7-Zip, etc). It's not THAT bad, hopefully,
> > but I'd be very selective and test before assuming much of anything.
>
> It's true that Zip does have a lot of extensions and optional features,
> but they're not necessary in this case.

There were two competing encryption formats at one point. I think
somebody even created ZIPX. It's quite a mess, maybe too complicated
at this point. Just saying "ZIP is universal" ignores a lot of corner
cases.

Also, a lot of "modern" extensions came with, say, Vista's Explorer.
So XP wasn't as good. (I think Windows support first debuted in ME?)

7-Zip ignores EOS (end of stream) markers. KZIP had some rare, weird
quirks. Obviously Info-Zip behaves slightly differently to PKZIP
proper (which??). Gotta watch out for extra fields and various
compression methods, filename case or Unicode, path, timestamps,
whatever.

> Zip compression tools generally default to a combination of settings
> which is designed to be extractable with a specific PKZip for DOS 2.x
> version from the 1990s because Zip's big competitive advantage against
> things like ARJ, ARC, ZOO, RAR, Z-Zip, etc. is that everyone has a
> working unpacker for it.

I hate to pretend to know everything or pretend that you really have
to be super careful because of a thousand reasons, but it really is
a minefield.

Unpackers were usually free (or at least shareware), even PKUNZIP.
No, I'm not sure exactly why ZIP replaced everything else. Others
were relatively good, too.

PKZIP 2.04g was the popular one, but 2.50 was also very popular.
The latter had (undocumented, unpacking) Deflate64 support, some
rare extra features ("-exx"), and LFN support for Win9x. (I want
to say it was last updated in 1999. Newer versions weren't for
DOS.)

> For example:
>
> * Compressed Folders for Windows breaks if you try to use it to
> extract files from a Zip file bigger than 4GiB, like the archives
> from the Fanfiction.net dump over on Archive.org.

Which Windows? BTW, I hear NTVDM is still broken on Win10 (32-bit).

> * Info-ZIP supports ZIP64 and big zip files, but still doesn't
> support non-Deflate compression methods like bzip2. You need to
> use 7-Zip or p7zip for those.

Unzip 6.00 had Bzip2 support.

7-Zip's 7z [sic] had plugins for various formats beyond the normal
7za tool, but some of those (e.g. UnRAR) aren't free/libre. Yes,
one guy I know is redistributing some .ZIPs using LZMA method
(made by 7-Zip) since it compresses better. (FDNPKG supports that
but usually avoids recommending it due to higher memory requirements.)

Actually, Unrarlib 2.x was free/libre, so maybe you could adapt that.
RAR 2.50 (1999?) was the last 16-bit DOS version, and RARX 3.93 (2010)
was last 32-bit (dual DOS + OS/2) EMX version.

* https://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/util/file/unrar/v2/

> * If I'm remembering correctly, Info-ZIP intentionally doesn't
> implement support for creating archives with PKZIP's archaic,
> weak DOS-era password-protection.

Info-Zip's Zip "-e" is encrypt. Yes, it's pointless and can be broken.
No, I'm no hacker/cracker, so who cares. But yes, there are several
tools for that (not to mention GPUs, supercomputers, SMP, or whatever).

Here's pretty much all I know about that (one-time curiosity for
innocent reasons ... yes, people use passwords to avoid false
heuristics from bad/broken antiviruses, ugh):

* http://www.bttr-software.de/forum/board_entry.php?id=7492#p11871

> While I'll still definitely want to test against archives created
> by various tools, my goal is just to support that default
> compatibility profile in the form that makes sense for DOS
> installers created by a user zipping up a folder full of files
> with a non-standard but not custom-compiled SFX stub.
>
> (So no large file support, no password-protection, only supporting
> Store or Deflate compression, etc.)

I don't mean you have to (or should) support everything. Just pick
your tools carefully. Info-Zip is reasonable, so is 7-Zip. Things
that are widely available and used (and free/libre) should be preferred.

rug...@gmail.com

unread,
May 28, 2019, 7:59:50 PM5/28/19
to
Hi again,

On Tuesday, May 28, 2019 at 6:43:32 PM UTC-5, rug...@gmail.com wrote:
> On Monday, May 27, 2019 at 10:56:49 PM UTC-5, S. Sokolow wrote:
>
> > * Info-ZIP supports ZIP64 and big zip files, but still doesn't
> > support non-Deflate compression methods like bzip2. You need to
> > use 7-Zip or p7zip for those.
>
> Unzip 6.00 had Bzip2 support.

Maybe your Linux build doesn't have it, but the default Windows
.EXE supports it (slightly trimmed useless other info):

UnZip 6.00 of 20 April 2009, by Info-ZIP.

Compiled with Microsoft C 13.10 (Visual C++ 7.1) for
Windows 9x / Windows NT/2K/XP/2K3 (32-bit) on Apr 20 2009.

UnZip special compilation options:
USE_BZIP2 (PKZIP 4.6+, using bzip2 lib version 1.0.5, 10-Dec-2007)

IIRC, Bzip2 uses up to 900 kb ("-9") dictionary, even though it's only
a compressor, not an archiver. One of my pet peeves was that it would
not auto-detect and use a smaller dictionary for smaller files.

Anyways, if you really want to know more about compressors, try these
sites. I've never studied any theory nor written any compressors
from scratch, so I know basically nothing. But some of them are
geniuses:

* http://www.mattmahoney.net/dc/
* https://encode.ru/forum.php/

S. Sokolow

unread,
May 29, 2019, 12:43:11 AM5/29/19
to
On Tuesday, May 28, 2019 at 11:43:32 PM UTC, rug...@gmail.com wrote:
> Hi,
>
> There were two competing encryption formats at one point. I think
> somebody even created ZIPX. It's quite a mess, maybe too complicated
> at this point. Just saying "ZIP is universal" ignores a lot of corner
> cases.

True, but I'm not saying "ZIP is universal". I'm saying that one can define a subset of ZIP which can be trusted to be easy for anyone to extract with the tools they already have.

Goal #1 is for people to be able to unpack and test the CRCs on these single-file installers without running them for software preservation reasons.

It's Goal #2 that is "Make it as easy as possible for people to build new installers" and that's where making my extractor sufficiently permissive comes in.

For example, since this is a DOS installer project, I'd probably start from what PKZIP 2.04g's understands and then extend it with whatever is necessary to make it get along with test files produced by popular archival tools.

If I can figure out the set of characteristics which everyone can extract, then make my own installer only support that (with easily understandable error messages for what it doesn't support), it becomes reasonably easy for people to build archives for it, and any archive it supports is guaranteed to satisfy the requirements for everyone to be able to unpack it easily.

>
> Also, a lot of "modern" extensions came with, say, Vista's Explorer.
> So XP wasn't as good. (I think Windows support first debuted in ME?)

Compressed Folders was introduced in the Windows 98 Plus! extension pack. I remember having it on Windows 98 SE.

>
> 7-Zip ignores EOS (end of stream) markers. KZIP had some rare, weird
> quirks. Obviously Info-Zip behaves slightly differently to PKZIP
> proper (which??). Gotta watch out for extra fields and various
> compression methods, filename case or Unicode, path, timestamps,
> whatever.
>
> > Zip compression tools generally default to a combination of settings
> > which is designed to be extractable with a specific PKZip for DOS 2.x
> > version from the 1990s because Zip's big competitive advantage against
> > things like ARJ, ARC, ZOO, RAR, Z-Zip, etc. is that everyone has a
> > working unpacker for it.
>
> I hate to pretend to know everything or pretend that you really have
> to be super careful because of a thousand reasons, but it really is
> a minefield.

*nod* Hence the need to generate a bunch of test files from various programs and with various combinations of options.

For example, I'm in the middle of generating a bunch of test files for a recursive corruption check helper script I'm putting together to aid in finally getting some really old archives off my hard drives and onto dvdisaster-augmented DVD+Rs. Of course, in this case, it's a matter of generating compact files that still contain all the boilerplate for the format, and then making tweaked copies in a hex editor to simulate cosmic-ray bit-flips and other potential forms of corruption.

(Once I finish with the basic functional tests to ensure that my wrapper reports success and simple CRC errors properly, I think I might set up a fuzzer to try to generate corrupt files that still pass the checks.)

> Unpackers were usually free (or at least shareware), even PKUNZIP.
> No, I'm not sure exactly why ZIP replaced everything else. Others
> were relatively good, too.

*nod* As I remember, LHA had much less primitive support for split archives, which is why iD software used it for their installers.

It also became so dominant in Japan that the Japanese version of Compressed Folders supports it. (For Windows 7 Enterprise and Ultimate, you can gain .lzh support by installing the Japanese localization pack.)

> Which Windows? BTW, I hear NTVDM is still broken on Win10 (32-bit).

I'm not sure.

Windows 7 at least, because I run an archive of fanfiction that's lost its home and, during the Windows 7 era, I told a lot of people who asked me for help finding stories that the "corruption" they encountered was because Microsoft still hadn't fixed Compressed Folders to work with archives that large.

> Unzip 6.00 had Bzip2 support.

Huh. You're right. My November 9th, 2015 build of UnZip 6.00 does list Bzip2 support. I must've mis-identified the feature in that Zip file that p7zip supported but Info-ZIP didn't.

> Actually, Unrarlib 2.x was free/libre, so maybe you could adapt that.
> RAR 2.50 (1999?) was the last 16-bit DOS version, and RARX 3.93 (2010)
> was last 32-bit (dual DOS + OS/2) EMX version.

*nod* It's another possibility if Zip doesn't pan out.

>
> Info-Zip's Zip "-e" is encrypt. Yes, it's pointless and can be broken.
> No, I'm no hacker/cracker, so who cares. But yes, there are several
> tools for that (not to mention GPUs, supercomputers, SMP, or whatever).

Huh. It seems that the documentation on the Info-ZIP site is internally inconsistent. The main FAQ mentions it supporting encryption but, a couple of weeks ago, I stumbled across a page which was explaining why they didn't consider it worth their time to implement PKZIP-compatible encryption and it didn't appear to be an archived copy of outdated information.

> I don't mean you have to (or should) support everything. Just pick
> your tools carefully. Info-Zip is reasonable, so is 7-Zip. Things
> that are widely available and used (and free/libre) should be preferred.

*nod* At minimum, I plan to support Zip files created by the default settings on the current versions of Info-ZIP, p7zip, 7-Zip, WinRAR, and PeaZip, the Plus! 98 and Windows XP versions of Compressed Folders, and, if the nuances of the "for testing only" license on the modern.ie VMs allow it, the Windows 7 and Windows 10 versions of Compressed Folders.

I'll also want to generate some test Zips from the zipfile module in Python's standard library for various versions of Python and to see if anyone I know has a Mac so I can get a 100% legally clean test file generated by whatever's popular on MacOS.

On Tuesday, May 28, 2019 at 11:59:50 PM UTC, rug...@gmail.com wrote:
> Anyways, if you really want to know more about compressors, try these
> sites. I've never studied any theory nor written any compressors
> from scratch, so I know basically nothing. But some of them are
> geniuses:
>
> * http://www.mattmahoney.net/dc/
> * https://encode.ru/forum.php/

Thanks. :)

Johann 'Myrkraverk' Oskarsson

unread,
Oct 18, 2019, 2:01:56 PM10/18/19
to
On 11/05/2019 6:07 am, S. Sokolow wrote:
> I've been slowly picking away at the beginnings of a DOS analogue to open-source installer creators like InnoSetup and NSIS and, since I want this to be suitable for use on floppy disks, I've been aiming for the smallest code size that I can tolerate working toward using free tooling that can be legally redistributed. (ie. Open Watcom C/C++ with small bits of inline assembly, not NASM)

Did you get anywhere with the installer? Do you have something you're
ready to share?

--
Johann | email: invalid -> com | www.myrkraverk.com/blog/
I'm not from the Internet, I just work there. | twitter: @myrkraverk

S. Sokolow

unread,
Jan 5, 2020, 8:40:27 PM1/5/20
to
On Friday, October 18, 2019 at 6:01:56 PM UTC, Johann 'Myrkraverk' Oskarsson
> Did you get anywhere with the installer? Do you have something you're
> ready to share?
>
> --
> Johann | email: invalid -> com | www.myrkraverk.com/blog/
> I'm not from the Internet, I just work there. | twitter: @myrkraverk

I just realized that I forgot to share a copy of my reply publicly after I'd realized I'd only replied privately.

For anyone else who's wondering, I've had to put it on hold to work on some projects that have more need of my attention, but I do intend to come back to it.

Since my previous messages, I made sense of the splint documentation and got the annotations all fixed up, got part-way through refactoring the Watcom-compatible public-domain unzipping code I found so it'll function reliably if built for real mode, and completed the DOS shell-quoting/escaping parser (which will form the basis of the command-line argument parser and control-scripting runtime) and some basic automated tests for it.
0 new messages