news:70bcdacb-8024-408b...@googlegroups.com...
> On Tuesday, July 9, 2013 10:11:08 PM UTC+1, Rod Pemberton wrote:
> > "James Harris" <
james.h...@gmail.com> wrote in message
> > news:911de716-4485-493c...@googlegroups.com...
> > > [snip]
> > [snip]
> The key thing is the formats of IVT and 32-bit IDT are completely
> different and so would need some different source code to manage
> them.
Are you trying to manage them using the exact same code? #if-def's
do work...
There is no IDT until you set it up. So, you need code to install
it and manage it. The IVT is setup by the BIOS. Do you need code
to manage the IVT? How many RM interrupts does a PM OS need to
"fix" ... ? I'm assuming that for security, which you mentioned
somewhere, if you have a multi-mode 16-bit/32-bit/64-bit OS, then
all three modes will be PM, including the 16-bit mode. I.e., no
v86, no 16-bit RM. In which case, you have an IDT for all three,
but the sizes of data are likely different. I don't know what
you'd do about needing v86 or 16-bit RM for BIOS calls. We need to
find out if 16-bit PM can be used for BIOS calls.
> To try to illustrate what I had in mind the following could be
> the code for reading the CMOS/RTC chip.
>
> int cmos_read(int addr)
> {
> if (addr > ADDR_MAX)
> return -1;
> wait_if_needed(addr);
> addr_write(addr);
> return port_8_in(DATA_PORT);
> }
>
I'd expect "you", i.e., the OS you're creating or planning, to be
the _only_ caller of cmos_read(). I wouldn't expect a user to be
able to call it since it accesses a port. You can't be as
successful as MS without angering your users. Access denied! So,
we'll say that's for security reasons... ;-) So, is there any
reason at all to check for "addr>ADDR_MAX"? That seems to be
saying you can't trust your own OS which you coded to pass in the
correct value to cmos_read()... :-) If hackers do manage to call
it somehow, the 'if' would block bad port numbers. The question is
what happens if you attempt to write to a non-existant port number?
Does it get masked or wrapped to be within the valid address range?
I want to say it does... But, I'd need to pull out the manuals.
> It's untested but the idea is that the same source code should
> compile to three different outputs: 16-bit, 32-bit and 64-bit
> object code and work in the same way on each of them. The
> int sizes would differ but not what the code does.
8-bits, AFAIK, are supported in all three mode sizes on x86 as
integers and for ports. Of course, I'm not so familiar with
64-bits... As long as the int size is sufficiently large for all
modes for the returned data, that seems correct to me. I'm
assuming the data size is 8-bits from "port_8_in". I.e, returning
8-bits "promoted" or "implicitly" cast by the compiler to the
default integer size, most likely larger in size, should work for
16-bit, 32-bit, 64-bit without issue. Now, if there was a 32-bit
value that needs to be used in all three modes, it'll result in
splitting or joining operations for 16-bit mode, if you're using C.
If using assembly, you could use a size override. Then, the
question becomes, if you're using C, do you want to use the lowest
common denominator by spliting or joining for all three modes, or
do you want to use mode based #if-def code?
> The thing is that even with segments and whether paging is turned
> on or not, on x86-32 all the PM addresses that programs use are
> ultimately 32-bits wide. RM addresses are 16-bit or 20-bit.
v86, (and probably) 16-bit PM, are not 32-bits... For 32-bits, the
offset is large enough to address typical maximum hardware memory.
IIRC, for 64-bits, the offset is limited also, perhaps to 32-bits?
I.e., 64-bits has same issue as RM addresses: you can't address
*all* physical memory with a single offset, you may need to change
the selector or segment.
> [x86 interpreter]
Although, there definately are merits to a full 16-bit x86 binary
translator... It's just nobody really wants to code one for their
OS. A 32-bit PM monitor with v86 mode support and with a built-in
16-bit to 32-bit binary translator would progressively convert all,
or nearly all, 16-bit RM/v86 code from the BIOS, video BIOS, option
roms, DOS, DOS apps, DOS TSRs, to 32-bit. I.e., you'd have a
32-bit DOS OS in no time from an old 16-bit OS. However, you'd
still be limited to whatever drivers are available to DOS
community... I.e., if there is no 16-bit DOS SATA driver, there is
no SATA support.
> > Of course, that wouldn't be powerful enough, or correct, to
> > execute BIOS in 64-bit mode... The big issue is do you need
> > BIOS support in the OS or not? I.e., can BIOS usage be limited
> > to the MBR/VBR and bootloader code? I think it can. I.e.,
> > 16-bit code can be used until transferring execution to the OS.
>
> Two answers to that:
>
> 1. Yes, I think I might need to be able to execute the BIOS to
> adjust video settings. It seems impossible in a homebrew OS
> to do that portably otherwise.
You can do the VGA and lower stuff, because that hardware register
set is defined. I'm thinking you're probably aware of this, but
it's the non-standardized SVGA interfaces and VESA modes that are
an issue. First, their hardware interfaces aren't defined, i.e.,
must call BIOS to program the hardware. Second, the newest hi-res
modes have different mode numbers for each video card.
A while back I mentioned to Alexei that one could possibly use v86
mode to trap the ports and values for each required video BIOS
call. Then, you could program the card. He pointed out that there
could be timing issues or unknown wait loops etc. But, I think
it's still a valid concept, although it may need tweaking to
actually work correctly all the time, on every machine, because of
such issues. It's possible such an idea could work on very many or
very few machines, but if it worked on 50% to 80%, that might be
good enough. Obviously, an incompatibility disclaimer or DOS test
program should be offered so an interested user could verify
compliance of his machine with your OS, etc.
> 2. I had an eye on a distant 64-bit OS being able to support very
> old 16-bit real-mode apps. So an emulator would still be needed.
...
> If I follow then as above I had a different approach in mind,
> i.e. that the source code would be compiled to three different
> versions of the object code, one for each bit width of the OS.
>
> So there
> would be
>
> / --> 16-bit obj
> Portable C Source ---> | --> 32-bit obj
> \ --> 64-bit obj
>
> 16-bit C and asm ---> 16-bit obj
> 32-bit C and asm ---> 32-bit obj
> 64-bit C and asm ---> 64-bit obj
>
> Then the obj files would be combined to form three
> working OS images.
Um, what I understand you to be saying, and what the diagram tells
me are two different things. This is what I understand you to be
saying:
/ --> 16-bit obj
Portable C Source ---> | --> 32-bit obj
\ --> 64-bit obj
16-bit asm and Port. C Src. 16-bit obj ---> 16-bit OS image
32-bit asm and Port. C Src. 32-bit obj ---> 32-bit OS image
64-bit asm and Port. C Src. 64-bit obj ---> 64-bit OS image
Where the obj from the Portable C Source is taken and combined with
additional asm to produce the image. Is that correct?
Or, is your diagram correct? I take your diagram to mean you have
an obj from Portable C Source plus *another* obj from a mix of
non-portable 16-bit C and non-portable 16-bit asm. That seemed odd
to me. Why is there more C that's not in the Portable C Source?
I.e., avoid non-portable C. So, I would think all or most
non-portable code would be in asm (or could be...) or is inlined
assembly in C.
BTW, by "non-portable" I've been assuming this is _just_ across x86
cpu modes, i.e., non-portable from 16-bit to 32-bit to 64-bit x86.
If you meant "non-portable" in the C sense of portability across
different platforms, e.g., from x86 to ARM to Cray to Vax to 6502
etc, I misunderstood.
> My intention is that none of the above source files should
> contain conditionally included code. For a number of reasons
> ISTM better to write separate modules than have a given module
> with conditional code, if the modules can be kept small and their
> interfaces are well defined.
Creator's choice! Bonus.
> Could you say some more about why two compilers are more
> work than one? You might save me going down a bad road.
>
The two compilers I used have some very different ways to do
things. Of course, it's very possible I wasn't doing some things
the correct way. So, searching my code for my OS (long list):
-They call DPMI functions differently. That's not important to
you.
-The inline assembly syntax is different for each compiler, i.e.,
#if-defs. You'll need inline assembly for various instructions etc
like: lgdt, sgdt, lidt, interrupt vectors, cr0, lar, lsl, cs, ds,
es, fs, gs, ss, sp, wbinvd, rdtsc, hlt, int, smsw, etc.
-The inline assembly of one compiler doesn't support forward
references! (nightmare...)
-The two compilers have different naming conventions, i.e., those
underscores on names.
-They address registers in register structures differently. You
can fix this with conditional code or defines.
-They need different include files to load register structures. Of
course, you could define your own.
-They both have different naming for their inport and outport C
routines. They are in different include files. Of course, you
could define your own, but each has different inline assembly
syntaxes... I.e., no way to make uniform.
-They use a different method address C objects. One compiler uses
physical addressing. The addresses correspond to the physical
memory locations. This is true of memory below 1MB and the
application space. The other compiler uses relative addressing.
So, the application space starts at offset zero, no matter where
it's loaded. This means you have to adjust any address to memory
below 1MB by the load offset.
-The code they generate is for DOS DPMI, i.e., 32-bit code. Each
DPMI host set's up it's own GDT. However, when using these
compilers for a custom OS that won't have DPMI available, the
compiled C code is dependent on the GDT entries each compiler needs
for it's compiled code. I.e., one compiler needs 4 GDT entries
while the other needs 5.
-One compiler packs structs by single bytes. The other needs a
#pragma to do so.
-I'm not using them, but I found out the format of the C jmpbuf
structure was different for each.
-Each compiler has their own cli and sti _routines_, not
instruction... You need to use the compiler's routine, not roll
your own. Each of these is in a different include file too.
-I had to code some code to replace the initialization of various
data items that were set as part of the executable startup which
were no longer present once I abused the compiler to produce code
for my OS...
-The routines that will move and copy to memory below 1MB are
different.
-The compiler methods to inline routines is different for each.
One uses an declaration with an attribute and the other uses a
compiler keyword.
-One compiler had a way to make "naked" C routines, i.e., no
prolog(ue), no epilog(ue), no stack frame code, no code clean,
i.e., "clean" assembly. The other compiler would only support
"void func(void)" which leaves a minimal stack frame. This
eliminated the prolog and epilog but keeps the stackframe code. It
required use of the 'leave' instruction to remove the stack frame.
I suspect I'm overlooking or unaware of some compiler feature...
You're likely going to need a naked function for an interrupt
wrapper.
-One compiler could generate both near and far returns
appropriately for procedures while the other would only generate
near. That prevented adjusting the "naked" function's return code
via a far ret. Optimization also prevented adjustment of the
procedure return instruction for that compiler. Additional
instructions were required to manually adjust the IP.
> > > Both could be built from the same HLL source.
>
> > That's great, where possible. It's never 100%.
>
> Would having some modules be mode-specific allow
> the bulk of code to be truly portable?
30% of C is portable.
30% of C appears portable, i.e., close equivalent.
40% of C is not portable and never will be.
:-)
Your asking if, say the BIOS video routines are always 16-bit RM,
can the rest of the OS be portable between modes? I've not
compiled my OS for 16-bit. I'm assuming all the C code could be
made to work for 16-bits fairly easily, but not without some
changes. Off hand, I can't think of anything that's 32-bit
specific, but I haven't worked on it in some years now... Of
course, there are the addressing limitations for 16-bit code that
have to be fixed by small #if-defs. The few inline assembly
_routines_ would need new routines coded as 16-bit, e.g., interrupt
wrappers, instead of 32-bit. The standalone inline assembly
_instructions_ should probably compile as-is.
> > > (They would need to be linked with other object files which
> > > were mode specific in order to generate the complete OS.)
>
> > Is this binary mixed-mode code? Both 16-bit and 32-bit in one
> > binary? Yeah, I just haven't considered that. I'd prefer
> > entirely 16-bit, or entirely 32-bit, etc.
>
> No, I don't mean the binaries to run in any more than one mode.
> I thought you did based on your comment:
>
> "That's dual mode 16-bit and 32-bit without any operand and
> address size overrides. I was hoping to find a way to do
> 16-bit and 32-bit and 64-bit. But, I'm not sure that that is
> possible because of the REX prefix... You'd have to find a
> way to nullify the effects of it's operation in 16-bit and
> 32-bit code."
>
> Sorry if I misunderstood it.
Sorry, taken together that's not especially clear, is it?
The part I hadn't considered was differently sized code segments in
the same file. This would be like an OS that supports both 32-bit
and 16-bit code segments at the same time, instead of a pure 32-bit
OS, etc. The dual mode idea (multi-mode now...) was intended for
minimal assembly startup, bootloader, bootstrap interpreter, etc.
There is no way I'd want to code anything of sufficient size in
that. There is no way to get a compiler to emit such code either.
I.e., develop the technique, assemble code for 16-bit, disassemble
for 16-bit, 32-bit, and 64-bit, confirm code functions
equivalently. PIA.
> I have made a start with gcc and bcc on Linux but don't want the
> C code to use any pragmas or specific features of either of them.
You may have a hard time without using a pragma. GCC needs one to
pack structures correctly, or an attribute... Some of the x86
system structures need to be aligned correctly too. That requires
a pragma, or directive, or attribute too.
> I don't want to get locked into any specific compiler.
I'd say "good luck"! But, that could just be my experience or
incorrect path.
> If you wanted to would it be practical to change the C source
> so it would compile under any compiler?
Any compiler, any C compiler, or any compiler or assembler?
There is a bit of inline assembly. That syntax is custom to the
compiler. E.g., OpenWatcom uses WASM which is MASM compatible
syntax, DJGPP (GCC) uses GNU AS (GAS) which is AT&T syntax. If I
move the assembly out of the C code into assembly files, I could
convert it to say NASM. Then, the C code would be purely C code,
but it'll still need compiler specific tweaks because of the x86
assembly produced, different include files, differently named
functions, functions which work differently, etc.
Of course, I'm not up on name demangling, so I might be overlooking
some feature that solves a few more of these issues. IIRC, Chris
Giese in his code used many #defines to make the main body of C
code in each file identical. However, there was a block of
compiler specific defines at the top of each file. My code has
compiler specific includes at the top and a bunch of #if-defs
throughout to adjust for each compiler.
> Could the stuff that makes some code compiler-specific
> including possibly inline assembly be written in separate
> modules and linked to the C?
Yes. But, you can't eliminate all of it, AFAIK. E.g., 32-bit code
produced by DJGPP needs 5 descriptors, whereas OW needs 4. The
setup is compiler specific. Of course, I'm not using "standalone"
or "freestanding" compiler options, or compiler flags to select
independence from the C library, or linker scripts. I'm using a
standard DOS DPMI executable intended to be executed under DOS.
This would be like using a standard ELF executable on Linux or
standard PE on Win32 for your OS. Most people don't do that. :-)
So, those things may be a big factor that helps you eliminate some
of the issues I have.
Rod Pemberton