Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Reset via null idt

190 views
Skip to first unread message

Rod Pemberton

unread,
Nov 7, 2011, 2:32:13 AM11/7/11
to

As you know or should, you can reset a machine via a triple-fault when an
int instruction is executed and the idt has zero length, i.e., no available
vectors to call for the int instruction. AIUI, this is supposed to work on
all machines.

So, have you guys ever had problems using that method to reboot?

This works for most of my machines, even one that is no longer alive.
However, I just noticed that one of my machines is giving me fits when I try
to attempt to reset it that way. Unfortunately, this is the primary method
that I prefer for my OS and other utils.

In the app where I noticed the problem, this method is part of a slightly
larger routine which I haven't eliminated as a possible cause, yet ... I'll
cut down the routine later to do some tests eventually. I don't currently
have any assemblers installed on that machine. A keyboard reset works fine,
so no comments for using that instead, please. :-) The problem is that it's
not triple-faulting. Interrupts and NMI are disabled. I decided to write
0xCC (Int03) to a block of low memory, e.g., zero upto 600h. (Yeah, don't
ask how I came up with that ...) When I do, it ends up in the Int3 handler,
which means one of those 0xCC instructions gets executed in low mem ...
!?!?! <--roughly WTF? That implies to me, that it's executing code other
than what's in my reset routine - although interrupts and NMI are disabled.
How does it escape? Any speculation on how it's going "off into the weeds"?
Single- or Double-fault instead of triple and being trapped? Hardware
interrupt? SMM? Does the IDT area need to be null'd too, i.e., not just the
IDT descriptor used by LIDT?


Rod Pemberton



CN

unread,
Nov 7, 2011, 3:23:24 AM11/7/11
to
What do you do after you load IDTR with a zero address and size? You
need to trigger some exception for it to work. Are you sure you
successfully set IDTR to zero/zero? If there's a bug in loading the
IDTR, it would point to some random memory resulting in the varying
behavior you see. Try storing IDTR with SIDT and check the value.

wolfgang kern

unread,
Nov 7, 2011, 5:18:19 AM11/7/11
to

Rod Pemberton wrote:

> As you know or should, you can reset a machine via a triple-fault when an
> int instruction is executed and the idt has zero length, i.e., no
> available
> vectors to call for the int instruction. AIUI, this is supposed to work
> on
> all machines.

> So, have you guys ever had problems using that method to reboot?

:) sure not by intention!
but I remember that the CPU just fell asleep because its RESET-pin is an
input, so it may depend on the mainboard/chipset to invoke RESET on the
status of the various CPU-ERRor-pins, this could be standard meanwhile.

my prefered reboot method is a jump to ffff:0000 (000f_fff0).

....
> The problem is that it's not triple-faulting.
> Interrupts and NMI are disabled. I decided to write 0xCC (Int03)
> to a block of low memory, e.g., zero upto 600h. (Yeah, don't
> ask how I came up with that ...) When I do, it ends up in the Int3
> handler, which means one of those 0xCC instructions gets executed in
> low mem ...!?!?! <--roughly WTF?

did you set all 6 IDTR bytes to null ;size=0 pointer=0 ?
a zero size should just do what you want.

> That implies to me, that it's executing code other
> than what's in my reset routine - although interrupts and NMI are
> disabled.
> How does it escape?
> Any speculation on how it's going "off into the weeds"?
> Single- or Double-fault instead of triple and being trapped? Hardware
> interrupt? SMM? Does the IDT area need to be null'd too, i.e., not
> just the IDT descriptor used by LIDT?

before I had my own tools ready I often stumbled over the hidden behaviour
from some smart tools which had their own idea about exceptions ...

I haven't tried recently, but it could well be that the BIOS together with
the chipset can detect and respond to ERR-pins with an int3 like debug-info.
__
wolfgang


James Harris

unread,
Nov 7, 2011, 2:06:54 PM11/7/11
to
On Nov 7, 7:32 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> As you know or should, you can reset a machine via a triple-fault when an
> int instruction is executed and the idt has zero length, i.e., no available
> vectors to call for the int instruction.  AIUI, this is supposed to work on
> all machines.
>
> So, have you guys ever had problems using that method to reboot?
>
> This works for most of my machines, even one that is no longer alive.

Why is in no longer alive? Did you triple fault it too many times? ;-)

> However, I just noticed that one of my machines is giving me fits when I try
> to attempt to reset it that way.  Unfortunately, this is the primary method
> that I prefer for my OS and other utils.
>
> In the app where I noticed the problem, this method is part of a slightly
> larger routine which I haven't eliminated as a possible cause, yet ...  I'll
> cut down the routine later to do some tests eventually.  I don't currently
> have any assemblers installed on that machine.  A keyboard reset works fine,
> so no comments for using that instead, please. :-)

Why don't you use the.... Oh, OK.

>  The problem is that it's
> not triple-faulting.  Interrupts and NMI are disabled.  I decided to write
> 0xCC (Int03) to a block of low memory, e.g., zero upto 600h.  (Yeah, don't
> ask how I came up with that ...)  When I do, it ends up in the Int3 handler,
> which means one of those 0xCC instructions gets executed in low mem ...

You can tell the address of the 0xCC, no? Maybe that will give a clue.
Are you running in real or protected mode? I presume protected but
that something is directing EIP to low memory - possibly 0x0000_0000.

> ... That implies to me, that it's executing code other
> than what's in my reset routine - although interrupts and NMI are disabled.
> How does it escape?  Any speculation on how it's going "off into the weeds"?
> Single- or Double-fault instead of triple and being trapped?  Hardware
> interrupt?  SMM? Does the IDT area need to be null'd too, i.e., not just the
> IDT descriptor used by LIDT?

According to one manual I have (PPro Developer's) if it gets an
exception while trying to call the double fault handler the processor
enters shutdown mode. Not sure if that's the same thing you are
thinking about but shutdown != reset.

James

James Harris

unread,
Nov 7, 2011, 2:29:34 PM11/7/11
to
On Nov 7, 10:18 am, "wolfgang kern" <nowh...@never.at> wrote:

...

> my prefered reboot method is a jump to ffff:0000 (000f_fff0).

You mean to 0xffff_fff0? I remembered Frank van Gilluwe made some
comments on reset options. He mentions jumping to the reset start
address and says

* this is *the* way to reset an 8088 PC
* after the 8088 it may not work on all machines.

Why the latter? It seems it's mainly to do with register values at
reset unless you carefully set them up first. Some BIOSes may expect
certain cpu identification values in the data regs. Non-data registers
may need suitable values: flags, debug regs, etc., possibly mtrrs.
Also, IIRC the CPU holds the upper part of the address bus at high
logic levels from reset until after the first jump. If 0xffff_fff0
holds a relative jump you are probably OK but if it had some code in
those 16 bytes prior to the jump that code might operate with
unexpected addressing.

On a 32-bit system why use anything other than KBC or SCPA?

James

Rod Pemberton

unread,
Nov 7, 2011, 10:31:13 PM11/7/11
to
"wolfgang kern" <now...@never.at> wrote in message
news:j98hbr$uag$1...@newsreader2.utanet.at...
> > Rod Pemberton wrote:
...

> my prefered reboot method is a jump to ffff:0000 (000f_fff0).
>

Why? And, that requires RM ... Yes? Will that work for v86?

How do you handle PM? Switch to back to RM? Why? The keyboard controller
and null IDT should work for PM ...

> > The problem is that it's not triple-faulting.
> > Interrupts and NMI are disabled.
>
> did you set all 6 IDTR bytes to null ;size=0 pointer=0 ?
> a zero size should just do what you want.
>

I set the 6 bytes loaded by LIDT - whatever they're called - to null, yes.
Do I have an o32 on LIDT? No. Is LIDT one of those intructions that
reads 8 bytes but uses only 6 ... ? I'll have to check that.


Rod Pemberton



Rod Pemberton

unread,
Nov 7, 2011, 10:33:46 PM11/7/11
to
"James Harris" <james.h...@googlemail.com> wrote in message
news:e6e5c4eb-cb63-4d3c...@p20g2000prm.googlegroups.com...
> On Nov 7, 7:32 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
...

> > [...] even one [PC] that is no longer alive.
>
> Why is in no longer alive? Did you triple fault it too many times? ;-)

:) No.

It had electrical problems. It was a compact machine that had part of the
power supply circuitry on the motherboard instead of in the power supply.
E.g., instead of a power supply with cables for every device, it had one
cable to the motherboard and other cables coming off of the motherboard
running to the devices. I suspect either the power supply or that something
in the motherboard circuitry was failing. A small coil or transformer on
the board appears to be slighltly burnt or heat damaged. I never got back
around to attempting to fix it, so recently re-used the parts in another
machine.

> > The problem is that it's
> > not triple-faulting. Interrupts and NMI are disabled. I decided
> > to write 0xCC (Int03) to a block of low memory, e.g., zero upto
> > 600h. (Yeah, don't ask how I came up with that ...) When I do, it
> > ends up in the Int3 handler, which means one of those 0xCC
> > instructions gets executed in low mem ...
>
> You can tell the address of the 0xCC, no? Maybe that will give
> a clue. Are you running in real or protected mode?

This is in RM or v86 ... v86 could be an issue. I'll have to check that.


Rod Pemberton


Rod Pemberton

unread,
Nov 7, 2011, 10:35:41 PM11/7/11
to
"CN" <qmbmn...@pacbell.net> wrote in message
news:j984gi$vbl$1...@dont-email.me...
> On 11/6/2011 11:32 PM, Rod Pemberton wrote:
> > As you know or should, you can reset a machine via a triple-fault when
> > an int instruction is executed and the idt has zero length, i.e., no
> > available vectors to call for the int instruction. AIUI, this is
supposed
> > to work on all machines.
> >
> > So, have you guys ever had problems using that method to reboot?
> > [snip]
>
> What do you do after you load IDTR with a zero address and size? You
> need to trigger some exception for it to work.

I call int 0h using an 'int' instruction. CLI has interrupts disabled. NMI
is disabled too.

> Are you sure you
> successfully set IDTR to zero/zero?

It works on other machines ...

> If there's a bug in loading the IDTR, it would point to some
> random memory resulting in the varying behavior you see. Try
> storing IDTR with SIDT and check the value.

Since it rebooted other machines, I've not stored out IDTR with SIDT as
you've suggested.


Rod Pemberton


Antoine Leca

unread,
Nov 8, 2011, 10:33:08 AM11/8/11
to
Rod Pemberton wrote:
>> my prefered reboot method is a jump to ffff:0000 (000f_fff0).
>
> Why? [...] Will that work for v86?

No, but neither will the null IDT method. ;-)


>> did you set all 6 IDTR bytes to null ;size=0 pointer=0 ?
>> a zero size should just do what you want.

Yes (according to Intel arch manuals), providing the "pointer" points to
a valid 64KB area of memory: the upper limit of the IDT is the first
check after determining that we are dealing with a protected-mode
interruption or exception.

> Do I have an o32 on LIDT?

For the same reason as above, it should not matter either.

Also, your tests with 0CCh filling show that whatever the chip is doing,
it does NOT have the "normal" behaviour: if you load IDTR with 0/0 then
issue INT 0, it should fetch the entry at 0000_0000, encounter an access
byte of 1_2_01100 (present DPL2 32-bit call gate) which would be illegal
in IDT, hence #GP(0), should fetch the entry at 0000_0068, encounter an
access byte of 1_2_01100 which would be illegal, so #DF and it ends
being a triple fault anyway...

> Is LIDT one of those intructions that reads 8 bytes but uses
> only 6 ... ?

I believe no. If you activate alignment check, the SIDT/SGDT
instructions are successful only when the operand is aligned on a even
address NOT divisible by 4: the only coherent explanation is that the
16-bit size is written separately from the 32-bit address.

Of course this is not directly applicable to LIDT/LGDT since they are
always used at CPL0; and chip designs are also free to use optimizing
roads and a 64-bit load cycle here; or even an 128-bit load to share
circuits with the logic in charge of segment loading (bytes 0-4 are the
same for IDTR/GDTR as they are for a segment descriptor...)


Antoine

wolfgang kern

unread,
Nov 8, 2011, 1:31:37 PM11/8/11
to

"James Harris" replied:

I wrote:

...

> my prefered reboot method is a jump to ffff:0000 (000f_fff0).

|You mean to 0xffff_fff0? I remembered Frank van Gilluwe made some
|comments on reset options. He mentions jumping to the reset start
|address and says
|* this is *the* way to reset an 8088 PC
|* after the 8088 it may not work on all machines.

No, I just jump to the standardised (since long I think) BIOS-RESET,
and there (at ffff:0000) is a real mode far jump into the BIOS itself.

|Why the latter? It seems it's mainly to do with register values at
|reset unless you carefully set them up first. Some BIOSes may expect
|certain cpu identification values in the data regs. Non-data registers
|may need suitable values: flags, debug regs, etc., possibly mtrrs.
|Also, IIRC the CPU holds the upper part of the address bus at high
|logic levels from reset until after the first jump. If 0xffff_fff0
|holds a relative jump you are probably OK but if it had some code in
|those 16 bytes prior to the jump that code might operate with
|unexpected addressing.

Yes, a hardware-reset (power-cycle or reset-button) jumps to a linear
ffff_fff0 but there is a 16- bit far jump instruction (CS-initialise)
and the mainboard has to relocate (and-mask) ffff_fff0 to 000f_fff0 to
physical address the BIOS during several hundred cycles after a RESET
occure.

The jump I mentioned causes the BIOS do to the same as after power-on.
Of course it assumes that this address isn't paged away or overwritten :)

|On a 32-bit system why use anything other than KBC or SCPA?

sure, both above may work as well, but with my way I haven't to
check if the legacy keybd-controller is enabled and emulated or not.

I haven't seen any PC-XT/AT++ BIOS which haven't got this far jump
at this address during the last three decades.
Looks like this olde hardware standard survived all new ideas :)

__
wolfgang


wolfgang kern

unread,
Nov 8, 2011, 1:43:58 PM11/8/11
to

"Rod Pemberton" <do_no...@noavailemail.cmm> schrieb im Newsbeitrag
news:j9a7oe$9if$1...@speranza.aioe.org...
> "wolfgang kern" <now...@never.at> wrote in message
> news:j98hbr$uag$1...@newsreader2.utanet.at...
>> > Rod Pemberton wrote:
> ...
>
>> my prefered reboot method is a jump to ffff:0000 (000f_fff0).
>>
>
> Why? And, that requires RM ... Yes?

Yes.

> Will that work for v86?

Haven't checked that.
May not work if the BIOS-ROM-space is paged away ...

> How do you handle PM? Switch to back to RM?

Yes.

> Why?

The function (at least in my OS) is there, so why not use it?

> The keyboard controller and null IDT should work for PM ...

Keyboard method may not work if emulation is off.
I like to see all my code working or not.
So a zero-sized IDT will disable all exception-reporting.

>> > The problem is that it's not triple-faulting.
>> > Interrupts and NMI are disabled.

>> did you set all 6 IDTR bytes to null ;size=0 pointer=0 ?
>> a zero size should just do what you want.

> I set the 6 bytes loaded by LIDT - whatever they're called - to null, yes.
> Do I have an o32 on LIDT? No. Is LIDT one of those intructions that
> reads 8 bytes but uses only 6 ... ? I'll have to check that.

LIDT reads only 48 bits from memory, valid for RM,PM16,PM32.
An operand size prefix will/should be ignored.

__
wolfgang


Rod Pemberton

unread,
Nov 9, 2011, 2:39:10 AM11/9/11
to
"Antoine Leca" <ro...@localhost.invalid> wrote in message
news:4eb94b9e$0$3190$426a...@news.free.fr...
> Rod Pemberton wrote:
...
> Also, your tests with 0CCh filling show that whatever the chip is doing,
> it does NOT have the "normal" behaviour: if you load IDTR with 0/0 then
> issue INT 0,

"... and the non-normal behavior occurs, and the IDT has been filled with
0xCC, and the processor is in PM, then ...."

> it should fetch the [IDT] entry at 0000_0000, encounter an access
> byte of 1_2_01100 (present DPL2 32-bit call gate) which would be illegal
> in IDT, hence #GP(0), should fetch the entry at 0000_0068, encounter an
> access byte of 1_2_01100 which would be illegal, so #DF and it ends
> being a triple fault anyway...

It's been a few years now since I setup a call gate for PM ... What is
"illegal in [an] IDT" for the call gate? DPL of 2? My notes say DPL must
matches caller's PL ...

In this case, the reset is intended for RM, not PM. Yes, LIDT works for RM.
So, IVTs, not IDTs, apply. The IVT isn't filled with 0xCC by default.
After not triple-faulting, I enabled that code just in case, and was
surprised when it trapped. It just happens that the IVT is in the region
filled with 0xCC, except for INT 3 which I reset. The issue exists without
a 0xCC'd IVT, i.e., with whatever 16-bit RM IVT address is for vector zero.
If INT 0 for an IDT of size zero was loading a 0xCC'd IVT for INT 0, it'd
end up at RM address of CCCC:CCCCh or D998Ch. That's an optional ROM
BIOS region or network card. If code execution transferred to D998Ch, it
likely wouldn't execute my 0xCC's in low memory as an instruction.
Therefore, it wouldn't end up in INT 3's interrupt routine. Somehow,
something below 600h is being executed. That's my best guess ... I have to
recheck to see if v86 is active or not. I don't believe so, but it could
be. If v86 was causing a problem with LIDT, I'd think I'd have seen the
issue on other machines.


Rod Pemberton







Rod Pemberton

unread,
Nov 9, 2011, 2:41:27 AM11/9/11
to
"wolfgang kern" <now...@never.at> wrote in message
news:j9bv9v$ph$2...@newsreader2.utanet.at...
> "Rod Pemberton" <do_no...@noavailemail.cmm> schrieb im Newsbeitrag
> news:j9a7oe$9if$1...@speranza.aioe.org...
> > "wolfgang kern" <now...@never.at> wrote in message
> > news:j98hbr$uag$1...@newsreader2.utanet.at...
> >> > Rod Pemberton wrote:
> > ...
> >
...

> The function (at least in my OS) is there, so why not use it?
>

I don't recall. Is your OS in RM? Surprised, if it is ... Wait, no, you
said you switched to RM from PM ... RM and PM? v86? PM but
switch back to RM for BIOS hardware and video BIOS ... ?

> > The keyboard controller and null IDT should work for PM ...
>
> Keyboard method may not work if emulation is off.
>

That's not something I considered, but that's also something I had no
intention of supporting either. (USB keyboard without BIOS emulation)

USB is good. It be nice if I could ever get to the point of USB support ...
But, I've got lots to go on other stuff, and the project has been stalled
for a few years.

> >> > The problem is that it's not triple-faulting.
> >> > Interrupts and NMI are disabled.
>
> >> did you set all 6 IDTR bytes to null ;size=0 pointer=0 ?
> >> a zero size should just do what you want.
>
> > I set the 6 bytes loaded by LIDT - whatever they're called - to null,
> > yes. Do I have an o32 on LIDT? No. Is LIDT one of those intructions
> > that reads 8 bytes but uses only 6 ... ? I'll have to check that.
>
> LIDT reads only 48 bits from memory, valid for RM,PM16,PM32.
> An operand size prefix will/should be ignored.
>

Lookup LIDT description, yet again ...

How you specify a 32-bit base address in RM for LIDT, instead of 24-bit,
without using an operand size prefix? E.g., for a 32-bit PM IDT inited from
RM. Most people would init from PM, but that should be legal from RM, yes?


Rod Pemberton



James Harris

unread,
Nov 9, 2011, 3:05:42 AM11/9/11
to
On Nov 9, 7:41 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> "wolfgang kern" <nowh...@never.at> wrote in message
>
> news:j9bv9v$ph$2...@newsreader2.utanet.at...> "Rod Pemberton" <do_not_h...@noavailemail.cmm> schrieb im Newsbeitrag
> >news:j9a7oe$9if$1...@speranza.aioe.org...
> > > "wolfgang kern" <nowh...@never.at> wrote in message
Are you saying you run LIDT in real mode before switching to PM (or
run LIDT in PM and switch back to RM)?

Whether you are or not I wondered if that would work and found some
references as follows.

80386 Programmer's Reference Manual
"10.4.1 Interrupt Descriptor Table
The IDTR may be loaded in either real-address or protected mode.
However, the format of the interrupt table for protected mode is
different than that for real-address mode."

Which doesn't state but it does imply that the load needs to be in the
mode that the IDT is for, and, to focus on the key phrase in the text
(but not intending to hide the context - q.v.): "...a valid IDT has
been **created in protected mode**." (Asterisks added.)

Pentium Pro Family Developer's Manual Vol 3: Operating System Writer's
Guide
"8.8.1. Switching to Protected Mode
...The 32-bit Intel Architecture processors have slightly different
requirements for switching to protected mode. To insure upwards and
downwards code compatibility with all 32-bit Intel Architecture
processors, it is recommended that the following steps be performed:"

and **after** step 3 (set PM in CR0) step 8 is
"8. Execute the LIDT instruction to load the IDTR register with the
address and limit of the protected-mode IDT."

In other words, even if LIDT works with an opsiz between modes in some
CPUs it is not documented to work on all of them that way but running
it in the mode concerned: PM or RM should always work.

James

wolfgang kern

unread,
Nov 9, 2011, 4:51:11 AM11/9/11
to

Rod Pemberton wrote:

>> The function (at least in my OS) is there, so why not use it?

> I don't recall. Is your OS in RM? Surprised, if it is ... Wait, no, you
> said you switched to RM from PM ... RM and PM? v86? PM but
> switch back to RM for BIOS hardware and video BIOS ... ?

Yeah :) I got back- and forward links for both without VM86.

>>> The keyboard controller and null IDT should work for PM ...

>> Keyboard method may not work if emulation is off.

> That's not something I considered, but that's also something I had no
> intention of supporting either. (USB keyboard without BIOS emulation)

> USB is good. It be nice if I could ever get to the point of USB support
> ...
> But, I've got lots to go on other stuff, and the project has been stalled
> for a few years.

Looks like we both wait for Bens book :)

>>>>> The problem is that it's not triple-faulting.
>>>>> Interrupts and NMI are disabled.
>>>> did you set all 6 IDTR bytes to null ;size=0 pointer=0 ?
>>>> a zero size should just do what you want.

>>> I set the 6 bytes loaded by LIDT - whatever they're called - to null,
>>> yes. Do I have an o32 on LIDT? No. Is LIDT one of those intructions
>>> that reads 8 bytes but uses only 6 ... ? I'll have to check that.

>> LIDT reads only 48 bits from memory, valid for RM,PM16,PM32.
>> An operand size prefix will/should be ignored.

> Lookup LIDT description, yet again ...

I just checked my code, both of my LIDT doesn't have 66h in front of.

> How you specify a 32-bit base address in RM for LIDT, instead of 24-bit,
> without using an operand size prefix? E.g., for a 32-bit PM IDT inited
> from RM.

LIDT base for RM should reside below 1MB of course, but because it may
become loaded while still in PM32 it's best to always define all 48 bits.
I remember vague to once read about just 24-bit base loaded when in RM(?).

> Most people would init from PM, but that should be legal from RM,
> yes?

the RM-IDT BIOS default is at RAM-bottom (0), but after I switched to PM,
I need to reload IDTR with full 48-bit for a backlink to RM.

__
wolfgang


Antoine Leca

unread,
Nov 9, 2011, 6:52:56 AM11/9/11
to
Rod Pemberton wrote:
> It's been a few years now since I setup a call gate for PM ... What is
> "illegal in [an] IDT" for the call gate?

IDT should only have interrupt gates, trap gates, or task gates if you
really feel brave ;-) ; call gates are supposed to be in GDT or LDT; the
number of parameters which is the biggest difference with trap gates
won't make much sense for an interrupt switch anyway.

> DPL of 2? My notes say DPL must matches caller's PL ...

It is purely theorical analysis: DPL of 2 is what results when you
analyze an attribute byte set to 0CCh (1100_1100 or 1_10_01100 when
grouped accordingly.)


> In this case, the reset is intended for RM, not PM.

I was considering the V86 case you mentioned, and also considering a
case where something had gone wrong (SMM or some strange thing like
that) which would left the CPU in some strange mode. But your detailed
reasoning shows me now that indeed RM is at hand, because the INT03
handler is at distinct position in an IVT (000C) or in an IDT (0018).

Sorry I cannot help further.


Antoine

Antoine Leca

unread,
Nov 9, 2011, 6:53:03 AM11/9/11
to
wolfgang kern wrote:
>> The keyboard controller and null IDT should work for PM ...
>
> Keyboard method may not work if emulation is off.

Hmmm... the i8042 KBC is hardware, I do not believe there are much
emulation to turn on or off here; unless you are speaking about hardware
where the 8042 controller is not emulated, but then I guess you are
going to have more troubles than just KBC-reset-by-clearing-bit-0-of
port-64 not working...


> LIDT reads only 48 bits from memory, valid for RM,PM16,PM32.

Probably yes (16+32), as I wrote yesterday.

> An operand size prefix will/should be ignored.

It is not, for compatibility with the 286 processor, where the SIDT
instruction wrote 0FFh in the top byte of the base, while it should
logically have been 00h. So the 16-bit LIDT masks off the upper 8 bits
of the 32-bit base address it just reads (effectively restricting the
IVT/IDT to be in the lower 16MB), while the 32-bit version allows the
the table to be in any part of the logical address space.


Antoine

Rod Pemberton

unread,
Nov 10, 2011, 8:30:50 AM11/10/11
to
"James Harris" <james.h...@googlemail.com> wrote in message
news:19d561d7-a11a-460f...@f29g2000yqa.googlegroups.com...
> On Nov 9, 7:41 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
...

> > > LIDT reads only 48 bits from memory, valid for RM,PM16,PM32.
> > > An operand size prefix will/should be ignored.
>
> > Lookup LIDT description, yet again ...
>
> > How you specify a 32-bit base address in RM for LIDT, instead of 24-bit,
> > without using an operand size prefix? E.g., for a 32-bit PM IDT inited
> > from RM. Most people would init from PM, but that should be legal from
> > RM, yes?
>
> Are you saying you run LIDT in real mode before switching to PM
> (or run LIDT in PM and switch back to RM)?

1) Me, personally? No, I don't, or don't believe so. I do the following
for various programs. I reset the RM IVT location to default RM IVT
location (zero) for some code to ensure it wasn't changed. I relocate the
RM IVT to detect or ignore direct writes to the default region of the RM
IVT. I set null IVT/IDT for resets in RM and PM. In PM, I load or reload
an IDT for PM.

2) It's my understanding that both of those uses are valid though. AIR, the
reason LIDT is not a privileged instruction is to allow that. I seem to
recall that loading a PM IDT in RM prior to the PM switch was needed in
some situations.

> 80386 Programmer's Reference Manual
> "10.4.1 Interrupt Descriptor Table
> The IDTR may be loaded in either real-address or protected mode.
> However, the format of the interrupt table for protected mode is
> different than that for real-address mode."
>

Let's quote 10.4.1 in full. There is other stuff there I'll be mentioning
later:

"The IDTR may be loaded in either real-address or protected mode. However,
the format of the interrupt table for protected mode is different than that
for real-address mode. It is not possible to change to protected mode and
change interrupt table formats at the same time; therefore, it is inevitable
that, if IDTR selects an interrupt table, it will have the wrong format at
some time. An interrupt or exception that occurs at this time will have
unpredictable results. To avoid this unpredictability, interrupts should
remain disabled until interrupt handlers are in place and a valid IDT has
been created in protected mode."

> Which doesn't state but it does imply that the load needs to be in the
> mode that the IDT is for, and, to focus on the key phrase in the text
> (but not intending to hide the context - q.v.): "...a valid IDT has
> been **created in protected mode**." (Asterisks added.)

10.4.1 is clearly accurate until the final "protected mode" which would
only apply to PM ... It seems like the 386 manual only explains PM. I
don't see much RM explanation in the manual. I think, maybe, you need the
86 or 286 manual for RM explanations.

AFAICT, the 386 manual usage doesn't use IVT at all, and only mentions
"interrupt vector table" twice. It explains the PM IDT in 9.4, but doesn't
explain the RM IVT. The 386 manual's INT instruction description uses "IDT"
and "interrupt descriptor table" for both PM *and* RM:

"The INT instruction generates via software a call to an interrupt handler.
The immediate operand, from 0 to 255, gives the index number into the
Interrupt Descriptor Table (IDT) of the interrupt routine to be
called. In Protected Mode, the IDT consists of an array of eight-byte
descriptors; the descriptor for the interrupt invoked must indicate an
interrupt, trap, or task gate. In Real Address Mode, the IDT is an array of
four byte-long pointers. In Protected and Real Address Modes, the base
linear address of the IDT is defined by the contents of the IDTR."

That's the only explanation of the RM IVT in the 386 manual that I could
find ... I.e., post 386, the RM IVT is really a variant form of an IDT, and
no longer an IVT.

> Pentium Pro Family Developer's Manual Vol 3: Operating System Writer's
> Guide
> "8.8.1. Switching to Protected Mode
> ...The 32-bit Intel Architecture processors have slightly different
> requirements for switching to protected mode. To insure upwards and
> downwards code compatibility with all 32-bit Intel Architecture
> processors, it is recommended that the following steps be performed:"
>
> and **after** step 3 (set PM in CR0) step 8 is
> "8. Execute the LIDT instruction to load the IDTR register with the
> address and limit of the protected-mode IDT."

I accept that as typical or standard or preferred usage. I don't believe
it's the only usage.

> In other words, even if LIDT works with an opsiz between modes in some
> CPUs it is not documented to work on all of them that way but running
> it in the mode concerned: PM or RM should always work.

I don't agree with that conclusion.

First, 10.4.1 above implies you can load the IDTR before or after a mode
switch. Obviously, you shouldn't trigger an interrupt when loading a PM IDT
while in RM, or vice versa.

Second, I don't see anything that would indicate that LIDT won't work with
an operand-size prefix. Table 17-1 of the 386 manual should work for all
instructions of the 386. LLDT, LMSW, LTR, SLDT and STR have a
note that the operand-size has "no effect". These instructions only accept
16-bit operands. LIDT has no such note, and uses both 16-bit and 32-bit
operands. I.e., it should be legal to operand-size override LIDT, yes?

Third, 14.3 (below) clearly indicates that LIDT is valid in RM, and
relocating the IVT in RM is valid too. Under normal situations, why would
RM code need to relocate the IVT from it's default location? It wouldn't.
It's uneeded. So, why provide the ability to relocate the IVT while in RM?
I.e., setup for PM.

14.3
"The primary difference in the interrupt handling of the 80386 compared to
the 8086 is that the location and size of the interrupt table depend on the
contents of the IDTR (IDT register). Ordinarily, this fact is not apparent
to programmers, because, after RESET, the IDTR contains a base address of 0
and a limit of 3FFH, which is compatible with the 8086. However, the LIDT
instruction can be used in real-address mode to change the base and limit
values in the IDTR."


Rod Pemberton







Marven Lee

unread,
Nov 11, 2011, 8:58:51 AM11/11/11
to
Rod Pemberton wrote:
> As you know or should, you can reset a machine via a triple-fault when an
> int instruction is executed and the idt has zero length, i.e., no
> available
> vectors to call for the int instruction. AIUI, this is supposed to work
> on
> all machines.

I never bothered trying to reset my machine through triple faulting
intentionally. I just let my PC BSOD on a crash and used to put up
a "safe to turn off computer" message when the OS shutdown.

I started tinkering with Intel's ACPI-CA code and linked it into my
kernel so that I could turn my computer off by sending it to sleep.

I don't fully understand ACPI or ACPI-CA, but I've seen references to
RESET_REG and RESET_VALUE which I believe might be in the
ACPI FADT (Fixed ACPI Description Table). Maybe you write the
value to the register and the machine resets?


--
Marv


James Harris

unread,
Nov 11, 2011, 3:29:26 PM11/11/11
to
On Nov 10, 1:30 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> "James Harris" <james.harri...@googlemail.com> wrote in message
7:41 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>

...

> > > Lookup LIDT description, yet again ...
>
> > > How you specify a 32-bit base address in RM for LIDT, instead of 24-bit,
> > > without using an operand size prefix? E.g., for a 32-bit PM IDT inited
> > > from RM. Most people would init from PM, but that should be legal from
> > > RM, yes?

This is fascinating. I've been reading up some more and trying out
some code generation options in Nasm and while I find I can't agree
with your reasoning and inferences I think I'm coming to a similar
conclusion as you. I'll try to explain why. See if you agree with any
of my comments.

> > Are you saying you run LIDT in real mode before switching to PM
> > (or run LIDT in PM and switch back to RM)?
>
> 1) Me, personally?  No, I don't, or don't believe so.

OK.

...

> 2) It's my understanding that both of those uses are valid though.  AIR, the
> reason LIDT is not a privileged instruction is to allow that.

LIDT not privileged?

> I seem to
> recall that loading a PM IDT in RM prior to the PM switch was needed in
> some situations.

I can't think of any occasion where LIDT would be needed prior to
switching to PM but LGDT is different. According to the Intel
instructions for compatibility with CPUs past and future LGDT should
be executed just prior to changing CR0.

The two instructions - LGDT and LIDT - are almost identical. This
suggests that what works for LGDT will also work for LIDT - though it
doesn't guarantee it.

...
Interesting research. Could it be nothing more than that Intel has
been a bit sloppy with their terminology?

> > Pentium Pro Family Developer's Manual Vol 3: Operating System Writer's
> > Guide
> > "8.8.1. Switching to Protected Mode
> > ...The 32-bit Intel Architecture processors have slightly different
> > requirements for switching to protected mode. To insure upwards and
> > downwards code compatibility with all 32-bit Intel Architecture
> > processors, it is recommended that the following steps be performed:"
>
> > and **after** step 3 (set PM in CR0) step 8 is
> > "8. Execute the LIDT instruction to load the IDTR register with the
> > address and limit of the protected-mode IDT."
>
> I accept that as typical or standard or preferred usage.  I don't believe
> it's the only usage.

I'm coming round to your view. Since Intel specify the sequence
there's no assurance that LIDT in RM will correctly set up the PM IDT
descriptor on *all* x86-32 CPUs. In practice I expect it would
generally work. Still, I can't think of a reason not to follow Intel's
prescribed sequence and load the PM IDTR in PM.

> > In other words, even if LIDT works with an opsiz between modes in some
> > CPUs it is not documented to work on all of them that way but running
> > it in the mode concerned: PM or RM should always work.
>
> I don't agree with that conclusion.
>
> First, 10.4.1 above implies you can load the IDTR before or after a mode
> switch.  Obviously, you shouldn't trigger an interrupt when loading a PM IDT
> while in RM, or vice versa.

Yes, LIDT works in either mode but isn't the query whether the IDTR
value loaded in one mode is valid in the other? In fact, to my mind
the question is more demanding: will the IDTR loaded by LIDT in one
mode correctly work in the other mode in all cases on all x86-32 CPUs?

It might.

> Second, I don't see anything that would indicate that LIDT won't work with
> an operand-size prefix.

Agreed. I hadn't realised until checking this out but LGDT can
apparently have an opsiz prefix. In fact it seems it *should*! Here's
my take. Can you tell me if I have it right or wrong? Nasm syntax but
I'll include the object code so that it is relevant in other
assemblers.

0F0116[0501] lgdt [mem]

This, in real mode will, I think, load a 16-bit limit and only a 24-
bit base. It will load only five bytes setting the top byte of the
base to zero. Therefore it will work only when the GDT base is in the
range 0 up to 2**24-1. If the base of the GDT is at 2**24 or above the
lgdt command will be unsuccessful. On the other hand

660F0116[0501] o32 lgdt [mem]

will, I think, load a 16-bit limit and a 32-bit base.

How does that look? Anywhere close?

Nasm-specific question: Based on other tests it seems there should be
a way to specify to Nasm that the operand at [mem] is six bytes
instead of five but I can't find it. Ideal would be

lgdt dword [mem]

as that is consistent with other instructions but the version of Nasm
I have does not accept that form. I know dword in this case applies to
the Base field and not to the whole six bytes. Is there a qualifying
keyword to specify a six-byte piece of data - and a five-byte piece of
data for going the other way? I doubt it. Keywords word and dword seem
far more suitable as the 80386 manual refers to the operand size as
being 16 for 24-bit values.

http://pdos.csail.mit.edu/6.828/2007/readings/i386/LGDT.htm

Notably, where many assemblers leave the programmer to hard-code the
bytes for scenarios such as the above at least Nasm provides ways to
code things with mnemonics. In this case the o32 or o16 keywords do
the job. It's just that with other instructions Nasm allows data-size
keywords which add the prefixes automatically. For example, with bits
16,

FF36[0501] push word [mem]
66FF36[0501] push dword [mem]
66FF36[0501] o32 push word [mem]

The second example is equivalent to the third. It's just that push
dword has added the o32 automatically. The same syntax does not seem
to be available for the lgdt instruction.

>  Table 17-1 of the 386 manual should work for all
> instructions of the 386.  LLDT, LMSW, LTR, SLDT and STR have a
> note that the operand-size has "no effect".  These instructions only accept
> 16-bit operands.  LIDT has no such note, and uses both 16-bit and 32-bit
> operands.  I.e., it should be legal to operand-size override LIDT, yes?

LIDT/SIDT and LGDT/SGDT are fundamentally different from all the
others. AFAICT they are the only instructions that work with a 5-byte
or 6-byte limit and base. So while LLDT, LTR, SLDT and STR have a 16-
bit operand they are different in that their operands are fixed as 16-
bit segment ids.

I'd accept comparison between LIDT and LGDT, though. In that case,
yes, they do seem to accept operand-size overrides.

> Third, 14.3 (below) clearly indicates that LIDT is valid in RM, and
> relocating the IVT in RM is valid too.  Under normal situations, why would
> RM code need to relocate the IVT from it's default location?  It wouldn't.
> It's uneeded.  So, why provide the ability to relocate the IVT while in RM?
> I.e., setup for PM.

Do you really infer this? AIUI the IVT can be relocated and should
work in real mode wherever it has been relocated to.

> 14.3
> "The primary difference in the interrupt handling of the 80386 compared to
> the 8086 is that the location and size of the interrupt table depend on the
> contents of the IDTR (IDT register). Ordinarily, this fact is not apparent
> to programmers, because, after RESET, the IDTR contains a base address of 0
> and a limit of 3FFH, which is compatible with the 8086. However, the LIDT
> instruction can be used in real-address mode to change the base and limit
> values in the IDTR."

This just says that in RM it can be relocated and resized. I don't see
this as suggesting that the reason for the ability to relocate/resize
it is to set it up for PM.

I do (now) think that the IDT probably can be defined in real mode for
use in PM - at least on most CPUs. My reason is the similarities with
LGDT, though, and not the same reasoning as yourself.

There still seems greater safety in loading the PM IDTR in PM because
it is specified by Intel to work that way. I expect other
manufacturers would strive to follow Intel's lead. It would be
interesting to hear of a manufacturer whose chips did not work in that
case but it seems unlikely given that moving the IDT while running in
PM is a reasonable thing to do. I can't think of a reason not to avoid
loading the IDTR in real mode and only load it once in PM.

Any chimes of agreement in the above?

James

James Harris

unread,
Nov 11, 2011, 3:49:31 PM11/11/11
to
On Nov 8, 6:31 pm, "wolfgang kern" <nowh...@never.at> wrote:

...

> The jump I mentioned causes the BIOS do to the same as after power-on.
> Of course it assumes that this address isn't paged away or overwritten :)
>
> |On a 32-bit system why use anything other than KBC or SCPA?
>
> sure, both above may work as well, but with my way I haven't to
> check if the legacy keybd-controller is enabled and emulated or not.

I can see this point. Is it possible that there is a machine which has
neither KBC (real or emulated) nor SCPA? And if the KBC per se is not
emulated would the following not work? It is all that should be
required to reset a machine.

mov al, 0xfe
out 0x64, al

with possibly a delay and then, just in case the delay returns,

mov al, 1
out 0x92, al

Maybe another delay is needed then your method as final fallback?

> I haven't seen any PC-XT/AT++ BIOS which haven't got this far jump
> at this address during the last three decades.
> Looks like this olde hardware standard survived all new ideas :)

Over those thirty years you have restricted yourself to specific
machines, haven't you? :-)

Here's another potential reason for not jumping to the init location.
The bootstrap code may not but is entitled to expect certain registers
to have certain values following reset. For example, take a look at
the 242692 Pentium Pro Family Developer’s Manual Volume 3 Operating
System Writer’s Guide table 8-1. It lists a raft of register values
that a Pentium Pro should start up with. To jump to the init location
properly one should set these registers correctly first. Otherwise,
although the machine might reboot, it may store incorrect information
in certain RAM fields leading to obscure reports or even subtle bugs.

Quite apart from the completeness of the approach some of the
registers seem quite important in themselves. In the table mentioned
above, there are all the CRs, EDX, DR6 and DR7 and the TSC, for
example.

Worst of all, to set up the registers properly requires different code
for each type of CPU that jumps to the init location. Not an appealing
thought!

James

wolfgang kern

unread,
Nov 11, 2011, 4:12:06 PM11/11/11
to

Rod Pemberton said:

> ...

>>>> LIDT reads only 48 bits from memory, valid for RM,PM16,PM32.
>>>> An operand size prefix will/should be ignored.

>> Are you saying you run LIDT in real mode before switching to PM
>> (or run LIDT in PM and switch back to RM)?

Yes, I have to switch it with every mode change.

> 1) Me, personally? No, I don't, or don't believe so. I do the following
> for various programs. I reset the RM IVT location to default RM IVT
> location (zero) for some code to ensure it wasn't changed. I relocate the
> RM IVT to detect or ignore direct writes to the default region of the RM
> IVT. I set null IVT/IDT for resets in RM and PM. In PM, I load or reload
> an IDT for PM.

Ok for RM bootup.

> 2) It's my understanding that both of those uses are valid though. AIR,
> the
> reason LIDT is not a privileged instruction is to allow that. I seem to
> recall that loading a PM IDT in RM prior to the PM switch was needed in
> some situations.

LIDT 'is' a priveledged instruction (needs PL0 while PM and RPL3 in RM).
Dont care the called names, fact is that a RM-IDT works on 16:16 vectors
and PM-IDT uses a 64-bit struct to define type and pointer to a routine.

>> Pentium Pro Family Developer's Manual Vol 3: Operating System Writer's
>> Guide
>> "8.8.1. Switching to Protected Mode
>> ...The 32-bit Intel Architecture processors have slightly different
>> requirements for switching to protected mode. To insure upwards and
>> downwards code compatibility with all 32-bit Intel Architecture
>> processors, it is recommended that the following steps be performed:"

>> and **after** step 3 (set PM in CR0) step 8 is
>> "8. Execute the LIDT instruction to load the IDTR register with the
>> address and limit of the protected-mode IDT."

> I accept that as typical or standard or preferred usage. I don't believe
> it's the only usage.

Me think a mode change (PM16/32 <-> RM) will always need an IDT update.

>> In other words, even if LIDT works with an opsiz between modes in some
>> CPUs it is not documented to work on all of them that way but running
>> it in the mode concerned: PM or RM should always work.

> I don't agree with that conclusion.

what's the question about this ?

> First, 10.4.1 above implies you can load the IDTR before or after a mode
> switch. Obviously, you shouldn't trigger an interrupt when loading a PM
> IDT
> while in RM, or vice versa.

Of course not a good idea to enable IRQs while switching CPU-modes :)
But my Os got identical working IRQ-handlers for both, RM and PM16/32, so
I can keep the IRQ-disable cycle quite short and wont ever miss an event.

> Second, I don't see anything that would indicate that LIDT won't work with
> an operand-size prefix. Table 17-1 of the 386 manual should work for all
> instructions of the 386. LLDT, LMSW, LTR, SLDT and STR have a
> note that the operand-size has "no effect". These instructions only
> accept
> 16-bit operands. LIDT has no such note, and uses both 16-bit and 32-bit
> operands. I.e., it should be legal to operand-size override LIDT, yes?

No, LIDT doesn't need any operandsize prefix because it will always work
in any mode as expected on a 6 byte struct (memory base and size).
anchient 286 and 386 may have clipped the high-byte of base to 0 or ffh
while in 16-bit RM,
but I see this as a matter of the museum rather than an issue today.

> Third, 14.3 (below) clearly indicates that LIDT is valid in RM, and
> relocating the IVT in RM is valid too. Under normal situations, why would
> RM code need to relocate the IVT from it's default location? It wouldn't.
> It's uneeded. So, why provide the ability to relocate the IVT while in
> RM?
> I.e., setup for PM.

I could assign the REAL-mode IDT to somewhere else than 0:0
(ie: to the HMA ffff:0010h aka 0010_0000h)

> 14.3
> "The primary difference in the interrupt handling of the 80386 compared to
> the 8086 is that the location and size of the interrupt table depend on
> the
> contents of the IDTR (IDT register). Ordinarily, this fact is not apparent
> to programmers, because, after RESET, the IDTR contains a base address of
> 0
> and a limit of 3FFH, which is compatible with the 8086. However, the LIDT
> instruction can be used in real-address mode to change the base and limit
> values in the IDTR."

Yes.
There are no features like virtual interrupts designed for true RM.
Beside all the historic information from 286/386/early 486...
we now got a standard x86-platform which got two different IDTs,
one for RM16 and the other one for PM (either 16 or 32 bit gates).
VM86 may behave different depending on the OS which deployed it.

RM IDT is usually found and kept at 0000_0000 but who can hold us
from changing this ? It's just a matter of design.

V86-IVT (the so called virtual IDT) doesn't exist in true RM.
It's an abstraction for HLL-coders convenience and actually got
not much sense in my world :)
__
wolfgang


wolfgang kern

unread,
Nov 11, 2011, 5:18:32 PM11/11/11
to

James Harris mentioned:
[LIDT]
....
|Yes, LIDT works in either mode but isn't the query whether the IDTR
|value loaded in one mode is valid in the other? In fact, to my mind
|the question is more demanding: will the IDTR loaded by LIDT in one
|mode correctly work in the other mode in all cases on all x86-32 CPUs?
|
|It might.

Except that a RM-IDT holds 16:16 (CS:IP) vectors and a PM-IDT holds
64-bit entries (int-gate/int-trap descriptors).
....

|Any chimes of agreement in the above?

We may not agree in the interpretation of the (old dubious) manuals here :)
What I figured is that my machines doesn't need a o32-prefix on LIDT/LGDT
regardless of RM16/PM32 [looks like any operand-size prefix is ignored].

Ok, I'll check just for the record if I can set the IDT to above 16MB
when I'm using Big-Real- aka Unreal-mode. Give me a few days...

__
wolfgang





James Harris

unread,
Nov 11, 2011, 6:00:45 PM11/11/11
to
On Nov 11, 10:18 pm, "wolfgang kern" <nowh...@never.at> wrote:
> James Harris mentioned:
> [LIDT]
> ....
> |Yes, LIDT works in either mode but isn't the query whether the IDTR
> |value loaded in one mode is valid in the other? In fact, to my mind
> |the question is more demanding: will the IDTR loaded by LIDT in one
> |mode correctly work in the other mode in all cases on all x86-32 CPUs?
> |
> |It might.
>
> Except that a RM-IDT holds 16:16 (CS:IP) vectors and a PM-IDT holds
> 64-bit entries (int-gate/int-trap descriptors).

True but the IDTR (the subject of the LIDT instruction) holds
basically a limit and a base doesn't it? It might be ignorant of the
format of the memory between Base and Base + Limit.

> ....
>
> |Any chimes of agreement in the above?
>
> We may not agree in the interpretation of the (old dubious) manuals here :)
> What I figured is that my machines doesn't need a o32-prefix on LIDT/LGDT
> regardless of RM16/PM32 [looks like any operand-size prefix is ignored].
>
> Ok, I'll check just for the record if I can set the IDT to above 16MB
> when I'm using Big-Real- aka Unreal-mode. Give me a few days...

Cool! It would be good to hear. As someone mentioned we might be able
to find out by loading something above 16Mby and then using SIDT to
see the value of the MSByte.

Incidentally, I wonder if we get away with a 5-byte LGDT because we
don't place an initial GDT high enough to make a difference. The Intel
instruction descriptions seem to bear this out as seen at

http://pdos.csail.mit.edu/6.828/2007/readings/i386/LGDT.htm

where in contrast with the load instructions it says, "The SGDT and
SIDT instructions always store into all 48 bits."

James

Rod Pemberton

unread,
Nov 11, 2011, 10:22:55 PM11/11/11
to
"James Harris" <james.h...@googlemail.com> wrote in message
news:fcef42b6-10da-49d6...@p5g2000vbm.googlegroups.com...
> On Nov 10, 1:30 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
> > "James Harris" <james.harri...@googlemail.com> wrote in message
> > > 7:41 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
...

> > > Are you saying you run LIDT in real mode before switching
> > > to PM (or run LIDT in PM and switch back to RM)?
>
> > 1) Me, personally? No, I don't, or don't believe so.
>
> OK.

Correction. I found one program of mine that does ... :-) It was never
completed. I was trying to setup some PM stuff from RM. It looks like the
overrides aren't necessary.

> > 2) It's my understanding that both of those uses are valid though. AIR,
> > the reason LIDT is not a privileged instruction is to allow that.
>
> LIDT not privileged?

Sorry, I mean that it works in RM too ...

> > [quote]
>
> > That's the only explanation of the RM IVT in the 386 manual that I could
> > find ... I.e., post 386, the RM IVT is really a variant form of an IDT,
> > and no longer an IVT.
>
> Interesting research. Could it be nothing more than that Intel has
> been a bit sloppy with their terminology?

It could be. The older manuals are an "easier read," IMO. Maybe, they are
less precise too? But, hey, try to find something on 16-bit PM in a modern
manual, or a 286 TSS ...

> > I accept that as typical or standard or preferred usage.
> > I don't believe it's the only usage.
>
> I'm coming round to your view.

I think the underlying design is very "generic", at least originally.
However, usage in certain "standard" ways results in specific code
sequences.

> Since Intel specify the sequence there's no assurance that LIDT in RM
> will correctly set up the PM IDT descriptor on *all* x86-32 CPUs.

AIR, I attempted to follow both old and new manuals, and there are a number
of conflicting differences.

It would seem obvious that they found some type of incompatibility with the
switch to 32-bit PM. The "more precise" method is only recent manuals, yes?
However, they don't mention what cpu had a problem or what their rationale
is. E.g., it might not be one of their cpu's had an issue. It could
be Cyrix or Transmeta or NEC or AMD or ...

Since they don't mention the reason, it's also possible they intend to merge
instructions or perhaps eliminate some instructions in the future. E.g.,
maybe there is some problem with design changes for future cpu's. In which
case, the startup sequence may be more similar to what they suggest and
easier to fix. Or, it could just be a slight timing issue. Who knows?

> Yes, LIDT works in either mode but isn't the query whether the
> IDTR value loaded in one mode is valid in the other?

If the IDTR is loaded in one mode for use in another and the IDT is not used
until the correct mode for it is activated, what is the issue? Unreal mode
works by setting parts of registers not available in RM.

> In fact, to my mind the question is more demanding: will the IDTR
> loaded by LIDT in one mode correctly work in the other mode in
> all cases on all x86-32 CPUs?
>
> It might.

LIDT only sets the base and size, yes? The cpu mode selects how data
at that location is interpreted, yes? That's how I understand it. Maybe
that's wrong.

> I hadn't realised until checking this out but LGDT can
> apparently have an opsiz prefix. In fact it seems it *should*!

I have 66h (i.e., o32 or o16 for NASM) on some and not on others. I don't
think any of them are critical, i.e., none are above 1MB or 16MB ...
They're all probably fine without it. IIRC, I only did so to make sure that
the upper bits were loaded or cleared.

> my take. Can you tell me if I have it right or wrong? Nasm syntax but
> I'll include the object code so that it is relevant in other
> assemblers.
>
> 0F0116[0501] lgdt [mem]
>
> This, in real mode will, I think, load a 16-bit limit and only a 24-
> bit base.

Yes.

> It will load only five bytes setting the top byte of the
> base to zero.

(Loads six. Uses five. ?)

> Therefore it will work only when the GDT base is in the
> range 0 up to 2**24-1.

Yes.

> If the base of the GDT is at 2**24 or above the
> lgdt command will be unsuccessful.

Not sure. Does it just wrap? I.e., upper address bits are effectively
truncated since they aren't loaded or are cleared? That's my guess.

> On the other hand
>
> 660F0116[0501] o32 lgdt [mem]
>
> will, I think, load a 16-bit limit and a 32-bit base.
>

Yes.

> How does that look? Anywhere close?

It looks correct to me ... ;-)

I take it the brackets on the table location are yours. I.e., address
0x105, DOS .com file org 0x100.

> Nasm-specific question: Based on other tests it seems there should
> be a way to specify to Nasm that the operand at [mem] is six bytes
> instead of five but I can't find it. Ideal would be
>
> lgdt dword [mem]

o32

> as that is consistent with other instructions but the version of Nasm
> I have does not accept that form. I know dword in this case applies to
> the Base field and not to the whole six bytes. Is there a qualifying
> keyword to specify a six-byte piece of data - and a five-byte piece of
> data for going the other way? I doubt it.

For NASM:

Some instructions support multiple data sizes in the same processor mode.
That's when you need to select between 'byte' and 'word', or 'byte' and
'dword', etc. If you specify 'word' when only 'byte' and 'dword' are an
option, then NASM inserts an override prefix. Ditto for 'dword' when 'byte'
and 'word' are the mode corect options.

Other instructions only support one form in a specific mode. They select
data size purely via the size of the code or data segment. That's when you
explicitly use o16 or o32 to select the instruction for the other mode.
NASM inserts an override prefix because of o16 or o32.

So, in 16-bit RM code, LIDT specifies 16-bit form (5 bytes). In 16-bit RM
code, o32 LIDT specifies 32-bit form (6 bytes). In 32-bit PM code, LIDT
specifies 32-bit form (6 bytes). In 32-bit PM code, o16 LIDT specifies
16-bit form (5 bytes). o32 or o16 is used to select form from the other
mode. They don't support multiple forms in the same mode, without the
override.

> Keywords word and dword seem far more suitable as the 80386
> manual refers to the operand size as being 16 for 24-bit values.

You're a MASM user aren't you? I.e., I think that's the MASM solution to
the issue: 'byte', 'sbyte' 'tbyte', 'word', 'sword, 'dword', 'sdword',
'fword', 'tword', 'oword', 'qword', 'sqword', 'dqword', 'mmword', 'xmmword',
'ymmword', and all repeated with 'ptr' ... ;-)

IMO, NASM has the more clean solution by using what the instruction defines
and what the segment size is by default. I.e., far fewer 'keywords' to get
wrong. If you don't know already, try determining if you need 'dword ptr',
'fword ptr' or 'tword ptr' for LIDT or LGDT. Try to find it in the MASM
manual first. Then, try to find it with the assembler. You may need
NDISASM to find out what MASM actually compiles ...

> Notably, where many assemblers leave the programmer to hard-code the
> bytes for scenarios such as the above at least Nasm provides ways to
> code things with mnemonics. In this case the o32 or o16 keywords do
> the job. It's just that with other instructions Nasm allows data-size
> keywords which add the prefixes automatically.
...

> For example, with bits 16,
>
> FF36[0501] push word [mem]

'word' is not needed. It's the default code/data size for 'bits 16'.

> 66FF36[0501] push dword [mem]

'dword' is needed because you're pushing a value different from the default
code size. Since it's a memory operand, it only "knows" the default size:
16-bits. 'dword' is used to change the default size. If you did 'push
eax', it "knows" the size is 32-bits and inserts an override.

> 66FF36[0501] o32 push word [mem]

Apparently valid, but you're going around in circles ...

> The second example is equivalent to the third. It's just that push
> dword has added the o32 automatically. The same syntax does
> not seem to be available for the lgdt instruction.

This is all mixing of 16-bit and 32-bit code. IIRC, there is a similarly
titled section in one (or both) of the AMD and Intel manuals.

Some prefer a syntax or understanding where 'word' and 'dword' aren't needed
for LGDT. I do. I "see" 5 bytes as default for one mode and 6 bytes as
default for another. And, maybe the NASM designer's felt that dword should
be 32-bit and word should be 16-bits, not 32-bit/24-bit ...

> > Third, 14.3 (below) clearly indicates that LIDT is valid in RM, and
> > relocating the IVT in RM is valid too. Under normal situations, why
> > would RM code need to relocate the IVT from it's default location?
> > It wouldn't. It's uneeded. So, why provide the ability to relocate the
> > IVT while in RM? I.e., setup for PM.
>
> Do you really infer this? AIUI the IVT can be relocated and should
> work in real mode wherever it has been relocated to.

I understand that it can be relocated. In fact, I stated above that I do
use a relocated IVT in RM for a few programs. But, what "normal" RM code
needs to relocate the IVT? Why would you ever relocate it? The IVT was
established as being at location zero for many years prior to 386. No
"normal" RM code should need to relocate an IVT, shouldn't relocate it for
safety or compatibility with BIOS or DOS or CP/M, and shouldn't expect that
it's been moved. AFAICT, relocating the IVT is only needed for blocking or
prohibiting direct use of the IVT, i.e., forcing code to use a system call,
e.g., like DOS' calls to get/set IVT vectors, or monitoring if other code
changes the IVT. How often is that needed or used? I.e., it may have been
useful to develop v86 mode, but otherwise is only used, if ever, in software
that "protects" the PC, e.g., anti-virus. It seems unlikely they would
intentionally provide support for anti virus software long before it was a
problem. It could also be used by virii and malware to prevent normal
operation. Can that be intentional design? I think this goes back to
generic design.

> > 14.3
> > [quote]
>
> This just says that in RM it can be relocated and resized. I don't see
> this as suggesting that the reason for the ability to relocate/resize
> it is to set it up for PM.
>

Why is it needed in RM?

Well, I do know of one other reason, other than those I mentioned above.
Supposedly (unconfirmed), Intel designed their chips so that they could be
used in embedded environments that placed startup code at zero instead of at
the initial processor startup location. It's rumored that 0xFF 0xFF (i.e.,
bus pull-up read of location without memory or ROM) is a NOP on Intel's so
the processor execution can wrap to zero and begin executing code there
(zero). AMDs supposedly page fault on wrap-around. It's possible that if
it works that way, it's only a temporary instruction, until an actual opcode
is executed. If true, relocating the RM IVT from zero would be required for
such a setup.

> There still seems greater safety in loading the PM IDTR in PM
> because it is specified by Intel to work that way.

"specified" vs. "firmly suggested"

I agree that it is probably "safer" to only setup IDTR in the mode in which
it'll be used, but the idea that it's somehow safer is precautionary,
speculative, or possibly specious on my part. I think it's probably
perfectly safe either way. It's not like most of the x86 instructions have
strict interlocking rulesets which prohibit all but a few "valid" uses. For
the most part, they are generic. Some have very specific, mode specific
rules, e.g., ret.


Rod Pemberton



Rod Pemberton

unread,
Nov 11, 2011, 10:44:30 PM11/11/11
to
"Rod Pemberton" <do_no...@noavailemail.cmm> wrote in message
news:j9kooo$8sc$1...@speranza.aioe.org...
> "James Harris" <james.h...@googlemail.com> wrote in message
> news:fcef42b6-10da-49d6...@p5g2000vbm.googlegroups.com...
...

> > Since Intel specify the sequence there's no assurance that LIDT in RM
> > will correctly set up the PM IDT descriptor on *all* x86-32 CPUs.
>
> AIR, I attempted to follow both old and new manuals, and there are a
> number of conflicting differences.
>

E.g., I don't have a list of what the issues were, but I made a note in one
more convoluted 16-bit RM to 32-bit PM switch that there are three
contradictory sections in the various Intel manuals. For "safety," this
"required" setting and clearing bits in CR0 such as CD and NW, wbinvd, then
clear bits in CR4 such as PGE, PAE and v86, then clear bits in CR3 such PAE
mask, which also flushes the TLB, then back to CR0 again for the enable of
PM bit and setting the co-processor, etc, and then the far jump. It also
supposedly needs MTRRs set for Pentium Pro's, which I did not do. Normally,
most people just enable CR0.PE and far jump. They assume that all those
other bits and registers are basically the in cpu's default state for them,
even if the BIOS may have used them a bit.

> It would seem obvious that they found some type of incompatibility with
> the switch to 32-bit PM. The "more precise" method is only recent
> manuals, yes? However, they don't mention what cpu had a problem
> or what their rationale is. E.g., it might not be one of their cpu's had
> an issue. It could be Cyrix or Transmeta or NEC or AMD or ...
>
> Since they don't mention the reason, it's also possible they intend to
> merge instructions or perhaps eliminate some instructions in the future.
> E.g., maybe there is some problem with design changes for future cpu's.
> In which case, the startup sequence may be more similar to what they
> suggest and easier to fix. Or, it could just be a slight timing issue.
> Who knows?

E.g., it could be something like SMM interrupting the mode switch.
Everything is speculation.


Rod Pemberton



wolfgang kern

unread,
Nov 12, 2011, 3:57:21 AM11/12/11
to

James Harris wrote:

> [LIDT]
> ....
>|Yes, LIDT works in either mode but isn't the query whether the IDTR
>|value loaded in one mode is valid in the other? In fact, to my mind
>|the question is more demanding: will the IDTR loaded by LIDT in one
>|mode correctly work in the other mode in all cases on all x86-32 CPUs?

>|It might.

> Except that a RM-IDT holds 16:16 (CS:IP) vectors and a PM-IDT holds
> 64-bit entries (int-gate/int-trap descriptors).

|True but the IDTR (the subject of the LIDT instruction) holds
|basically a limit and a base doesn't it? It might be ignorant of the
|format of the memory between Base and Base + Limit.

Yes.

> ....
>|Any chimes of agreement in the above?

> We may not agree in the interpretation of the (old dubious) manuals here
> :)
> What I figured is that my machines doesn't need a o32-prefix on LIDT/LGDT
> regardless of RM16/PM32 [looks like any operand-size prefix is ignored].

> Ok, I'll check just for the record if I can set the IDT to above 16MB
> when I'm using Big-Real- aka Unreal-mode. Give me a few days...

|Cool! It would be good to hear. As someone mentioned we might be able
|to find out by loading something above 16Mby and then using SIDT to
|see the value of the MSByte.

I come back to this ...

|Incidentally, I wonder if we get away with a 5-byte LGDT because we
|don't place an initial GDT high enough to make a difference. The Intel
|instruction descriptions seem to bear this out as seen at

http://pdos.csail.mit.edu/6.828/2007/readings/i386/LGDT.htm

mmh..., where could a five byte GDT make sense ? A relict from 16MB XT ?
GDT doesn't play any role while in RM and this 2 KB can reside anywhere
in the 4GB range, and LGDT is executed only once in my whole code.

|where in contrast with the load instructions it says, "The SGDT and
|SIDT instructions always store into all 48 bits."

Yeah, I wont see much sense if they would store less :)

__
wolfgang


wolfgang kern

unread,
Nov 14, 2011, 3:18:45 AM11/14/11
to

I promised to check:

... and had to change my opinion about LIDT and LGDT in Real-Mode :)

checked while in Big-REAL-Mode on PhenomII-4x940:

Both load only 16+24 bits without a 66h prefix and zero out the high byte,
while SIDT/SGDT always store all 48 bits regardless of size-override.

But even I can move the RM-IDT to above 16MB (checked at 0xc8ff_f800)
when using Big-real-mode, this 16:16 Vectors cant point above 1 MB :)
So I'll keep my RM-vectors and routines within the low+HMA space.

Haven't seen this 24-bit limit explicit mentioned in newer CPU-manuals,
perhaps because a 24-bit base make not too much sense for real-mode ?
__
wolfgang


Antoine Leca

unread,
Nov 14, 2011, 7:25:44 AM11/14/11
to
Rod Pemberton wrote:
>> 0F0116[0501] lgdt [mem]
[ without the o32 prefix]

>> Therefore it will work only when the GDT base is in the
>> range 0 up to 2**24-1.
>
> Yes.

+1

>> If the base of the GDT is at 2**24 or above the
>> lgdt command will be unsuccessful.

It will load the GDTR base with the 24-bit masked value, ie some wrong
value. Yet the CPU will perform the "wrong" operation blindly; then,
some time later, when selector will be examined (perhaps through a far
jump), will the --probably bad-- consequences occur.

> Not sure. Does it just wrap? I.e., upper address bits are effectively
> truncated since they aren't loaded or are cleared? That's my guess.

OTOH, if the GDTR base is set near from 00FF_FFFF, there will be no
wrap: the CPU will add the selector offset (in the 0-FFF8 range), after
having checked of course against the limit, to the base; and if this
results in something above the 16MB mark 0100_0000, well nothing special
will occur, it will work OK!

By the way, this is just a special case of the general difference
between a iAPX286 and a 80386, where the first wraps around the 16MB
mark while the later does not (13.3.1 of 80386 reference manual.)
Putting the GDT (or the IDT, or a LDT, or a TSS, or any table) near the
top of memory to achieve wrap effects does not seem a wonderful idea
whatever the processor anyway; the special case of A20 gate is still
with us only because some more-than-smart engineers of important
software vendor(s) used such a trick to pack executables...


Antoine

Rod Pemberton

unread,
Nov 15, 2011, 8:01:50 AM11/15/11
to
"Antoine Leca" <ro...@localhost.invalid> wrote in message
news:4ec108c9$0$3190$426a...@news.free.fr...
> Rod Pemberton wrote:
> > > 0F0116[0501] lgdt [mem]
> [ without the o32 prefix]
>
> >> Therefore it will work only when the GDT base is in the
> >> range 0 up to 2**24-1.
> >
> > Yes.
>
> +1
>

Wouldn't that be 25 bits ... ? Or, were you taking issue with the use of
'up to'? I.e., 'up to' can mean including 2**24-1 or not including 2**24-1
... I took it to mean inclusive, not exclusive.

A binary range is always 0 through 2**n-1, yes? I.e., 2**24-1=00FFFFFFh and
2**24=01000000h. Each hex digit is four bits ... BTW, I prefer ^ to ** or
'exp'.

> >> If the base of the GDT is at 2**24 or above the
> >> lgdt command will be unsuccessful.
>
> [snip]
>
> > Not sure. Does it just wrap? I.e., upper address bits are effectively
> > truncated since they aren't loaded or are cleared? That's my guess.
>
> OTOH, if the GDTR base is set near from 00FF_FFFF, there will be no
> wrap: [...]

The snipped response above about invalid address applied, but the offset
response here didn't seem to ...

James was asking about what happens when the base is larger than 24-bits,
but the lgdt loads only 24-bits. Those 8 extra bits are likely cleared or
truncated, i.e., the base address (possibly) wraps around to zero at 16MB.
Although, it's possible the base address picks up 8-bits of garbage from the
internal cpu base register, or it could be all bits set: 0FFh ... I haven't
read LGDT in a while.

> By the way, this is just a special case of the general difference
> between a iAPX286 and a 80386, where the first wraps around the 16MB
> mark while the later does not (13.3.1 of 80386 reference manual.)
> Putting the GDT (or the IDT, or a LDT, or a TSS, or any table) near the
> top of memory to achieve wrap effects does not seem a wonderful idea
> whatever the processor anyway; the special case of A20 gate is still
> with us only because some more-than-smart engineers of important
> software vendor(s) used such a trick to pack executables...

AIR, the 68000 series microprocessor went through similar issues. They
didn't end up with a "permanent" A20 gate:
http://en.wikipedia.org/wiki/32-bit_clean#32-bit_clean


Rod Pemberton


Antoine Leca

unread,
Nov 15, 2011, 9:09:54 AM11/15/11
to
Rod Pemberton wrote:
> "Antoine Leca" <ro...@localhost.invalid> wrote in message
> news:4ec108c9$0$3190$426a...@news.free.fr...
>> Rod Pemberton wrote:
>>>> 0F0116[0501] lgdt [mem]
>> [ without the o32 prefix]
>>
>>>> Therefore it will work only when the GDT base is in the
>>>> range 0 up to 2**24-1.
>>>
>>> Yes.
>>
>> +1
>
> Wouldn't that be 25 bits ... ?

No, I meant I was in agreement with you. :-)

Sorry to be so terse I am becoming obscure.


>>> Not sure. Does it just wrap? I.e., upper address bits are effectively
>>> truncated since they aren't loaded or are cleared? That's my guess.
>>
>> OTOH, if the GDTR base is set near from 00FF_FFFF, there will be no
>> wrap: [...]
>
> The snipped response above about invalid address applied, but the offset
> response here didn't seem to ...
>
> James was asking about what happens when the base is larger than 24-bits,
> but the lgdt loads only 24-bits. Those 8 extra bits are likely cleared or
> truncated, i.e., the base address (possibly) wraps around to zero at 16MB.

Okay, I did not read James' post this way, but in this particular case I
agree with you, the upper 8 bits are cleared when one uses the o16
variant of lgdt or lidt; among other reasons, this is done this way to
be compatible with the 286, on which a former sgdt/sidt would have
written 0FFh in the top byte, while lgdt/lidt cleared or discarded that
(garbage) top byte after having loaded it.


> AIR, the 68000 series microprocessor went through similar issues. They
> didn't end up with a "permanent" A20 gate:
> http://en.wikipedia.org/wiki/32-bit_clean#32-bit_clean

Interesting information, much appreciated.


Antoine

Benjamin David Lunt

unread,
Nov 18, 2011, 5:56:47 PM11/18/11
to
<Off Topic>

Rod,

You previously mentioned some free newsgroup
servers in a posting a while back. I can't
seem to find that posting.

Would you please point me to that posting
or post it here again?

Thanks,
Ben

</off topic>


Rod Pemberton

unread,
Nov 19, 2011, 11:29:18 AM11/19/11
to
"Benjamin David Lunt" <zf...@fysnet.net> wrote in message
news:ToBxq.42458$Gr6....@newsfe09.iad...
I've posted a bunch of posts like that over the years. An A.O.D. post?

Recent A.O.D. post with a few free Usenet and other software:
http://groups.google.com/group/alt.os.development/msg/a795cc87353d5757


Recent C.L.F. post with list of free Usenet:
http://groups.google.com/group/comp.lang.forth/msg/4d43b392163f4d3c

Recent C.L.M. posts with free webhosting and other services:
http://groups.google.com/group/comp.lang.misc/msg/c6c9495ae438b65a
http://groups.google.com/group/comp.lang.misc/msg/c4686f6b1341db53

In the C.L.M. thread , people have posted other sites and services. Google
didn't archive James Harris' initial posts for the C.L.M. thread. It has
some links he found. You can read it here or here:
http://www.rhinocerus.net/forum/lang-misc/659758-choosing-hosting-service-new-site-wiki.html
http://www.44342.com/misc-f400-t1910-p1.htm


Rod Pemberton


James Harris

unread,
Dec 10, 2011, 6:27:50 AM12/10/11
to
On Nov 12, 3:22 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> "James Harris" <james.harri...@googlemail.com> wrote in message

...

> > Since Intel specify the sequence there's no assurance that LIDT in RM
> > will correctly set up the PM IDT descriptor on *all* x86-32 CPUs.
>
> AIR, I attempted to follow both old and new manuals, and there are a number
> of conflicting differences.

Like yourself I am going from memory but I don't recall seeing them as
conflicting. Maybe that's because I took the 386 manuals as showing
nothing more specific than the general steps.

For example, I think it says that a jump (near or far) should follow
the move to CR0. Presumably this applies to 386 and 486 and possibly
the Pentium 1 - i.e. CPUs where a jump flushes the prefetch queue. And
later manuals say to do a far jump at this point.

Well, a far jump will work on any CPU from the 386 onward so I don't
see it as a conflict. The far jump is needed in order to set CS and
has to be done anyway, even on a 386.

> It would seem obvious that they found some type of incompatibility with the
> switch to 32-bit PM.  The "more precise" method is only recent manuals, yes?

I think so.

> However, they don't mention what cpu had a problem or what their rationale
> is.  E.g., it might not be one of their cpu's had an issue.  It could
> be Cyrix or Transmeta or NEC or AMD or ...

Call me a cynic but I would have thought Intel would be more likely to
specify a sequence that some of their competitors had problems with!

> Since they don't mention the reason, it's also possible they intend to merge
> instructions or perhaps eliminate some instructions in the future.  E.g.,
> maybe there is some problem with design changes for future cpu's.  In which
> case, the startup sequence may be more similar to what they suggest and
> easier to fix.  Or, it could just be a slight timing issue.  Who knows?

Maybe. They could have just decided it would help their future
hardware designs if they had a standard transition sequence to
support.

The only significant change I can think of, though, was the migration
from Pentium 1 to Pentium Pro, i.e. P6 family. My guess is that at
that point the move to CR0 became a synchronising instruction and
removed the need to flush the prefetch queue. Didn't they begin out-of-
order execution there too? If so that might have suggested to them
that a standard sequence would be desirable.

> > Yes, LIDT works in either mode but isn't the query whether the
> > IDTR value loaded in one mode is valid in the other?
>
> If the IDTR is loaded in one mode for use in another and the IDT is not used
> until the correct mode for it is activated, what is the issue?

If the IDTR is implemented in hardware as literally a 16-bit segment
and a 32-bit offset and nothing else, and loading it sets no other
state then for sure I would agree. I just don't know that it is
implemented that way on all CPUs.

Loading the IDTR may also cache some info which is not valid in the
other mode such as protection bits. Of course, even that would not be
an issue if changing mode flushed such cached info but we don't know
that that would be done. Sure, I am being extra cautious and I think
you are probably right that loading the Pmode IDTR in real mode is
probably safe on all CPUs. But as we can't be sure I would rather
avoid it.

>  Unreal mode
> works by setting parts of registers not available in RM.
>
> > In fact, to my mind the question is more demanding: will the IDTR
> > loaded by LIDT in one mode correctly work in the other mode in
> > all cases on all x86-32 CPUs?
>
> > It might.
>
> LIDT only sets the base and size, yes?  The cpu mode selects how data
> at that location is interpreted, yes?  That's how I understand it.  Maybe
> that's wrong.

That's the programmer's model, yes. And all CPUs *probably* get the
hardware implementation so that this works between modes. It's the
fact that Intel's later documented sequence doesn't say it will work
on all CPUs that suggests to me not to rely on it. After all, running
LIDT in Pmode is no hardship. But, that said, I don't know of a CPU on
which it won't work.

...

> > > Third, 14.3 (below) clearly indicates that LIDT is valid in RM, and
> > > relocating the IVT in RM is valid too. Under normal situations, why
> > > would RM code need to relocate the IVT from it's default location?
> > > It wouldn't.  It's uneeded. So, why provide the ability to relocate the
> > > IVT while in RM?  I.e., setup for PM.
>
> > Do you really infer this? AIUI the IVT can be relocated and should
> > work in real mode wherever it has been relocated to.
>
> I understand that it can be relocated.  In fact, I stated above that I do
> use a relocated IVT in RM for a few programs.  But, what "normal" RM code
> needs to relocate the IVT?  Why would you ever relocate it?  The IVT was
> established as being at location zero for many years prior to 386.  No
> "normal" RM code should need to relocate an IVT, shouldn't relocate it for
> safety or compatibility with BIOS or DOS or CP/M, and shouldn't expect that
> it's been moved.  AFAICT, relocating the IVT is only needed for blocking or
> prohibiting direct use of the IVT, i.e., forcing code to use a system call,
> e.g., like DOS' calls to get/set IVT vectors, or monitoring if other code
> changes the IVT.  How often is that needed or used?  I.e., it may have been
> useful to develop v86 mode, but otherwise is only used, if ever, in software
> that "protects" the PC, e.g., anti-virus.  It seems unlikely they would
> intentionally provide support for anti virus software long before it was a
> problem.  It could also be used by virii and malware to prevent normal
> operation.  Can that be intentional design?  I think this goes back to
> generic design.

I was querying whether your belief was that LIDT was provided in real
mode *solely* so that the CPU could load the IDTR for protected mode.

>
> > > 14.3
> > > [quote]
>
> > This just says that in RM it can be relocated and resized. I don't see
> > this as suggesting that the reason for the ability to relocate/resize
> > it is to set it up for PM.
>
> Why is it needed in RM?

I doubt the designers sat down and decided they needed the option to
relocate the real-mode IDT. Rather, I think the point is that the CPU
hardware that runs real mode is, in large part, the same as that which
runs protected mode. In protected mode a register which points to the
IDT was needed. With appropriate contents that same register can be
used in real mode.

> Well, I do know of one other reason, other than those I mentioned above.
> Supposedly (unconfirmed), Intel designed their chips so that they could be
> used in embedded environments that placed startup code at zero instead of at
> the initial processor startup location.  It's rumored that 0xFF 0xFF (i.e.,
> bus pull-up read of location without memory or ROM) is a NOP on Intel's so
> the processor execution can wrap to zero and begin executing code there
> (zero).  AMDs supposedly page fault on wrap-around.  It's possible that if
> it works that way, it's only a temporary instruction, until an actual opcode
> is executed.  If true, relocating the RM IVT from zero would be required for
> such a setup.
>
> > There still seems greater safety in loading the PM IDTR in PM
> > because it is specified by Intel to work that way.
>
> "specified" vs. "firmly suggested"

OK. How about "recommended"? :-)

> I agree that it is probably "safer" to only setup IDTR in the mode in which
> it'll be used, but the idea that it's somehow safer is precautionary,
> speculative, or possibly specious on my part.

Kind of you to say on your part! For me, yes it seems a better option.
Only slightly better, it's true.

>  I think it's probably
> perfectly safe either way.  It's not like most of the x86 instructions have
> strict interlocking rulesets which prohibit all but a few "valid" uses.  For
> the most part, they are generic.  Some have very specific, mode specific
> rules, e.g., ret.

Sure.

James

James Harris

unread,
Dec 10, 2011, 7:02:57 AM12/10/11
to
On Nov 12, 3:22 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> "James Harris" <james.harri...@googlemail.com> wrote in message

...

> > I hadn't realised until checking this out but LGDT can
> > apparently have an opsiz prefix. In fact it seems it *should*!
>
> I have 66h (i.e., o32 or o16 for NASM) on some and not on others. I don't
> think any of them are critical, i.e., none are above 1MB or 16MB ...
> They're all probably fine without it. IIRC, I only did so to make sure that
> the upper bits were loaded or cleared.

This has become to me a more interesting point than when to run LIDT.
As I realised when working up a reply to Frank on clax86, if I wanted
to enable Pmode and paging at the same time it would be *very*
possible that the (permanent or initial) GDT would be above 16Mby. And
I've seen other people reserve upper address space for the kernel. In
either case the 32-bit form of LGDT would be required (again, if
enabling paging at the same time). The default 24-bit form would fail.

It's a pity Frank didn't reply further. I would have liked to have
seen the issue wrt Nasm at least discussed.

> > my take. Can you tell me if I have it right or wrong? Nasm syntax but
> > I'll include the object code so that it is relevant in other
> > assemblers.
>
> > 0F0116[0501] lgdt [mem]
>
> > This, in real mode will, I think, load a 16-bit limit and only a 24-
> > bit base.
>
> Yes.
>
> > It will load only five bytes setting the top byte of the
> > base to zero.
>
> (Loads six. Uses five. ?)

Possibly a moot point. However many it loads from memory it only loads
five into the register, the top byte probably being zeroed.

> > Therefore it will work only when the GDT base is in the
> > range 0 up to 2**24-1.
>
> Yes.
>
> > If the base of the GDT is at 2**24 or above the
> > lgdt command will be unsuccessful.
>
> Not sure. Does it just wrap? I.e., upper address bits are effectively
> truncated since they aren't loaded or are cleared? That's my guess.

I think I see your point. Per a 386 manual, the high order byte is
"not used." Perhaps this refers to what comes from memory. That byte
is probably set to zero in the register. I suppose a separate bit
could be set to say that when placing the GDTR value onto the bus only
output 24 bits of the offset and zero the top 8 bits. Sounds like hard
work for the hardware designer but I suppose it could be done that
way, perhaps on an early design taken from the 286.

> > On the other hand
>
> > 660F0116[0501] o32 lgdt [mem]
>
> > will, I think, load a 16-bit limit and a 32-bit base.
>
> Yes.
>
> > How does that look? Anywhere close?
>
> It looks correct to me ... ;-)
>
> I take it the brackets on the table location are yours. I.e., address
> 0x105, DOS .com file org 0x100.

It's the Nasm listing file output so the brackets just indicate a
memory reference. Yes, 0501 just happens to be the offset of mem.

> > Nasm-specific question: Based on other tests it seems there should
> > be a way to specify to Nasm that the operand at [mem] is six bytes
> > instead of five but I can't find it. Ideal would be
>
> > lgdt dword [mem]
>
> o32

o32 works for the example instruction I was using, too. In real mode

o32 push word [mem]
push dword [mem]

generate exactly the same code. Which form do you prefer?

> > as that is consistent with other instructions but the version of Nasm
> > I have does not accept that form. I know dword in this case applies to
> > the Base field and not to the whole six bytes. Is there a qualifying
> > keyword to specify a six-byte piece of data - and a five-byte piece of
> > data for going the other way? I doubt it.
>
> For NASM:
>
> Some instructions support multiple data sizes in the same processor mode.
> That's when you need to select between 'byte' and 'word', or 'byte' and
> 'dword', etc. If you specify 'word' when only 'byte' and 'dword' are an
> option, then NASM inserts an override prefix. Ditto for 'dword' when 'byte'
> and 'word' are the mode corect options.
>
> Other instructions only support one form in a specific mode. They select
> data size purely via the size of the code or data segment. That's when you
> explicitly use o16 or o32 to select the instruction for the other mode.
> NASM inserts an override prefix because of o16 or o32.
>
> So, in 16-bit RM code, LIDT specifies 16-bit form (5 bytes). In 16-bit RM
> code, o32 LIDT specifies 32-bit form (6 bytes). In 32-bit PM code, LIDT
> specifies 32-bit form (6 bytes). In 32-bit PM code, o16 LIDT specifies
> 16-bit form (5 bytes). o32 or o16 is used to select form from the other
> mode. They don't support multiple forms in the same mode, without the
> override.

I was saying it would be good if Nasm supported LGDT (or LIDT) in the
same form as for other instructions such as push that was used as an
example, above. So,

lgdt dword [mem]

would produce the full load for which we currently have to use

o32 lgdt [mem]

The former may be more consistent but it is good that Nasm at least
allows the latter.

> > Keywords word and dword seem far more suitable as the 80386
> > manual refers to the operand size as being 16 for 24-bit values.
>
> You're a MASM user aren't you?

No, usually Nasm.

> I.e., I think that's the MASM solution to
> the issue: 'byte', 'sbyte' 'tbyte', 'word', 'sword, 'dword', 'sdword',
> 'fword', 'tword', 'oword', 'qword', 'sqword', 'dqword', 'mmword', 'xmmword',
> 'ymmword', and all repeated with 'ptr' ... ;-)
>
> IMO, NASM has the more clean solution by using what the instruction defines
> and what the segment size is by default. I.e., far fewer 'keywords' to get
> wrong. If you don't know already, try determining if you need 'dword ptr',
> 'fword ptr' or 'tword ptr' for LIDT or LGDT. Try to find it in the MASM
> manual first. Then, try to find it with the assembler. You may need
> NDISASM to find out what MASM actually compiles ...

No, my comments are all about Nasm. It has keywords such as word and
dword.

> > Notably, where many assemblers leave the programmer to hard-code the
> > bytes for scenarios such as the above at least Nasm provides ways to
> > code things with mnemonics. In this case the o32 or o16 keywords do
> > the job. It's just that with other instructions Nasm allows data-size
> > keywords which add the prefixes automatically.
>
> ...
>
> > For example, with bits 16,
>
> > FF36[0501] push word [mem]
>
> 'word' is not needed. It's the default code/data size for 'bits 16'.
>
> > 66FF36[0501] push dword [mem]
>
> 'dword' is needed because you're pushing a value different from the default
> code size. Since it's a memory operand, it only "knows" the default size:
> 16-bits. 'dword' is used to change the default size. If you did 'push
> eax', it "knows" the size is 32-bits and inserts an override.

I know. :-)

> > 66FF36[0501] o32 push word [mem]
>
> Apparently valid, but you're going around in circles ...

Not at all. I'm trying to generate a comparison between push (as an
example instruction that allows data size qualifiers) and lgdt (which
doesn't).

> > The second example is equivalent to the third. It's just that push
> > dword has added the o32 automatically. The same syntax does
> > not seem to be available for the lgdt instruction.
>
> This is all mixing of 16-bit and 32-bit code. IIRC, there is a similarly
> titled section in one (or both) of the AMD and Intel manuals.
>
> Some prefer a syntax or understanding where 'word' and 'dword' aren't needed
> for LGDT. I do. I "see" 5 bytes as default for one mode and 6 bytes as
> default for another. And, maybe the NASM designer's felt that dword should
> be 32-bit and word should be 16-bits, not 32-bit/24-bit ...

Yes, I accept that the end result of

lgdt [x]

in real mode is a five byte load. It seems a step too far to ask that
Nasm make another keyword for five bytes and another for six bytes
such as

lgdt fivw [x]
lgdt sixw [x]

when there are already keywords for 2-byte and 4-byte operands. After
all, the 386 manual does say, "if a 16-bit operand is used," and, "if
a 32-bit operand is used." Note the use of *operand* here. I was
hoping Nasm would allow

lgdt word [x]
lgdt dword [x]

where the "word" or "dword" qualifiers relate to the 'operand size'
mentioned in the manual.

The icing on the cake would be if new versions of Nasm were to issue a
warning when the LGDT & LIDT operand size was not specified. That
would be a useful indicator to the programmer to point out that the
default form in 16-bit mode does NOT load a 32-bit offset. Until this
discussion I had been unaware that there was a 5-byte form of LGDT and
that that form was the default. I can't be the only programmer not to
have known that.

James

James Harris

unread,
Dec 10, 2011, 7:24:16 AM12/10/11
to
On Nov 12, 3:44 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> "Rod Pemberton" <do_not_h...@noavailemail.cmm> wrote in message

...

> > AIR, I attempted to follow both old and new manuals, and there are a
> > number of conflicting differences.
>
> E.g., I don't have a list of what the issues were, but I made a note in one
> more convoluted 16-bit RM to 32-bit PM switch that there are three
> contradictory sections in the various Intel manuals.  For "safety," this
> "required" setting and clearing bits in CR0 such as CD and NW, wbinvd, then
> clear bits in CR4 such as PGE, PAE and v86, then clear bits in CR3 such PAE
> mask, which also flushes the TLB, then back to CR0 again for the enable of
> PM bit and setting the co-processor, etc, and then the far jump.  It also
> supposedly needs MTRRs set for Pentium Pro's, which I did not do.  Normally,
> most people just enable CR0.PE and far jump.  They assume that all those
> other bits and registers are basically the in cpu's default state for them,
> even if the BIOS may have used them a bit.

An interesting list and I think you might be right that the BIOS could
have tampered with the bits. I wonder if any other OS goes to the
trouble you did. Certainly, the extra paging bits would be irrelevant
until paging was enabled and then, if not at defaults, they would
certainly jump up and bite the Pmode code.

Under the topic of conflicts between Intel manuals these bits don't
apply do they? I guess that none of the manuals specify to set or
clear bits other than those for Pmode and paging in CR0. So are the
manuals contradictory?

Whether the manuals contradict or not, I think you have raised a great
point that even though the CPU manufacturer's docs say to do things a
certain way a cautious OS writer will make allowances for the state
that could have been left by the BIOS.

Of course, some of the bits you mention do not apply to earlier CPUs.
Without going to the trouble of detecting the CPU type and responding
accordingly there's possibly an appropriate set of bits to zero and
another set of bits to leave alone.

Thanks for bringing this up.

James

James Harris

unread,
Dec 10, 2011, 7:29:17 AM12/10/11
to
On Nov 14, 8:18 am, "wolfgang kern" <nowh...@never.at> wrote:
> I promised to check:
>
> ... and had to change my opinion about LIDT and LGDT in Real-Mode :)
>
> checked while in Big-REAL-Mode on PhenomII-4x940:
>
> Both load only 16+24 bits without a 66h prefix and zero out the high byte,
> while SIDT/SGDT always store all 48 bits regardless of size-override.

Thanks for checking. It was a surprise to me too. IMO it would be good
if whatever assembler was being used generated a warning for where
there would be a 5-byte load. Otherwise this is a trap to catch us
out.

James

Rod Pemberton

unread,
Dec 10, 2011, 8:22:23 PM12/10/11
to
"James Harris" <james.h...@googlemail.com> wrote in message
news:e38ebf2b-dd12-405b...@q11g2000vbq.googlegroups.com...
> On Nov 12, 3:22 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
> > "James Harris" <james.harri...@googlemail.com> wrote in message
...

> > > I hadn't realised until checking this out but LGDT can
> > > apparently have an opsiz prefix. In fact it seems it *should*!
>
> > I have 66h (i.e., o32 or o16 for NASM) on some and not on others.
> > I don't think any of them are critical, i.e., none are above 1MB or
> > 16MB ... They're all probably fine without it. IIRC, I only did so
> > to make sure that the upper bits were loaded or cleared.
>
> This has become to me a more interesting point than when to run LIDT.

Er, Ok ...

> As I realised when working up a reply to Frank on clax86, if I wanted
> to enable Pmode and paging at the same time it would be *very*
> possible that the (permanent or initial) GDT would be above 16Mby.
> And I've seen other people reserve upper address space for the kernel.
> In either case the 32-bit form of LGDT would be required (again, if
> enabling paging at the same time). The default 24-bit form would fail.

Well, good point ...

I was thinking in terms of *my code* setting and *my code* using the GDT.
In that case, it's known where the GDT will be prior to programming. If *my
code* sets the GDT and *someone else's* code expects the GDT elsewhere -
like outside that range, it'd be important to not make the assumption that
24-bits is enough ... E.g., it's probably smart to clear all bits for a
bootloader that switches to PM.

> o32 push word [mem]
> push dword [mem]
>
> generate exactly the same code. Which form do you prefer?

I would prefer this not valid for NASM code:

o32 push [mem]

I.e., technically and at least for 16-bit and 32-bit code, the stack size is
the same as the code segment size, so it's "known" or should be the
default case that a stack item is 16-bits for RM code in BITS 16
section, just as AX indicates 16-bit register. Ditto for 32-bits. I'm not
familiar enough with 64-bit NASM mode, and am not currently sure of the
stack size for it, and am not going to go look it up ... ;)

> I was saying it would be good if Nasm supported LGDT (or LIDT) in
> the same form as for other instructions such as push that was used as
> an example, above. So,
>
> lgdt dword [mem]
>
> would produce the full load for which we currently have to use
>
> o32 lgdt [mem]
>
> The former may be more consistent but it is good that Nasm at least
> allows the latter.
...

> No, my comments are all about Nasm. It has keywords such as word
> and dword.

Yes, but their use is minimal and usually indicated via a warning message.
You shouldn't ever need to consciously and preemptively place a keyword
such as 'word' or 'dword' with NASM. You place one only when NASM complains
and then it's usually the default segment size.

> > > 66FF36[0501] o32 push word [mem]
>
> > Apparently valid, but you're going around in circles ...
>
> Not at all. I'm trying to generate a comparison between push
> (as an example instruction that allows data size qualifiers) and
> lgdt (which doesn't).

You should just say so:

Why does (or doesn't) NASM support this syntax "..." for LGDT, and NASM
doesn't (or does) support this syntax for PUSH "..."?

(Are Americans too direct? ;-)

I don't know, but my guess is that they were probably back filling syntax
with 'word' and 'dword' only when they couldn't do without it easily.

> I was hoping Nasm would allow
>
> lgdt word [x]
> lgdt dword [x]
>
> where the "word" or "dword" qualifiers relate to the
> 'operand size' mentioned in the manual.

If o32 works, why?

FYI, I'm fairly sure H.P. Anvin would likely agree with you. He added
similar, unecessary I believe, MASM-like keywords to NASM for
64-bit mode despite my complaints of "destroying" the NASM
philosophy with MASM keywords. Add a request to NASM's bugfix page.

> The icing on the cake would be if new versions of Nasm were to
> issue a warning when the LGDT & LIDT operand size was not
> specified.

Huh?

It is specified: BITS 16, BITS 32, ...

> That would be a useful indicator to the programmer to point out
> that the default form in 16-bit mode does NOT load a 32-bit offset.
> Until this discussion I had been unaware that there was a 5-byte form
> of LGDT and that that form was the default. I can't be the only
> programmer not to have known that.

Oh ... Ok.

Well, you could put that issue, and the syntax symmetry issue, and maybe ask
for a "programmer safe" option for LGDT/LIDT and other etc instructions by
default or as an option, in their bugfix requests.

And, why would you expect a 32-bit offset in 16-bit code?

Is this an assembler issue or a programmer issue?

Yes, there are few things one wouldn't expect that x86 instructions do.
Ever looked at DIV or IDIV? Ever tried to use DIV/IDIV? Ever
unexpectedly fail when using DIV/IDIV?


Rod Pemberton


Rod Pemberton

unread,
Dec 10, 2011, 8:37:12 PM12/10/11
to
"James Harris" <james.h...@googlemail.com> wrote in message
news:0c96df7c-3afe-444e...@l24g2000yqm.googlegroups.com...
> On Nov 12, 3:44 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
> > "Rod Pemberton" <do_not_h...@noavailemail.cmm> wrote in message
...

> > [manuals contradictory on entering PM]
>
> An interesting list and I think you might be right that the BIOS
> could have tampered with the bits. I wonder if any other OS
> goes to the trouble you did.

I doubt it. I'm not sure why I did either. I was being thorough at the
time, I guess, and reading the manuals at the time too ... I really see no
need for it. CR3, CR4 should be left alone, until you need paging. I've
not heard of anyone using PAE. PE is the only bit in CR0 that is really
needed. For some OSes, the coprocessor bits in CR0 may need to be set.
BIOS is usually good about cleanup.

If the SMM mode is used on a specific motherboard, maybe it messes with
things more, e.g., if SMM is activated between setting CR0.PE and the far
jump. That could be a "serious" issue. I thought I mentioned that
somewhere ...

Have you ever found mention of a simple set CR0.PE, lgdt, and far jump
failing? Or, heard rumors of it failing? (Yeah, I know there is one on
clax right now that hasn't been entirely worked out ...)

> Certainly, the extra paging bits would be irrelevant
> until paging was enabled and then, if not at defaults, they
> would certainly jump up and bite the Pmode code.

BIOS can enable and disable paging.

> Under the topic of conflicts between Intel manuals these bits don't
> apply do they? I guess that none of the manuals specify to set or
> clear bits other than those for Pmode and paging in CR0.
...

> So are the manuals contradictory?

Well, I thought so at the time! :-)

> Whether the manuals contradict or not, I think you have raised a great
> point that even though the CPU manufacturer's docs say to do things a
> certain way a cautious OS writer will make allowances for the state
> that could have been left by the BIOS.

One of the things I was doing back then was to not assume or use anything
from the BIOS in my OS and related code. I wanted it to be independent.
That may have been part of my reasoning then, but I don't recall now.
Obviously, you've been testing a lot of stuff to check what a variety of
machines actually do. That's good, IMO.

> Of course, some of the bits you mention do not apply to earlier CPUs.
> Without going to the trouble of detecting the CPU type and responding
> accordingly there's possibly an appropriate set of bits to zero and
> another set of bits to leave alone.

Yes. That's very true. At the time, circa '05-'07 ... '08, I only had the
current manuals. So, that definately didn't include older CPUs. I also
didn't know what future CPUs would do either, but I assumed most of the
major bits were defined. So, therefore, the major set of bits was safe to
set and clear upto a certain generation of cpu, while a few other bits were
to be "not touched", i.e., reserved, and expected to act in a safe manner
when read and rewritten. If you modify and write other bits in those
registers, the reserved bits get rewritten too, so the cpu should respond
safely with writes to those bits, i.e., block, accept same value, whatever.


Rod Pemberton



Rod Pemberton

unread,
Dec 10, 2011, 9:07:57 PM12/10/11
to
"James Harris" <james.h...@googlemail.com> wrote in message
news:fbb769f2-77a2-40da...@n10g2000vbg.googlegroups.com...
> On Nov 12, 3:22 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
> > "James Harris" <james.harri...@googlemail.com> wrote in message
...

> > > Since Intel specify the sequence there's no assurance that LIDT in RM
> > > will correctly set up the PM IDT descriptor on *all* x86-32 CPUs.
>
> > AIR, I attempted to follow both old and new manuals, and there are a
> > number of conflicting differences.
>
> Like yourself I am going from memory but I don't recall seeing them as
> conflicting. Maybe that's because I took the 386 manuals as showing
> nothing more specific than the general steps.

I didn't have the 386 or 486 or "586" manuals at the time, just current
manuals circa '05 to '07. I've got a bunch more of them now. I suspect the
386 manual might clear up a few things, at least for me. I could still use
a 286 manual. I'm still searching for that .pdf (authorized or not ...).

> For example, I think it says that a jump (near or far) should follow
> the move to CR0.

AIR, the "new" manuals want the jump immediately after.

> Presumably this applies to 386 and 486 and possibly
> the Pentium 1 - i.e. CPUs where a jump flushes the prefetch
> queue. And later manuals say to do a far jump at this point.

Are you saying they changed the type of jump, or the early manuals
weren't clear?

> Well, a far jump will work on any CPU from the 386 onward
> so I don't see it as a conflict. The far jump is needed in order
> to set CS and has to be done anyway, even on a 386.

Note to self: I've got to read, er... skim, the 386 manual someday ...

> > Since they don't mention the reason, it's also possible they intend to
> > merge instructions or perhaps eliminate some instructions in the future.
> > E.g., maybe there is some problem with design changes for future cpu's.
> > In which case, the startup sequence may be more similar to what they
> > suggest and easier to fix. Or, it could just be a slight timing issue.
> > Who knows?
>
> Maybe. They could have just decided it would help their future
> hardware designs if they had a standard transition sequence to
> support.

In the other post, I mentioned the possibly of SMM being activated between
the two instructions. I don't know if that would affect anything or not,
but it could be why they want them together. E.g., it might be a problem
switching to SMM in-between.

> > > Yes, LIDT works in either mode but isn't the query whether
> > > the IDTR value loaded in one mode is valid in the other?
>
> > If the IDTR is loaded in one mode for use in another and the IDT is
> > not used until the correct mode for it is activated, what is the issue?
>
> If the IDTR is implemented in hardware as literally a 16-bit segment
> and a 32-bit offset and nothing else, and loading it sets no other
> state then for sure I would agree. I just don't know that it is
> implemented that way on all CPUs.

If it's not in the manuals, what do we know about it ... ?

I must assume the instructions work as described, and they are using
standard electronic latches, etc. I probably infer way too much based on my
obsolete electronics knowledge and ancient microprocessor experiences.
I have a certain expectation about how things should work, and they seem
to do so at the time I program them! ;-)

> Loading the IDTR may also cache some info which is not valid in the
> other mode such as protection bits. Of course, even that would not be
> an issue if changing mode flushed such cached info but we don't know
> that that would be done. Sure, I am being extra cautious and I think
> you are probably right that loading the Pmode IDTR in real mode is
> probably safe on all CPUs. But as we can't be sure I would rather
> avoid it.
...

> It's the fact that Intel's later documented [activate PM] sequence
> doesn't say it will work on all CPUs that suggests to me not to rely
> on it.

I actually can't read the current manuals. Newer .pdf's are unreadable on
this OS, so far. I might locate another reader. Do the current manuals say
the same thing about far jumping immediately afterwards? (Yes.)

> > [apparently answered a different question]
>
> I was querying whether your belief was that LIDT was provided in
> real mode *solely* so that the CPU could load the IDTR for
> protected mode.

No, I stated that most will load IDTR in PM if switching in or out of PM.
However, the other uses for LIDT in RM, that I'm aware of as being
legitimate, seem to be so rare as to imply that LIDT's purpose for being
supported in RM is to load a PM IDTR. Unless there is a good use for LIDT
in RM, why is it supported? (already asked).

> Why is [LIDT] needed in RM?
> > [...]
> Well, I do know of one other reason, other than those I mentioned above.
> Supposedly (unconfirmed), Intel designed their chips so that they could be
> used in embedded environments that placed startup code at zero instead of
> at the initial processor startup location. It's rumored that 0xFF 0xFF
> (i.e., bus pull-up read of location without memory or ROM) is a NOP on
> Intel's so the processor execution can wrap to zero and begin executing
> code there (zero). AMDs supposedly page fault on wrap-around. It's
> possible that if it works that way, it's only a temporary instruction,
> until an actual opcode is executed. If true, relocating the RM IVT from
> zero would be required for such a setup.
>

Well, actually, I do know of one more possible reason too. Supposedly, the
x86 supports a true 32-bit RM mode. In which case, with the larger address
range available, someone may want to relocate the IVT. I've just not been
fully able to confirm whether it's true from someone in the know. The few
guys who mention it aren't exactly mentally sound, IMO, or can't describe it
in standard terminology.

Of course, the "truth" about LIDT in RM is probably somewhere in-between, or
perhaps both of, the "generic instruction set" argument and the "needed for
something" argument.


Rod Pemberton



James Harris

unread,
Dec 11, 2011, 4:20:24 AM12/11/11
to
On Dec 11, 1:22 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> "James Harris" <james.harri...@googlemail.com> wrote in message
>
> news:e38ebf2b-dd12-405b...@q11g2000vbq.googlegroups.com...> On Nov 12, 3:22 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>

...

> >  o32 push word [mem]
> >  push dword [mem]
>
> > generate exactly the same code. Which form do you prefer?
>
> I would prefer this not valid for NASM code:
>
> o32 push [mem]

The thing is, o16 and o32 are in Nasm to provide the optional
generation of operand size overrides. The CPU has no prefix for 8-bit
data so where an 8-bit operand was intended the o prefix option would
not be valid. (See the examples below.)

> I.e., technically and at least for 16-bit and 32-bit code, the stack size is
> the same as the code segment size, so it's "known" or should be the
> default case that a stack item is 16-bits for RM code in BITS 16
> section, just as AX indicates 16-bit register.  Ditto for 32-bits.  I'm not
> familiar enough with 64-bit NASM mode, and am not currently sure of the
> stack size for it, and am not going to go look it up ... ;)

Push was chosen to illustrate the point but the same issue exists
elsewhere. Say we choose neg as the example instruction. Then

neg [mem]

is not valid in Nasm because the assembler has no way to tell the
width of the memory that is to be negated. So Nasm provides data-size
qualifiers and we can write

neg dword [mem]
neg word [mem]
neg byte [mem]

Nice and simple! The alternative, if you prefer the above o32 prefix
arrangement would become

o32 neg [mem]
o16 neg [mem]
neg byte [mem]

which is pretty inconsistent - both in terms of clarity and syntax.

> > I was saying it would be good if Nasm supported LGDT (or LIDT) in
> > the same form as for other instructions such as push that was used as
> > an example, above. So,
>
> >  lgdt dword [mem]
>
> > would produce the full load for which we currently have to use
>
> >  o32 lgdt [mem]
>
> > The former may be more consistent but it is good that Nasm at least
> > allows the latter.
>
> ...
>
> > No, my comments are all about Nasm. It has keywords such as word
> > and dword.
>
> Yes, but their use is minimal and usually indicated via a warning message.
> You shouldn't ever need to consciously and preemptively place a keyword
> such as 'word' or 'dword' with NASM.  You place one only when NASM complains
> and then it's usually the default segment size.

Is it possible you are under a misapprehension? Data size specifiers
have to be put in where they are needed. They apply any time the
operand size is not known from the operands. For example,

neg word [mem]
not byte [esi]
push dword [eax + 4 * ebp]
bt dword [mem], 1
bt word [mem], 1

All of these require the data size specification and we don't need to
wait for the assembler to tell us so.

> > > >   66FF36[0501]       o32 push word [mem]
>
> > > Apparently valid, but you're going around in circles ...
>
> > Not at all. I'm trying to generate a comparison between push
> > (as an example instruction that allows data size qualifiers) and
> > lgdt (which doesn't).
>
> You should just say so:
>
> Why does (or doesn't) NASM support this syntax "..." for LGDT, and NASM
> doesn't (or does) support this syntax for PUSH "..."?
>
> (Are Americans too direct? ;-)

Haha - it's more likely that the English are too, er, circumlocutory!

> I don't know, but my guess is that they were probably back filling syntax
> with 'word' and 'dword' only when they couldn't do without it easily.

This can't be right. The keywords byte, word and dword had to be in
from early days.

> > I was hoping Nasm would allow
>
> >  lgdt word [x]
> >  lgdt dword [x]
>
> > where the "word" or "dword" qualifiers relate to the
> > 'operand size' mentioned in the manual.
>
> If o32 works, why?

For consistency with other operand size specifications.

> FYI, I'm fairly sure H.P. Anvin would likely agree with you.  He added
> similar, unecessary I believe, MASM-like keywords to NASM for
> 64-bit mode despite my complaints of "destroying" the NASM
> philosophy with MASM keywords.  Add a request to NASM's bugfix page.
>
> > The icing on the cake would be if new versions of Nasm were to
> > issue a warning when the LGDT & LIDT operand size was not
> > specified.
>
> Huh?
>
> It is specified: BITS 16, BITS 32, ...

No, I mean if the operand size was not specified in the instruction. I
know the semantics are slightly different but compare the following
two for syntax and appearance. Just as Nasm will tell us that it does
not know the operand size if we code

neg [mem]

I would like the assembler to say the same thing if we code

lgdt [mem]

and for the same reason. The operand size was not specified.

> > That would be a useful indicator to the programmer to point out
> > that the default form in 16-bit mode does NOT load a 32-bit offset.
> > Until this discussion I had been unaware that there was a 5-byte form
> > of LGDT and that that form was the default. I can't be the only
> > programmer not to have known that.
>
> Oh ...  Ok.
>
> Well, you could put that issue, and the syntax symmetry issue, and maybe ask
> for a "programmer safe" option for LGDT/LIDT and other etc instructions by
> default or as an option, in their bugfix requests.

I try to avoid having yet more web-server accounts but that may be
worth doing.

> And, why would you expect a 32-bit offset in 16-bit code?

I didn't know there was a form of lgdt that didn't load a full
pointer. Nor did I know it was the real-mode default! A nasty gotcha,
IMHO.

> Is this an assembler issue or a programmer issue?

Haha - partly a programmer one - me. But I have seen other people's
code for lgdt and I'm sure I'm not the only one who was ignorant of
the need for an o32 if the GDT is above 16Mby.

Also, I've seen printed code (from Intel) that makes no
acknowledgement whatever of the need to specify 32-bit in certain
circumstances.... In fact I've just checked two Intel sources and
their code makes no mention of it.

All in all, it would be better if the assembler warned that a size was
needed - as it does for other instructions.

> Yes, there are few things one wouldn't expect that x86 instructions do.
> Ever looked at DIV or IDIV?  Ever tried to use DIV/IDIV?  Ever
> unexpectedly fail when using DIV/IDIV?

Yes, this can be tricky. I don't remember it and look it up every
time.

James

James Harris

unread,
Dec 11, 2011, 4:52:16 AM12/11/11
to
On Dec 11, 2:07 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> "James Harris" <james.harri...@googlemail.com> wrote in message
>
> news:fbb769f2-77a2-40da...@n10g2000vbg.googlegroups.com...> On Nov 12, 3:22 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>

...

> I didn't have the 386 or 486 or "586" manuals at the time, just current
> manuals circa '05 to '07.  I've got a bunch more of them now.  I suspect the
> 386 manual might clear up a few things, at least for me.  I could still use
> a 286 manual.  I'm still searching for that .pdf (authorized or not ...).

I've never found a 286 pdf but I have the manual as a text file. Do
you have that? ... In fact I'd forgotten but since the DOS text file
contained DOS graphics characters which didn't display on Windows I
made a new version as an rtf with block graphics that do display on
windows. I'm not sure how this will appear in Usenet but instead of

ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
º LEVEL 3 ÄÄÄÄÄÄÄÄ×ÄÄLEAST TRUSTED
º ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ» º

the converted version has

┌───────────────────────────┐
│ LEVEL 3 ────────┼──LEAST TRUSTED
│ ┌─────────────────────┐ │

except that in the rtf the RHS vertical line matches up.

> > For example, I think it says that a jump (near or far) should follow
> > the move to CR0.
>
> AIR, the "new" manuals want the jump immediately after.

The jump had to be immediately after even on the 386.

> > Presumably this applies to 386 and 486 and possibly
> > the Pentium 1 - i.e. CPUs where a jump flushes the prefetch
> > queue. And later manuals say to do a far jump at this point.
>
> Are you saying they changed the type of jump, or the early manuals
> weren't clear?

They originally specified that there had to be a jump. Later manuals
recommended, for forward and backward compatibility, to use a far
jump. It makes sense to me as CS has to be loaded at some point. Why
not do it straight away.

...

> In the other post, I mentioned the possibly of SMM being activated between
> the two instructions.  I don't know if that would affect anything or not,
> but it could be why they want them together.  E.g., it might be a problem
> switching to SMM in-between.

I saw it. Yes, horrible, nasty, yucky SMM. Not sure it would cause a
problem here. I just don't like it.

...

> > If the IDTR is implemented in hardware as literally a 16-bit segment
> > and a 32-bit offset and nothing else, and loading it sets no other
> > state then for sure I would agree. I just don't know that it is
> > implemented that way on all CPUs.
>
> If it's not in the manuals, what do we know about it ... ?

Not a lot. It's what we *don't* know that makes me cautious.

...

> > It's the fact that Intel's later documented [activate PM] sequence
> > doesn't say it will work on all CPUs that suggests to me not to rely
> > on it.
>
> I actually can't read the current manuals.  Newer .pdf's are unreadable on
> this OS, so far.  I might locate another reader.  Do the current manuals say
> the same thing about far jumping immediately afterwards? (Yes.)

Yes, they say to carry out a far jump immediately afterwards. (I
suspect it's not a big issue on a newer CPU as the move to CR0 flushes
the prefetch queue but that it's a sequence that works on old and new
processors.)

> > > [apparently answered a different question]
>
> > I was querying whether your belief was that LIDT was provided in
> > real mode *solely* so that the CPU could load the IDTR for
> > protected mode.
>
> No, I stated that most will load IDTR in PM if switching in or out of PM.
> However, the other uses for LIDT in RM, that I'm aware of as being
> legitimate, seem to be so rare as to imply that LIDT's purpose for being
> supported in RM is to load a PM IDTR.  Unless there is a good use for LIDT
> in RM, why is it supported? (already asked).

OK

James

Rod Pemberton

unread,
Dec 12, 2011, 8:23:59 AM12/12/11
to
"James Harris" <james.h...@googlemail.com> wrote in message
news:c4d5d1c4-30f4-4eac...@a17g2000yqj.googlegroups.com...
> On Dec 11, 1:22 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
> > "James Harris" <james.harri...@googlemail.com> wrote in message
> >
news:e38ebf2b-dd12-405b...@q11g2000vbq.googlegroups.com...>
> > > On Nov 12, 3:22 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
...

> The thing is, o16 and o32 are in Nasm to provide the optional
> generation of operand size overrides.

False. I originally responded with "True." Upon rereading, I realized that
you said "optional" ... To program mixed-mode code, they are required.
I.e., if you want a 32-bit form of an instruction in 16-bit mode, you need
the instruction as it would compile for 32-bit mode, e.g., larger offset,
extra bytes if encoding an SIB mode, etc, plus an override since it's in the
wrong mode. Using a BITS 32 directive will not put the override onto the
instruction.

> The CPU has no prefix for 8-bit data
>

True. Why would that be needed? Encodings for 8-bit instructions is 8-bits
in both modes. I.e., the binary sequence means the same thing, unlike the
identical encodings for 16-bit/32-bit instructions, which mean 16-bit in
16-bit mode and 32-bit in 32-bit mode, unless there is an override prefix,
in which case, their meaning is reversed.

> so where an 8-bit operand was intended the o prefix option
> would not be valid. (See the examples below.)

True. The oNN prefix is for 16-bit or 32-bit operands. If placed on an
8-bit instruction, it should do nothing since 8-bit instructions use the
same encodings in both modes (16-bit, 32-bit). I don't believe there
are any exceptions to that. The register modes are the same. The
addressing modes are different. So, the 8-bit instructions will be decoded
as different instructions if a memory operand is specified.

> > I.e., technically and at least for 16-bit and 32-bit code, the stack
> > size is the same as the code segment size, so it's "known" or should
> > be the default case that a stack item is 16-bits for RM code in BITS
> > 16 section, just as AX indicates 16-bit register. Ditto for 32-bits. I'm
> > not familiar enough with 64-bit NASM mode, and am not currently
> > sure of the stack size for it, and am not going to go look it up ... ;)
>
> Push was chosen to illustrate the point but the same issue exists
> elsewhere. Say we choose neg as the example instruction. Then
>
> neg [mem]
>

Push is different from neg. It doesn't have the same forms. The issue with
neg does not exist with push. The stack is never 8-bits. The stack is
16-bits for 16-bit mode and 32-bits for 32-bit mode. The size of pushed
data can be changed from 16-bits to 32-bits or vice-versa with an override.
There is no syntax conflict between "r16" and "r8", or "r32" and "r8", or
"m16" and "m8", or "m32" and "m8" since there is no "r8" and no "m8" form
for push. The same is true for pop and 21 other instructions without "r8"
and "m8" forms. There is an imm8 form for push which gets extended to the
stack size. For push mem, there is no "push r/m8". So, there is only one
push mem form for each mode, i.e., push is "push r/m16" for 16-bit mode or
push is "push r/m32" for 32-bit mode.

> neg [mem] is not valid in Nasm because the assembler has no way to
> tell the width of the memory that is to be negated.

No. It's not valid because NASM doesn't use BITS 16 or BITS 32 to do so.
It could. It should. Not doing so implies that mixed-mode code, 16-bit
data in 32-bit code or 32-bit data in 16-bit code, is more prevalent than
16-bit data in 16-bit code or 32-bit data in 32-bit code. It's not. The
default code segment size is what 95% to 100% of the code in that section
will be. E.g., if I specify the following, I know or expect data at 'mem'
to be 16-bits:

BITS 16
neg [mem]

> So Nasm provides data-size
> qualifiers and we can write
>
> neg dword [mem]
> neg word [mem]
> neg byte [mem]
>
> Nice and simple!

Is it? In BITS 16, dword inserts an o32. In BITS 32, word inserts an o16.
In BITS 16, mem is 2-bytes. In BITS 32, mem is 4-bytes. Quick! Without
rereading, tell me which has an override in BITS 32?

> The alternative, if you prefer the above o32 prefix
> arrangement would become
>
> o32 neg [mem]
> o16 neg [mem]
> neg byte [mem]
>
> which is pretty inconsistent - both in terms of clarity and syntax.

Did you compile and disassemble those? For which mode(s) did you compile?
Did you decompile for the same mode(s)? Theoretical ... ?

If that is real NASM syntax, it's invalid. If it's theoretical syntax, it's
still "incorrect" (from my perspective).

For BITS 16 (theoretical syntax):
neg [mem] ; 16-bit data-size
o32 neg [mem] ; 32-bit data-size

For BITS 32 (theoretical syntax):
neg [mem] ; 32-bit data-size
o16 neg [mem] ; 16-bit data-size

I've ignored the 'byte' case since that confuses things. 'byte' syntax in
x86 doesn't need overrides since they are the same encodings in both modes.
What NASM needs is a decent way to specify the 8-bit form when syntax
conflicts with 16/32-bit forms.

The override switches the compile mode from the default for the specific
instruction. This was real wierd for me to grasp at first. I like it now.
I think of x86 code as purely 16-bit when in 16-bit mode or as purely 32-bit
in 32-bit, with the option to use 8-bit instructions or selectively override
an instruction to a different address or operand size. I.e., if I specify
BITS 16, I want registers, stack, and memory data to all be 16-bits in size.
Now, NASM doesn't do that in it's entirety and the x86 instruction set has a
few limitations too, but that is what I desire.

Since you didn't specify BITS 16 or BITS 32 for either of your examples, it
appears to me that you've specified mixed-mode instructions when they aren't
needed. For the default case of "neg [mem]," o32 and o16 should not be
needed as I've demonstrated above due to BITS 16 or BITS 32.

FYI:
; o16/o32 are the same so these are valid too.
; They look odd and are confusing with an
; override of the same size as the code
; segment size.
For BITS 16 (theoretical syntax):
o16 neg [mem] ; 32-bit data-size
For BITS 32 (theoretical syntax):
o32 neg [mem] ; 16-bit data-size

> Is it possible you are under a misapprehension?

Did you mean misconception or miscomprehension? It's possible, but
unlikely, IMO. ;-)

> Data size specifiers have to be put in where they are
> needed. They apply any time the operand size is not
> known from the operands.

True, but NASM complains for those situations ... This is not necessarily
because there is insufficient information. It just doesn't use all
available info.

> For example,
>
> neg word [mem]
> not byte [esi]
> push dword [eax + 4 * ebp]
> bt dword [mem], 1
> bt word [mem], 1
>
> All of these require the data size specification and we
> don't need to wait for the assembler to tell us so.

You need to do so when NASM complains. I checked. NASM complains for all
those cases for both BITS 16 and BITS 32.

AIUI (which may not be completely accurate), NASM complains for:
1) memory operands, because NASM isn't "smart" enough. This is not
necessarily because there is insufficient information. A default size is
specified by BITS 16 or BITS 32. That could be used. NASM just doesn't use
it. 8-bit instructions could be the non-standard choice requiring a
keyword.
2) syntax conflict between 8-bit and 16/32-bit operand, which only occurs
for memory operands. Register naming provides sufficient information to
select the correct size for register operands.

But, if NASM used BITS 16 and BITS 32 to specify the default sizes for
memory, keywords wouldn't be "required" for most of them would they? (No.
'byte' being the exception.)

> > > The icing on the cake would be if new versions of Nasm were
> > > to issue a warning when the LGDT & LIDT operand size was
> > > not specified.
>
> > Huh?
>
> > It is specified: BITS 16, BITS 32, ...
>
> No, I mean if the operand size was not specified in the instruction.

It is specified elsewhere when not specified in the instruction: BITS 16,
BITS 32, ... It's just that NASM doesn't always use the specified default
size as the default size, e.g., for memory operands. It's not
"smart" enough to use the default size for memory, apparently ...

> I know the semantics are slightly different but compare the
> following two for syntax and appearance. Just as Nasm will
> tell us that it does not know the operand size if we code
>
> neg [mem]

I am of the strong opinion that NASM shouldn't complain for that ...

BITS 16 or BITS 32 specifies the default size (or should), i.e., no 8-bit
instructions by default, 16-bit or 32-bit forms by default. IMO, NASM
should compile - if it worked properly - the 0xF7 form for both. Selecting
the 8-bit form 0xF6 in 16-bit or 32-bit code is the non-standard choice.
The stack size is fixed and is not 8-bits. Using overrides to select the a
larger or smaller operand is a non-standard choice too.

> I would like the assembler to say the same thing if we code
>
> lgdt [mem]
>
> and for the same reason.

NASM is "dumb" ... ? That wasn't the 'same reason' was it? LOL!

It's definately not using information gleaned from BITS 16 or BITS 32. It's
definately not taking the default stack size into consideration. If the
stack is 16-bits and we do a push [mem], why wouldn't NASM "think" the
memory data size is 16-bits just like the stack?

> The operand size was not specified.

Nor, AISI, should it need be ... A valid form exists for BITS 16 and BITS
32, and o16/o32 and a16/a32 can select different operand and addressing
combinations when needed.

> Also, I've seen printed code (from Intel) that makes no
> acknowledgement whatever of the need to specify 32-bit in certain
> circumstances.... In fact I've just checked two Intel sources and
> their code makes no mention of it.

Have you read the mixed-mode code sections of the manuals yet? (No.)

Personally, if I knew enough about NASM's code and was being paid to
develop it, I would take it away from the use of keywords like 'word',
'dword', etc. I.e., minimize their usage even further. I don't have a nice
solution for eliminating 'byte' since something is needed to select 8-bit
forms
for memory due to syntax conflicts. However, I probably wouldn't want to
work on it since it gained 64-bit code ability.


Rod Pemberton



Antoine Leca

unread,
Dec 12, 2011, 10:58:28 AM12/12/11
to
James Harris wrote:
> On Nov 12, 3:22 am, "Rod Pemberton" wrote:
>>> If the base of the GDT is at 2**24 or above the
>>> lgdt command will be unsuccessful.
>>
>> Not sure. Does it just wrap? I.e., upper address bits are effectively
>> truncated since they aren't loaded or are cleared? That's my guess.
>
> I think I see your point. Per a 386 manual, the high order byte is
> "not used." Perhaps this refers to what comes from memory. That byte
> is probably set to zero in the register. I suppose a separate bit
> could be set to say that when placing the GDTR value onto the bus only
> output 24 bits of the offset and zero the top 8 bits. Sounds like hard
> work for the hardware designer but I suppose it could be done that
> way, perhaps on an early design taken from the 286.

I guess the hardware designer did not work that hard :-)
From the 80386 hardware reference, about differences with 80286:

13.3.1 Wraparound of 80286 24-Bit Physical Address Space

With the 80286, any base and offset combination that
addresses beyond 16M bytes wraps around to the first
megabyte of the 80286 address space. With the 80386,
since it has a greater physical address space, any
such address falls into the 17th megabyte. [...]

I cannot believe GDT accesses could be any different (on any processor),
even if they are not explicitly mentioned here.


You have obviously noticed that this works the same as the precedent of
the 8086->80286 compatibility, about the 1-MB wrap around (which was
noticed by IBM engineers and was "fixed" then by the A20 gate...)


Antoine

James Harris

unread,
Dec 14, 2011, 3:15:34 PM12/14/11
to
On Dec 12, 1:23 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:

...

> > The thing is, o16 and o32 are in Nasm to provide the optional
> > generation of operand size overrides.
>
> False.  I originally responded with "True."  Upon rereading, I realized that
> you said "optional" ...  To program mixed-mode code, they are required.

I meant that they add an optional prefix. For example,

1E o16 push ds ;in bits 16 mode
661E o16 push ds ;in bits 32 mode

The codes on the LHS are those generated by the assembler. The same
instruction, "o16 push ds", has generated two different machine code
sequences. Why? The second was in bits 32 mode so needed the prefix to
tell the CPU to do a 16-bit push. The o16 generates a prefix in one
mode but not the other. Hence, "optional."

...

> > The CPU has no prefix for 8-bit data
>
> True.  Why would that be needed?

I meant that if you could just use an oNN prefix to select 16- or 32-
bit operands, as you seemed to prefer, there is no equivalent o8 so
you would have to use a different way to tell the assembler to use an
8-bit operand leading to an inconsistent syntax.

...

> > Push was chosen to illustrate the point but the same issue exists
> > elsewhere. Say we choose neg as the example instruction. Then
>
> >   neg [mem]
>
> Push is different from neg.  It doesn't have the same forms.  The issue with
> neg does not exist with push.

The thing is, push and neg (and other instructions) allow reference to
memory such as these, none of which is valid in Nasm as the data size
is not specified.

push [mem]
neg [mem]
not [mem]
bt [mem], 1

As I understand it you would prefer them all to use the bits size by
default and I've no idea how you would specify byte widths on neg and
not. I like they fact that in Nasm they all use the same syntax (not
shown) to make them valid.

...

> > neg [mem] is not valid in Nasm because the assembler has no way to
> > tell the width of the memory that is to be negated.
>
> No.  It's not valid because NASM doesn't use BITS 16 or BITS 32 to do so.
> It could.  It should.

All I can say to that is YMMV.

> Not doing so implies that mixed-mode code, 16-bit
> data in 32-bit code or 32-bit data in 16-bit code, is more prevalent than
> 16-bit data in 16-bit code or 32-bit data in 32-bit code.  It's not.  The
> default code segment size is what 95% to 100% of the code in that section
> will be.  E.g., if I specify the following, I know or expect data at 'mem'
> to be 16-bits:
>
> BITS 16
> neg [mem]

Well, it's an opinion. I don't agree with it but I understand it. I do
use 16-bit values in 32-bit code.

> > So Nasm provides data-size
> > qualifiers and we can write
>
> >  neg dword [mem]
> >  neg word [mem]
> >  neg byte [mem]
>
> > Nice and simple!
>
> Is it?  In BITS 16, dword inserts an o32.  In BITS 32, word inserts an o16.
> In BITS 16, mem is 2-bytes.  In BITS 32, mem is 4-bytes.  Quick!  Without
> rereading, tell me which has an override in BITS 32?

LOL! The point is that I can code the instruction thinking about the
data and leave the assembler to generate a prefix if necessary.

> > The alternative, if you prefer the above o32 prefix
> > arrangement would become
>
> >  o32 neg [mem]
> >  o16 neg [mem]
> >  neg byte [mem]
>
> > which is pretty inconsistent - both in terms of clarity and syntax.
>
> Did you compile and disassemble those?  For which mode(s) did you compile?
> Did you decompile for the same mode(s)?  Theoretical ... ?

Theoretical, yes. I thought I was clear enough. When I said, "would
become," I was saying that if you would prefer to use o16 and o32
instruction prefixes rather than Nasm's word and dword data prefixes
you would have to use a different format for the byte option. I
personally prefer a consistent format because I find it easier to
remember but maybe you remember each instruction separately.

...

> I've ignored the 'byte' case since that confuses things.

LOL! It doesn't confuse things in Nasm. It might confuse things in
your model. Sorry to laugh. Your comment reminded me of a comedy
sketch where the judge tells the lawyers not to mention the facts
"because they only confuse me."

>  'byte' syntax in
> x86 doesn't need overrides since they are the same encodings in both modes.
> What NASM needs is a decent way to specify the 8-bit form when syntax
> conflicts with 16/32-bit forms.

This is going rather off topic but how would you like to specify a
byte?

> The override switches the compile mode from the default for the specific
> instruction.  This was real wierd for me to grasp at first.  I like it now.
> I think of x86 code as purely 16-bit when in 16-bit mode or as purely 32-bit
> in 32-bit, with the option to use 8-bit instructions or selectively override
> an instruction to a different address or operand size.  I.e., if I specify
> BITS 16, I want registers, stack, and memory data to all be 16-bits in size.
> Now, NASM doesn't do that in it's entirety and the x86 instruction set has a
> few limitations too, but that is what I desire.

OK. I guess I tend to mix operand sizes more than you.

...

> > Is it possible you are under a misapprehension?
>
> Did you mean misconception or miscomprehension?  It's possible, but
> unlikely, IMO.  ;-)

:-) Misapprehansion: "an understanding of something that is not
correct."

>
> > Data size specifiers have to be put in where they are
> > needed. They apply any time the operand size is not
> > known from the operands.
>
> True, but NASM complains for those situations ...  This is not necessarily
> because there is insufficient information.  It just doesn't use all
> available info.

True. Nasm doesn't store data sizes. I was surprised at that at first
but I've been working without it doing so for so long that I don't
think about its absence.

> > For example,
>
> >   neg word [mem]
> >   not byte [esi]
> >   push dword [eax + 4 * ebp]
> >   bt dword [mem], 1
> >   bt word [mem], 1
>
> > All of these require the data size specification and we
> > don't need to wait for the assembler to tell us so.
>
> You need to do so when NASM complains.  I checked.  NASM complains for all
> those cases for both BITS 16 and BITS 32.

Of course it does. That's why I wrote them! :-) The point is that you
can tell in advance that size specifiers are needed. You don't need to
wait for the assembler to tell you.

...

> > > It is specified: BITS 16, BITS 32, ...
>
> > No, I mean if the operand size was not specified in the instruction.
>
> It is specified elsewhere when not specified in the instruction: BITS 16,
> BITS 32, ...  It's just that NASM doesn't always use the specified default
> size as the default size, e.g., for memory operands.  It's not
> "smart" enough to use the default size for memory, apparently ...

It's not a case of being smart or not. It's just that Nasm doesn't
work that way.

...

> Personally, if I knew enough about NASM's code and was being paid to
> develop it, I would take it away from the use of keywords like 'word',
> 'dword', etc.  I.e., minimize their usage even further.

If you ever do please work on a fork rather than the original. ;-)

James

Rod Pemberton

unread,
Dec 15, 2011, 5:16:11 AM12/15/11
to
"James Harris" <james.h...@googlemail.com> wrote in message
news:0c8befde-e3ae-4913...@p20g2000vbm.googlegroups.com...
> On Dec 12, 1:23 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
...

> I meant that they add an optional prefix. For example,
>
> 1E o16 push ds ;in bits 16 mode
> 661E o16 push ds ;in bits 32 mode
>

BITS 16
1E push ds
1E o16 push ds ; unneeded syntax
661E o32 push ds

BITS 32
1E push ds
661E o16 push ds
1E o32 push ds ; unneeded syntax

Interesting ... missing overrides ...

It appears NASM does distinquish between o16 and o32 depending on the
BITS setting. It compares the override size to the mode size and disregards
the keyword or disregards emitting the override when unecessary for that
mode, i.e., same size code segment and instruction size. I use the correct
override in the correct mode, so I guess I never noticed or forgot. I.e., I
would've expected and preferred o16 and o32 to generate the operand
override in all cases ... NASM should warn when it's not using the prefix,
or warn that it's invalid.

> The codes on the LHS are those generated by the assembler. The same
> instruction, "o16 push ds", has generated two different machine code
> sequences. Why?

Apparently, it's been deemed unecessary for the one mode ... Once
you add o32 to the mix, I think it makes more sense, perhaps?

> The second was in bits 32 mode so needed the prefix to
> tell the CPU to do a 16-bit push. The o16 generates a prefix
> in one mode but not the other. Hence, "optional."

o32 generates the prefix in the other ...

> I meant that if you could just use an oNN prefix to select
> 16- or 32-bit operands, as you seemed to prefer, [...]

Why do you think this is a preference instead of "embracing the way
the x86 microprocessor instruction encoding actually works"?

Most who program in assembly desire a 1-to-1 correspondence between
instruction syntax and encoded binary values. That's not entirely possible
on x86 due to instruction duplication. But, for override prefixes - which
are a single byte - it is possible. That's what was implemented for NASM.
Correction: That's what was implemented for NASM, apparently with mode size
dependence ... However, it seems you're arguing that it shouldn't be that
way. You're arguing that the syntax should be coherent and symmetric, even
if it's meaningless, unecessary, or doesn't correspond well to the encoded
binary values. IMO, that's been done. It's called MASM. You remember the
exceptionally large list of keywords for size specifications that I posted?
I believe that is the expression of that philosophy. x86 instructions with
a memory operand generally have a register or stack for the other half of
the operation. I.e., the memory size is usually "known": matches the
register or stack size or is set by keyword, like BITS. There are only a
few instructions that are truly are mixed size instructions between register
and memory: movsz, movzx. There are 24 or so that are mixed size with an
imm8. There are about 18 that are single argument like 'neg'. 'setcc' is
the only one with just r/m8 form. 'bswap' is the only one with just an r32
form. There are a few with just r/m16 form, but they are all system
instructions. There are 23 or so that don't have an r/m8 form. There are
even a few with an r32/m16 form ... How do you expect to fit a symmetric
syntax to all of that?

> Well, it's an opinion. I don't agree with it but I understand it.

Where do you think NASM syntax started from? An opinion by somebody that
memory operands shouldn't need keywords like PTR that MASM used, perhaps
"influenced" by TASM's ideal mode ... See "Subject 3" section:
http://webster.cs.ucr.edu/Page_TechDocs/X86FAQ/tasm.html

> The point is that I can code the instruction thinking about the
> data and leave the assembler to generate a prefix if necessary.

I code the instruction without thinking about the data. I only need to know
register names and brackets for memory operands. NASM will tell me when
it "thinks" I'm insufficiently clear ... Then, I decide what additional
syntax is needed. I usually "know" which ones will be problems. It's the
two cases I mentioned before. But, why risk an incorrect keyword?

> I personally prefer a consistent format because I find it easier
> to remember but maybe you remember each instruction separately.

I personally prefer a syntax that matches the binary encodings as closely as
possible. That's my absolute top, far above-all-else, preference. Given
that, I also prefer a syntax that matches the AMD and Intel manuals,
somewhat. Neither is entirely possible with x86. There are ambiguities in
encodings and syntax. I don't like AMD and Intel manual syntax completely
either. But, I would rather have a syntax that is close to the manual
syntax than one that is not close. I like syntax to match the manuals
because those are what I read. GAS syntax is very different from the
manuals. MASM is quite a bit different, IMO too. NASM is much closer, but
not perfect. Even so, you can see from the manuals that encodings and
syntax are duplicated for many instructions. Some method must be used to
select the alternates, if the ability to encode the alternates is a
priority. For me, being able to encode the same sequence is good to have
since it allows one to compare the resulting binary with the original to
confirm the source is correct. When reading or coding x86 code, one can
make mistakes. If you can't compare the binaries for equality, you can't be
entirely sure the new program source is correct. You can manually check
each differing byte sequence via disassembly. But, even then, you may miss
something.

> This is going rather off topic but how would you like to
> specify a byte?

Specify a byte or specify a memory argument a byte in size? We were
discussing the latter, but that wasn't what you just asked. I have no real
preference as long as I can select or determine the correct syntax. I get
aggravated when I can't determine the correct syntax or select a certain
instruction encoding via syntax. When constructing an instruction with
db's, you lose all the important features of assembly, like calculating
offsets. It helps if the syntax makes some sense and is memorizable, i.e.,
abbreviation or naming is based on what AMD or Intel use. I wouldn't expect
an "operand size override" to begin with Z. The AMD and Intel manuals use
r8 for registers, imm8 for constants, and m8 for memory. There is no reason
why 'm8' or 'byte' etc can't be used for 8-bit memory locations.

> I guess I tend to mix operand sizes more than you.

NASM takes care of that for registers. The question is why are you doing
that for memory? Using the native size for memory prevents alignment and
boundary issues, such as alignment faults, stack faults from under/overflow,
and address wraps (like 32-bit on A20 boundary).

> The point is that you can tell in advance that size specifiers
> are needed. You don't need to wait for the assembler to tell you.

Yes, you could. But, if you get into the habit of pre-emptively placing
keywords, you may also end up placing unnecessary or faulty keywords.


Rod Pemberton








Rod Pemberton

unread,
Dec 15, 2011, 5:19:06 AM12/15/11
to
"James Harris" <james.h...@googlemail.com> wrote in message
news:0c8befde-e3ae-4913...@p20g2000vbm.googlegroups.com...
> On Dec 12, 1:23 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
...

> > > Is it possible you are under a misapprehension?
>
> > Did you mean misconception or miscomprehension? It's
> > possible, but unlikely, IMO. ;-)
>
> :-) Misapprehansion: "an understanding of something that
> is not correct."

If you look up apprehension and comprehension or apprehend and comprehend,
they're not quite the same. Personally, I've only used or seen apprehension
or apprehend used in the fear and foreboding sense. That's why I replied.
I wasn't aware that apprehension had the other meanings. From the
dictionary definitions, I wasn't quite sure what the difference was either,
other than apprehension had "understanding" as a lower ranked definition.
Perhaps, this is a UK English versus US English issue? I.e., we're not
using it?

http://vspages.com/apprehension-vs-comprehension-the-difference-between-1167/
http://www.differencebetween.net/language/difference-between-apprehension-and-comprehension/


Rod Pemberton







James Harris

unread,
Dec 15, 2011, 3:28:13 PM12/15/11
to
On Dec 15, 10:16 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> "James Harris" <james.harri...@googlemail.com> wrote in message

...

I do make an effort to reply when you ask me a direct question so I
hope you don't mind if I skip many of those in your post this time. In
general we are talking about preferences so it doesn't seem a big
loss. I will respond to those below.

> > I guess I tend to mix operand sizes more than you.
>
> NASM takes care of that for registers.  The question is why are you doing
> that for memory?

For scalars I normally match the mode size (e.g. 16-bit values if in
16-bit mode) for speed but, say, in 32-bit mode I might store a table
of bytes or 16-bit words. Also, there's never any question of whether
to range-check right-sized parameters.

> Using the native size for memory prevents alignment and
> boundary issues, such as alignment faults, stack faults from under/overflow,
> and address wraps (like 32-bit on A20 boundary).

I guess you don't mean that exactly as written. You know that
alignment prevents alignment problems. Data sizes do not. Even 32-bit
words in 32-bit mode should be consciously aligned. That applies to
data, bss and stack.

> > The point is that you can tell in advance that size specifiers
> > are needed. You don't need to wait for the assembler to tell you.
>
> Yes, you could.  But, if you get into the habit of pre-emptively placing
> keywords, you may also end up placing unnecessary or faulty keywords.

IMHO it's not hard to work out where to use them. As a suggestion,
don't think of individual instructions but of addressing of memory
indicated by square brackets and the presence or absence of a way for
Nasm to tell the width of the memory being addressed.

James

James Harris

unread,
Dec 15, 2011, 3:30:26 PM12/15/11
to
On Dec 15, 10:19 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:

...

> If you look up apprehension and comprehension or apprehend and comprehend,

...

> Perhaps, this is a UK English versus US English issue?

Quite likely. IME your version is more consistent and logical.

James

Rod Pemberton

unread,
Dec 16, 2011, 2:57:52 AM12/16/11
to
"James Harris" <james.h...@googlemail.com> wrote in message
news:89579024-f5f7-4fb0...@o9g2000yqa.googlegroups.com...
> On Dec 15, 10:16 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
> > "James Harris" <james.harri...@googlemail.com> wrote in message
...

> > Using the native size for memory prevents alignment and
> > boundary issues, such as alignment faults, stack faults from
> > under/overflow, and address wraps (like 32-bit on
> > A20 boundary).
>
> I guess you don't mean that exactly as written. You know that
> alignment prevents alignment problems. Data sizes do not.
>

They don't? What if I'm 1) in 16-bit mode, 2) I have an address that is
correctly aligned to 16-bits, 3) the address is located 16-bits below a
boundary, and 4) I write or read 32-bits of data to that address ... Didn't
16-bits just cross the boundary? What if it crossed a page boundary from a
mapped to unmapped page when paging is enabled? What if it crossed the A20
boundary near 1MB when A20 wraps memory? What if it popped a 32-bit item
from the top of a 16-bit stack? Those are examples of an alignment and/or
boundary problems due to data size, yes? In all cases, they were correctly
aligned for 16-bits, just not 32-bits ... Well, that's what I was thinking
of as issues.


Rod Pemberton



James Harris

unread,
Dec 16, 2011, 2:46:56 PM12/16/11
to
On Dec 16, 7:57 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> "James Harris" <james.harri...@googlemail.com> wrote in message
> news:89579024-f5f7-4fb0...@o9g2000yqa.googlegroups.com...> On Dec 15, 10:16 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> > > "James Harris" <james.harri...@googlemail.com> wrote in message
>
> ...
>
> > > Using the native size for memory prevents alignment and
> > > boundary issues, such as alignment faults, stack faults from
> > > under/overflow, and address wraps (like 32-bit on
> > > A20 boundary).
>
> > I guess you don't mean that exactly as written.  You know that
> > alignment prevents alignment problems. Data sizes do not.
>
> They don't?  What if I'm 1) in 16-bit mode, 2) I have an address that is
> correctly aligned to 16-bits, 3) the address is located 16-bits below a
> boundary, and 4) I write or read 32-bits of data to that address ...  Didn't
> 16-bits just cross the boundary?

Trying to divine the principle.... Tell me if I am right. I think you
mean, and meant, that using a word size that is wider than the "native
size" has caused a misaligned access whereas keeping, in this case, to
16-bit values would have avoided the misalignment. Have I understood?

> What if it crossed a page boundary from a
> mapped to unmapped page when paging is enabled?  What if it crossed the A20
> boundary near 1MB when A20 wraps memory?  What if it popped a 32-bit item
> from the top of a 16-bit stack?  Those are examples of an alignment and/or
> boundary problems due to data size, yes?

Yes. These are also examples of accessing a datum wider than the
native size. Your principle sounds more general and I thought you were
applying it also to accessing narrower data under the same sort of
conditions such as you would get if you had a native size of 32-bits,
32-bit alignment and accessed 16-bit data.

As you know, a data section that includes dd or a bss section that
includes resd does not align the values automatically. For that an
align directive may be needed. Similarly, a stack does not
automatically realign itself if sp/esp have been loaded with an odd
number. Hence, my comment that alignment - i.e. explicit alignment -
may be needed to ensure alignment of data. It can be used in your
scenarios too for the data sections by using align 4 before dd or
resd. Of course, the stack would need alignment to be ensured by code.

> In all cases, they were correctly
> aligned for 16-bits, just not 32-bits ...  Well, that's what I was thinking
> of as issues.

Understood.

James

Rod Pemberton

unread,
Dec 17, 2011, 3:51:50 AM12/17/11
to
"James Harris" <james.h...@googlemail.com> wrote in message
news:77234136-b1f0-483c...@m7g2000vbc.googlegroups.com...
>
> Your principle sounds more general and I thought you were
> applying it also to accessing narrower data under the same sort of
> conditions such as you would get if you had a native size of 32-bits,
> 32-bit alignment and accessed 16-bit data.
>

Not to be flippant, but who aligns ... ? I.e., most programmers will just
define some bytes in their code for their data and access it - unaligned and
without any concern for it.

For those that do align, who aligns to 32-bits for a 16-bit segment ... ?
I.e., that would be "safe" practice and probably required in a corporate
environment ...

In many cases, especially for smaller programs, the code an data will be in
the same code section (e.g., text) without any alignment directives. Most
programmers will separate the two, but sometimes they are interleaved.
In the latter case, alignment directives will be needed everywhere, if used.
Attempting to code something like that in a corporate environment might be
problematic, even if that is the model required to implement the program,
e.g., Forth interpreter.

I think I've only explicitly aligned the GDT, perhaps IDT in assembly ...
Since my code is predominantly C, my code in assembly is usually smaller in
size. In which case, should I really care about speed, register stalls,
alignment, etc?


Rod Pemberton



James Harris

unread,
Dec 17, 2011, 5:39:51 AM12/17/11
to
On Dec 17, 8:51 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> "James Harris" <james.harri...@googlemail.com> wrote in message

...

> > Your principle sounds more general and I thought you were
> > applying it also to accessing narrower data under the same sort of
> > conditions such as you would get if you had a native size of 32-bits,
> > 32-bit alignment and accessed 16-bit data.
>
> Not to be flippant, but who aligns ... ?  I.e., most programmers will just
> define some bytes in their code for their data and access it - unaligned and
> without any concern for it.

I do. Of course, I sometimes come across unaligned data and when that
happens I break down in tears. :-((

Aligned data is the only option on some CPUs. x86 is very forgiving
(as long as alignment checking has not been enabled). I've seen newer
data structure designs consciously properly align fields. (When I say
newer I'm probably talking about the last fifteen years or so.) Older
designs sometimes have unaligned values. From memory I think the BDA
has some.

Once there is data alignment in a .data or a .bss section it is
maintained so there's no need for a lot of explicit align statements.
And IIRC most executable formats begin data and bss sections with a
default alignment so you can know what you are starting from.

> For those that do align, who aligns to 32-bits for a 16-bit segment ... ?
> I.e., that would be "safe" practice and probably required in a corporate
> environment ...

I don't know who does it but as you know it's very easy to do
*regardless* of segment size. In Nasm I think it's

align 4, db 0 ;in the .data section
align 4, resb 0 ;in the .bss section

or something like that.

> In many cases, especially for smaller programs, the code an data will be in
> the same code section (e.g., text) without any alignment directives.  Most
> programmers will separate the two, but sometimes they are interleaved.
> In the latter case, alignment directives will be needed everywhere, if used.
> Attempting to code something like that in a corporate environment might be
> problematic, even if that is the model required to implement the program,
> e.g., Forth interpreter.

Assemblers normally allow temporarily switching to a data section so
data values can be separated from code in the object file. Here's an
example not of good coding (it is not good coding) but to illustrate
the point.

section .code
mov eax, [mydata]

section .data
align 4, db 0
mydata: dd 53
myvalu: dd -1

section .code
add eax, [myvalu]
ret

The .data section happens to be defined in the middle of the code but
the idea is that it will not appear there in the object module.
Instead it will appear separately, grouped with any other .data
sections.

(The value myvalu does not need an explicit alignment as the alignment
of 4 will have been maintained.)

Nasm allows local labels (those who's definitions begin with a dot are
appended to the most recent prior label that does not begin with a
dot) and, for any who are not aware, the mechanism works well across
sections so you can do, starting in section .code,

mymod:

[section .data]
.mydata: dd 1023
__SECT__

.myroutine:
mov eax, [.mydata]
ret

As well as leaving you back in the section (.code) that you started
from this fragment generates a field in the .data section called

mymod.mydata

and that field is referred to in the mov instruction. Sweet! All
related labels - code and data - will be grouped under the initial
name mymod. This can help to modularise big files of code and any that
include lots of other files.

On the topic of code, it can be worth aligning branch targets too. I
don't think it's good to go overboard on this but aligning frequently
jumped-to labels can help performance. IIRC the main thing is to avoid
jumping to within a few bytes of the end of a code cache line and it
is related to the CPU's decoding-ahead of the next few instructions.
Details can be found in the optimisation manuals but something like

align 8
label:

might be appropriate. The align directive in a .code section like this
generates nops so code prior to the label, if any, can run over the
alignment code without generating an exception.

The value of 8 doesn't necessarily get to the beginning of a cache
line, which may be much larger, but it does mean the next 8 bytes will
be in the same cache line as each other. In practice another power of
two alignment might be better depending, mainly, on what follows.

For details I would need to reread the manufacturers manuals and Fog's
excellent optimisation documents to be sure of picking good values. If
going to this level there are related issues to consider which I
haven't mentioned such as mutli-byte nops and code movement.

> I think I've only explicitly aligned the GDT, perhaps IDT in assembly ...
> Since my code is predominantly C, my code in assembly is usually smaller in
> size.  In which case, should I really care about speed, register stalls,
> alignment, etc?

I would hope that any compiler would align data properly. If you
access C variables from assembler alignment should already be correct
shouldn't it? If you define variables in assembler it may matter. I
should say that IMHO it's good practice to ensure alignment always but
only code that gets executed repeatedly will have a noticeable impact
on performance.

James

s_dub...@yahoo.com

unread,
Dec 17, 2011, 12:07:25 PM12/17/11
to
On Dec 17, 4:39 am, James Harris <james.harri...@googlemail.com>
wrote:
> On Dec 17, 8:51 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
>
> > "James Harris" <james.harri...@googlemail.com> wrote in message
>
> [...]
>
> > In many cases, especially for smaller programs, the code an data will be in
> > the same code section (e.g., text) without any alignment directives.  Most
> > programmers will separate the two, but sometimes they are interleaved.
> > In the latter case, alignment directives will be needed everywhere, if used.
> > Attempting to code something like that in a corporate environment might be
> > problematic, even if that is the model required to implement the program,
> > e.g., Forth interpreter.
>
> Assemblers normally allow temporarily switching to a data section so
> data values can be separated from code in the object file. Here's an
> example not of good coding (it is not good coding) but to illustrate
> the point.
>
> section .code
>   mov eax, [mydata]
>
> section .data
>   align 4, db 0
>   mydata:  dd 53
>   myvalu:  dd -1
>
> section .code
>   add eax, [myvalu]
>   ret
>
> The .data section happens to be defined in the middle of the code but
> the idea is that it will not appear there in the object module.
> Instead it will appear separately, grouped with any other .data
> sections.
>

And the programmer needs to realize that CS=DS is required for the
above, the address of mydata follows the last of [.code] plus the
alignment of 4. Address of 000Ch in this case, if the obj form is -f
bin.

Personally, I prefer section sized by alignment of 16, and place
'align 16' at the end of similarly named sections. For the above..

section .code
add eax, [myvalu]
ret
ALIGN 16

Which causes .data to be paragraph aligned, since that section was
named next after .code in the above.

Then, if CS != DS is desired, the first naming of 'section .data' is
changed to [SECTION .data vstart=0] so that the address of 'mydata'
would be DS:0000h and dereference properly.

The thing is though, the programmer's target obj format has everything
to say about his desire to 'code craft' on these issues, and what it
allows is further mitigated by his choice and use of tools; such as C
structure member alignment options. -nothing new, but bears
repeating. (So to NASM's struc..endstruc calls attention to data
member aligning and aligning when declaring its instances).

The very reason I prefer NASM is that it allows -f bin, and 'code
crafting' just about every which way the CPU can toggle its flip-
flops.

Steve
[...]
0 new messages