On 12.06.2021 21:06, James Harris wrote:
> On Sunday, 26 April 2015 at 19:54:07 UTC+1, James Harris wrote:
>> You know that, per Intel's directions, after setting the CR0 PE flag
>> mov eax, cr0
>> or al, 1
>> mov cr0, eax
>> we are expected to have something like
>> jmp seg:pmode_running
>> I had taken that jump instruction for granted but the recent Qemu/GDT
>> thread has brought up some issues about the jump, as follows.
>> 1. The jump appears to be necessary in order to put the correct pmode
>> GDT entry number in CS (in its upper 13 bits, i.e. shifted left 3 bits)
>> and also to set the low bits of CS so that they contain the CPL and TI,
>> all of which should be zero.
> Going back to this old thread as I have some more information.
> The Sybex book Programming the 80386 says on the back that its authors are one of the 80386's logic designers and the 80386's (and thus x86-32's) chief architect, so they ought to know a thing or two about the design! On page 605 the book says that after setting the PE bit the processor enters "16-bit protected mode" - which I didn't know even existed (but see below).
Old (some newer as well) docs are often written with doubtful wording.
I remember my early attempts to enter PM (but 486), it doesn't work that
way, PM16 or PM32 start is exactly at the point where CS become altered.
> The processor apparently stays in that mode (Protected Mode but 16-bit) for as long as we want, and what changes it to 32-bit PMode is an instruction which loads CS with the selector for a 32-bit descriptor.
PM16 or 32 just differ by one bit in the code segment selector.
> Correspondingly, it would appear to be possible to change /back/ to 16-bit PMode by loading CS with the selector of a 16-bit descriptor.
Yes, my mode switch from PM32 to RM16 needs one step to PM16 in between
but the switches from RM16 to PM32 or LM don't need a PM16 step.
> Basically, B=0 means 16-bit and B=1 means 32-bit.
> I knew about the bit, of course, but never really understood it; and I didn't realise that B=0 was the mode the processor executed in after enabling PMode prior to reloading CS.
yeah, it wasn't too well documented :)
> The different stages of enabling PMode are now possibly easier to guess at.
> 1. After enabling CR0.PE
> The processor will be in 16-bit PM
> but early processors (including 386 and 486) did not automatically flush the prefetch queue so they could have already decoded some of the following bytes as RM. I would guess that many such decodings would be different but that some could be the same. If that's right then some instructions could be validly executed here even without flushing the prefetch queue. There's no value in doing so but ISTM informative to see exactly what it likely to change and when.
> 2. After flushing the prefetch queue
> This will apparently be true 16-bit PM with a 64k limit on code addresses - and probably the same for data addresses.
> From this point it turns out that contrary to normal practice one could load selectors for 32-bit /data/ descriptors while still running the code in 16-bit mode. I say that because the code in
it's a bit dated :)
> includes the following:
> LGDT tGDT__pword
and the GDT should already contain valid entries here.
> ; switch to protected mode
correction: prepare to switch
> MOV EAX,CR0
> MOV EAX,1
> MOV CR0,EAX
;> ; clear prefetch queue
;> JMP SHORT flush
> ; set DS,ES,SS to address flat linear space (0 ... 4GB)
> MOV BX,FLAT_DES-Temp_GDT
> MOV DS,BX
> MOV ES,BX
> MOV SS,BX
> Note the data selectors being loaded before the code selector (which the code changes much later) - and the botched update of CR0 which appeared in many Intel sources of the time.
it doesn't matter where data/stack selectors were initialized as long
they aren't used. I make DS SS:ESP before, ES,FS.GS
after the far jump.
> FWIW, the code also goes on to do a bunch of other stuff such as
> ; initialize stack pointer to some (arbitrary) RAM location
> MOV ESP, OFFSET end_Temp_GDT
wherever you want it to be :)
> ; copy eprom GDT to RAM
> MOV ESI, DWORD PTR GDT_eprom +2 ; get base of eprom GDT
> MOV EDI,GDTbase
> MOV CX,WORD PTR gdt_eprom +0
> INC CX
> SHR CX,1
> REP MOVS WORD PTR ES : [EDI] , WORD PTR DS:[ESI]
> ; point ES:EDI to GDT base in RAM.
> ; limit of eprom GDT
> ; easier to move words
> ;copy eprom IDT to RAM
> MOV ESI, DWORD PTR IDT_eprom +2 ; get base of eprom IDT
> MOV EDI,IDTbase
> MOV CX,WORD PTR idt_eprom +0
> INC CX
> SHR CX,1
> REP MOVS WORD PTR ES : [EDI] , WORD PTR DS:[ESI]
this ROM GDT may point to an already wiped RAM area !!!
modern (already old now) BIOS use only temporary PM32 and LM.
> etc, all before setting CS to a 32-bit descriptor.
> 3. After loading CS (via a far jump or fall call or by a TSS switch) to refer to a 32-bit descriptor.
I wouldn't use a far CALL nor (total worse) a task-switch here.
if you're brave you could use what I do for mode switches but only after
the initial PM and stack setup:
PUSH EFL ;needs 66 if within 16 bit code
PUSH new_descriptor ;I use immediate constants here
PUSH new_offset ;
> The code will finally be in PM32.
could be in PM16 as well.
>> I knew the above but the following points are of particular interest
>> just now as I had not considered them before - or if I had then I had
>> forgotten the subtleties of the problem.
OK, nothing new on the matter to me.
>> 3. That jump instruction has a 16-bit form and a 32-bit form. It is
>> encoded in hex as
>> EA oo oo ss ss (16-bit form)
>> EA oo oo oo oo ss ss (32-bit form)
>> where the Ss are the selector and the Os are the offset as hex bytes.
> The above info implies that it's the /short/ version of the jump which is required.
yes of course because this jump is still RM16 [NOT PM16] code.
>> 5. Immediately after the MOV to CR0 to set the pmode bit the CPU is
>> still in 16-bit mode. Right?
> That turns out to be right.
Yes still in 16 bit Realmode, until CS is loaded.
>> 6. Now, where it gets interesting is that the offset field of the EA
>> jump instruction seems to be an offset not from the jump instruction but
>> from the start if the segment. Is that correct? If so then we have to be
>> careful which jump form is encoded, as follows.
>> If the executing code is in the low 64k of a descriptor's space then we
>> can encode the simple
>> EA oo oo ss ss
>> because the offset can fit in 16 bits. But if the executing code is
>> above the 64k mark relative to the start of the segment then we need to
>> encode the 32-bit form for 16-bit mode, i.e.
>> 66 EA oo oo oo oo ss ss
only if there is already code loaded (guess how to do before PM..).
>> Solution 1. Set up a temporary GDT entry to point to the place in memory
>> where the code is running. In the above case, the GDT entry could point
>> at 0x10000 and then the jump offset would be 0x2345, leading to the jump
>> instruction being encoded as
>> EA 45 23 ss ss (bytes shown in memory order, i.e. little endian)
0000:0600 66 EA 18 00 45 23 10 00 jmp far 0018:102345 ;assume FLAT CS
FFFF:2355 ;aka 0010:2345 PM32 code