Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

paging makes OS to reboot infinitely

148 views
Skip to first unread message

אורי ויסבלום

unread,
Dec 4, 2022, 7:48:08 AM12/4/22
to
Hello, I'm writing an OS and I can't find my problem. The second i set the CR0 paging bit, the system reboots itself, and goes to an infinite loop.
currently i dont care about page allocations, i'll write this part later, for now i want a paging setup that works, even with only one page directory entry.
if you know what might cause this, please let me know.
the kernel calls initPDT with some arbitrary number and goes to an infinite loop.
my code is as followed (ignore irq, and print, they work fine):

#define PDT_SIZE 1024

typedef enum {
PRESENT = 1,
READWRITE = 2,
USER = 4,
WRITETHROUGH = 8,
CACHE = 16
} PageDirectoryFlags;

typedef uint32_t PDEntry;

typedef struct {
PDEntry entries[PDT_SIZE];
} PD;

typedef uint32_t PTEntry;

typedef struct {
PTEntry entries[PDT_SIZE];
} PT;

void startVirtualMode(uint32_t address) {
__asm__("mov %0, %%cr3"::"r"(address));
uint32_t cr0 = 0;
__asm__("mov %%cr0, %0":"=r"(cr0));
cr0 |= 0x80000000;
__asm__("mov %0, %%cr0"::"r"(cr0));
}

void initPDT(uint32_t address) {
PD* table = (PD*)address;

for (int i = 0; i < PDT_SIZE; i++) {
table->entries[i] = READWRITE | PRESENT | USER | (i>>22);
}

// last entry points to the pdt itself
table->entries[PDT_SIZE-1] = READWRITE | PRESENT | address;

irqInstallHandler(14, pagefault);

startVirtualMode(address);
}

James Harris

unread,
Dec 4, 2022, 4:43:26 PM12/4/22
to
On 04/12/2022 12:48, אורי ויסבלום wrote:

> Hello, I'm writing an OS and I can't find my problem. The second i set the CR0 paging bit, the system reboots itself, and goes to an infinite loop.

A lot can happen in a second. I'll assume you mean that no instructions
are executed after you set CR0.PG.

If in 32-bit mode do you have a page directory and the requisite initial
page tables set up (or the equivalent) and do they identity-map the code
location you are running at? Are they all marked Present and are all
their other bits correct?

In case of not-present or a protection fault or for debugging (see
below) have you got interrupts working and a handler for the paging
interrupt? Is the handler's interrupt gate fully correct?

Have you loaded CR3?

Do you follow MOV CR0 with a JMP?


> currently i dont care about page allocations, i'll write this part later, for now i want a paging setup that works, even with only one page directory entry.
> if you know what might cause this, please let me know.
> the kernel calls initPDT with some arbitrary number and goes to an infinite loop.

IIRC once you enable paging (and execute a JMP) every CPU-controlled
memory access will go via paging, including access to the GDT and other
system tables.

You could set up a handler for the paging interrupt and write something
to the screen if it gets triggered. Remember to have all memory you use
mapped with Present PTEs and Present PDEs.

I notice you are using inline asm. Some do. But linking a separate asm
file can be easier to work with.


--
James Harris


Joe Monk

unread,
Dec 4, 2022, 5:15:20 PM12/4/22
to

> cr0 |= 0x80000000;

Paging is bit 31.

https://wiki.osdev.org/CPU_Registers_x86

Joe

anti...@math.uni.wroc.pl

unread,
Dec 4, 2022, 6:30:06 PM12/4/22
to
% printf "0x%x\n" $[1<<31]
0x80000000

--
Waldek Hebisch

Alexei A. Frounze

unread,
Dec 4, 2022, 9:06:44 PM12/4/22
to
If your page tables aren't properly set up, your kernel will almost certainly
triple-fault and reset because it itself uses page tables to access memory.
When you screw up your page tables, memory accesses either won't work at all
due to permissions or will read/write wrong memory cells, not the ones
you're expeting.

Are you setting up 4KB pages or 4MB pages? You can't magically have both.
For 4KB pages you aren't setting up any page tables or you're not showing your
code for this.
For 4MB pages recursive mapping won't work. And you're not setting
PDE.PS=1 for this either.

Also, with PDT_SIZE=1024, (i>>22) is always 0 in
----8<----
for (int i = 0; i < PDT_SIZE; i++) {
table->entries[i] = READWRITE | PRESENT | USER | (i>>22);
}
----8<----
For 4MB pages you probably want (i<<22) here.

You need to understand the page table hierarchy/structure and
properly set it up.

Alex

אורי ויסבלום

unread,
Dec 6, 2022, 12:00:35 PM12/6/22
to
First of all, thank you all for replying, I really appreciate it.

James:
> If in 32-bit mode do you have a page directory and the requisite initial
page tables set up (or the equivalent) and do they identity-map the code
location you are running at? Are they all marked Present and are all
their other bits correct?

Yes, I mean no instructions are happening after I set CR0. Didn't know it needs to JMP to a new line of code after it, I thought the jump at the end of the scope of the function is enough, but it makes a lot of sense I should identity-map the kernel's code into virtual mode. But I reckon it's not my only problem there.
Yes, I'm in 32-bit mode. No, I don't have an initial page tables setup, but if I'm not mistaken, that shouldn't cause the whole system reboot thing, until I'm trying to do a memory-based operation, and then it should call for a page fault.
Although now I realize that if I'm saying the function's ending is equivalent to a jump, it means I am trying to execute a memory-based operation, so yeah, it might be it.
Yes, the handler for page fault is correct, the code just doesn't get to it. I loaded CR3, as you can see in the code.

I believe for GDT specifically the CPU is not using paging for the address, but it does for JMP operations.
I did do a page fault handler. I've tried including asm file, it is a bit easier but it's not that different.

Joe, I think Waldek is right.

Alex:
>If your page tables aren't properly set up, your kernel will almost certainly
triple-fault and reset because it itself uses page tables to access memory. When you screw up your page tables, memory accesses either won't work at all due to permissions or will read/write wrong memory cells, not the ones you're expeting.

Yeah, but I thought it would call for a page fault, not reboot the system over and over again. I think James is right about his guess about not identity-mapping the kernel's code as the reason for its rebooting.


>Are you setting up 4KB pages or 4MB pages? You can't magically have both. For 4KB pages you aren't setting up any page tables or you're not showing your code for this.
4KB. You're right, I'm not setting up page tables, but again it should do a page fault IMO.


>Also, with PDT_SIZE=1024, (i>>22) is always 0. For 4MB pages you probably want (i<<22) here.
Oh, I forgot that I need to shift left, you're right.
Why are you talking specifically about 4MB pages? Isn't it right for 4KB too?

(Also, this setup is just for placeholding, they're obviously not real page tables)

אורי ויסבלום

unread,
Dec 6, 2022, 2:41:23 PM12/6/22
to
Now, here is the new code I've made (with a basic kmalloc, without free because I don't need it for now). The same bug happens. Any other ideas?
BTW, I've checked all of the or expressions and all of the allocPage() outputs, they're as intended.


#define PDT_SIZE 1024
#define KERNEL_START 0x1000 // a constant in my linker code
#define PAGE_SIZE 0x1000
#define KERNEL_SIZE 3 // the size of the kernel code is 10KB for now (I've checked it), so I gave it 3 pages * 4KB = 12KB
#define KERNEL_END (KERNEL_START + PAGE_SIZE*KERNEL_SIZE)

PTEntry* kernelPTAddr = 0;
uint32_t firstFree = 0;

uint32_t allocPage() {
if (firstFree == 0)
firstFree = KERNEL_END;
firstFree += PAGE_SIZE;
return firstFree - PAGE_SIZE;
}

void initPDT() {
PDEntry* table = allocPage();
kernelPTAddr = allocPage();
// Here it was with the struct of PD earlier, but I've realized it doesn't work correctly, it doesn't take into account the size of P*Entry, so I'm using pointers now.
*table = READWRITE | PRESENT | (uint32_t)kernelPTAddr;

for (int i = 1; i <= KERNEL_SIZE; i++) {
*(kernelPTAddr+i*sizeof(PTEntry)) = PRESENT | READWRITE | DIRTY | KERNEL_START + (i-1)*PAGE_SIZE;
}

// last entry points to the PDT itself
*(table+(PDT_SIZE-1)*sizeof(PDEntry)) = READWRITE | PRESENT | (uint32_t)table;

irqInstallHandler(14, pagefault);

startVirtualMode((uint32_t)table);
}
void startVirtualMode(uint32_t address); // the same as before.

// typedefs (it's in the .h file, the order of the definitions in this mail is for comfort purposes)
typedef enum {
PRESENT = 1,
READWRITE = 2,
USER = 4,
WRITETHROUGH = 8,
CACHE = 16,
DIRTY = 64 // this is specifically in PTEntry and not in PDEntry
// you may have noticed i don't mention some other flags, such as PageSize, Global, PageAttributeTable
// I didn't because we don't need them
} PageFlags;

typedef uint32_t PDEntry;
typedef uint32_t PTEntry;

Scott Lurndal

unread,
Dec 6, 2022, 3:37:56 PM12/6/22
to
Here's some working code that starts in real mode and
ends in long mode with paging enabled setting up enough state to
call a C++ function. This should give you an idea of the steps
required.

setup.s:

/*
// Boot Component
//
// This module implements the second-stage boot. This code is physically
// located 512 bytes into the image loaded by the PXE code (the image is
// constructed by boot/build.c). This part of the PXE image is
// loaded at address 0x90000 (576kB) and the rest is loaded at address
// 0x10000 (64kB). Using this loading mechanism, the image may be no more
// than 512 kB in size.
//
// After the syslinux PXE loader loads the first 64k of the image at
// address 0x90000, it will jump, in 16-bit real-mode, to the address
// 0x90200. The image as built by boot/build.c will contain a 512-byte
// Master Boot Record (which is unused by the PXE boot protocol, but provided
// for compatability with alternate boot methods) followed immediately by
// the code below.
//
// This code will ensure that the data segment register correctly
// points to the same region as the code register. If an additional
// 512 bytes are needed, an alternate segment register can be loaded
// with the address of the mbr (0x90000) and used for scratch storage.
//
// The code will then perform any needed real-mode initialization steps,
// turn on protected mode, paging and long mode, then branch to a
// 32-bit code segment with an identity mapping of all physical memory.
//
// Written:
// 2004-December-21 Scott Lurndal
//
// Modified:
// 2006-October-27 Scott Lurndal
// Separated platform level initialization from processor initialization
//
*/

DVMM_START=0x100000
PAGE_TABLE_ADDR=0x0

SS_CODE64=0x10 # GDT Segment Selector for 64 bit code segment
SS_CODE32=0x18 # GDT Segment Selector for 32 bit code segment
SS_DATA=0x20 # GDT Segment Selector for data segment

PE_BIT=0x1 # Protected Mode Enable bit in CR0
PG_BIT=0x80000000 # Paging Mode Enable bit in CR0
PAE_BIT=0x00000020 # Physical address Extensions enable in CR4
PG_FLAGS=0x7 # Supervisor/ReadWrite/Present Flags for Paging
LME_BIT=0x100 # Long Mode Enable Bit in EFER

SMAP=0x534D4150

.code16 # Generate code for 16-bit real mode
.text

.global start
start:
jmp continue

/*
// Following the initial 16-bit jump instruction is a header which is filled
// in by the PXE netboot code (syslinux package).
*/
.ascii "HdrS" # Header signature
.word 0x0203 # Header version #
.long 0 # boot loader hook
.long 0 # start_syst (kernel version string)
.byte 0 # Type of loader
loadflags:
.byte 1 # Load high.
.word 0x8000 # Setup move size
h_start32:
.long 0x100000 # 32-bit code start address
.long 0 # Address of loaded ramdisk (DVMM: Configfile)
.long 0 # Ramdisk size in bytes
ap_start:
.word 0
.word 0
.word message+1024 # Heap end pointer (end of second-sector-boot)
.word 0 # pad
.long 0 # Command line pointer
.long 0xefffffff # Highest safe address for initrd

continue:
call next

.org 0x1fe
.word 0xaa55 # Signature looked for by first stage boot
#end of header

next:
movw %cs, %ax # Get CS (Should be 0x9020)
movw %ax, %ds # Set DS (Set to 0x9020)
pushw $0 # Clear flags
popfw

testw $1,ap_start # BSP or AP?
jnz 1f # AP, skip platform initialization

call initbsp # Do platform initialization
1:
call loaddt32 # load 32-bit descriptor tables
call relocate_pt # Relocate page tables
call enable_protectedmode # Enable protected mode

9:
hlt # Halt the CPU
jmp 9b # stay halted

initbsp:
call getmemsize
call initvideo

movl $0xec00, %eax
movl $0x2, %ebx
int $0x15 # Configure BIOS for long mode

call enableA20

cli
movb $0x80, %al # Set to disable NMI
outb %al, $0x70 # Disable NMI
xorw %ax, %ax
outb %al, $0xf0 # Not used on modern systems (math coproc)
call iodelay
outb %al, $0xf1 # Not used on modern systems (math coproc)
call iodelay

movb $0xff, %al # Mask all interrupts
outb %al, $0xA1 # program the PIC
call iodelay
movb $0xfb, %al # Mask all but IRQ2 cascade
outb %al, $0x21 # program the second PIC

call initdt32 # Setup 32-bit descriptor tables

# Save first 16 bytes of IVT and clear
xorw %ax, %ax # Clear
movw %ax, %es # Setup %ES to point at the IVT
movl %es:0x0, %eax
movl %eax, %es:0x200
movl %es:0x4, %eax
movl %eax, %es:0x204
movl %es:0x8, %eax
movl %eax, %es:0x208
movl %es:0xc, %eax
movl %eax, %es:0x20c
xorl %eax, %eax
movl %eax, %es:0x0
movl %eax, %es:0x4
movl %eax, %es:0x8 # Set percpu_cpunum to zero
movl %eax, %es:0xc # ...

ret

relocate_pt:
#
# Relocate the identity mapped page tables to low memory starting at
# address 0xa000. These page tables will be used only until the 64-bit
# C++ code establishes the real page tables.
#
orw $PG_FLAGS, identitypdp # Set the flags in the PDP entry
orw $PG_FLAGS, identitypml4 # Set the flags in the PML4 entry
orw $PG_FLAGS, identitypml4e2 # Same, but for the 2nd PML4 entry.
orw $PG_FLAGS, percpupdp # Same, for per-cpu pdp
orw $PG_FLAGS, percpupml4 # Same, for per-cpu pml4
lea identitydir, %si # Set the source address

movl $0xa00, %eax # Page Tables segment 0xa00 (addr = 0xa000)
movw %ax, %es
movw $PAGE_TABLE_ADDR, %di # Set the destination address

lea identitypml4e2 , %cx
addw $0x8, %cx
subw %si, %cx # Get the length of the page table.
shr $1, %cx # Get the number of words to be copied.
cld
rep movsw # Shift page tables to page table segment

#
# re-base the page tables at 0xa000
#
shl $4, %ax # Convert segment to physical address
addw %ax, %es:identitypdp-identitydir
addw %ax, %es:identitypml4-identitydir
addw %ax, %es:identitypml4e2-identitydir
addw %ax, %es:percpupdp-identitydir
addw %ax, %es:percpupml4-identitydir
ret

loaddt32:
lidt idtdesc32 # load 32-bit IDT
data32 lgdt gdtdesc32 # load 32-bit GDT
ret


noe820:
lea noe820msg, %si
call print
call delay
call delay
jmp 1b
noe820msg:
.string "No E820 Support in BIOS!\n"

#
# One E820 Map entry. Instead of using INT $15 to read directly
# into our E820 map buffer, we'll read into this map entry. We do
# this because some broken BIOS will only update portions of the
# buffer expecting the prior contents in the remainder.
#
mapentry:
.quad 0 # Base address
.quad 0 # Size, in bytes
.long 0 # Type field

getmemsize:
movw %ds, %ax # segment for mapentry
movw %ax, %es # Use it for int15
lea mapentry, %di # And this buffer
movw $0x9000, %ax # Segment for map buffer
movw %ax, %fs # load seg register
xorl %esi, %esi # Clear offset in map buffer
xorl %ebx, %ebx # Continuation counter

1:
movl memsize, %ecx # Remaining size
cmpl $20, %ecx # Enough left?
jl 2f # No, done.

movl $0x0000e820, %eax # Phoenix BIOS GET SYSTEM MEMORY MAP
movl $SMAP, %edx # string 'SMAP'
movl $20, %ecx
int $0x15 # Call bios
jc noe820 # No E820 - we don't support older bios!

cmpl $SMAP, %eax # Correct return value?
jne noe820 # No E820 - we don't support older bios!

movl %es:(%di), %eax # Get base low
movl %eax, %fs:(%si) # Store in buffer
movl %es:4(%di), %eax # Get base high
movl %eax, %fs:4(%si) # Store in buffer
movl %es:8(%di), %eax # Get size low
movl %eax, %fs:8(%si) # Store in buffer
movl %es:12(%di), %eax # Get size high
movl %eax, %fs:12(%si) # Store it
movl %es:16(%di), %eax # Get type
movl %eax, %fs:16(%si) # Store it
movw %si, %ax # Get buffer offset
addw $20, %ax # Bump to next entry
movw %ax, %si # Restore buffer offset
subl $20, memsize # Decrement by amount used

cmpl $0, %ebx # done yet?
jne 1b # Nope, get next chunk

2:
ret

initvideo:
ret

enableA20:
call testA20 # Check if a20 is enabled
jnz 1f # Yes, done.

movw 0x2401,%ax # Enable A20 gate
int $0x15 # Call the BIOS

call testA20 # Check again,
jnz 1f # yes, done.

call flush_keyboard # Flush keyboard controller

movb $0xD1, %al # Command Write
outb %al, $0x64 # send to register
call flush_keyboard # Flush keyboard controller
movb $0xDF, %al # A20 On
outb %al, $0x60 # send to register
call flush_keyboard # Flush keyboard controller
#
# Some chips/motherboards also require enabling A20 via port 0x92.
# This needs to be done before checking if A20 is enabled.
#
inb $0x92, %al # Get current state
orb $0x02, %al # Set Fast A20 bit
andb $0xfe, %al # Clear reset bit, just in case
outb %al, $0x92 # Enable fast A20

call testA20 # Check if a20 is enabled
jnz 1f

lea noa20msg, %si # Load message
call print # and print it.

hlt
1:
ret

A20ADDR=0x200

testA20:
pushw %cx # Save caller's register
pushw %ax # Save caller's register
xorw %cx, %cx # Set %cx = 0
movw %cx, %fs # Load %FS = 0000
decw %cx # Set %cx = FFFF
movw %cx, %gs # Load %GS = FFFF
movw $0x4000, %cx # Loop 16K times as A20 enable can take a while
movw %fs:(A20ADDR), %ax # Fetch word at address 0x200
pushw %ax # Save it for later restore
1:
incw %ax # Bump it
movw %ax, %fs:(A20ADDR) # Store it back at 0x200
call iodelay # Wait one.
cmpw %gs:(A20ADDR+0x10), %ax # See if it appears at 0x100200
loope 1b # Issue the compare 16K times
# ZF = 1 if the same value appears at both
# ZF = 0 if different values at both

popw %fs:(A20ADDR) # Restore original value
popw %ax # Restore caller's register
popw %cx # Restore caller's register
ret

flush_keyboard:
pushl %ecx # Save caller's reg
movl $100000, %ecx # Set maximum loop count for broken hdw
1:
decl %ecx # Count an interation
jz 3f # Exhausted interation count, bail.

call iodelay
inb $0x64, %al # Read 8042 status port
testb $1, %al # Is there data at port 60 for system?
jz 2f # No.

call iodelay
inb $0x60, %al # Read and discard byte
jmp 1b

2:
testb $2, %al # System data still pending?
jnz 1b # yes, loop

3:
popl %ecx
ret

initdt32:
xorl %eax, %eax # Clear register
movw %cs, %ax # Fetch CS
shll $4, %eax # Flatten it out
movl %eax, %ebx # Save it
addl $gdt, %eax # relative to gdt
movl %eax, (gdtdesc32+2) # Set address and

movl $idt32, %eax # Get base of IDT
movl $20, %ecx # Number of entries to munge

# Set the IDT32 entries in proper format
1: addr32 movw 2(,%eax,1), %dx # Copy high word
addr32 movw %dx, 6(,%eax,1) # Copy high word
addr32 movw $SS_CODE32, 2(,%eax,1) # Set code segment selector

addl $8, %eax # Bump to next entry
decw %cx # Down by one
testw $0xffff,%cx # Is it zero yet?
jnz 1b # No, loop around

movl %ebx, %eax # Retrieve segment base address
addl $idt32, %eax # relative to idt
movl %eax, (idtdesc32+2) # Set address and

movl %ebx, cs_realmode # Save the flattened CS base address
ret


enable_protectedmode: /* Do we need to set up the stack segment here ? */

movl %cr0, %eax # Get CR0
orl $PE_BIT, %eax # Enable 32 bit protected mode
movl %eax, %cr0 # Set CR0

# Set the base address in the 32 bit code descriptor of the GDT to
# cs_realmode. We can then specify enable_longmode as the
# operand of the far jmp directly without having to account for
# cs_realmode at runtime.

movl cs_realmode, %eax # Get CS pointer
movl $gdt, %ebx # Get GDT base

addr32 movw $0xFFFF, SS_CODE32(,%ebx,1) # 4GB Code segment limit
addr32 movl %eax, SS_CODE32+2(,%ebx,1) # Code segment base
addr32 orw $0x9A00, SS_CODE32+4(,%ebx,1) # Code descriptor flags
addr32 movw $0xCF, SS_CODE32+6(,%ebx,1) # Code descriptor flags

# Do the same for the 32 bit data descriptor in the GDT
addr32 movw $0xFFFF, SS_DATA(,%ebx,1) # 4GB Data segment limit
addr32 movl %eax, SS_DATA+2(,%ebx,1) # Data segment base
addr32 orw $0x9200, SS_DATA+4(,%ebx,1) # Data descriptor flags
addr32 movw $0xCF, SS_DATA+6(,%ebx,1) # Data descriptor flags

# A far jump is needed to actually activate the protected mode

# jmp enable_longmode,[SS_CODE32] (Invoke address using GDT entry #4)
.byte 0x66 # code32 override
.byte 0xea # jmpi instruction
.long enable_longmode # Address to invoke
.word SS_CODE32 # GDT[2]


iodelay:
outb %al,$0x80
ret

/* Print a null-terminated string. Address in si. */

print:
lodsb #
andb %al, %al
jz 2f

xorb %bh, %bh # Display page 0
movw $1, %cx # Print one time
movb $0x7, %bl # Text attribute
movb $0x0e, %ah # BIOS TTY
# char in %al
int $0x10 # Display character
jmp print

2:
ret

delayreload: .long 0x8000000 # 2 billion+

/* Delay for slightly over one second. Assumes 2.4ghz clock */
delay:
movl delayreload, %eax
1:
decl %eax
jg 1b

2:
incl %eax
jg 2b

ret

/* Print a word */
/* Value to print is in %ax. Pointer to label in %si */
dumpword:
pushw %dx # Save DX
movw %ax,%dx # Copy PC

call print

lea regstring, %si
call print

movw %dx,%ax # Reload AX
andw $0xf000,%ax # Mask high digit
shrw $12,%ax # Shift
call putdigit # Print digit

movw %dx,%ax
andw $0x0f00,%ax
shrw $8,%ax
call putdigit

movw %dx,%ax
andw $0x00f0,%ax
shrw $4,%ax
call putdigit

movw %dx,%ax
andw $0x000f,%ax
call putdigit

lea crlf, %si
call print

popw %dx
ret

putdigit:
lea digitlist,%si
addw %ax,%si
movb (%si),%al
movw $1, %cx # Repeat only once
movw $0x0007,%bx # Page # and Attribute
movb $0x0e,%ah # Function WRITE TTY
int $0x10 # Video call
ret

digitlist:
.ascii "0123456789ABCDEF"

message:
.string "Welcome to the 3Leaf Networks DVMM.....\r\n"

noa20msg:
.string "Unable to enable gate A20!\r\n"

regstring:
.string " value is: 0x"

crlf:
.string "\r\n"


.code32
.text
.global enable_longmode
enable_longmode:
movl $SS_DATA, %eax
movl %eax, %ds
movl %eax, %es
movl %eax, %fs
movl %eax, %gs

# Set up the protected mode stack pointer

lss stack_segdesc, %esp # Load SS:ESP
movl cs_realmode, %eax # Get CS address

# Set up cr3 with PML4 base before paging is enabled

leal identitypml4, %eax # Get PML4 base address
subl $identitydir, %eax
addl $0xa000, %eax # Bump to real base
movl %eax, %cr3

movl %cr4, %eax # Get CR4
orl $PAE_BIT, %eax # Set PAE for long mode
movl %eax, %cr4 # Set CR4

# Enable long mode
movl $0xC0000080, %ecx # EFER address
rdmsr # Read EFER Register into EAX
orl $LME_BIT, %eax # Enable long mode
wrmsr # Set EFER

# Enable paging to activate long mode
movl %cr0, %eax # Get CR0
mov $0x80050033,%eax
movl %eax, %cr0 # Set CR0
jmp 1f # Clear pipeline


1:
movl memsize, %eax # Amount of e820 buf space remaining
movzwl ap_start, %ebx # BSP or AP flag
movl cs_realmode, %esi # Real-mode base address

# Here, we are still in the compatibility
# mode.A far jump is needed to actually activate the 64bit mode.

# jmp 0x10000,[SS_CODE64] (Invoke address using GDT entry #2)

.byte 0xea # jmpi instruction
.long DVMM_START # Address to invoke, dvmmstart is based at DVMM_START
.word SS_CODE64 # GDT[2]


lea bummer, %si
call print
9: hlt
jmp 9b

bummer:
.string "Fell through jump to C++ code - halting\r\n"

intde:
hlt
intdb:
hlt
intnmi:
hlt
intbp:
hlt
intof:
hlt
intbr:
hlt
intud:
hlt
intnm:
hlt
intdf:
hlt
intts:
hlt
intnp:
hlt
intss:
hlt
intgp:
hlt
intpf:
hlt
intmf:
hlt
intac:
hlt
intmc:
hlt
intxf:
hlt


memsize:
.long 512 # Number of bytes available for E820 memory map

cs_realmode: # Save the CS value in real mode.
.long 0x0 # for offset calculations in protected mode.

#
# Global Descriptor Table
#
# The GDT has one reserved entry (0), but this code was written with
# the belief that there are two reserved entries. No reason to fix that
# now, since setup64.S just creates its own GDT anyway.
#
# Entry 2 is a flat 4GB view for 32-bit code access. Entry 3 is
# a flat 4GB view for 32-bit data access.
#
# The same GDT can be used in the both the 32 bit mode as well as the 64
# bit mode, except that the code segment descriptor in the long mode has
# the Long mode bit set.

.align 16
gdt:
.long 0, 0 # Entry 0 (NULL segment descriptor)
.long 0, 0 # Entry 1 (reserved)
.long 0x0000ffff, 0x00AF9A00 # 0-4Gb Code read/exec compat mode
.long 0x0000ffff, 0x00CF9A00 # 0-4Gb Code read/exec 32-bit
# (Segment Base modified at run time.)
.long 0x0000ffff, 0x00CF9200 # 0-4Gb Data read/write 32-bit
# (Segment Base modified at run time.)
gdt_end:

gdtdesc32: # 32 bit GDT Descriptor
.word gdt_end-gdt-1 # limit
.long 0 # 32 bit GDT base (Needs flattening)

.align 16
idt32:
.long intde, 0x00008e00 # #DE /* Reversed the order of the IDT entries */
.long intdb, 0x00008e00 # #DB
.long intnmi, 0x00008e00 # #NMI
.long intbp, 0x00008e00 # #BP
.long intof, 0x00008e00 # #OF
.long intbr, 0x00008e00 # #BR
.long intud, 0x00008e00 # #UD
.long intnm, 0x00008e00 # #NM
.long intdf, 0x00008e00 # #DF
.long 0, 0x00008e00 # Reserved
.long intts, 0x00008e00 # #TS
.long intnp, 0x00008e00 # #NP
.long intss, 0x00008e00 # #SS
.long intgp, 0x00008e00 # #GP
.long intpf, 0x00008e00 # #PF
.long intmf, 0x00008e00 # #MF
.long intac, 0x00008e00 # #AC
.long intmc, 0x00008e00 # #MC
.long intxf, 0x00008e00 # #XF
idt32_end:

idtdesc32:
.word idt32_end-idt32-1 # limit
.long 0 # 32 bit IDT base (Needs flattening)

stack:
.space 4096, 0
stacktop:
stack_segdesc:
.long stacktop
.word SS_DATA

# Initial Page Tables in the long mode will provide an identity mapping for
# a virtual address space of 0-1Gb with a page size of 2Mb.
# The PD, PDP, PML4 must be aligned on a boundary congruent to zero modulo 4096.

.align 4096,0
identitydir:
.quad 0x0000000000000087 # 0000000000000000 - 00000000001fffff
.quad 0x0000000000200087 # 0000000000200000 - 00000000003fffff
.quad 0x0000000000400087 # 0000000000400000 - 00000000005fffff
.quad 0x0000000000600087 # 0000000000600000 - 00000000007fffff
.quad 0x0000000000800087 # 0000000000800000 - 00000000009fffff
.quad 0x0000000000a00087 # 0000000000a00000 - 0000000000bfffff
.quad 0x0000000000c00087 # 0000000000c00000 - 0000000000dfffff
.quad 0x0000000000e00087 # 0000000000e00000 - 0000000000ffffff
.quad 0x0000000001000087 # 0000000001000000 - 00000000011fffff
.quad 0x0000000001200087 # 0000000001200000 - 00000000013fffff
.quad 0x0000000001400087 # 0000000001400000 - 00000000015fffff
.quad 0x0000000001600087 # 0000000001600000 - 00000000017fffff
.quad 0x0000000001800087 # 0000000001800000 - 00000000019fffff
.quad 0x0000000001a00087 # 0000000001a00000 - 0000000001bfffff
.quad 0x0000000001c00087 # 0000000001c00000 - 0000000001dfffff
.quad 0x0000000001e00087 # 0000000001e00000 - 0000000001ffffff
.quad 0x0000000002000087 # 0000000002000000 - 00000000021fffff
.quad 0x0000000002200087 # 0000000002200000 - 00000000023fffff
.quad 0x0000000002400087 # 0000000002400000 - 00000000025fffff
.quad 0x0000000002600087 # 0000000002600000 - 00000000027fffff
.quad 0x0000000002800087 # 0000000002800000 - 00000000029fffff
.quad 0x0000000002a00087 # 0000000002a00000 - 0000000002bfffff
.quad 0x0000000002c00087 # 0000000002c00000 - 0000000002dfffff
.quad 0x0000000002e00087 # 0000000002e00000 - 0000000002ffffff
.quad 0x0000000003000087 # 0000000003000000 - 00000000031fffff
.quad 0x0000000003200087 # 0000000003200000 - 00000000033fffff
.quad 0x0000000003400087 # 0000000003400000 - 00000000035fffff
.quad 0x0000000003600087 # 0000000003600000 - 00000000037fffff
.quad 0x0000000003800087 # 0000000003800000 - 00000000039fffff
.quad 0x0000000003a00087 # 0000000003a00000 - 0000000003bfffff
.quad 0x0000000003c00087 # 0000000003c00000 - 0000000003dfffff
.quad 0x0000000003e00087 # 0000000003e00000 - 0000000003ffffff
.quad 0x0000000004000087 # 0000000004000000 - 00000000041fffff
.quad 0x0000000004200087 # 0000000004200000 - 00000000043fffff
.quad 0x0000000004400087 # 0000000004400000 - 00000000045fffff
.quad 0x0000000004600087 # 0000000004600000 - 00000000047fffff
.quad 0x0000000004800087 # 0000000004800000 - 00000000049fffff
.quad 0x0000000004a00087 # 0000000004a00000 - 0000000004bfffff
.quad 0x0000000004c00087 # 0000000004c00000 - 0000000004dfffff
.quad 0x0000000004e00087 # 0000000004e00000 - 0000000004ffffff
.quad 0x0000000005000087 # 0000000005000000 - 00000000051fffff
.quad 0x0000000005200087 # 0000000005200000 - 00000000053fffff
.quad 0x0000000005400087 # 0000000005400000 - 00000000055fffff
.quad 0x0000000005600087 # 0000000005600000 - 00000000057fffff
.quad 0x0000000005800087 # 0000000005800000 - 00000000059fffff
.quad 0x0000000005a00087 # 0000000005a00000 - 0000000005bfffff
.quad 0x0000000005c00087 # 0000000005c00000 - 0000000005dfffff
.quad 0x0000000005e00087 # 0000000005e00000 - 0000000005ffffff
.quad 0x0000000006000087 # 0000000006000000 - 00000000061fffff
.quad 0x0000000006200087 # 0000000006200000 - 00000000063fffff
.quad 0x0000000006400087 # 0000000006400000 - 00000000065fffff
.quad 0x0000000006600087 # 0000000006600000 - 00000000067fffff
.quad 0x0000000006800087 # 0000000006800000 - 00000000069fffff
.quad 0x0000000006a00087 # 0000000006a00000 - 0000000006bfffff
.quad 0x0000000006c00087 # 0000000006c00000 - 0000000006dfffff
.quad 0x0000000006e00087 # 0000000006e00000 - 0000000006ffffff
.quad 0x0000000007000087 # 0000000007000000 - 00000000071fffff
.quad 0x0000000007200087 # 0000000007200000 - 00000000073fffff
.quad 0x0000000007400087 # 0000000007400000 - 00000000075fffff
.quad 0x0000000007600087 # 0000000007600000 - 00000000077fffff
.quad 0x0000000007800087 # 0000000007800000 - 00000000079fffff
.quad 0x0000000007a00087 # 0000000007a00000 - 0000000007bfffff
.quad 0x0000000007c00087 # 0000000007c00000 - 0000000007dfffff
.quad 0x0000000007e00087 # 0000000007e00000 - 0000000007ffffff
.quad 0x0000000008000087 # 0000000008000000 - 00000000081fffff
.quad 0x0000000008200087 # 0000000008200000 - 00000000083fffff
.quad 0x0000000008400087 # 0000000008400000 - 00000000085fffff
.quad 0x0000000008600087 # 0000000008600000 - 00000000087fffff
.quad 0x0000000008800087 # 0000000008800000 - 00000000089fffff
.quad 0x0000000008a00087 # 0000000008a00000 - 0000000008bfffff
.quad 0x0000000008c00087 # 0000000008c00000 - 0000000008dfffff
.quad 0x0000000008e00087 # 0000000008e00000 - 0000000008ffffff
.quad 0x0000000009000087 # 0000000009000000 - 00000000091fffff
.quad 0x0000000009200087 # 0000000009200000 - 00000000093fffff
.quad 0x0000000009400087 # 0000000009400000 - 00000000095fffff
.quad 0x0000000009600087 # 0000000009600000 - 00000000097fffff
.quad 0x0000000009800087 # 0000000009800000 - 00000000099fffff
.quad 0x0000000009a00087 # 0000000009a00000 - 0000000009bfffff
.quad 0x0000000009c00087 # 0000000009c00000 - 0000000009dfffff
.quad 0x0000000009e00087 # 0000000009e00000 - 0000000009ffffff
.quad 0x000000000a000087 # 000000000a000000 - 000000000a1fffff
.quad 0x000000000a200087 # 000000000a200000 - 000000000a3fffff
.quad 0x000000000a400087 # 000000000a400000 - 000000000a5fffff
.quad 0x000000000a600087 # 000000000a600000 - 000000000a7fffff
.quad 0x000000000a800087 # 000000000a800000 - 000000000a9fffff
.quad 0x000000000aa00087 # 000000000aa00000 - 000000000abfffff
.quad 0x000000000ac00087 # 000000000ac00000 - 000000000adfffff
.quad 0x000000000ae00087 # 000000000ae00000 - 000000000affffff
.quad 0x000000000b000087 # 000000000b000000 - 000000000b1fffff
.quad 0x000000000b200087 # 000000000b200000 - 000000000b3fffff
.quad 0x000000000b400087 # 000000000b400000 - 000000000b5fffff
.quad 0x000000000b600087 # 000000000b600000 - 000000000b7fffff
.quad 0x000000000b800087 # 000000000b800000 - 000000000b9fffff
.quad 0x000000000ba00087 # 000000000ba00000 - 000000000bbfffff
.quad 0x000000000bc00087 # 000000000bc00000 - 000000000bdfffff
.quad 0x000000000be00087 # 000000000be00000 - 000000000bffffff
.quad 0x000000000c000087 # 000000000c000000 - 000000000c1fffff
.quad 0x000000000c200087 # 000000000c200000 - 000000000c3fffff
.quad 0x000000000c400087 # 000000000c400000 - 000000000c5fffff
.quad 0x000000000c600087 # 000000000c600000 - 000000000c7fffff
.quad 0x000000000c800087 # 000000000c800000 - 000000000c9fffff
.quad 0x000000000ca00087 # 000000000ca00000 - 000000000cbfffff
.quad 0x000000000cc00087 # 000000000cc00000 - 000000000cdfffff
.quad 0x000000000ce00087 # 000000000ce00000 - 000000000cffffff
.quad 0x000000000d000087 # 000000000d000000 - 000000000d1fffff
.quad 0x000000000d200087 # 000000000d200000 - 000000000d3fffff
.quad 0x000000000d400087 # 000000000d400000 - 000000000d5fffff
.quad 0x000000000d600087 # 000000000d600000 - 000000000d7fffff
.quad 0x000000000d800087 # 000000000d800000 - 000000000d9fffff
.quad 0x000000000da00087 # 000000000da00000 - 000000000dbfffff
.quad 0x000000000dc00087 # 000000000dc00000 - 000000000ddfffff
.quad 0x000000000de00087 # 000000000de00000 - 000000000dffffff
.quad 0x000000000e000087 # 000000000e000000 - 000000000e1fffff
.quad 0x000000000e200087 # 000000000e200000 - 000000000e3fffff
.quad 0x000000000e400087 # 000000000e400000 - 000000000e5fffff
.quad 0x000000000e600087 # 000000000e600000 - 000000000e7fffff
.quad 0x000000000e800087 # 000000000e800000 - 000000000e9fffff
.quad 0x000000000ea00087 # 000000000ea00000 - 000000000ebfffff
.quad 0x000000000ec00087 # 000000000ec00000 - 000000000edfffff
.quad 0x000000000ee00087 # 000000000ee00000 - 000000000effffff
.quad 0x000000000f000087 # 000000000f000000 - 000000000f1fffff
.quad 0x000000000f200087 # 000000000f200000 - 000000000f3fffff
.quad 0x000000000f400087 # 000000000f400000 - 000000000f5fffff
.quad 0x000000000f600087 # 000000000f600000 - 000000000f7fffff
.quad 0x000000000f800087 # 000000000f800000 - 000000000f9fffff
.quad 0x000000000fa00087 # 000000000fa00000 - 000000000fbfffff
.quad 0x000000000fc00087 # 000000000fc00000 - 000000000fdfffff
.quad 0x000000000fe00087 # 000000000fe00000 - 000000000fffffff
.quad 0x0000000010000087 # 0000000010000000 - 00000000101fffff
.quad 0x0000000010200087 # 0000000010200000 - 00000000103fffff
.quad 0x0000000010400087 # 0000000010400000 - 00000000105fffff
.quad 0x0000000010600087 # 0000000010600000 - 00000000107fffff
.quad 0x0000000010800087 # 0000000010800000 - 00000000109fffff
.quad 0x0000000010a00087 # 0000000010a00000 - 0000000010bfffff
.quad 0x0000000010c00087 # 0000000010c00000 - 0000000010dfffff
.quad 0x0000000010e00087 # 0000000010e00000 - 0000000010ffffff
.quad 0x0000000011000087 # 0000000011000000 - 00000000111fffff
.quad 0x0000000011200087 # 0000000011200000 - 00000000113fffff
.quad 0x0000000011400087 # 0000000011400000 - 00000000115fffff
.quad 0x0000000011600087 # 0000000011600000 - 00000000117fffff
.quad 0x0000000011800087 # 0000000011800000 - 00000000119fffff
.quad 0x0000000011a00087 # 0000000011a00000 - 0000000011bfffff
.quad 0x0000000011c00087 # 0000000011c00000 - 0000000011dfffff
.quad 0x0000000011e00087 # 0000000011e00000 - 0000000011ffffff
.quad 0x0000000012000087 # 0000000012000000 - 00000000121fffff
.quad 0x0000000012200087 # 0000000012200000 - 00000000123fffff
.quad 0x0000000012400087 # 0000000012400000 - 00000000125fffff
.quad 0x0000000012600087 # 0000000012600000 - 00000000127fffff
.quad 0x0000000012800087 # 0000000012800000 - 00000000129fffff
.quad 0x0000000012a00087 # 0000000012a00000 - 0000000012bfffff
.quad 0x0000000012c00087 # 0000000012c00000 - 0000000012dfffff
.quad 0x0000000012e00087 # 0000000012e00000 - 0000000012ffffff
.quad 0x0000000013000087 # 0000000013000000 - 00000000131fffff
.quad 0x0000000013200087 # 0000000013200000 - 00000000133fffff
.quad 0x0000000013400087 # 0000000013400000 - 00000000135fffff
.quad 0x0000000013600087 # 0000000013600000 - 00000000137fffff
.quad 0x0000000013800087 # 0000000013800000 - 00000000139fffff
.quad 0x0000000013a00087 # 0000000013a00000 - 0000000013bfffff
.quad 0x0000000013c00087 # 0000000013c00000 - 0000000013dfffff
.quad 0x0000000013e00087 # 0000000013e00000 - 0000000013ffffff
.quad 0x0000000014000087 # 0000000014000000 - 00000000141fffff
.quad 0x0000000014200087 # 0000000014200000 - 00000000143fffff
.quad 0x0000000014400087 # 0000000014400000 - 00000000145fffff
.quad 0x0000000014600087 # 0000000014600000 - 00000000147fffff
.quad 0x0000000014800087 # 0000000014800000 - 00000000149fffff
.quad 0x0000000014a00087 # 0000000014a00000 - 0000000014bfffff
.quad 0x0000000014c00087 # 0000000014c00000 - 0000000014dfffff
.quad 0x0000000014e00087 # 0000000014e00000 - 0000000014ffffff
.quad 0x0000000015000087 # 0000000015000000 - 00000000151fffff
.quad 0x0000000015200087 # 0000000015200000 - 00000000153fffff
.quad 0x0000000015400087 # 0000000015400000 - 00000000155fffff
.quad 0x0000000015600087 # 0000000015600000 - 00000000157fffff
.quad 0x0000000015800087 # 0000000015800000 - 00000000159fffff
.quad 0x0000000015a00087 # 0000000015a00000 - 0000000015bfffff
.quad 0x0000000015c00087 # 0000000015c00000 - 0000000015dfffff
.quad 0x0000000015e00087 # 0000000015e00000 - 0000000015ffffff
.quad 0x0000000016000087 # 0000000016000000 - 00000000161fffff
.quad 0x0000000016200087 # 0000000016200000 - 00000000163fffff
.quad 0x0000000016400087 # 0000000016400000 - 00000000165fffff
.quad 0x0000000016600087 # 0000000016600000 - 00000000167fffff
.quad 0x0000000016800087 # 0000000016800000 - 00000000169fffff
.quad 0x0000000016a00087 # 0000000016a00000 - 0000000016bfffff
.quad 0x0000000016c00087 # 0000000016c00000 - 0000000016dfffff
.quad 0x0000000016e00087 # 0000000016e00000 - 0000000016ffffff
.quad 0x0000000017000087 # 0000000017000000 - 00000000171fffff
.quad 0x0000000017200087 # 0000000017200000 - 00000000173fffff
.quad 0x0000000017400087 # 0000000017400000 - 00000000175fffff
.quad 0x0000000017600087 # 0000000017600000 - 00000000177fffff
.quad 0x0000000017800087 # 0000000017800000 - 00000000179fffff
.quad 0x0000000017a00087 # 0000000017a00000 - 0000000017bfffff
.quad 0x0000000017c00087 # 0000000017c00000 - 0000000017dfffff
.quad 0x0000000017e00087 # 0000000017e00000 - 0000000017ffffff
.quad 0x0000000018000087 # 0000000018000000 - 00000000181fffff
.quad 0x0000000018200087 # 0000000018200000 - 00000000183fffff
.quad 0x0000000018400087 # 0000000018400000 - 00000000185fffff
.quad 0x0000000018600087 # 0000000018600000 - 00000000187fffff
.quad 0x0000000018800087 # 0000000018800000 - 00000000189fffff
.quad 0x0000000018a00087 # 0000000018a00000 - 0000000018bfffff
.quad 0x0000000018c00087 # 0000000018c00000 - 0000000018dfffff
.quad 0x0000000018e00087 # 0000000018e00000 - 0000000018ffffff
.quad 0x0000000019000087 # 0000000019000000 - 00000000191fffff
.quad 0x0000000019200087 # 0000000019200000 - 00000000193fffff
.quad 0x0000000019400087 # 0000000019400000 - 00000000195fffff
.quad 0x0000000019600087 # 0000000019600000 - 00000000197fffff
.quad 0x0000000019800087 # 0000000019800000 - 00000000199fffff
.quad 0x0000000019a00087 # 0000000019a00000 - 0000000019bfffff
.quad 0x0000000019c00087 # 0000000019c00000 - 0000000019dfffff
.quad 0x0000000019e00087 # 0000000019e00000 - 0000000019ffffff
.quad 0x000000001a000087 # 000000001a000000 - 000000001a1fffff
.quad 0x000000001a200087 # 000000001a200000 - 000000001a3fffff
.quad 0x000000001a400087 # 000000001a400000 - 000000001a5fffff
.quad 0x000000001a600087 # 000000001a600000 - 000000001a7fffff
.quad 0x000000001a800087 # 000000001a800000 - 000000001a9fffff
.quad 0x000000001aa00087 # 000000001aa00000 - 000000001abfffff
.quad 0x000000001ac00087 # 000000001ac00000 - 000000001adfffff
.quad 0x000000001ae00087 # 000000001ae00000 - 000000001affffff
.quad 0x000000001b000087 # 000000001b000000 - 000000001b1fffff
.quad 0x000000001b200087 # 000000001b200000 - 000000001b3fffff
.quad 0x000000001b400087 # 000000001b400000 - 000000001b5fffff
.quad 0x000000001b600087 # 000000001b600000 - 000000001b7fffff
.quad 0x000000001b800087 # 000000001b800000 - 000000001b9fffff
.quad 0x000000001ba00087 # 000000001ba00000 - 000000001bbfffff
.quad 0x000000001bc00087 # 000000001bc00000 - 000000001bdfffff
.quad 0x000000001be00087 # 000000001be00000 - 000000001bffffff
.quad 0x000000001c000087 # 000000001c000000 - 000000001c1fffff
.quad 0x000000001c200087 # 000000001c200000 - 000000001c3fffff
.quad 0x000000001c400087 # 000000001c400000 - 000000001c5fffff
.quad 0x000000001c600087 # 000000001c600000 - 000000001c7fffff
.quad 0x000000001c800087 # 000000001c800000 - 000000001c9fffff
.quad 0x000000001ca00087 # 000000001ca00000 - 000000001cbfffff
.quad 0x000000001cc00087 # 000000001cc00000 - 000000001cdfffff
.quad 0x000000001ce00087 # 000000001ce00000 - 000000001cffffff
.quad 0x000000001d000087 # 000000001d000000 - 000000001d1fffff
.quad 0x000000001d200087 # 000000001d200000 - 000000001d3fffff
.quad 0x000000001d400087 # 000000001d400000 - 000000001d5fffff
.quad 0x000000001d600087 # 000000001d600000 - 000000001d7fffff
.quad 0x000000001d800087 # 000000001d800000 - 000000001d9fffff
.quad 0x000000001da00087 # 000000001da00000 - 000000001dbfffff
.quad 0x000000001dc00087 # 000000001dc00000 - 000000001ddfffff
.quad 0x000000001de00087 # 000000001de00000 - 000000001dffffff
.quad 0x000000001e000087 # 000000001e000000 - 000000001e1fffff
.quad 0x000000001e200087 # 000000001e200000 - 000000001e3fffff
.quad 0x000000001e400087 # 000000001e400000 - 000000001e5fffff
.quad 0x000000001e600087 # 000000001e600000 - 000000001e7fffff
.quad 0x000000001e800087 # 000000001e800000 - 000000001e9fffff
.quad 0x000000001ea00087 # 000000001ea00000 - 000000001ebfffff
.quad 0x000000001ec00087 # 000000001ec00000 - 000000001edfffff
.quad 0x000000001ee00087 # 000000001ee00000 - 000000001effffff
.quad 0x000000001f000087 # 000000001f000000 - 000000001f1fffff
.quad 0x000000001f200087 # 000000001f200000 - 000000001f3fffff
.quad 0x000000001f400087 # 000000001f400000 - 000000001f5fffff
.quad 0x000000001f600087 # 000000001f600000 - 000000001f7fffff
.quad 0x000000001f800087 # 000000001f800000 - 000000001f9fffff
.quad 0x000000001fa00087 # 000000001fa00000 - 000000001fbfffff
.quad 0x000000001fc00087 # 000000001fc00000 - 000000001fdfffff
.quad 0x000000001fe00087 # 000000001fe00000 - 000000001fffffff
.quad 0x0000000020000087 # 0000000020000000 - 00000000201fffff
.quad 0x0000000020200087 # 0000000020200000 - 00000000203fffff
.quad 0x0000000020400087 # 0000000020400000 - 00000000205fffff
.quad 0x0000000020600087 # 0000000020600000 - 00000000207fffff
.quad 0x0000000020800087 # 0000000020800000 - 00000000209fffff
.quad 0x0000000020a00087 # 0000000020a00000 - 0000000020bfffff
.quad 0x0000000020c00087 # 0000000020c00000 - 0000000020dfffff
.quad 0x0000000020e00087 # 0000000020e00000 - 0000000020ffffff
.quad 0x0000000021000087 # 0000000021000000 - 00000000211fffff
.quad 0x0000000021200087 # 0000000021200000 - 00000000213fffff
.quad 0x0000000021400087 # 0000000021400000 - 00000000215fffff
.quad 0x0000000021600087 # 0000000021600000 - 00000000217fffff
.quad 0x0000000021800087 # 0000000021800000 - 00000000219fffff
.quad 0x0000000021a00087 # 0000000021a00000 - 0000000021bfffff
.quad 0x0000000021c00087 # 0000000021c00000 - 0000000021dfffff
.quad 0x0000000021e00087 # 0000000021e00000 - 0000000021ffffff
.quad 0x0000000022000087 # 0000000022000000 - 00000000221fffff
.quad 0x0000000022200087 # 0000000022200000 - 00000000223fffff
.quad 0x0000000022400087 # 0000000022400000 - 00000000225fffff
.quad 0x0000000022600087 # 0000000022600000 - 00000000227fffff
.quad 0x0000000022800087 # 0000000022800000 - 00000000229fffff
.quad 0x0000000022a00087 # 0000000022a00000 - 0000000022bfffff
.quad 0x0000000022c00087 # 0000000022c00000 - 0000000022dfffff
.quad 0x0000000022e00087 # 0000000022e00000 - 0000000022ffffff
.quad 0x0000000023000087 # 0000000023000000 - 00000000231fffff
.quad 0x0000000023200087 # 0000000023200000 - 00000000233fffff
.quad 0x0000000023400087 # 0000000023400000 - 00000000235fffff
.quad 0x0000000023600087 # 0000000023600000 - 00000000237fffff
.quad 0x0000000023800087 # 0000000023800000 - 00000000239fffff
.quad 0x0000000023a00087 # 0000000023a00000 - 0000000023bfffff
.quad 0x0000000023c00087 # 0000000023c00000 - 0000000023dfffff
.quad 0x0000000023e00087 # 0000000023e00000 - 0000000023ffffff
.quad 0x0000000024000087 # 0000000024000000 - 00000000241fffff
.quad 0x0000000024200087 # 0000000024200000 - 00000000243fffff
.quad 0x0000000024400087 # 0000000024400000 - 00000000245fffff
.quad 0x0000000024600087 # 0000000024600000 - 00000000247fffff
.quad 0x0000000024800087 # 0000000024800000 - 00000000249fffff
.quad 0x0000000024a00087 # 0000000024a00000 - 0000000024bfffff
.quad 0x0000000024c00087 # 0000000024c00000 - 0000000024dfffff
.quad 0x0000000024e00087 # 0000000024e00000 - 0000000024ffffff
.quad 0x0000000025000087 # 0000000025000000 - 00000000251fffff
.quad 0x0000000025200087 # 0000000025200000 - 00000000253fffff
.quad 0x0000000025400087 # 0000000025400000 - 00000000255fffff
.quad 0x0000000025600087 # 0000000025600000 - 00000000257fffff
.quad 0x0000000025800087 # 0000000025800000 - 00000000259fffff
.quad 0x0000000025a00087 # 0000000025a00000 - 0000000025bfffff
.quad 0x0000000025c00087 # 0000000025c00000 - 0000000025dfffff
.quad 0x0000000025e00087 # 0000000025e00000 - 0000000025ffffff
.quad 0x0000000026000087 # 0000000026000000 - 00000000261fffff
.quad 0x0000000026200087 # 0000000026200000 - 00000000263fffff
.quad 0x0000000026400087 # 0000000026400000 - 00000000265fffff
.quad 0x0000000026600087 # 0000000026600000 - 00000000267fffff
.quad 0x0000000026800087 # 0000000026800000 - 00000000269fffff
.quad 0x0000000026a00087 # 0000000026a00000 - 0000000026bfffff
.quad 0x0000000026c00087 # 0000000026c00000 - 0000000026dfffff
.quad 0x0000000026e00087 # 0000000026e00000 - 0000000026ffffff
.quad 0x0000000027000087 # 0000000027000000 - 00000000271fffff
.quad 0x0000000027200087 # 0000000027200000 - 00000000273fffff
.quad 0x0000000027400087 # 0000000027400000 - 00000000275fffff
.quad 0x0000000027600087 # 0000000027600000 - 00000000277fffff
.quad 0x0000000027800087 # 0000000027800000 - 00000000279fffff
.quad 0x0000000027a00087 # 0000000027a00000 - 0000000027bfffff
.quad 0x0000000027c00087 # 0000000027c00000 - 0000000027dfffff
.quad 0x0000000027e00087 # 0000000027e00000 - 0000000027ffffff
.quad 0x0000000028000087 # 0000000028000000 - 00000000281fffff
.quad 0x0000000028200087 # 0000000028200000 - 00000000283fffff
.quad 0x0000000028400087 # 0000000028400000 - 00000000285fffff
.quad 0x0000000028600087 # 0000000028600000 - 00000000287fffff
.quad 0x0000000028800087 # 0000000028800000 - 00000000289fffff
.quad 0x0000000028a00087 # 0000000028a00000 - 0000000028bfffff
.quad 0x0000000028c00087 # 0000000028c00000 - 0000000028dfffff
.quad 0x0000000028e00087 # 0000000028e00000 - 0000000028ffffff
.quad 0x0000000029000087 # 0000000029000000 - 00000000291fffff
.quad 0x0000000029200087 # 0000000029200000 - 00000000293fffff
.quad 0x0000000029400087 # 0000000029400000 - 00000000295fffff
.quad 0x0000000029600087 # 0000000029600000 - 00000000297fffff
.quad 0x0000000029800087 # 0000000029800000 - 00000000299fffff
.quad 0x0000000029a00087 # 0000000029a00000 - 0000000029bfffff
.quad 0x0000000029c00087 # 0000000029c00000 - 0000000029dfffff
.quad 0x0000000029e00087 # 0000000029e00000 - 0000000029ffffff
.quad 0x000000002a000087 # 000000002a000000 - 000000002a1fffff
.quad 0x000000002a200087 # 000000002a200000 - 000000002a3fffff
.quad 0x000000002a400087 # 000000002a400000 - 000000002a5fffff
.quad 0x000000002a600087 # 000000002a600000 - 000000002a7fffff
.quad 0x000000002a800087 # 000000002a800000 - 000000002a9fffff
.quad 0x000000002aa00087 # 000000002aa00000 - 000000002abfffff
.quad 0x000000002ac00087 # 000000002ac00000 - 000000002adfffff
.quad 0x000000002ae00087 # 000000002ae00000 - 000000002affffff
.quad 0x000000002b000087 # 000000002b000000 - 000000002b1fffff
.quad 0x000000002b200087 # 000000002b200000 - 000000002b3fffff
.quad 0x000000002b400087 # 000000002b400000 - 000000002b5fffff
.quad 0x000000002b600087 # 000000002b600000 - 000000002b7fffff
.quad 0x000000002b800087 # 000000002b800000 - 000000002b9fffff
.quad 0x000000002ba00087 # 000000002ba00000 - 000000002bbfffff
.quad 0x000000002bc00087 # 000000002bc00000 - 000000002bdfffff
.quad 0x000000002be00087 # 000000002be00000 - 000000002bffffff
.quad 0x000000002c000087 # 000000002c000000 - 000000002c1fffff
.quad 0x000000002c200087 # 000000002c200000 - 000000002c3fffff
.quad 0x000000002c400087 # 000000002c400000 - 000000002c5fffff
.quad 0x000000002c600087 # 000000002c600000 - 000000002c7fffff
.quad 0x000000002c800087 # 000000002c800000 - 000000002c9fffff
.quad 0x000000002ca00087 # 000000002ca00000 - 000000002cbfffff
.quad 0x000000002cc00087 # 000000002cc00000 - 000000002cdfffff
.quad 0x000000002ce00087 # 000000002ce00000 - 000000002cffffff
.quad 0x000000002d000087 # 000000002d000000 - 000000002d1fffff
.quad 0x000000002d200087 # 000000002d200000 - 000000002d3fffff
.quad 0x000000002d400087 # 000000002d400000 - 000000002d5fffff
.quad 0x000000002d600087 # 000000002d600000 - 000000002d7fffff
.quad 0x000000002d800087 # 000000002d800000 - 000000002d9fffff
.quad 0x000000002da00087 # 000000002da00000 - 000000002dbfffff
.quad 0x000000002dc00087 # 000000002dc00000 - 000000002ddfffff
.quad 0x000000002de00087 # 000000002de00000 - 000000002dffffff
.quad 0x000000002e000087 # 000000002e000000 - 000000002e1fffff
.quad 0x000000002e200087 # 000000002e200000 - 000000002e3fffff
.quad 0x000000002e400087 # 000000002e400000 - 000000002e5fffff
.quad 0x000000002e600087 # 000000002e600000 - 000000002e7fffff
.quad 0x000000002e800087 # 000000002e800000 - 000000002e9fffff
.quad 0x000000002ea00087 # 000000002ea00000 - 000000002ebfffff
.quad 0x000000002ec00087 # 000000002ec00000 - 000000002edfffff
.quad 0x000000002ee00087 # 000000002ee00000 - 000000002effffff
.quad 0x000000002f000087 # 000000002f000000 - 000000002f1fffff
.quad 0x000000002f200087 # 000000002f200000 - 000000002f3fffff
.quad 0x000000002f400087 # 000000002f400000 - 000000002f5fffff
.quad 0x000000002f600087 # 000000002f600000 - 000000002f7fffff
.quad 0x000000002f800087 # 000000002f800000 - 000000002f9fffff
.quad 0x000000002fa00087 # 000000002fa00000 - 000000002fbfffff
.quad 0x000000002fc00087 # 000000002fc00000 - 000000002fdfffff
.quad 0x000000002fe00087 # 000000002fe00000 - 000000002fffffff
.quad 0x0000000030000087 # 0000000030000000 - 00000000301fffff
.quad 0x0000000030200087 # 0000000030200000 - 00000000303fffff
.quad 0x0000000030400087 # 0000000030400000 - 00000000305fffff
.quad 0x0000000030600087 # 0000000030600000 - 00000000307fffff
.quad 0x0000000030800087 # 0000000030800000 - 00000000309fffff
.quad 0x0000000030a00087 # 0000000030a00000 - 0000000030bfffff
.quad 0x0000000030c00087 # 0000000030c00000 - 0000000030dfffff
.quad 0x0000000030e00087 # 0000000030e00000 - 0000000030ffffff
.quad 0x0000000031000087 # 0000000031000000 - 00000000311fffff
.quad 0x0000000031200087 # 0000000031200000 - 00000000313fffff
.quad 0x0000000031400087 # 0000000031400000 - 00000000315fffff
.quad 0x0000000031600087 # 0000000031600000 - 00000000317fffff
.quad 0x0000000031800087 # 0000000031800000 - 00000000319fffff
.quad 0x0000000031a00087 # 0000000031a00000 - 0000000031bfffff
.quad 0x0000000031c00087 # 0000000031c00000 - 0000000031dfffff
.quad 0x0000000031e00087 # 0000000031e00000 - 0000000031ffffff
.quad 0x0000000032000087 # 0000000032000000 - 00000000321fffff
.quad 0x0000000032200087 # 0000000032200000 - 00000000323fffff
.quad 0x0000000032400087 # 0000000032400000 - 00000000325fffff
.quad 0x0000000032600087 # 0000000032600000 - 00000000327fffff
.quad 0x0000000032800087 # 0000000032800000 - 00000000329fffff
.quad 0x0000000032a00087 # 0000000032a00000 - 0000000032bfffff
.quad 0x0000000032c00087 # 0000000032c00000 - 0000000032dfffff
.quad 0x0000000032e00087 # 0000000032e00000 - 0000000032ffffff
.quad 0x0000000033000087 # 0000000033000000 - 00000000331fffff
.quad 0x0000000033200087 # 0000000033200000 - 00000000333fffff
.quad 0x0000000033400087 # 0000000033400000 - 00000000335fffff
.quad 0x0000000033600087 # 0000000033600000 - 00000000337fffff
.quad 0x0000000033800087 # 0000000033800000 - 00000000339fffff
.quad 0x0000000033a00087 # 0000000033a00000 - 0000000033bfffff
.quad 0x0000000033c00087 # 0000000033c00000 - 0000000033dfffff
.quad 0x0000000033e00087 # 0000000033e00000 - 0000000033ffffff
.quad 0x0000000034000087 # 0000000034000000 - 00000000341fffff
.quad 0x0000000034200087 # 0000000034200000 - 00000000343fffff
.quad 0x0000000034400087 # 0000000034400000 - 00000000345fffff
.quad 0x0000000034600087 # 0000000034600000 - 00000000347fffff
.quad 0x0000000034800087 # 0000000034800000 - 00000000349fffff
.quad 0x0000000034a00087 # 0000000034a00000 - 0000000034bfffff
.quad 0x0000000034c00087 # 0000000034c00000 - 0000000034dfffff
.quad 0x0000000034e00087 # 0000000034e00000 - 0000000034ffffff
.quad 0x0000000035000087 # 0000000035000000 - 00000000351fffff
.quad 0x0000000035200087 # 0000000035200000 - 00000000353fffff
.quad 0x0000000035400087 # 0000000035400000 - 00000000355fffff
.quad 0x0000000035600087 # 0000000035600000 - 00000000357fffff
.quad 0x0000000035800087 # 0000000035800000 - 00000000359fffff
.quad 0x0000000035a00087 # 0000000035a00000 - 0000000035bfffff
.quad 0x0000000035c00087 # 0000000035c00000 - 0000000035dfffff
.quad 0x0000000035e00087 # 0000000035e00000 - 0000000035ffffff
.quad 0x0000000036000087 # 0000000036000000 - 00000000361fffff
.quad 0x0000000036200087 # 0000000036200000 - 00000000363fffff
.quad 0x0000000036400087 # 0000000036400000 - 00000000365fffff
.quad 0x0000000036600087 # 0000000036600000 - 00000000367fffff
.quad 0x0000000036800087 # 0000000036800000 - 00000000369fffff
.quad 0x0000000036a00087 # 0000000036a00000 - 0000000036bfffff
.quad 0x0000000036c00087 # 0000000036c00000 - 0000000036dfffff
.quad 0x0000000036e00087 # 0000000036e00000 - 0000000036ffffff
.quad 0x0000000037000087 # 0000000037000000 - 00000000371fffff
.quad 0x0000000037200087 # 0000000037200000 - 00000000373fffff
.quad 0x0000000037400087 # 0000000037400000 - 00000000375fffff
.quad 0x0000000037600087 # 0000000037600000 - 00000000377fffff
.quad 0x0000000037800087 # 0000000037800000 - 00000000379fffff
.quad 0x0000000037a00087 # 0000000037a00000 - 0000000037bfffff
.quad 0x0000000037c00087 # 0000000037c00000 - 0000000037dfffff
.quad 0x0000000037e00087 # 0000000037e00000 - 0000000037ffffff
.quad 0x0000000038000087 # 0000000038000000 - 00000000381fffff
.quad 0x0000000038200087 # 0000000038200000 - 00000000383fffff
.quad 0x0000000038400087 # 0000000038400000 - 00000000385fffff
.quad 0x0000000038600087 # 0000000038600000 - 00000000387fffff
.quad 0x0000000038800087 # 0000000038800000 - 00000000389fffff
.quad 0x0000000038a00087 # 0000000038a00000 - 0000000038bfffff
.quad 0x0000000038c00087 # 0000000038c00000 - 0000000038dfffff
.quad 0x0000000038e00087 # 0000000038e00000 - 0000000038ffffff
.quad 0x0000000039000087 # 0000000039000000 - 00000000391fffff
.quad 0x0000000039200087 # 0000000039200000 - 00000000393fffff
.quad 0x0000000039400087 # 0000000039400000 - 00000000395fffff
.quad 0x0000000039600087 # 0000000039600000 - 00000000397fffff
.quad 0x0000000039800087 # 0000000039800000 - 00000000399fffff
.quad 0x0000000039a00087 # 0000000039a00000 - 0000000039bfffff
.quad 0x0000000039c00087 # 0000000039c00000 - 0000000039dfffff
.quad 0x0000000039e00087 # 0000000039e00000 - 0000000039ffffff
.quad 0x000000003a000087 # 000000003a000000 - 000000003a1fffff
.quad 0x000000003a200087 # 000000003a200000 - 000000003a3fffff
.quad 0x000000003a400087 # 000000003a400000 - 000000003a5fffff
.quad 0x000000003a600087 # 000000003a600000 - 000000003a7fffff
.quad 0x000000003a800087 # 000000003a800000 - 000000003a9fffff
.quad 0x000000003aa00087 # 000000003aa00000 - 000000003abfffff
.quad 0x000000003ac00087 # 000000003ac00000 - 000000003adfffff
.quad 0x000000003ae00087 # 000000003ae00000 - 000000003affffff
.quad 0x000000003b000087 # 000000003b000000 - 000000003b1fffff
.quad 0x000000003b200087 # 000000003b200000 - 000000003b3fffff
.quad 0x000000003b400087 # 000000003b400000 - 000000003b5fffff
.quad 0x000000003b600087 # 000000003b600000 - 000000003b7fffff
.quad 0x000000003b800087 # 000000003b800000 - 000000003b9fffff
.quad 0x000000003ba00087 # 000000003ba00000 - 000000003bbfffff
.quad 0x000000003bc00087 # 000000003bc00000 - 000000003bdfffff
.quad 0x000000003be00087 # 000000003be00000 - 000000003bffffff
.quad 0x000000003c000087 # 000000003c000000 - 000000003c1fffff
.quad 0x000000003c200087 # 000000003c200000 - 000000003c3fffff
.quad 0x000000003c400087 # 000000003c400000 - 000000003c5fffff
.quad 0x000000003c600087 # 000000003c600000 - 000000003c7fffff
.quad 0x000000003c800087 # 000000003c800000 - 000000003c9fffff
.quad 0x000000003ca00087 # 000000003ca00000 - 000000003cbfffff
.quad 0x000000003cc00087 # 000000003cc00000 - 000000003cdfffff
.quad 0x000000003ce00087 # 000000003ce00000 - 000000003cffffff
.quad 0x000000003d000087 # 000000003d000000 - 000000003d1fffff
.quad 0x000000003d200087 # 000000003d200000 - 000000003d3fffff
.quad 0x000000003d400087 # 000000003d400000 - 000000003d5fffff
.quad 0x000000003d600087 # 000000003d600000 - 000000003d7fffff
.quad 0x000000003d800087 # 000000003d800000 - 000000003d9fffff
.quad 0x000000003da00087 # 000000003da00000 - 000000003dbfffff
.quad 0x000000003dc00087 # 000000003dc00000 - 000000003ddfffff
.quad 0x000000003de00087 # 000000003de00000 - 000000003dffffff
.quad 0x000000003e000087 # 000000003e000000 - 000000003e1fffff
.quad 0x000000003e200087 # 000000003e200000 - 000000003e3fffff
.quad 0x000000003e400087 # 000000003e400000 - 000000003e5fffff
.quad 0x000000003e600087 # 000000003e600000 - 000000003e7fffff
.quad 0x000000003e800087 # 000000003e800000 - 000000003e9fffff
.quad 0x000000003ea00087 # 000000003ea00000 - 000000003ebfffff
.quad 0x000000003ec00087 # 000000003ec00000 - 000000003edfffff
.quad 0x000000003ee00087 # 000000003ee00000 - 000000003effffff
.quad 0x000000003f000087 # 000000003f000000 - 000000003f1fffff
.quad 0x000000003f200087 # 000000003f200000 - 000000003f3fffff
.quad 0x000000003f400087 # 000000003f400000 - 000000003f5fffff
.quad 0x000000003f600087 # 000000003f600000 - 000000003f7fffff
.quad 0x000000003f800087 # 000000003f800000 - 000000003f9fffff
.quad 0x000000003fa00087 # 000000003fa00000 - 000000003fbfffff
.quad 0x000000003fc00087 # 000000003fc00000 - 000000003fdfffff
.quad 0x000000003fe00087 # 000000003fe00000 - 000000003fffffff

.balign 4096,0
identitypdp:
.quad identitydir - identitydir

.balign 4096,0
percpudir:
.quad 0x0000000000000087 # 0000000000000000 - 00000000001fffff

.balign 4096,0
percpupdp:
.quad percpudir - identitydir

# The pdpe and the pml4e will be modified at runtime to include
# the necessary flags.

.balign 4096,0
identitypml4:
.quad identitypdp - identitydir

.space 0x800
percpupml4: # Offset in PML4 = 0x808
.quad percpupdp - identitydir
.space 0x20
identitypml4e2: # Offset in PML4 = 0x830
.quad identitypdp - identitydir


setup64.S:
#include "dvmm_constants.h"
#include "core/asm_offsets.h"

/**
* \fn void ::dvmmstart(uint32 e820space, uint32 bspflag, uint32 rmseg)
* \brief The 64-bit code entry point
* \param e820space The number of bytes of e820 space remaining. Passed
* in the <b>%eax</b> register.
* \param bspflag Set to zero to indicate that the Bootstrap Processor
* (BSP) or one to indicate that an Application Processor (AP) has
* invoked this function. Passed in the <b>%ebx</b> register.
* \param rmseg The real-mode segment selector used by the second-stage
* bootstrap program. Passed in the <b>%esi</b> register.
* \remark This function never returns.
*
* This code is the long mode 64-bit code entry point. It is
* loaded by the PXE boot loader (or 3leaf floppy boot loader) to
* physical address 0x100000, virtual address 0xffff830000100000.
* It is responsible for setting up the 64 bit GDT and IDT and then
* invoking the C++ code. It executes based at 0x100000 until the
* call into ::dvmm_bsp_start, from whence we're running at 0xffff83...
*
* \note Note that this code is linked into the main hypervisor ELF image.
*
*/

SS_CODE=0x10
SS_DATA=0x18

.code64
.section inittext,"xa",@progbits
# .text

# On entry to dvmmstart:
# %eax contains the amount of e820 buffer space remaining
# %ebx contains 0 for the bootstrap processor (BSP), and 1 for application
# processors (AP).
# %esi contains the real-mode segment selector for the secondary bootstrap
#
.global dvmmstart
dvmmstart:
#
# Get processor into known state.
#

cld
cli

movl %esi, %r15d # Save real-mode segment base
xorq %rsi, %rsi # EAX contains 'memsize' initialized
movl %eax, %esi # before calling setup64.s
movabsq $PHYSMAP_BASE, %r12 # Base DVMM virtual address

#
# 64bit RSP points to the stack's linear address.
#
leaq init_stack_top(%rip), %rsp
addq %r12, %rsp # Rebase the stack

#
# Fill in the GDTR.
#
leaq gdt64(%rip), %rax # Load the address of the gdt
leaq gdtdesc64(%rip), %rcx # Load the address of the gdt descriptor
addq $2, %rcx # &gdt goes 2 bytes into the gdt descriptor
movq %r12, (%rcx) # Start with PHYSMAP_BASE
addq %rax, (%rcx) # offset to gdt
lgdtq gdtdesc64(%rip) # Load the GDT

#
# Load up the new selectors.
#

movl $SS_DATA, %eax
movl %eax, %ds
movl %eax, %ss
movl %eax, %es
movl %eax, %fs
movl %eax, %gs

#
# Fill in the IDTR.
#
leaq idt64(%rip), %rax # Load the address of the idt
leaq idtdesc64(%rip), %rcx # Load the address of the idt descriptor
addq $2, %rcx # &idt goes 2 bytes into the gdt descriptor
movq %r12, (%rcx) # Start with PHYSMAP_BASE
addq %rax, (%rcx) # add offset to &idt

#
# Fill in the IDT.
#
testl $1, %ebx # BSP or AP?
jnz 2f # AP, just load IDT

movabsq $handlerlist, %r11 # List of interrupt handlers
movl $((idt64_end - idt64) / 16), %ecx # Number of entries to copy
1:
movq (%r11), %rdx # Get handler address
addq $8, %r11 # Skip to next
movw %dx, (%rax) # Set the low 16 bits of the target offset
movw $SS_CODE, 2(%rax) # Set the target CS
movw $0x8e00, 4(%rax) # Set the gate flags
shrq $16, %rdx # Shift temp_int address down by 16
movw %dx, 6(%rax) # Set bits 31-16 of the target offset
shrq $16, %rdx # Shift temp_int address down by 16 more
movl %edx, 8(%rax) # Set bits 63-32 of the target offset

addq $16, %rax # Next IDT64 entry
decw %cx # Down by one
testw $0xffff, %cx # Is it zero yet?
jnz 1b # No, loop around

2:
lidt idtdesc64(%rip) # Load the IDT

#
# Clear BSS
#
testl $1, %ebx # BSP or AP?
jnz 1f # AP, skip clear BSS & constructors

xorq %rax, %rax
leaq _sbss(%rip), %rdi
movq $0, %rcx
leaq _ebss(%rip), %rcx
subq %rdi, %rcx
rep stosb

pushq %rsi # Save RSI, since we'll need it later.
#
# Invoke the debugger early initialization function
#
movabsq $debugger, %rdi # Set this
movabsq $_ZN10c_debugger10early_initEv, %rcx # Call ::early_init
call *%rcx

#
# Invoke global C++ constructors.
#
movabsq $__call_constructors, %rcx
call *%rcx

popq %rsi # Restore RSI.
#
# Invoke C++ code. Pass begin and end address of memory map.
#
1:
movq $512, %rax # Starting with 512 bytes
subq %rsi, %rax # Subtract remaining
addq $0x90000, %rax # e820 data map address

# Use x86_64 calling conventions
movq $0x90000, %rdi # Pass arg1 to main
addq %r12, %rdi # Rebase arg1

movq %rax, %rsi # Pass arg2 to main
addq %r12, %rsi # Rebase arg2
movq %r15, %rdx # Arg3 is real-mode segment base
xorq %rbp, %rbp # Indicate end of stack for -fframe-pointer

movabsq $dvmm_bsp_start, %rcx # We use an indirect jump to invoke main
testl $1, %ebx # BSP or AP?
jz 4f # BSP, use 'dvmm_bsp_start'

movabsq $dvmm_ap_start, %rcx # AP, use 'dvmm_ap_start'
4: call *%rcx # because it is more than 2GB away from here.

#
# Should never return.
#
1:
hlt
jmp 1b

#
# Global Descriptor Table
#
.align 16,0
gdt64:
.long 0, 0 # Entry 0 (NULL segment descriptor)
.long 0, 0 # Entry 1 (reserved)
.long 0x0000ffff, 0x00AF9A00 # Code read/exec long mode
.long 0x0000ffff, 0x00CF9200 # Data read/write.
gdt64_end:

gdtdesc64: # 64 bit GDT Descriptor
.word gdt64_end-gdt64-1 # limit
.quad PHYSMAP_BASE # 64 bit GDT base (overwritten above)

#
# Interrupt Descriptor Table
#

/* These are overwritten above in dvmmstart. */
#define IGATE .quad 0x0, 0x0

.align 16,0
idt64:
IGATE #DE
IGATE #DB
IGATE #NMI
IGATE #BP
IGATE #OF
IGATE #BR
IGATE #UD
IGATE #NM
IGATE #DF
IGATE #Reserved
IGATE #TS
IGATE #NP
IGATE #SS
IGATE #GP
IGATE #PF
IGATE #MF
IGATE #AC
IGATE #MC
IGATE #XF
idt64_end:

idtdesc64:
.word idt64_end-idt64-1 # limit
.quad PHYSMAP_BASE # 64 bit IDT base (overwritten above)

handlerlist:
.quad fault_divide
.quad trap_debug
.quad fault_nmi
.quad trap_breakpoint
.quad trap_overflow
.quad fault_bounds
.quad fault_ud
.quad fault_nm
.quad abort_df
.quad fault_invalid9
.quad fault_invtss
.quad fault_notpresent
.quad fault_stack
.quad fault_gp
.quad fault_page
.quad fault_invalid15
.quad fault_fpe
.quad fault_alignment
.quad machine_check
.quad fault_simd
.quad fault_invalid20
.quad fault_invalid21
.quad fault_invalid22
.quad fault_invalid23
.quad fault_invalid24
.quad fault_invalid25
.quad fault_invalid26
.quad fault_invalid27
.quad fault_invalid28
.quad fault_invalid29
.quad fault_security
.quad fault_invalid31

.align 4096,0
/*
* Application processor initialization code. This needs to be aligned on a
* page boundary, as we use the LAPIC startup interprocessor interrupt (IPI)
* to start the application processors (AP) from the bootstrap processor (BSP),
* and the startup API requires a page-aligned address within the first
* megabyte of DRAM as a start-vector.
*/
.global dvmm_ap_init
.global dvmm_ap_base
.code16
dvmm_ap_init:
jmp 1f
dvmm_ap_base:
.word 0x0000
1: wbinvd
mov %cs, %ax
mov %ax, %ds
cli

mov %cs:(0x2), %ax
mov %ax, %es
sub $0x20, %ax
/*
* We only need a few bytes (4, to be exact) on the stack used prior
* to invoking dvmmstart in long-mode. Use the bottom portion of the
* segment starting at $3000:$0, the first portion of the 512 bytes at
* that segment is only used by the bootstrap processor to hold the
* BIOS E820 table.
*/
mov %ax, %ss
mov $0x1f0, %sp

mov %es, %ax
mov %ax, %cs:(2f-dvmm_ap_init)

movw $1, %es:(0x20) # Mark AP startup
.byte 0xea # Long Jump
.word 0x0 # Offset from seg base == 0
2: .word 0x0 # Real-mode segment base

.global dvmm_ap_init_end
dvmm_ap_init_end:
.long 0

.global init_stack
.global init_stack_top
.align STACK_SIZE_ASM,0
init_stack:
.space STACK_SIZE_ASM, 0
init_stack_top:

/* vim: sw=4 sts=4 sta ts=8:
*/

Alexei A. Frounze

unread,
Dec 6, 2022, 11:24:14 PM12/6/22
to
On Tuesday, December 6, 2022 at 11:41:23 AM UTC-8, אורי ויסבלום wrote:
> Now, here is the new code I've made (with a basic kmalloc, without free because I don't need it for now). The same bug happens. Any other ideas?
> BTW, I've checked all of the or expressions and all of the allocPage() outputs, they're as intended.
>
>
> #define PDT_SIZE 1024
> #define KERNEL_START 0x1000 // a constant in my linker code
> #define PAGE_SIZE 0x1000
> #define KERNEL_SIZE 3 // the size of the kernel code is 10KB for now (I've checked it), so I gave it 3 pages * 4KB = 12KB
> #define KERNEL_END (KERNEL_START + PAGE_SIZE*KERNEL_SIZE)
>
> PTEntry* kernelPTAddr = 0;
> uint32_t firstFree = 0;
>
> uint32_t allocPage() {
> if (firstFree == 0)
> firstFree = KERNEL_END;
> firstFree += PAGE_SIZE;
> return firstFree - PAGE_SIZE;
> }
>
> void initPDT() {
> PDEntry* table = allocPage();
> kernelPTAddr = allocPage();
> // Here it was with the struct of PD earlier, but I've realized it doesn't work correctly, it doesn't take into account the size of P*Entry, so I'm using pointers now.
> *table = READWRITE | PRESENT | (uint32_t)kernelPTAddr;
>
> for (int i = 1; i <= KERNEL_SIZE; i++) {
> *(kernelPTAddr+i*sizeof(PTEntry)) = PRESENT | READWRITE | DIRTY | KERNEL_START + (i-1)*PAGE_SIZE;

So, when you add a number to a pointer to type T, how many bytes does the pointer advance?

Also, why do you use i and (i-1) in this statement? Shouldn't they be the same?

> }
>
> // last entry points to the PDT itself
> *(table+(PDT_SIZE-1)*sizeof(PDEntry)) = READWRITE | PRESENT | (uint32_t)table;

Same question w.r.t. pointer arithmetic.

...
> typedef uint32_t PDEntry;
> typedef uint32_t PTEntry;

Alex

אורי ויסבלום

unread,
Dec 7, 2022, 1:47:41 AM12/7/22
to
Scott - I'll read your code and see what the difference is with mine, though it's a long assembly code that moves to *long mode*, and I'm making a 32-bit system.

Alex:
>So, when you add a number to a pointer to type T, how many bytes does the pointer advance?
Ok, I've tried again and now it works as it should, advances with sizeof(T).

>Also, why do you use i and (i-1) in this statement? Shouldn't they be the same?
The loop there is for making the kernel's virtual and physical addresses identical, so after activating the paging bit in CR0 it'll still go back to the line it was supposed to go to.
How does it work? The kernel address is 0x00001000, which means the first entry in PDT and the second entry in the PT that you got.
The kernel's code size is 3 Pages long. the for loop works like this: put into the second part of the PT the first part of the kernel, and keep going to the next part.

Alexei A. Frounze

unread,
Dec 7, 2022, 2:22:36 AM12/7/22
to
On Tuesday, December 6, 2022 at 10:47:41 PM UTC-8, אורי ויסבלום wrote:
...
> >Also, why do you use i and (i-1) in this statement? Shouldn't they be the same?
> The loop there is for making the kernel's virtual and physical addresses identical

In that case PTE[i] should "contain" i*4096, not (i-1)*4096.

Alex

Alexei A. Frounze

unread,
Dec 7, 2022, 2:24:40 AM12/7/22
to
Never mind, I missed the KERNEL_START addend.

Is it all working now?

Alex

James Harris

unread,
Dec 7, 2022, 9:09:54 AM12/7/22
to
On 06/12/2022 17:00, אורי ויסבלום wrote:

> First of all, thank you all for replying, I really appreciate it.
>
> James:
>> If in 32-bit mode do you have a page directory and the requisite initial
> page tables set up (or the equivalent) and do they identity-map the code
> location you are running at? Are they all marked Present and are all
> their other bits correct?
>
> Yes, I mean no instructions are happening after I set CR0. Didn't know it needs to JMP to a new line of code after it, I thought the jump at the end of the scope of the function is enough, but it makes a lot of sense I should identity-map the kernel's code into virtual mode. But I reckon it's not my only problem there.

Over the years Intel have said that after enabling paging (with MOV CR0)
some of their processors require a JMP, others require identity paging
and still others require both. It's more portable to use identity
mapping and a jump of some sort such as

mov cr0, eax
jmp .next
.next:

Your RET /might be/ good enough as the "jump" or it might not. I don't
know. It's safest just to include a JMP so as to be sure your code meets
Intel specs. (I cannot comment on AMD chips but they would likely have
the same requirements as Intel.)


--
James Harris


James Harris

unread,
Dec 7, 2022, 9:17:22 AM12/7/22
to
On 06/12/2022 19:41, אורי ויסבלום wrote:

> Now, here is the new code I've made (with a basic kmalloc, without free because I don't need it for now). The same bug happens. Any other ideas?
> BTW, I've checked all of the or expressions and all of the allocPage() outputs, they're as intended.
>
>
> #define PDT_SIZE 1024
> #define KERNEL_START 0x1000 // a constant in my linker code
> #define PAGE_SIZE 0x1000
> #define KERNEL_SIZE 3 // the size of the kernel code is 10KB for now (I've checked it), so I gave it 3 pages * 4KB = 12KB
> #define KERNEL_END (KERNEL_START + PAGE_SIZE*KERNEL_SIZE)
>
> PTEntry* kernelPTAddr = 0;
> uint32_t firstFree = 0;
>
> uint32_t allocPage() {
> if (firstFree == 0)
> firstFree = KERNEL_END;
> firstFree += PAGE_SIZE;
> return firstFree - PAGE_SIZE;
> }
>
> void initPDT() {
> PDEntry* table = allocPage();
> kernelPTAddr = allocPage();
> // Here it was with the struct of PD earlier, but I've realized it doesn't work correctly, it doesn't take into account the size of P*Entry, so I'm using pointers now.
> *table = READWRITE | PRESENT | (uint32_t)kernelPTAddr;
>
> for (int i = 1; i <= KERNEL_SIZE; i++) {
> *(kernelPTAddr+i*sizeof(PTEntry)) = PRESENT | READWRITE | DIRTY | KERNEL_START + (i-1)*PAGE_SIZE;

I might be wrong but when you add an integer to a pointer in C doesn't
it automatically scale the integer by the size of the type of element
pointed at so

kernelPTAddr + 3

would form the address kernelPTAddr + 12 (because kernelPTAddr points to
elements of PTEntry which has size 4)?

Do you have any kind of Protected Mode printing working? It would help
debugging if you could print out things such as the non-zero elements of
the PD and PT.

...

> typedef uint32_t PDEntry;
> typedef uint32_t PTEntry;


--
James Harris


Scott Lurndal

unread,
Dec 7, 2022, 10:11:26 AM12/7/22
to
=?UTF-8?B?15DXldeo15kg15XXmdeh15HXnNeV150=?= <turhu...@gmail.com> writes:
>Scott - I'll read your code and see what the difference is with mine, though it's a long assembly code that moves to *long mode*, and I'm making a 32-bit system.
>

Just leave out the EFER programming, and run the
teriary boot code in .code32 instead of .code64.


However, can you elaborate on why you don't want to use
long mode?

אורי ויסבלום

unread,
Dec 7, 2022, 12:32:19 PM12/7/22
to
BTW can you look at my flags enum? I might have made some mistakes there.

Alex:
>Is it all working now?
I meant the pointer arithmetics part is now working as intended. the OS is s1till rebooting over and over again

James:
I'll try the JMP part, just in case.

>I might be wrong but when you add an integer to a pointer in C ...
Yup, Alex mentioned it earlier, I've fixed it in my code, but didn't send it all over again.

>Do you have any kind of Protected Mode printing working? It would help debugging if you could print out things such as the non-zero elements of the PD and PT.
Yes, I printed those and they seem fine, but judge for yourself.
PDT[0] = 0x5003
PDT[0][1] = 0x1043
PDT[0][2] = 0x2043
PDT[0][3] = 0x3043
PDT[-1] = 0x4003

Scott:
Sorry, I'm not familiar with EFER programming and code32 and 64.
Yes, this is not intended to be an OS for wide use. It's the final project for a national cyber program I'm in. It's purpose is for me to learn about OS, not to make the best most optinmized one.
That's why I don't want to use long mode, I'm trying to be more focused on the basics and less on things around. It may be not very complicated to switch, but we've decided to work on it that way.

anti...@math.uni.wroc.pl

unread,
Dec 7, 2022, 2:57:25 PM12/7/22
to
???? ??????? <turhu...@gmail.com> wrote:
> Scott - I'll read your code and see what the difference is with mine, though it's a long assembly code that moves to *long mode*, and I'm making a 32-bit system.

If you want simple 32-bit example you may look at:

http://www.math.uni.wroc.pl/~p-wyk4/so/pr/

This is toy OS (about 2000 lines total) that boots from floppy and
switches to 32-bit protected mode with paging. Actual switching
is done in:

http://www.math.uni.wroc.pl/~p-wyk4/so/pr/boot-0.0/botsys.S

Line 48 switches to protected (16-bit) mode. Long jump in line 56
loads descriptor for 32-bit code segment, so from this point
execution is 32-bit. Line 70 turns on paging. Jump in line 75
transfers execution to virtual address. Page tables were filled
with initial values by C code in routine 'inicjuj_stronnicowanie',
that is first routine in

http://www.math.uni.wroc.pl/~p-wyk4/so/pr/boot-0.0/stronnicowanie.c

'botsys.S' has some complications because in 1995 (when this code
was written) GNU assembler only supported 32-bit code. First part
is executed in 16-bit mode, so I was careful to use instructions
with encodings which executed correctly in 16-bit mode and used
'.byte 0x66' to insert address override.

Extra thing, line 65 in botsys.S is first place where
C code has any chance of working and in this line I call
routine to put inital values in page tables. The called
routine only uses relative address, C code was linked to
execute in high memory, starting from 0xc0011000, so C code
using absolute address would fail before we have activated
paging and switched to addresses expected by C. So line 77
in botsys.S is first place where general C code can run,
and from that point interesting things are done by C code.
Anyway, code in botsys.S after line 77 is related to design
of OS but irrelevant from point of view of switching to
protected mode.

Concerning debugging, I used direct writes to screen memory
to get feedback how far my code got. There are rather short
sequences where instructions must be "as is" and one can not
add a printout.

--
Waldek Hebisch

Alexei A. Frounze

unread,
Dec 7, 2022, 11:02:08 PM12/7/22
to
On Wednesday, December 7, 2022 at 9:32:19 AM UTC-8, אורי ויסבלום wrote:
> BTW can you look at my flags enum? I might have made some mistakes there.

They seem fine at first glance.

> Alex:
> >Is it all working now?
> I meant the pointer arithmetics part is now working as intended. the OS is s1till rebooting over and over again

What addresses are in GDTR, GDT, IDTR, IDT, ESP and, finally, TSS if any?
The segment, interrupt and task tables must be accessible by the CPU as well.
Not to mention the stack, obviously.

Another thing I'd want to make sure is that the tables are all set up
before CR0.PG is set to 1.
I'd add a volatile attribute on those asm statements in startVirtualMode()
and also include "memory" in the clobber lists of those asm's.
And I'd check the disassembly of the result.

The C compiler doesn't know that those asm statements are relevant
to memory accesses and it can rearrange some things, e.g. generate
code to set CR0.PG before the tables are fully prepared.

Alex

Alexei A. Frounze

unread,
Dec 8, 2022, 1:48:32 AM12/8/22
to
On Wednesday, December 7, 2022 at 8:02:08 PM UTC-8, Alexei A. Frounze wrote:
> On Wednesday, December 7, 2022 at 9:32:19 AM UTC-8, אורי ויסבלום wrote:
> > BTW can you look at my flags enum? I might have made some mistakes there.
> They seem fine at first glance.
> > Alex:
> > >Is it all working now?
> > I meant the pointer arithmetics part is now working as intended. the OS is s1till rebooting over and over again
> What addresses are in GDTR, GDT, IDTR, IDT, ESP and, finally, TSS if any?
> The segment, interrupt and task tables must be accessible by the CPU as well.
> Not to mention the stack, obviously.

And if you're writing anything to the screen buffer, you may want to have
a mapping for that too.

Alex

Dan Cross

unread,
Dec 13, 2022, 5:31:58 PM12/13/22
to
In article <34818e6d-23a5-4c02...@googlegroups.com>,
ื ื ืจื ื ื ืกื ื ื ื <turhu...@gmail.com> wrote:
>First of all, thank you all for replying, I really appreciate it.
>
>James:
>> If in 32-bit mode do you have a page directory and the requisite initial
>page tables set up (or the equivalent) and do they identity-map the code
>location you are running at? Are they all marked Present and are all
>their other bits correct?
>
>Yes, I mean no instructions are happening after I set CR0. Didn't know it needs to JMP to a new line of code after it, I thought the jump at
>the end of the scope of the function is enough, but it makes a lot of sense I should identity-map the kernel's code into virtual mode. But I
>reckon it's not my only problem there.

This is for x86, but the same principle applies generally: once
you turn on paging by setting the PG bit in %cr0, the next
instruction must necessarily come from an address that is mapped
in the address space described by the page tables that you are
pointing to in %cr3. There is no jumping around permitted; the
next instruction is either mapped or you get a page fault.

Perhaps you were thinking of a long-jump between segments?
Once you're turning on 32-bit paging, that's not generally at
play anymore.

Further, the IDT and page fault handler must also be properly
mapped.

This implies that, at the moment you turn on paging, you must
have an identity mapping for the code that is doing the enabling
logic. Once you've set that up, you can jump pretty much
anywhere that's mapped and executable, and get rid of the
identity mapping, etc.

>Yes, I'm in 32-bit mode. No, I don't have an initial page tables setup, but if I'm not mistaken, that shouldn't cause the whole system reboot
>thing, until I'm trying to do a memory-based operation, and then it should call for a page fault.

Fetching the next instruction is a memory-based operation. :-)
So is jumping to a page fault handler. So is jumping to the
double fault handler when your page fault handler itself faults.
The last resort is a triple-fault, which is almost certainly
what the CPU is doing.

>Although now I realize that if I'm saying the function's ending is equivalent to a jump, it means I am trying to execute a memory-based
>operation, so yeah, it might be it.

It could be _any_ instruction; it's the fetch that's killing
your program.

>Yes, the handler for page fault is correct, the code just doesn't get to it. I loaded CR3, as you can see in the code.
>
>I believe for GDT specifically the CPU is not using paging for the address, but it does for JMP operations.

The GDTR stores a linear address; similarly with other CPU
resident tables such as the IDTR, LDTR, etc. The TSS must also
be in the linear address space. That means that they must all
be in the virtual address space you are creating when you enable
paging, and that the various CPU registers are loaded with the
virtual address of the table structures.

The exception is %cr3, which must hold the _physical_ address
of the page table root. Similarly, the PDT entries must be
filled with the physical addresses of the PTs, which must also
hold the physical addresses of the pages they map.

>I did do a page fault handler. I've tried including asm file, it is a bit easier but it's not that different.
>
>Joe, I think Waldek is right.
>
>Alex:
>>If your page tables aren't properly set up, your kernel will almost certainly
>triple-fault and reset because it itself uses page tables to access memory. When you screw up your page tables, memory accesses either won't
>work at all due to permissions or will read/write wrong memory cells, not the ones you're expeting.
>
>Yeah, but I thought it would call for a page fault, not reboot the system over and over again. I think James is right about his guess about
>not identity-mapping the kernel's code as the reason for its rebooting.

That is almost certainly correct.

>>Are you setting up 4KB pages or 4MB pages? You can't magically have both. For 4KB pages you aren't setting up any page tables or you're not
>showing your code for this.
>4KB. You're right, I'm not setting up page tables, but again it should do a page fault IMO.

Think of it this way: how does the CPU find the fault handler?
What mode is the CPU in, and what address space is the fault
handler in when paging is enabled?

>>Also, with PDT_SIZE=1024, (i>>22) is always 0. For 4MB pages you probably want (i<<22) here.
>Oh, I forgot that I need to shift left, you're right.
>Why are you talking specifically about 4MB pages? Isn't it right for 4KB too?

No, 4KiB pages shift the PFN (Page Frame Number) 12 bits to the
left. The page offset (byte offset within the page) is 12 bits
and hence 4096 (2^12 == 1 << 12 == 4096).

x86 page directories for 32-bit paging (where PDEs and PTEs are
32-bits wide) cover the full 4GiB of the 32-bit address space.
Think of the page tables as a two-level radix tree; each node in
the tree is 4KiB with 1024 (32-bit) entries. Given a 32-bit
virtual address, the index in the PDE for the corresponding PT
is the top 10 bits of the address; a single PDE thus covers 4MiB
of address space. The index in the PT selecting an actual page
of memory is the next 10 bits, and the offset in the page is the
bottom 12 bits. Note that each node in the tree is not just
4096 bytes wide, it's also aligned to a 4096-byte boundary; thus
the bottom 12 bits for PDEs and PTEs are free for control bits
(and some reserved for software use).

So what you likely want is a single PDT pointing to PTs and a
however-many PTs pointing to 4KiB pages. Again, make sure you
are filling in the PDEs and PTEs with _physical_ addresses of
whatever they point to, since the MMU works in the physical
address space. Note that you can cover the entire virtual
address space with a little over 4MiB of RAM devoted to tables.

>(Also, this setup is just for placeholding, they're obviously not real page tables)

It's best to get a handle on how the hardware works first, then
get more elaborate.

- Dan C.

anti...@math.uni.wroc.pl

unread,
Dec 13, 2022, 6:36:14 PM12/13/22
to
Dan Cross <cr...@spitfire.i.gajendra.net> wrote:
> In article <34818e6d-23a5-4c02...@googlegroups.com>,
> ? ? ??? ? ? ??? ? ? ? <turhu...@gmail.com> wrote:
> >First of all, thank you all for replying, I really appreciate it.
> >
> >James:
> >> If in 32-bit mode do you have a page directory and the requisite initial
> >page tables set up (or the equivalent) and do they identity-map the code
> >location you are running at? Are they all marked Present and are all
> >their other bits correct?
> >
> >Yes, I mean no instructions are happening after I set CR0. Didn't know it needs to JMP to a new line of code after it, I thought the jump at
> >the end of the scope of the function is enough, but it makes a lot of sense I should identity-map the kernel's code into virtual mode. But I
> >reckon it's not my only problem there.
>
> This is for x86, but the same principle applies generally: once
> you turn on paging by setting the PG bit in %cr0, the next
> instruction must necessarily come from an address that is mapped
> in the address space described by the page tables that you are
> pointing to in %cr3. There is no jumping around permitted; the
> next instruction is either mapped or you get a page fault.
>
> Perhaps you were thinking of a long-jump between segments?
> Once you're turning on 32-bit paging, that's not generally at
> play anymore.

386 is special here: one have to jump to make sure that processor
view of state of the world is consistent. And yes, page tables
have to be correctly set up with 1-1 mapping of currently
executing code.

--
Waldek Hebisch

Dan Cross

unread,
Dec 14, 2022, 8:35:07 AM12/14/22
to
You're referring to section 10.4.4 of the 80386 Programmer's
Reference Manual? Such language is absent in the current Intel
SDM (it was dropped when the 486 came out) when describing
paging in 32-bit mode, and it is not clear that OP is
targetting an actual 80386. Certainly, adding a `jmp 1f; 1:`
isn't going to hurt, but it is not necessary on any Intel or
AMD microprocessor manufactured in the last 30 years, even in
32-bit mode.

- Dan C.

James Harris

unread,
Dec 14, 2022, 5:59:22 PM12/14/22
to
On 14/12/2022 13:35, Dan Cross wrote:
> In article <tnb29c$19mk$1...@gioia.aioe.org>, <anti...@math.uni.wroc.pl> wrote:
>> Dan Cross <cr...@spitfire.i.gajendra.net> wrote:
>>> In article <34818e6d-23a5-4c02...@googlegroups.com>,

...

>>>> Didn't know it needs to JMP to a new line of code after it, I thought the jump at
>>>> the end of the scope of the function is enough, but it makes a lot of sense I should identity-map the kernel's code into virtual mode. But I
>>>> reckon it's not my only problem there.
>>>
>>> This is for x86, but the same principle applies generally: once
>>> you turn on paging by setting the PG bit in %cr0, the next
>>> instruction must necessarily come from an address that is mapped
>>> in the address space described by the page tables that you are
>>> pointing to in %cr3. There is no jumping around permitted; the
>>> next instruction is either mapped or you get a page fault.
>>>
>>> Perhaps you were thinking of a long-jump between segments?
>>> Once you're turning on 32-bit paging, that's not generally at
>>> play anymore.
>>
>> 386 is special here: one have to jump to make sure that processor
>> view of state of the world is consistent. And yes, page tables
>> have to be correctly set up with 1-1 mapping of currently
>> executing code.
>
> You're referring to section 10.4.4 of the 80386 Programmer's
> Reference Manual? Such language is absent in the current Intel
> SDM (it was dropped when the 486 came out) when describing
> paging in 32-bit mode, and it is not clear that OP is
> targetting an actual 80386. Certainly, adding a `jmp 1f; 1:`
> isn't going to hurt, but it is not necessary on any Intel or
> AMD microprocessor manufactured in the last 30 years, even in
> 32-bit mode.

For ref, here's that section:

https://www.scs.stanford.edu/05au-cs240c/lab/i386/s10_04.htm

Certainly the 386 required identity mapping OR a jump (or both) whereas
the 486 needed both.

The Pentium Architecture and Programming Manual, Order Number 241430 says:

"The 32-bit Intel architectures have different requirements for enabling
paging and switching to protected mode. The Intel386 processor requires
following steps 1 [jump] or 2 [identify mapping] above. The Intel486
processor requires following both steps 1 and 2 above. The Pentium
processor requires only step 2 but for upwards and downwards code
compatibility with the Intel386 and Intel486 processors, it is
recommended both steps 1 and 2 be taken."


--
James Harris


Dan Cross

unread,
Dec 14, 2022, 8:12:03 PM12/14/22
to
In article <tndkg6$2srs1$1...@dont-email.me>,
James Harris <james.h...@gmail.com> wrote:
>On 14/12/2022 13:35, Dan Cross wrote:
>> In article <tnb29c$19mk$1...@gioia.aioe.org>, <anti...@math.uni.wroc.pl> wrote:
>>> Dan Cross <cr...@spitfire.i.gajendra.net> wrote:
>>>> In article <34818e6d-23a5-4c02...@googlegroups.com>,
>>> 386 is special here: one have to jump to make sure that processor
>>> view of state of the world is consistent. And yes, page tables
>>> have to be correctly set up with 1-1 mapping of currently
>>> executing code.
>>
>> You're referring to section 10.4.4 of the 80386 Programmer's
>> Reference Manual? Such language is absent in the current Intel
>> SDM (it was dropped when the 486 came out) when describing
>> paging in 32-bit mode, and it is not clear that OP is
>> targetting an actual 80386. Certainly, adding a `jmp 1f; 1:`
>> isn't going to hurt, but it is not necessary on any Intel or
>> AMD microprocessor manufactured in the last 30 years, even in
>> 32-bit mode.
>
>For ref, here's that section:
>
> https://www.scs.stanford.edu/05au-cs240c/lab/i386/s10_04.htm
>
>Certainly the 386 required identity mapping OR a jump (or both) whereas
>the 486 needed both.

Ah, I stand corrected. The cache architecture of the 386
allowed it to execute a jump from an unmapped address, provided
that instruction immediately followed the move to %cr0 that
enabled paging. The 486 removed _that_ since it had a different
cache architecture, but I mistook the timeframe in which the jmp
requirement was removed, which was for the Pentium. Forgive me,
25 years is a long time.

>The Pentium Architecture and Programming Manual, Order Number 241430 says:
>
>"The 32-bit Intel architectures have different requirements for enabling
>paging and switching to protected mode. The Intel386 processor requires
>following steps 1 [jump] or 2 [identify mapping] above. The Intel486
>processor requires following both steps 1 and 2 above. The Pentium
>processor requires only step 2 but for upwards and downwards code
>compatibility with the Intel386 and Intel486 processors, it is
>recommended both steps 1 and 2 be taken."

So now we're down to compatiblity with processors that are a
quarter century obsolete. :-) Seriously, on a modern x86
processor, you don't need the jmp after turning on paging. It
won't hurt anything should someone try and run this on an
actual 386 or 486, but it is no longer required.

Moreover, this business about a after setting CR0[PG] isn't
addressing the OP's issue, which surely has to do with not
properly setting up the virtual address space before enabling
paging.

- Dan C.

אורי ויסבלום

unread,
Dec 26, 2022, 4:54:15 AM12/26/22
to
Hi all, thank you for answering and helping, and sorry for not answering for a long time.
I've managed to find the bug using GDB, it was as you said, the kernel wasn't fully mapped to virtual memory.

James Harris

unread,
Jan 6, 2023, 11:39:10 AM1/6/23
to
On 15/12/2022 01:12, Dan Cross wrote:
> In article <tndkg6$2srs1$1...@dont-email.me>,
> James Harris <james.h...@gmail.com> wrote:

...

>> The Pentium Architecture and Programming Manual, Order Number 241430 says:
>>
>> "The 32-bit Intel architectures have different requirements for enabling
>> paging and switching to protected mode. The Intel386 processor requires
>> following steps 1 [jump] or 2 [identify mapping] above. The Intel486
>> processor requires following both steps 1 and 2 above. The Pentium
>> processor requires only step 2 but for upwards and downwards code
>> compatibility with the Intel386 and Intel486 processors, it is
>> recommended both steps 1 and 2 be taken."
>
> So now we're down to compatiblity with processors that are a
> quarter century obsolete. :-) Seriously, on a modern x86
> processor, you don't need the jmp after turning on paging. It
> won't hurt anything should someone try and run this on an
> actual 386 or 486, but it is no longer required.

I would agree with you that it's been many years since the 386 and 486
were available new from Intel ... but they are far from obsolete. Even
now they are bought and sold second hand. Those who buy them presumably
do so not to have them sit on a shelf but run software on them. YMMV, of
course, but I cannot see an advantage in avoiding compatibility with
such older processors, especially as the JMP instruction is so easy to
include.


--
James Harris


James Harris

unread,
Jan 6, 2023, 11:43:35 AM1/6/23
to
On 26/12/2022 09:54, אורי ויסבלום wrote:

> Hi all, thank you for answering and helping, and sorry for not answering for a long time.
> I've managed to find the bug using GDB, it was as you said, the kernel wasn't fully mapped to virtual memory.

That's good to hear. Thanks for reporting back.

For anyone who may be interested, I found a detailed write-up of related
problems being experienced by SCO Unix:

http://www.os2museum.com/wp/sco-unix-3-2v4-0-vs-ia-32-semantics-changes/


--
James Harris


Scott Lurndal

unread,
Jan 6, 2023, 1:14:50 PM1/6/23
to
James Harris <james.h...@gmail.com> writes:
>On 15/12/2022 01:12, Dan Cross wrote:
>> In article <tndkg6$2srs1$1...@dont-email.me>,
>> James Harris <james.h...@gmail.com> wrote:
>
>...
>
>>> The Pentium Architecture and Programming Manual, Order Number 241430 says:
>>>
>>> "The 32-bit Intel architectures have different requirements for enabling
>>> paging and switching to protected mode. The Intel386 processor requires
>>> following steps 1 [jump] or 2 [identify mapping] above. The Intel486
>>> processor requires following both steps 1 and 2 above. The Pentium
>>> processor requires only step 2 but for upwards and downwards code
>>> compatibility with the Intel386 and Intel486 processors, it is
>>> recommended both steps 1 and 2 be taken."
>>
>> So now we're down to compatiblity with processors that are a
>> quarter century obsolete. :-) Seriously, on a modern x86
>> processor, you don't need the jmp after turning on paging. It
>> won't hurt anything should someone try and run this on an
>> actual 386 or 486, but it is no longer required.
>
>I would agree with you that it's been many years since the 386 and 486
>were available new from Intel ... but they are far from obsolete. Even
>now they are bought and sold second hand.

That doesn't make them not obsolete. Many obsolete items are still
purchased and sold by collectors, hobbyists, etc (e.g. hand-crank
wallphones, WE 500 desksets, Burroughs Calculators). They're still
obsolete.

James Harris

unread,
Jan 6, 2023, 2:26:09 PM1/6/23
to
Sure, but machines with old CPUs are still //in use//, as I mentioned.

Some definitions of 'obsolete':

"no longer used because of being replaced by something newer and more
effective" (MacMillan)

"No longer in use; gone into disuse; disused or neglected (often in
favour of something newer)" (Wiktionary)

"no longer in use" (Wordnet)

Maybe another word would fit better, e.g. deprecated: "Said of a
function or feature planned to be phased out, but still available for
use." (Wiktionary) Though there may be a more apt word than that.



--
James Harris


Dan Cross

unread,
Jan 7, 2023, 9:51:38 AM1/7/23
to
In article <tp9irb$377ta$2...@dont-email.me>,
As I've said repeatedly in this thread, I don't care if someone
wants to `jump 1f; 1:` in their code after enabling paging if
they want to be explicitly compatible with the 386/486, but this
justification is a stretch.

Is there utility in old 386 and 486 hardware? Perhaps. But
there is a cost, too: compared to modern hardware, those systems
are power-hungry, physically large and slow. Even Linux removed
support for the 386 back in 2012. VAXen are still bought and
sold on the secondary market and still used in production, too,
but those machines really are obsolete. As of 2003, Melbourne's
commuter train control system ran on PDP-11s, which are beyond
obsolete. Being used and/or sold on the secondary market is
orthogonal to whether a machine is obsolete or not. Should
modern software constrain itself to archaic dialects of C, for
example, because someone _may_ still be using PDP-11s? No.

Furthermore, if you're going to make use of a feature that never
existed on one of those older processors, such as an instruction
introduced in newer microarchs, or enabling long mode, then
going out of one's way for compatibility with behavior of a
processor that doesn't support those featuers is strictly
unnecessary.

More importantly, executing a jmp, or not, after setting the PG
bit in cr0 had nothing to do with the OP's actual problem. By
fixating on this, most people missed the actual problem.

- Dan C.

Dan Cross

unread,
Jan 11, 2023, 3:24:04 PM1/11/23
to
In article <tp9j3k$377ta$3...@dont-email.me>,
James Harris <james.h...@gmail.com> wrote:
>[snip]
>For anyone who may be interested, I found a detailed write-up of related
>problems being experienced by SCO Unix:
>
>http://www.os2museum.com/wp/sco-unix-3-2v4-0-vs-ia-32-semantics-changes/

Sadly, that article seems to be wrong on what seem like basic
facts. It says this:

"On the other hand, setting the PG bit in CR0 does
not cause a TLB flush. The TLB is still valid and
the JMP instruction can be fetched without referring
to the paging structures."

What is this referring to? The 80386 programmer's manual is
very explicit that the TLB ("page translation cache") caches the
most recent page translations. If paging is not enabled, the
TLB is not consulted at all, and presumably doesn't have entries
for the current memory accesses. There is a way to write
entries to the TLB via the test registers, possibly without
paging enabled, but that's not generally applicable and this
seems like a red herring.

More likely, the jmp was simply in the instruction cache,
and enabling CR0[PG] didn't reset any of that state, so
the instruction just executed, as it didn't need to be
fetched from memory.

Scott Lurndal

unread,
Jan 11, 2023, 4:03:28 PM1/11/23
to
cr...@spitfire.i.gajendra.net (Dan Cross) writes:
>In article <tp9j3k$377ta$3...@dont-email.me>,
>James Harris <james.h...@gmail.com> wrote:
>>[snip]
>>For anyone who may be interested, I found a detailed write-up of related
>>problems being experienced by SCO Unix:
>>
>>http://www.os2museum.com/wp/sco-unix-3-2v4-0-vs-ia-32-semantics-changes/
>
>Sadly, that article seems to be wrong on what seem like basic
>facts. It says this:
>
> "On the other hand, setting the PG bit in CR0 does
> not cause a TLB flush. The TLB is still valid and
> the JMP instruction can be fetched without referring
> to the paging structures."
>
>What is this referring to? The 80386 programmer's manual is
>very explicit that the TLB ("page translation cache") caches the
>most recent page translations. If paging is not enabled, the
>TLB is not consulted at all, and presumably doesn't have entries
>for the current memory accesses.

Paging can be enabled, disabled, then re-enabled, in which
case there may be (now invalid) entries in the TLB from the first
set of page tables, and if they're not flushed when PG is
re-enabled, and the subsequent JMP PC hit, it will likely branch to nowhere
using the old TLB entry rather than doing a page table walk.

Yes, the first time PG is set after reset, it is unlikely that the TLB will
have a hit on the VA of the JMP, but it doesn't hurt to
always completely flush the TLBs when installing a new
page table, even if they're still at the reset values.

Dan Cross

unread,
Jan 11, 2023, 7:35:04 PM1/11/23
to
In article <yuFvL.133042$t5W7....@fx13.iad>,
This is all certainly true. However, the document linked above
seemed to indicate that SCO Unix would not cold boot because of
flushing the TLB, ie from cold boot, which seems unrelated. Put
another way, I don't think that was what was going on in the
particular case discussed over at os2museum.com, or if it was,
it was not well explained.

- Dan C.

Scott Lurndal

unread,
Jan 12, 2023, 12:37:48 PM1/12/23
to
It is possible that the TLB reset value is indeterminate in
that generation of processor.

Dan Cross

unread,
Jan 12, 2023, 9:28:54 PM1/12/23
to
In article <JzXvL.296372$MVg8....@fx12.iad>,
Possibly. But the excerpted code in that article shows CR3 being
loaded with the PD physical address, which _does_ flush the TLB.
The article then talks about how the TLB is not flushed when
CR0[PG] is set (on the 80386), and concludes that the code
worked on those chips because the jmp instruction was fetched
from an address that was in the TLB (he's a bit handwavey about
the instruction pipeline/cache after that). This suggests that
the author believes that the TLB comes into play before paging
is enabled, and that a valid TLB entry will exist after the mov
to cr0. This is simply incorrect: I see no evidence or
documentation that the TLB is loaded when paging is turned off;
for that matter, there's no guarantee that the text of the `jmp`
instruction itself won't cross over a page boundary.

The most likely case here is that the author was just mistaken.

- Dan C.

0 new messages