Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Can I write a Bootloader using GAS?

1,149 views
Skip to first unread message

CuppoJava

unread,
Jul 22, 2010, 9:05:59 PM7/22/10
to
Hi everyone,

I just read through "Programming from the Ground Up" to acquaint
myself with x86 assembly programming, and I'm interested in writing a
bootloader for myself.

Does anyone know of a tutorial for writing a bootloader with GAS? I
have found only tutorials using NASM, which I find to be a little more
subtle and harder to understand than GAS.

In particular, I don't understand how the following NASM commands
would look like in GAS:

org 0x07c00
This command (somehow) makes the program load at address 0x07c00, I'm
not sure about the details.

jmp codesel:program_start
I'm not sure how segment addressing looks in GAS.

lgdt[gdtr]
Loading the global descriptor table. Is this a macro? I'm not sure how
NASM would know how to do this without it being specified somewhere.

times 510-($-$$) db 0
Filling from the current location to address 510 with 0's.

Thanks a lot for your help and guidance.
-Patrick

nedbrek

unread,
Jul 23, 2010, 8:25:57 AM7/23/10
to
Hello all,

"CuppoJava" <patrick...@gmail.com> wrote in message
news:4926123d-2b90-4741...@z30g2000prg.googlegroups.com...
> Hi everyone,


>
> Does anyone know of a tutorial for writing a bootloader with GAS? I
> have found only tutorials using NASM, which I find to be a little more
> subtle and harder to understand than GAS.

Check http://wiki.osdev.org/Bare_Bones

When you use GAS, you need some linker magic to handle some of the stuff...

HTH,
Ned


Rod Pemberton

unread,
Jul 23, 2010, 10:14:49 AM7/23/10
to
"CuppoJava" <patrick...@gmail.com> wrote in message
news:4926123d-2b90-4741...@z30g2000prg.googlegroups.com...
>
> I just read through "Programming from the Ground Up" to acquaint
> myself with x86 assembly programming, and I'm interested in writing a
> bootloader for myself.
>
> Does anyone know of a tutorial for writing a bootloader with GAS? I
> have found only tutorials using NASM, which I find to be a little more
> subtle and harder to understand than GAS.
>
> In particular, I don't understand how the following NASM commands
> would look like in GAS:
>
> org 0x07c00
> This command (somehow) makes the program load at address 0x07c00, I'm
> not sure about the details.
>

I think that's set with the linker: LD...

After assembling with GAS, you'll need to link the .o object file:

ld --oformat binary --Ttext 0x7C00 -o boot.bin boot.o

If GAS has a method to set the org in your .S file, I'm not aware of it, but
I'm not as familiar with GAS bootloader code as with NASM.


GAS bootloaders typically start:

.code16
.text
.global _start
_start:

> jmp codesel:program_start
> I'm not sure how segment addressing looks in GAS.
>

I think that's:

ljmp $CODESEL, $program_start

.code32
program_start:

Actually, most will say $start32 and start32: since that's the long jump
used to flush the cpu cache and enable the switch to 32-bit mode. This is
after the code to set %cr0 PE bit to 1.

I'm not completely sure about how to specify CODESEL in GAS at the moment.
It may be an equate, or a computed offset like NASM, i.e., codesel label
minus start of gdt.


After that you usually reload segments registers with 32-bit selectors:

movl $DATASEL, $eax
movl %eax, %ds
movl %eax, %es
# etc...

SS and ESP should be loaded together either via LSS instruction, or via
back-to-back moves. The paired moves lock out interrupts. It's a special
feature. I'm not sure of the GAS form of LSS, but I can look it up.

> lgdt[gdtr]
> Loading the global descriptor table. Is this a macro? I'm not sure how
> NASM would know how to do this without it being specified somewhere.
>

Looking at some files online, it seems to be something like the following.
FYI, I'm not 100% sure I've got the syntax correct.


lgdt gdt

# elsewhere your gdt is setup
gdt:
# gdt info using .byte, .word, .long assembly pseudo-instructions
.word (gdt_end - gdt - 1)
.
.
.
.
# a bunch of .long or .word or .byte to fill in descriptor data
gdt_end:

> times 510-($-$$) db 0
> Filling from the current location to address 510 with 0's.
>

I think that's something like:

.fill 0x1fe - (. - start) , 1, 0


And the 55 AA signature in NASM,

dw 0xaa55

Is this in GAS,

.word 0xAA55


Examples of small bootloaders in GAS used to be everywhere. I can't find a
good GAS example. I can help you track down the other details of converting
NASM to GAS if you wish. Did you find a good NASM example? E.g., enable
CR0.PE, long jump, lgdt and gdt table, 0xAA55 signature, etc.? Actually,
I'm having problems finding a basic NASM example today that shows everything
too...

HTH,


Rod Pemberton
Topic for AOD regulars, maybe, someone (Did I elect myself?) should
construct a basic bootloader or PM startup identical in NASM and GAS... then
put it on AOD FAQ. I used to mention the "muVinux" loader which I now think
was derived from Chris Giese code... Maybe something by CG would be a
start.


CuppoJava

unread,
Jul 23, 2010, 11:31:11 AM7/23/10
to
Thank you very much for your help.

I am following this tutorial: http://www.osdever.net/tutorials/hello_btldr.php?the_id=85
which is clearly written but for NASM.

After reading around a bit more, it seems GAS is not easily capable of
generating real-mode code. Perhaps I'll put in some time to learn
NASM.

As an aside: would you mind explaining exactly what org 0x07c00 does?

This is how I thought it works:

(1) An assembly program is converted to binary by looking up the
opcodes for all the mnemonics in the program.
(2) A new file is created, and bytes 0 to 0x07c00 is padded with
zeroes. Then the program binary is copied into this file starting from
address 0x07c00.
Is that right?

Thanks
-Patrick

s_dub...@yahoo.com

unread,
Jul 23, 2010, 11:56:23 AM7/23/10
to

Not since 8080 ASM.

Nowadays ORG 0x7C00 means; here starts a block which is expected to be
loaded at segment:offset where offset is 0x7C00.

Do to the approach of IBM when it wrote its rombios code to load a
floppy boot sector for the IBM PC, we have a destination load address
of 0000:7C00h, still with us. This means the first physical byte of
the boot code will be loaded to segment 0, offset 0x7C00 by the
rombios bootstrap routine.

The algorithm for (1) is alot more involved nowadays also.

hth,

Steve
>
> Thanks
>   -Patrick

CuppoJava

unread,
Jul 23, 2010, 12:14:33 PM7/23/10
to
Thanks for the explanation Steve,

I think I get it now.

So all the "org 0x7c00" directive does, is change how labels are
resolved into addresses. Is that right?

For this program:
_start:
movl %eax, %ebx

_start would be resolved by the linker to point to 0.

But for this program:
org 1
_start:
movl %eax, %ebx

_start would be resolved by the linker to point to 1.

Is that correct?
-Patrick

Maxim S. Shatskih

unread,
Jul 23, 2010, 1:13:24 PM7/23/10
to
>of 0000:7C00h

07C0:0000 can also be used.

--
Maxim S. Shatskih
Windows DDK MVP
ma...@storagecraft.com
http://www.storagecraft.com

Maxim S. Shatskih

unread,
Jul 23, 2010, 1:18:27 PM7/23/10
to
> So all the "org 0x7c00" directive does, is change how labels are
> resolved into addresses.

I think yes.

s_dub...@yahoo.com

unread,
Jul 23, 2010, 3:23:51 PM7/23/10
to

I want to say yes. However, it can be more involved than that,
depending on your toolset, memory model, modules combined into a
group, etc. In any case you need to experiment with the toolset of
your choice and learn what _it_ does.

for a long example using NASM..

[MAP ALL ORG0.MAP]
;;--------------------------------------------------------60
;; File: ORG0.NSM
;; Last:
;; Init:
;; Vers: 0.0.0 r0
;; Note: test org 0.
;;--------------------------------------------------------60
;; Test & map ORG 0
;;--------------------------------------------------------60
BITS32
org 0
[SECTION .text]
_start:
mov ebx, eax

mov edx, _start

TIMES 30h db 90h
mov ax, 0
int 16h
int 19h
;;--------------------------------------------------------60
;;--------------------------------------------------------60
;; [SECTION .dseg]
;;--------------------------------------------------------60
;; --== EO .MOD ==--
;;--------------------------------------------------------60
-= VS =-
[MAP ALL ORG1.MAP]
;;--------------------------------------------------------60
;; File: ORG1.NSM
;; Last:
;; Init:
;; Vers: 0.0.0 r0
;; Note: test org 1.
;;--------------------------------------------------------60
;; Test & map ORG 1
;;--------------------------------------------------------60
BITS32
org 1
[SECTION .text]
_start:
mov ebx, eax

mov edx, _start

TIMES 30h db 90h
mov ax, 0
int 16h
int 19h
;;--------------------------------------------------------60
;;--------------------------------------------------------60
;; [SECTION .dseg]
;;--------------------------------------------------------60
;; --== EO .MOD ==--
;;--------------------------------------------------------60

I added a reference to _start (mov edx,_start) to illustrate a few
things..

- NASM Map file
---------------------------------------------------------------

Source file: ORG0.NSM
Output file: ORG0.BIN

-- Program origin
-------------------------------------------------------------

00000000

-- Sections (summary)
---------------------------------------------------------

Vstart Start Stop Length Class Name
00000000 00000000 00000040 00000040 progbits .text

-- Sections (detailed)
--------------------------------------------------------

---- Section .text
------------------------------------------------------------

class: progbits
length: 00000040
start: 00000000
align: not defined
follows: not defined
vstart: 00000000
valign: not defined
vfollows: not defined

-- Symbols
--------------------------------------------------------------------

---- Section .text
------------------------------------------------------------

Real Virtual Name
00000000 00000000 BITS32
00000000 00000000 _start

-=VS=-

- NASM Map file
---------------------------------------------------------------

Source file: ORG1.NSM
Output file: ORG1.BIN

-- Program origin
-------------------------------------------------------------

00000001

-- Sections (summary)
---------------------------------------------------------

Vstart Start Stop Length Class Name
00000001 00000001 00000041 00000040 progbits .text

-- Sections (detailed)
--------------------------------------------------------

---- Section .text
------------------------------------------------------------

class: progbits
length: 00000040
start: 00000001
align: not defined
follows: not defined
vstart: 00000001
valign: not defined
vfollows: not defined

-- Symbols
--------------------------------------------------------------------

---- Section .text
------------------------------------------------------------

Real Virtual Name
00000001 00000001 BITS32
00000001 00000001 _start

The above MAP listing hints at the difference.

The following listings give no clue..

1 [MAP ALL ORG0.MAP]

2 ;;--------------------------------------------------------60
3 ;; File: ORG0.NSM
4 ;; Last:
5 ;; Init:
6 ;; Vers: 0.0.0 r0
7 ;; Note: test org 0.

8 ;;--------------------------------------------------------60
9 ;; Test & map ORG 0

10 ;;--------------------------------------------------------60
11 BITS32
12 org 0
13 [SECTION .text]
14 _start:
15 00000000 6689C3 mov ebx, eax
16
17 00000003 66BA[00000000] mov edx, _start
18
19 00000009 90<rept> TIMES 30h db 90h
20 00000039 B80000 mov ax, 0
21 0000003C CD16 int 16h
22 0000003E CD19 int 19h

23 ;;--------------------------------------------------------60

24 ;;--------------------------------------------------------60
25 ;; [SECTION .dseg]

26 ;;--------------------------------------------------------60
27 ;; --== EO .MOD ==--

28 ;;--------------------------------------------------------60

-=VS=-

1 [MAP ALL ORG1.MAP]

2 ;;--------------------------------------------------------60
3 ;; File: ORG1.NSM
4 ;; Last:
5 ;; Init:
6 ;; Vers: 0.0.0 r0
7 ;; Note: test org 1.

8 ;;--------------------------------------------------------60
9 ;; Test & map ORG 1

10 ;;--------------------------------------------------------60
11 BITS32
12 org 1
13 [SECTION .text]
14 _start:
15 00000000 6689C3 mov ebx, eax
16
17 00000003 66BA[00000000] mov edx, _start
18
19 00000009 90<rept> TIMES 30h db 90h
20 00000039 B80000 mov ax, 0
21 0000003C CD16 int 16h
22 0000003E CD19 int 19h

23 ;;--------------------------------------------------------60

24 ;;--------------------------------------------------------60
25 ;; [SECTION .dseg]

26 ;;--------------------------------------------------------60
27 ;; --== EO .MOD ==--

28 ;;--------------------------------------------------------60

Which appear identical except for the ORG

Now I load each of these .bin files using a debugger which sees each
load command and loads each to a new segment...

--------------------------------------------------
*** Symbolic Instruction Debugger *** Release 3.2
Copyright (c) 1983,1984,1985,1988,1990,1991
Digital Research, Inc. All Rights Reserved
--------------------------------------------------

#rorg0.bin <---loads to current segment
Start End
142E:0000 142E:003F
#rorg1.bin <---loads to a subsequent segment
Start End
1432:0000 1432:003F
#d142e:0
142E:0000 66 89 C3 66 BA {00 00 00 00} 90 90 90 90 90 90 90
f..f............
142E:0010 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
90 ................
142E:0020 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
90 ................
142E:0030 90 90 90 90 90 90 90 90 90 B8 00 00 CD 16 CD
19 ................
.. I space here to show the end of the first, also to explain that the
address
.. of _start is between the brackets I've put in.
142E:0040 66 89 C3 66 BA {01 00 00 00} 90 90 90 90 90 90 90
f..f............
142E:0050 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
90 ................
142E:0060 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
90 ................
142E:0070 90 90 90 90 90 90 90 90 90 B8 00 00 CD 16 CD
19 ................

The dump command dumps from the first .bin thru the second, so note,
142E:0040h == 1432:0000h (this has everything to do with the code
being an exact multiple of 16 bytes, a 'paragraph'.)

If you can pickup on this idea, you will see that in the near
identical code, one .bin is 'correct' and the other will 'fail', but
not because of the segment it resides in. The first .bin address of
_start does indeed point to start of it segment, but the second .bin
is faulty because the address of its _start is off by one with the its
actual _start in the first paragraph of its segment. The
second .bin's loading would have to be at offset 1 in the 1st
paragraph of its segment to be 'correct', because it is ORG'ed at 1.

Whichever toolset you choose, you'll infrequently need to verify an
issue, like the above, but you will be very happy to know how to do
it. So my advice is to do test problems to learn how to check for
them.

hth.

Steve

Rod Pemberton

unread,
Jul 23, 2010, 5:01:51 PM7/23/10
to
"CuppoJava" <patrick...@gmail.com> wrote in message
news:251f0e04-5636-439c...@h40g2000pro.googlegroups.com...
>
> I am following this tutorial: [link]

> which is clearly written but for NASM.
>

Do I think NASM is easier? Yes. I prefer it to MASM or GAS. Some people
prefer NASM syntax, but using other assemblers: FASM, YASM.

> After reading around a bit more, it seems GAS is not easily capable of
> generating real-mode code.
>

Well, I've seen numerous OS projects based on the GNU toolchain, and they
all have 16-bit startups using GAS. I do know that GAS has .code16,
.data16, and .code16gcc directives for 16-bit support. I don't yet know
what the 16-bit code that GAS emits looks like.

I haven't converted any of my boot code or 32-bit protected-mode switch code
to GAS. So, I'm not exactly sure of all the limitations, especially 16-bit
limitations. But, I have implemented 32-bit inline assembly in C for GCC
(i.e., DJGPP) which is basically GAS syntax.

> As an aside: would you mind explaining exactly what org 0x07c00 does?
>

Sorry, I don't know _exactly_... It may do some things I'm not aware of.

What I do know is:

1) that it indicates the code is to be loaded at offset 0x7c00 from the
start of a segment, probably CS, maybe DS, or both...
2) that offsets will have 0x7c00 added to correct their addresses for the
load location of 0x7c00

Data is loaded based on DS, so data addresses are probably corrected by ORG.
Code is loaded based on CS, so jump addresses are probably corrected by ORG
too.

I.e., if I say "ORG 0x7c00", load the code to 0x7c00 with CS and DS
segment=0, and run it there, it works.
I.e., if I say "ORG 0x7c00", load the code to segmentX*16+0x7c00 with CS and
DS segment=segmentX, and run it there, it works.


E.g., from some code that compiles for 0x7c00 (BIOS boot) and for 0x100 (DOS
.com):

lss sp, [stk]

; elsewhere
stk:
dw 07c00h
dw 0

When compiled for "ORG 0x100", the (dis)assembled instruction is:

lss sp, [0x1b3]

When compiled for "ORG 0x7c00", the (dis)assembled instruction is:

lss sp, [0x7cb3]

You can see that 0x100 or 0x7c00 is added to the offset for stk: label -
apparently 0xb3. That would be a data address based on an offset from DS
segment.


Rod Pemberton


BGB / cr88192

unread,
Jul 24, 2010, 1:31:59 AM7/24/10
to

"Rod Pemberton" <do_no...@notreplytome.cmm> wrote in message
news:i2cvvj$42f$1...@speranza.aioe.org...

> "CuppoJava" <patrick...@gmail.com> wrote in message
> news:251f0e04-5636-439c...@h40g2000pro.googlegroups.com...
>>
>> I am following this tutorial: [link]
>> which is clearly written but for NASM.
>>
>
> Do I think NASM is easier? Yes. I prefer it to MASM or GAS. Some people
> prefer NASM syntax, but using other assemblers: FASM, YASM.
>

yeah. I use my own assembler (BGBASM), but it also still uses (more or less)
NASM syntax.
however, the main difference between mine and NASM is that mine is generally
used for assembling things at runtime, whereas NASM is generally used as a
static assembler.

YASM can apparently be used as both a static and runtime assembler, but I
haven't really much investigated the specifics (I wrote mine well before I
found out about YASM, and generally stuck with my own assembler).

also different:
mine supports multiple opcodes per line ("add eax, 15; shl eax, 4", note ';'
uses whitespace to decide whether to merge lines or be a comment);
mine supports C-stlye comments;
mine doesn't support assembly-time expressions;
the preprocessor works differently;
...


>> After reading around a bit more, it seems GAS is not easily capable of
>> generating real-mode code.
>>
>
> Well, I've seen numerous OS projects based on the GNU toolchain, and they
> all have 16-bit startups using GAS. I do know that GAS has .code16,
> .data16, and .code16gcc directives for 16-bit support. I don't yet know
> what the 16-bit code that GAS emits looks like.
>
> I haven't converted any of my boot code or 32-bit protected-mode switch
> code
> to GAS. So, I'm not exactly sure of all the limitations, especially
> 16-bit
> limitations. But, I have implemented 32-bit inline assembly in C for GCC
> (i.e., DJGPP) which is basically GAS syntax.
>

AFAIK GAS didn't originally generate real-mode code, and instead as86 was
typically used (or NASM or others).
eventually they added 16-bit support into GAS, and as86 I guess started to
fade away.

I may be wrong here though...

no real comment here...


BGB / cr88192

unread,
Jul 24, 2010, 1:37:01 AM7/24/10
to

"Maxim S. Shatskih" <ma...@storagecraft.com.no.spam> wrote in message
news:i2cijk$2sh5$1...@news.mtu.ru...
>of 0000:7C00h

<--


07C0:0000 can also be used.
-->

technically, yes, as this is the same address...

however, the BIOS can't likely jump to this address, as it would essentially
risk breaking many/most bootloaders. a bootloader could itself use this
address (probably with a far jump to a label), such as, for example, to
allow them to not use "org" or similar.

Rod Pemberton

unread,
Jul 24, 2010, 9:23:53 AM7/24/10
to
"Rod Pemberton" <do_no...@notreplytome.cmm> wrote in message
news:i2c84e$jmi$1...@speranza.aioe.org...

> Actually, I'm having problems finding a basic NASM example
> today that shows everything too...
>

While looking for some NASM info, I found this comparison of GAS and NASM
syntax:
http://www.ibm.com/developerworks/linux/library/l-gas-nasm.html?S_TACT=105AGX52&amp;S_CMP=cn-a-l


RP


CuppoJava

unread,
Jul 24, 2010, 10:00:41 AM7/24/10
to
On Jul 24, 9:23 am, "Rod Pemberton" <do_not_h...@notreplytome.cmm>
wrote:
> "Rod Pemberton" <do_not_h...@notreplytome.cmm> wrote in message

>
> news:i2c84e$jmi$1...@speranza.aioe.org...
>
> > Actually, I'm having problems finding a basic NASM example
> > today that shows everything too...
>
> While looking for some NASM info, I found this comparison of GAS and NASM
> syntax:http://www.ibm.com/developerworks/linux/library/l-gas-nasm.html?S_TAC...
>
> RP

Thank you everyone for all your help, and especially to Steve's mini-
tutorial on how I can go about finding out some of these answers for
myself. This will keep me busy for a while to come.

-Patrick

Maxim S. Shatskih

unread,
Jul 24, 2010, 4:33:25 PM7/24/10
to
> 07C0:0000 can also be used.
> -->
>
> technically, yes, as this is the same address...
>
> however, the BIOS can't likely jump to this address, as it would essentially
> risk breaking many/most bootloaders.

Compaq Presario did exactly this.

kiranbha...@gmail.com

unread,
Oct 19, 2013, 6:53:04 AM10/19/13
to

Rod Pemberton

unread,
Oct 19, 2013, 8:09:07 AM10/19/13
to
On Sat, 19 Oct 2013 06:53:04 -0400, <kiranbha...@gmail.com> wrote:
> On Friday, July 23, 2010 6:35:59 AM UTC+5:30, CuppoJava wrote:

>> [old post]
>
> [links]

We appreciate your contribution. However, most of the information
in the original conversation you replied to already covered what is
in the links you posted. Also, if you hadn't noticed, you replied
to a Usenet message from about four years ago. CuppoJava hasn't
posted since that time. This is a Usenet group, not a Google
Groups group. I.e., please respond to only to 2013 messages during
2013, preferably for the same month too, at least for Usenet.

"Can I write a Bootloader using GAS?"
https://groups.google.com/d/msg/alt.os.development/O2TWfoDiYfU/k9lvluhKiDQJ

The answer is: "Yes". You can write a bootloader using GAS.


Rod Pemberton
--
84Nk5y once again, proving artists, citizens, authorities from around
the world CAN'T recognize ART when they SEE it. Rats, chimps, gays,
un-angelic angels, shell-shocked soldiers, distopian childhoods ...
Whats next? We have Youtube where it's REAL, in COLOR, without need
of monochrome stenciled spray paint ... BANKSY, the world is unchanged.
0 new messages