I have started learning assembly. I have been learning C for a few
months now, i have been concentrating on algorithms and i am now about
to deal with pointers. I believe a quick look at assembly language
will help me deal with programming language for the near or distant
futur.
I am learning from 2 pdfs :
-"PC Assembly language" from P. Carter
-and "Programming from the ground up" from J. Bartlett.
I find the second one Programming from ... more accessible, i mean
easier to read for now. I understand it's all about Linux programming.
I have linux and xp and i run both from time to time.
Is assembly language very different from Windows to Linux ?
This is because, in the second pdf programming from... in the 3rd
chapter, it says
movl %1, %eax
whereas the 1st book "PC Assembly...", says about moving data that it
goes the other way around :
mov dest, src
So, it assembly language between linux and windows completly
different, the opposite?
Or is it just because there's a difference between movl and mov ?
What document should i first concentrate on ?
There is no way i can run Linux assembly from the second pdf on
Windows and the other way round?
Thanks,
Pascal
It probably would be wise to do the two in the opposite order.
> I believe a quick look at assembly language
> will help me deal with programming language for the near or distant
> futur.
>
> I am learning from 2 pdfs :
> -"PC Assembly language" from P. Carter
> -and "Programming from the ground up" from J. Bartlett.
>
> I find the second one Programming from ... more accessible, i mean
> easier to read for now. I understand it's all about Linux programming.
> I have linux and xp and i run both from time to time.
>
> Is assembly language very different from Windows to Linux ?
No. The difference is primarily in the available libraries of
subroutines. Think printf() in C/C++ vs PRINT in Basic or WriteLn() in
Pascal. They're all slightly different subroutines to do more or less
the same thing.
> This is because, in the second pdf programming from... in the 3rd
> chapter, it says
> movl %1, %eax
>
> whereas the 1st book "PC Assembly...", says about moving data that it
> goes the other way around :
> mov dest, src
>
> So, it assembly language between linux and windows completly
> different, the opposite?
It's not the OS, it's the different assemblers (=assembly language
compilers) that accept the same instructions in slightly different
forms. The one with percents is (G)AS and it typically uses the
opposite order of operands for x86. The normal order is used in MASM,
TASM, FASM and a number of other assemblers. And that DWORD isn't
always necessary in those assemblers because it's often possible to
deduce the operand size from the register name used (EAX is 4 bytes,
AX is 2 bytes, AL is 1 byte, RAX is 8 bytes). Think of different x86
assemblers (=compilers) as of programs speaking different flavors of
more or less the same original language (e.g. Spanish vs Portuguese).
> Or is it just because there's a difference between movl and mov ?
L tells (G)AS that the operand is long in size (4 bytes). In other
assemblers the role of L may be played by special keywords like DWORD.
> What document should i first concentrate on ?
You need to read all of these documents:
1. Processor manual (either Intel or Amd)
2. Assembler manual for your assembler
3. Some book/document/articles that teach you programming in assembly
(that's what you've quoted)
4. Some documents on how to do I/O (file, console, etc), memory
management, etc in your target OS for which you're writing your
programs. You need to know the API that the OS provides to programs.
> There is no way i can run Linux assembly from the second pdf on
> Windows and the other way round?
There's no way to learn Windows API from a book on Linux API and vice
versa. The assembly language itself is CPU-specific, not OS-specific.
Alex
Assembly language is the same but the system calls are different. For
example, to terminate your program in Linux you might code
mov ebx, [error_code]
mov eax, SYS_exit
int 0x80
To terminate your program in Windows you might write
push dword error_code
call [ExitProcess]
These both use the same assembler language but as you can see linux
and Windows offer different system calls (syscalls) and have different
means of invoking them.
When I was looking into this for myself I wrote some primers for
assembly under Linux and Windows. They are available at
http://codewiki.wikispaces.com/linux_stdout.nasm
http://codewiki.wikispaces.com/winstdout.nasm
They both use the same assembler and both do the same things which
makes it easier to compare. Note that the assembler software used is
the same whether running under Linux or Windows.
> This is because, in the second pdf programming from... in the 3rd
> chapter, it says
> movl %1, %eax
>
> whereas the 1st book "PC Assembly...", says about moving data that it
> goes the other way around :
> mov dest, src
>
> So, it assembly language between linux and windows completly
> different, the opposite?
>
> Or is it just because there's a difference between movl and mov ?
As has been pointed out movl is the mnemonic for the move instruction
under the Gas assembler and mov is the mnemonic for the same
instruction under a number of other assemblers: Masm, Nasm, Fasm, Yasm
etc. The mov variant is also what you will see in the Intel and AMD
manuals. (Intel's manuals are easier to read, IMHO, but both sets of
manuals can be overwhelming so you are probably better off starting
with a tutorial text. Use the manuals to check instruction
descriptions.)
>
> What document should i first concentrate on ?
Whatever I'm learning I generally find it helpful to have two or more
sources. Each author gives a different slant. And there's less chance
of mistakes or bias in one affecting your learning. So I'd recommend
to look at both.
>
> There is no way i can run Linux assembly from the second pdf on
> Windows and the other way round?
If you use Nasm on both Linux and Windows the assembler instructions -
mov, and, or, add, sub, etc - will be the same, i.e. identical. Only
the syscalls differ.
James
It might be wiser to direct the reader to download & install the latest
distribution from the NASM site. The last time I check, the copy in the
Debian/Ubuntu repo { what you get when you do a "sudo apt-get install
nasm" } is an old, broken version that newbies should not have to contend
with.
Nathan.
> If you use Nasm on both Linux and Windows the assembler instructions -
> mov, and, or, add, sub, etc - will be the same, i.e. identical. Only
> the syscalls differ.
>
> James
If i understand things well:
There are different assemblers : Gas, Masm, Nasm, Fasm, Yasm
Each has its own language and set of rules (like movl is specific to
Gas...) ? Gas is specific to Linux (it won't work on Windows) ? But as
i think i did already, i can install Nasm in Linux ?
Each processor (Intel, AMD) works differently wether it's dealing with
Gas, Masm, Nasm, Fasm, Yasm ?
That's why posts above say to read the processor manufacturer
documentation.
But i believe i need to understand the basics before i get into
technical documentation? And as i said earlier, i don't to get too
deep into assembly language, i'd just like to understand how things
work when i compile for instance, a C file.
Cheers,
Thanks again
Pascal
gas runs on Windows (the Windows version is part of the "MinGW" package
among others), and NASM runs on just about everything, including Linux,
Windows, and MacOS. Most Linux distributions include NASM.
-hpa
I've split off Nasm instructions to a separate page and allowed for
both options.
Can you remember what was broken about the Nasm version you saw and
roughly how long ago that was?
James
Yes
> Each has its own language and set of rules (like movl is specific to
> Gas...) ?
Yes, though most implement a similar language. Gas is particularly
different.
> Gas is specific to Linux (it won't work on Windows) ?
Per HPA's answer there is a version of Gas for Windows.
> But as
> i think i did already, i can install Nasm in Linux ?
Yes, you can install it on a number of operating systems.
>
> Each processor (Intel, AMD) works differently wether it's dealing with
> Gas, Masm, Nasm, Fasm, Yasm ?
I'm not sure what you mean. To generate a given program (which will
work on various Intel and AMD CPUs) you need to write the source
expected by the assembler chosen.
Intel and AMD make compatible CPUs. There are minor differences but in
general code you write for x86 will run on CPUs from either
manufacturer.
> That's why posts above say to read the processor manufacturer
> documentation.
>
> But i believe i need to understand the basics before i get into
> technical documentation? And as i said earlier, i don't to get too
> deep into assembly language, i'd just like to understand how things
> work when i compile for instance, a C file.
OK. Don't be afraid to look at the Intel instruction set manual
volumes, though.
James
Thanks
> > Each processor (Intel, AMD) works differently wether it's dealing with
> > Gas, Masm, Nasm, Fasm, Yasm ?
>
> I'm not sure what you mean. To generate a given program (which will
> work on various Intel and AMD CPUs) you need to write the source
> expected by the assembler chosen.
>
I think your answered my question even if english is not my mother
tongue and thus it's not easy to read me right.
Just to make sure, for instance, a Masm program can be different if it
runs on Intel or if it run on AMD or any other provided as you say
there are minor differences between the two. I don't even have any
clues about other processor names (AMD, ARM, IBM, Intel, MIPS,
Motorola, NEC, SUN, TI, Transmetta, VIA (list from http://en.wikipedia.org/wiki/Processors
(bottom of the page)? Are these the main ones? Then for instance, Masm
can be different from AMD to VIA ?
Is it possible to write assembly code for Mobile phones ?
You say "various" Intel and Amd. Do you mean Intel has different set
of rules for x86 processors (Celeron, Atom, Dual Core...)?
Then as i work on a computer with an Atom processor, i need to find
that specific Atom documentation? Tough, i just would like some kind
of initiation to assembly in global...
Provided time can't be extended, do i have to concentrate efforts on
Gas? This document written by Jonathan Bartlett "Programming from
ground up..." is really accessible and friendly to read and
understand. It deals with Gas.
Or should i learn Asm from another accessible doc "PC Assembly
Language" from Paul Carter?
As it said Gas has the most distant syntax from others, I should first
concentrate on Asm. But as it's an initiation (maybe 100 hours), i
don't mind, i just want to focus on something that can help understand
higher level language such as C.
Thx,
pascal
x86 is over thirty years old and has been extended many times. In
general most people ignore anything prior to the 386 these days,
because that's the first CPU that supported 32 bit mode, and everyone
wants to avoid both 16 bit real and 16 bit protected mode (and most
x86 OSs require 32 bit mode anyway these days). Not all of those
extensions have been adopted by all of the manufacturers (and some
have simply died out), and sometimes figuring out which ones are which
is a bit trying, but the core stuff, is pretty consistent between the
various x86 vendors. Now some stuff is less so. For example, some of
the Atoms don't implement 64 bit mode, so you can't run a 64 bit OS or
64 applications on it. Other extensions, like the various SSE
extensions, are also implemented somewhat irregularly, especially for
the newer versions of those. The more modern extensions have a well
defined (and fairly consistent) way of detecting their presence, some
of the stuff going back to the Pentium and before is much more ad-
hoc. In the Intel manuals, the sections on architectural
compatibility go a long way of sort out what goes where, at least from
an Intel perspective.
But if you're learning assembly on x86, just stick with the core
instructions, and don't worry about the rest of the stuff.
The assembler syntax is, as others have mentioned, specific to the
assembler. And the way you call the OS, or other programs/functions,
is specific to the OS you're using. But the x86 instructions remain
the same. FWIW, if you're using GAS, GAS has an "Intel" mode that
makes it syntax more like the one Intel uses in their documentation.
Anyway, x86 assembler, with the above caveats, is the same on any x86
processor, no matter the manufacturer. Intel, AMD, Via, etc.
Many other, non-x86, instruction set architectures (ISAs) exist.
These span a huge range of sizes, styles, purposes, and whatnot. For
example, PowerPC is implemented by many PPC microcontrollers, as well
as the POWERx chips used in some very large IBM servers. Many mobile
phones have an ARM CPU or two in them. IBM mainframes implements the
current version of the ISA defined by S/360 45 years ago. Many small
CPU, often intended for embedded applications, exist. The decades old
8-bit 8051 is still heavily used (as one example of many), and is
available in littereally hundreds of versions from dozens of vendors.
Nor is there any real pattern to the manufacture of these devices.
Intel, for example, manufactures, in various forms, CPUs based on the
x86, IPF, 8051 and ARM ISAs (as well as several others), none of which
are compatible with each other, and while Intel is the only
manufacturer of IPF chips, the others all have additional sources.
IBM manufactures POWER CPUs, as well as some non-POWER PPC devices,
zSeries (current S/360 ISA) CPUs, but has in the past manufactured x86
chips.
> Is it possible to write assembly code for Mobile phones ?
Sure, so long as the OS on that device allows you to load code like
that. Most smartphones use ARMs.
> You say "various" Intel and Amd. Do you mean Intel has different set
> of rules for x86 processors (Celeron, Atom, Dual Core...)?
>
> Then as i work on a computer with an Atom processor, i need to find
> that specific Atom documentation? Tough, i just would like some kind
> of initiation to assembly in global...
Sure. As mentioned above, some Atoms don't have 64 bit mode. Intel
sells a bunch of CPUs of various vintages. The older models still on
sale don't implement all the latest instruction set extensions that
the newest models do. Some CPUs support the virtualization
extensions, other do not.
> Provided time can't be extended, do i have to concentrate efforts on
> Gas? This document written by Jonathan Bartlett "Programming from
> ground up..." is really accessible and friendly to read and
> understand. It deals with Gas.
>
> Or should i learn Asm from another accessible doc "PC Assembly
> Language" from Paul Carter?
>
> As it said Gas has the most distant syntax from others, I should first
> concentrate on Asm. But as it's an initiation (maybe 100 hours), i
> don't mind, i just want to focus on something that can help understand
> higher level language such as C.
Pick an assembler that works well on your platform. One with a lot of
sample code is a plus too. Once you get some experience under your
belt, you'll realize that the particular syntax a particular assembler
uses is mostly irrelevant. While learning ARM assembler after
learning x86 assembler is a fairly bit chore (since they're so
different - although usually once you've got one ISA under your belt,
learning a new one is significantly easier), learning MASM syntax for
x86 after learning GAS syntax for x86 is not really a big deal.
Basically get out there and learn some assembler, and worry about the
rest later.
Hi Pascal,
Hey, you've already got a language named after you! :)
I don't think this is a "language problem"... at least not human
languages. Your question is a good one, but hard to answer in few words.
Let me have a shot at it...
First, I'm going to introduce still another language: machine language.
This is what the CPU actually "sees" and runs. Regardless which
assembler you use... or C compiler, or other language... the machine
language would be the same (with minor exceptions). A different CPU
would (might) use a different machine language.
The "x86" in the name of this newsgroup refers to the "x86 family" of
CPUs. This started with the 8086, then 80186, 80286, 80386, 80486... at
this point, Intel introduced "Pentium" which unlike "586" is
"registerable" as a trademark (AMD can't call their CPU a "Pentium",
even if it's essentially the same thing) . The progression continues...
I guess we're up to 80886 or beyond, but names are mostly used rather
than numbers, these days. Newer versions introduce new instructions -
which won't work on older models, of course - but the old instructions
still work (with *very* rare exceptions).
Intel, AMD, and other manufacturers make chips besides the "x86 family",
of course, but when we say "Intel processor" (in this newsgroup, at
least) we usually mean "x86 family". Macs used to use a Motorola CPU -
completely different machine language - different "architecture" - but
they've switched to Intel. Most any "desktop" computer or laptop you
encounter these days will be running an "x86 family" CPU.
> Just to make sure, for instance, a Masm program can be different if it
> runs on Intel or if it run on AMD or any other provided as you say
> there are minor differences between the two.
The differences are "more minor" than that. For "beginner purposes", you
can consider any "x86 family" as identical. There are "newer
instructions", but you probably won't be using them at first.
> I don't even have any
> clues about other processor names (AMD, ARM, IBM, Intel, MIPS,
> Motorola, NEC, SUN, TI, Transmetta, VIA (list from
http://en.wikipedia.org/wiki/Processors
> (bottom of the page)? Are these the main ones?
I guess so. I'm not familiar with anything but "x86 family".
> Then for instance, Masm
> can be different from AMD to VIA ?
Well, no... assuming we're talking about an "x86 family" CPU from each.
I don't think Masm will produce machine language for anything *but* x86.
I know Nasm won't. There's a "Nasm offshoot" called "Narm" for ARM
registered on SourceForge - I don't know if it even works. I suspect,
although I'm not sure, that there *are* versions of Gas which will emit
machine code for some (all?) of these other architectures you mention.
> Is it possible to write assembly code for Mobile phones ?
Sure. Assembly language is just a human-readable(?) representation of
machine language, and *any* CPU runs its own machine language.
Probably(?) not x86 family, though...
> You say "various" Intel and Amd. Do you mean Intel has different set
> of rules for x86 processors (Celeron, Atom, Dual Core...)?
Newer instructions won't work on older processors, of course. And there
are (wildly!) different rules for writing the fastest instructions (and
combinations of instructions) on different processors. But if you start
out assuming they're "all the same", it'll be a long while before you
notice that it isn't really true. :)
> Then as i work on a computer with an Atom processor, i need to find
> that specific Atom documentation?
Ideally, Atom-specific documentation written in AT&T (Gas) syntax... or
Nasm syntax, even. (Nasm is "similar" to "Intel syntax" but not quite
the same). I don't think you're going to find that. I assume "Atom" is a
64-bit CPU? Neither of your proposed books cover that. No matter, the
32-bit stuff will work. When you're ready to "unlock the advanced
features of the Atom" (I assume there are some), you'll need
Atom-specific documentation.
> Tough, i just would like some kind
> of initiation to assembly in global...
Unfortunately, assembly language isn't very "global"...
> Provided time can't be extended, do i have to concentrate efforts on
> Gas? This document written by Jonathan Bartlett "Programming from
> ground up..." is really accessible and friendly to read and
> understand.
Agreed! I wish PGU had been available when I first started learning this
stuff. I'd probably be a Gas user instead of the devout Nasmist I am. :)
> It deals with Gas.
Yes. Some people say Gas/AT&T syntax is "ugly". I won't disagree, but
you can get used to it, I guess. Some people say Gas is only intended as
a backend to gcc and doesn't handle human-generated errors very well.
Probably true at one time, but I think it does fine now. A few years
back, the ".intel_syntax noprefix" directive was added (I guess they
agree that their syntax is ugly). This allows "Intel syntax" - more like
Masm than Nasm, but they're getting closer. :) (PGU uses "straight" AT&T
syntax, so it won't help you)
FWIW, Yasm will accept either Gas or Nasm syntax, I understand. Haven't
tried it.
Wilhelm Zadrapa has translated the examples from PGU into Nasm syntax:
<http://home.myfairpoint.net/fbkotler/nasm-pgu-examples.tar.bz2>
Jonathan was receptive to the idea of a "Nasm syntax version" of PGU,
but I don't think anyone has done anything about it, besides the
examples. So, yeah, it deals with Gas...
It also deals with Linux. That's a "plus", for me, but it isn't the most
"popular" platform. I personally think that it's a lot easier to get a
simple program up and running under Linux than under Windows, but
opinions vary on that. If you're okay with Linux, that shouldn't be an
issue.
> Or should i learn Asm from another accessible doc "PC Assembly
> Language" from Paul Carter?
The beauty of Dr. Carter's book - besides that it uses Nasm - is that
you can use it from Dos (djgpp), Windows (several compilers), Linux,
BSD... ??? Maybe even Mac(?). He accomplishes this magic by using C to
interface with the OS. You might want to look at how he does that - but
perhaps not right at first...
I should mention that Jeff Duntemann's Third Edition of "Assembly
Language Step by Step" is out. Earlier editions concentrated on Dos -
this one uses Linux - and Nasm! :)
... but perhaps you've got enough decisions...
> As it said Gas has the most distant syntax from others, I should first
> concentrate on Asm. But as it's an initiation (maybe 100 hours), i
> don't mind, i just want to focus on something that can help understand
> higher level language such as C.
I think you'd get that from either of the books you mention, and others.
Since you seem to "like" Jonathan Bartlett's book, it's probably a good
place for you to start...
Best,
Frank
You are almost right. Intel and AMD CPUs execute the same machine code
(which is what assembler is translated into). However, later CPUs from
both Intel and AMD have additional instructions that are not available
in earlier CPUs. In the Nasm documentation there is a table of
instructions showing which CPU they were introduced into.
The sequence went something like 386, 486, Pentium, Pentium with MMX,
Pentium Pro, Pentium II, Pentium III, etc. If you write code for a
Pentium, for example, it should run on a Pentium with MMX and all
later models. You can tell Nasm which CPU you are writing for with a
"cpu" directive then Nasm will make sure you are not using
instructions which were introduced later. See the Nasm manual for
this.
On the other hand, CPUs from ARM, IBM, MIPS, etc each have their own
machine code and are not compatible with Intel and AMD chips.
> Are these the main ones?
In terms of how much they are used in familiar computers Intel and
Intel-compatible CPUs are the main ones. They have a ninety-something
percent share in the desktop PC and laptop markets.
> Then for instance, Masm
> can be different from AMD to VIA ?
I think VIA makes a range of CPUs which *are* compatible with Intel
and AMD.
>
> Is it possible to write assembly code for Mobile phones ?
All devices with CPUs can have assembly written for them - if you can
find an assembler and a means of loading the program on to the device.
>
> You say "various" Intel and Amd. Do you mean Intel has different set
> of rules for x86 processors (Celeron, Atom, Dual Core...)?
Yes, in part. Code for earlier Intel and AMD CPUs works on later ones.
Later CPUs may use instructions which are not present on earlier CPUs.
>
> Then as i work on a computer with an Atom processor, i need to find
> that specific Atom documentation? Tough, i just would like some kind
> of initiation to assembly in global...
You can program the Atom using 386 instructions or 486 instructions or
Pentium instructions etc. It's probably a good idea initially to use
older instructions even on the Atom as then you can be sure you will
be learning the fundamental instructions and not the fancy ones added
later.
>
> Provided time can't be extended, do i have to concentrate efforts on
> Gas? This document written by Jonathan Bartlett "Programming from
> ground up..." is really accessible and friendly to read and
> understand. It deals with Gas.
>
> Or should i learn Asm from another accessible doc "PC Assembly
> Language" from Paul Carter?
I'd recommend using both for a few weeks. After a while you will know
which one you want to concentrate on. As others have said there are
other introductory texts.
>
> As it said Gas has the most distant syntax from others, I should first
> concentrate on Asm. But as it's an initiation (maybe 100 hours), i
> don't mind, i just want to focus on something that can help understand
> higher level language such as C.
Given your requirements I wonder if you would benefit from the book
Inner Loops by Rick Booth. It was written in the days of the Pentium
(so uses fundamental instructions which are present on all later
CPUs). In Chapter 2 he brilliantly describes 32-bit assembler. I've
not seen a clearer explanation. And in Chapter 8 he shows how C
components translate to assembly language. Due to the book's age
second hand copies can be picked up quite cheaply from Amazon.
James
It was about a year-or-so ago so I do not remember the particulars.
However, I have just now performed the "apt-get" task and this is what
resulted:
nathan@aspireone:~$ uname -a
Linux aspireone 2.6.31-16-generic #52-Ubuntu SMP Thu Dec 3 22:00:22
UTC 2009 i686 GNU/Linux
nathan@aspireone:~$ nasm -v
NASM version 2.05.01 compiled on Nov 5 2008
Hmm... while typing this post, I noticed that the "Update Manager"
popped into action. I looked at the list, but do not see any entries
for Nasm. I guess I will have to update Nasm manually.
Not a big deal. Just that I think it is pointless for Linux users to
be wrestling bugs that have already been fixed in later versions.
Nathan.