how do you start learning assembly language

Greg

unread,

Jan 6, 2008, 11:07:52 AM1/6/08

to

my name is Greg, i am new to this forum, i am interested in learning
assembly language but do not know where to start, can some one tell me
how do i begin , what do i use,i have tried to research on the subject
but it all is confusing, what assembly languages should i learn

regards

Greg

hutch--

unread,

Jan 6, 2008, 11:56:49 AM1/6/08

to

Greg,

For Windows, www.masm32.com

For Linux, ask some of the members for the most current method.

Regards,

hutch at movsd dot com

Betov

unread,

Jan 6, 2008, 12:13:08 PM1/6/08

to

hutch-- <hu...@movsd.com> écrivait news:e80faf71-982c-4716-af2c-
b54aa9...@i3g2000hsf.googlegroups.com:

> For Windows, www.masm32.com

Illegal, very weird syntax, not conforming with the actual one,
as defined by NASM, and as applied by all the actual Assemblers.

Betov.

< http://rosasm.org >

Wolfgang Kern

unread,

Jan 6, 2008, 1:01:59 PM1/6/08

to

Greg asked:

Hello and welcome to the arena,
where best to start will depend on what you already know about
programming in general and with which environments you're familiar
and not at least on the CPU-family you are interested to learn.

BASIC ?, C ?, ...
DOS/windoze/L'unix ?

I assume x86..AMD64 CPUs is what you search for, here I'd use
FASM-, NASM-, RosAsm(winPE)- or even MASM-tutorials beside the
not renouncable CPU manuals (free downloads from Intel and AMD).

There are others too, but these use a weird confusing syntax like
AT&T,GAS, so you wont find a match with the CPU-manuals.

__
wolfgang

Herbert Kleebauer

unread,

Jan 6, 2008, 1:25:32 PM1/6/08

to

That depends on why you want to learn assembly programming. If
you are interested in hardware, then start with a simple 8 bit
controller, for example the Atmel AVR family. You can download
the development software from Atmel's web site. And even with
a simple hobby equipment you can build your first system.

If you are only interested in programming a CPU without doing
hardware experiments (but you really will miss much), you can
also program the CPU of your PC.

Anyhow, you always have start with reading the CPU manual which
you can download in pdf format from the manufacturers web site.
Intel also offers free printed versions for the x86 CPUs. If
you choose the x86 architecture use Google to search for a 386
manual which is much smaller then the current Pentium manuals.

Keith Kanios

unread,

Jan 6, 2008, 2:26:33 PM1/6/08

to

The following will cover both Windows and Linux x86/64 assembly
language without incurring extra learning due to the use of propriety
tools.

http://nasm.sourceforge.net/
http://www.asmcommunity.net/projects/nasmx/
http://www.asmcommunity.net/

Ratch

unread,

Jan 6, 2008, 4:56:27 PM1/6/08

to

"Betov" <be...@free.fr> wrote in message
news:XnF9A1DB95855...@212.27.60.40...

What's "illegal" and "weird" about the MASM syntax? MASM was around
before NASM, right? If anything, NASM should conform to MASM. Are you
saying that MASM is not an "actual" assembler? If so, you are making a damn
fool of yourself. Ratch

Rod Pemberton

unread,

Jan 6, 2008, 5:57:28 PM1/6/08

to

"Greg" <gregch...@gmail.com> wrote in message
news:4ab77345-451e-4f63...@e23g2000prf.googlegroups.com...

> what assembly languages should i learn

Others might say learn for ARM cpu's, etc., since they dominate the small
device market for embedded cpu's (phones, mp3 players, etc.). I'd say
assembly for Intel or AMD cpu's. They dominate the PC and (progressivey)
the supercomputer, mainframe, and miniframe markets.

> i am interested in learning
> assembly language but do not know where to start,

The first thing you want to _really_ learn well is hexadecimal (Real
oldtimers will say octal, but hex is far more useful.). You should memorize
the 4-bit binary (nybble) sequences for all sixteen values:
0,1,2,3,4,5,6,7,8,9,a,b,c,d,e,f. Next thing you should _really_ learn well
is how these three bitwise operators work: and, or, xor. Start with 2-bit
combinations. Then you should learn about one's and two's complement in
binary - which are used for arithmetic. Next, you need to get the cpu and
instructions manuals. Now, you want to learn the register set for the
microprocessor you choose. You want to know their names, their sizes (8-bit
or byte, 16-bit or word, 32-bit or dword), if one register is a sub-register
of another register, etc. Then, you can begin learning how each instruction
affects certain registers, flags, or memory. Along the way you'll pick up
lots of other stuff: BCD, ASCII, signed vs. unsigned, etc., etc.

> what do i use,

Unfortunately, there are many choices here... You'll need a computer,
probably a PC, with an OS (Windows, Linux, DOS, etc.). ;) Then you'll need
an assembler which works for your OS. For x86 (AMD, and Intel) cpu's, this
is how I rank the some of the more used assemblers:

NASM - easy, clean syntax, not OS specific, IMO, better than the others in
creating special forms of certain instructions
MASM - has many wierd syntax operators, but is original MS x86 assembler
(Windows, DOS) - so it has a large codebase
AT&T or GAS - powerful, used by FSF's (e.g., GNU C, Linux) compilers and
utilities, not easy to think in since arguments are reversed, difficult
syntax, but easy to convert to from NASM syntax. Can be very confusing
since format is different from Intel/AMD manuals.
HLA - assembly in a pseudo-C format (That's _not_ how it's author would
describe IT...)

There are numerous others, TASM, WASM, YASM, FASM. The more used ones are
frequently clones or derivatives of either MASM or NASM... Then, of course,
we have some of the assemblers of others here: Windela, RosAsm, etc.

Rod Pemberton

Greg

unread,

Jan 7, 2008, 10:43:21 AM1/7/08

to

thanks to everyone for the information it really helped.
i am now researching on how to take this further using the information
that all of you have been so liberal with, many thanks.
i have never worked with linux so want to do the programming using
windows, i thought it would be rather difficult to start learning
linux whilst also trying to learn assembly programming. i have
experience with C/C++, VB/VB.Net, Java, Delphi.

I have done some computer logic at college when we did a module called
Digital Logic and Design thats back home in Zimbabwe, i am now based
in South Africa, i got a distinction for the module but back then i
wasn't really much into the stuff, i just did it because i had to, so
I will have to do a refresher on that since i have just realised that
to be an assembly programmer one will sure need all that background
info, I am also catching up on my knowledge on Computer Architect. It
seems that to be a assembly programmer its quite a mountain. I am
probably now where near the summit but i will sure work my way up
especially with such a forum as a resource tool.

please do keep em posts coming, because i am really intent on being a
assembly programmer, especially the links and step wise info.

I think from what I have got from what everyone is saying i will go
for NASM for OS, I will go for Windows/DOS , i will program for CPU, i
would love to experiment with h/w but i also wanted a cost effective
thingy so i will do h/w later when circumstances allow, i will use
Intel, i have a 32bit .

thnx again

cheers

Anthony Fremont

unread,

Jan 7, 2008, 12:20:31 PM1/7/08

to

I'll say it. ;-) Any assembler that takes the syntax I wrote and assumes
that I must have actually meant something else is pretty much crap. I'd
rather have an error than to have to look at the actually machine code
generated to determine what went wrong. If I'm using an illegal register
(8088 days) or addressing mode, just give me an error.

I suggest the OP learn ARM assembler first. It's allot cleaner and easier
to grasp IMO.

//\\o//\\annabee

unread,

Jan 7, 2008, 9:20:15 PM1/7/08

to

for windevs, there is _only one_.

- RosAsm

If you are just looking to pretend to be an asmer, in front of
complete beginners, to boast your ego, use any of the other once.

If you chose the _only one_, you will be on you ass lauging at 95.5%
of the morons posting to a programming ng about assemblers, before
4 weeks.

Those people have implants, so tightly grown, over such a long time
that they are not even aware of it.

And choosing RosAsm - _You_ will be able to see it.

( I just call it CLS - Corporate Lobotomy Syndrome
also known is certain circles as "Clear Screen" ).

;-D

Its gad damn true, _every word_!

Wolfgang Kern

unread,

Jan 7, 2008, 3:38:45 PM1/7/08

to

Ratch replied to betov's notes,

>> Illegal, very weird syntax, not conforming with the actual one,
>> as defined by NASM, and as applied by all the actual Assemblers.

> What's "illegal" and "weird" about the MASM syntax? MASM was around
> before NASM, right? If anything, NASM should conform to MASM. Are you
> saying that MASM is not an "actual" assembler? If so, you are making a
damn > fool of yourself. Ratch

Yes, MASM predates FASM and NASM. But what may be the purpose for
FASM, NASM, RosASM and KESYS-script to exists at all ?
For sure not because MASM was once "that" superior.
MASM lacks on strict/native addressing modes by default (period).
(Lord Logic may us save from:) GAS and even HLA are better on this ...

But 'itzelon' and 'skeleton'-tutorials or their transformed variants
found in any tool (which is worth to be mentioned at all) show how to
talk to the CPU w/o detouring LIBs or weird combinations of API-calls.
__
wolfgang

Wolfgang Kern

unread,

Jan 7, 2008, 3:17:16 PM1/7/08

to

"Ratch" <wat...@comcast.net> schrieb im Newsbeitrag
news:QNmdnXbYWP0S0xza...@comcast.com...

Wolfgang Kern

unread,

Jan 7, 2008, 4:04:22 PM1/7/08

to

Wannabee replied to Greg:
...

>> I think from what I have got from what everyone is saying i will go
>> for NASM for OS, I will go for Windows/DOS , i will program for CPU, i
>> would love to experiment with h/w but i also wanted a cost effective
>> thingy so i will do h/w later when circumstances allow, i will use
>> Intel, i have a 32bit .

>> thnx again

> for windevs, there is _only one_.

> - RosAsm

> If you are just looking to pretend to be an asmer, in front of
> complete beginners, to boast your ego, use any of the other once.

> If you chose the _only one_, you will be on you ass lauging at 95.5%
> of the morons posting to a programming ng about assemblers, before
> 4 weeks.

Att Greg:
I can confirm a pretty high learning speed for WinDoze if you use RosAsm.
But if you're interested in HW-programming (BIOS calls and/or DOS)
check on 16-bit options given in FASM/NASM (or even MASM) in addition.
And if L'unix is your main target-OS then NASM should be your choice.

__
wolfgang

Ratch

unread,

Jan 7, 2008, 5:26:31 PM1/7/08

to

"Anthony Fremont" <nob...@noplace.net> wrote in message
news:13o4nr1...@news.supernews.com...

Does MASM do what you describe above? If so, can you give an example?
Ratch

Ratch

unread,

Jan 7, 2008, 5:32:13 PM1/7/08

to

"Wolfgang Kern" <now...@never.at> wrote in message
news:flu2m1$7o5$1...@newsreader1.xoc.utanet.at...

Can you give a short example where MASM does not do something as good
as most assemblers, or performs worse than most? Ratch

Wolfgang Kern

unread,

Jan 7, 2008, 6:15:48 PM1/7/08

to

Ratch asked:
...

> Can you give a short example where MASM does not do something as good
> as most assemblers, or performs worse than most? Ratch

just one:

MOV eax,[data_label+offset*8+40h]

__
wolfgang

Frank Kotler

unread,

Jan 7, 2008, 6:44:47 PM1/7/08

to

Greg wrote:

...

> I think from what I have got from what everyone is saying i will go
> for NASM

Excellent choice! (IMHO)

> for OS, I will go for Windows/DOS ,

Well, okay... I suppose you don't need to learn Linux, too...

> i will program for CPU,

Good! Many of the tutorials and example you'll encounter will almost
give you the impression that you "have to" use "invoke". You probably
*will* want to use it, but "invoke" is *not* a CPU instruction. You'll
want to learn what *instructions* are involved when you use "invoke"!
It's easy enough...

invoke SomeFunc, param1, param2, param3

emits:

push param3
push param2
push param1
call SomeFunc

As you can see, you don't "need" to use it, but push-push-push-call gets
old pretty quick. You'll also want include files with Windows equates
(long names for small integers). (converted from C .h files mostly, so
you're probably familiar with 'em)

You'll probably want that "NASMX" package from Keith and Bryant:

http://www.asmcommunity.net/projects/nasmx/

I see they've switched to Jeremy Gordon's "GoLink", etc. I vaguely
recalled that Jeremy had a nice introductory tutorial to Windows
programming on his site:

http://www.jorgon.freeserve.co.uk/

It's gotten even better than I remembered! Great resource! Thanks
Jeremy! The tutorials and examples are in GoAsm syntax, of course, but
close enough to Nasm that it shouldn't be a problem. (you might even
want to try his assembler!)

Another source of examples is the NaGoA project - seems to be "sleeping"
lately, but they've got a raft of examples:

http://www.visual-assembler.pt.vu/

There's also a "Yahoo group":

http://tech.groups.yahoo.com/group/win32-nasm-users/

The list is not very active, but they've got a bunch of examples in the
"files" section - *two* translations of the examples from the Iczelion
tutorials so that they'll assemble with Nasm. Note that this isn't a
"translation of the Iczelion Tutorials". They remain here:

http://win32assembly.online.fr/tutorials.html

Not Windows-specific - in fact platform independent (he does this trick
by interfacing with C, which may interest you) - Dr. Carter's tut may
provide a more generic "introduction to assembly", which some of the
other tuts may tend to assume...

http://www.drpaulcarter.com/pcasm/

That ought to keep you busy for a while. If by any chance there are any
questions <G>, get back to us!

Best,
Frank

Evenbit

unread,

Jan 7, 2008, 8:40:05 PM1/7/08

to

On Jan 7, 10:43 am, Greg <gregchipo...@gmail.com> wrote:
>
> I have done some computer logic at college when we did a module called
> Digital Logic and Design thats back home in Zimbabwe, i am now based
> in South Africa, i got a distinction for the module but back then i
> wasn't really much into the stuff, i just did it because i had to, so
> I will have to do a refresher on that since i have just realised that
> to be an assembly programmer one will sure need all that background
> info, I am also catching up on my knowledge on Computer Architect. It
> seems that to be a assembly programmer its quite a mountain. I am
> probably now where near the summit but i will sure work my way up
> especially with such a forum as a resource tool.

Most of what you'd want to know about computer architecture is covered
in Volume 2 of Randy's Art of Assembly book:

http://webster.cs.ucr.edu/AoA/Windows/HTML/AoATOC.html

Nathan.
http://del.icio.us/Evenbit/x86

Anthony Fremont

unread,

Jan 7, 2008, 11:32:50 PM1/7/08

to

Ratch wrote:
> "Anthony Fremont" <nob...@noplace.net> wrote in message

>> I'll say it. ;-) Any assembler that takes the syntax I wrote and

>> assumes that I must have actually meant something else is pretty
>> much crap. I'd rather have an error than to have to look at the
>> actually machine code generated to determine what went wrong. If
>> I'm using an illegal register (8088 days) or addressing mode, just
>> give me an error. I suggest the OP learn ARM assembler first. It's allot
>> cleaner and
>> easier to grasp IMO.
>
>
> Does MASM do what you describe above? If so, can you give an
> example? Ratch

I'll admit that it's been a very long time (late 90's) since I tinkered with
MASM and PC assembler. Not knowing the details of the legal combinations of
registers being used in various addressing modes, I wrote some code asif the
processor was orthogonal. Problem was, it isn't, or wasn't back then.
There were various restriction on how each register could be used. AIUI,
that has been pretty much eliminated in later CPUs. At any rate, I
discovered that things like brackets weren't treated as expected. In fact
they were pretty much optional as the assembler would just "interpret" the
written code as it saw fit by using a different addressing mode than what I
thought I coded. A few days of that and finding out that even "ideal" mode
in TASM would still do the same kinds of things (so that ancient MS code
examples would assemble), and I'd had enough of that.

Ratch

unread,

Jan 7, 2008, 11:55:50 PM1/7/08

to

"Anthony Fremont" <nob...@noplace.net> wrote in message

news:13o5v7k...@news.supernews.com...

It is true that MASM & TASM will take the context of the code to
generate what it assumes is the correct instruction if you do not use the
OFFSET directive or square brackets []. But, if you DO use those two
address modifiers, the address is explicit, and MASM/TASM will mark the code
as an error if the op code and address don't match. Of course a "legal"
instruction still might not be what you wanted, but that is not the fault of
the assembler. Anyway, MASM is capable of explicit code generation if you
tell it to do that. Ratch
>
>

Ratch

unread,

Jan 8, 2008, 12:09:07 AM1/8/08

to

"Wolfgang Kern" <now...@never.at> wrote in message

news:flubtf$sm4$1...@newsreader1.xoc.utanet.at...

Assuming it is MASM code, the instruction will error because you are
using a directive (OFFSET) in a nonsensical way. Describe what you are
trying to do. My assertion is that MASM can generate every coding that the
CPU is able to execute. Ratch

Frank Kotler

unread,

Jan 8, 2008, 1:04:28 AM1/8/08

to

mov cl, [80h]

...

Best,
Frank

//\\o//\\annabee

unread,

Jan 8, 2008, 11:10:44 AM1/8/08

to

The real trouble comes with trying to manage a large app _effciently_
I have no more problem manuvering my 3mega source than a 50 kb source in
RosAsm.
This is a problem that have death or life impact to a project, and which
is hard to see until one gets there.

I think I could also do asm with Masm, but nowhere near as easily as in
RosAsm.
Same for NASM, FASM. (last time I checked).

But without a runtime, sourcelevel debugger, I would not even try to use
an assembler ever again. Unless a significant very important issue came
up. It is not that it could not be done, but it fundamentally kills the
best thing about RosAsm which is to play with creativity, because the tool
responds as fast as the mind and fingers are ready to go. This leads for me
to a lot of typos, and the runtime debugger saves my ass time and time
again. It catches close to all the stackmismatches I make, and invalid
pointer assignments.

(The once I dont see, are mostly for codepaths possible that I havent been
able to run)

RosAsm also offers macros capable of personalizing the source code which
I think is great. As long as there are ways to get at the real sourcecode,
I have no problems with macros.

I just dont think its more meaningful to write eachtime:

and D$edi + oSectionState (NOT sectionstate_hidden)

( or in "real" asm : and D$edi + 78 0FFFF_FFF7 )

then to just write

show

Only one problem with RosAsm, and that is the useFULLness of the Right
click should
be replaced by doing this by just _thinking_ it, as this rightclicking can
wear a lot
on the shoulders. So I recommend Thai-massages + RosAsm. Alternativly
Indian massage. And also the Ma-uri is good..

:)

Another thing. RosAsm is monosource. This has a lot of benefits.
Espesially being able to rename the equates to shorter and better once, at
any time you feel like it.

In Delphi, the above "sectionstate_Hidden" would likly be written
ssHidden, and if you suddenly
wants todo something like that on a RosAsm source, this is CTRL+R and "do
all" and then its done. The monosource of RosAsm is a _feature_. And not
supporting *.obj files is also a _feature_ in my view. Its all genious if
you ask me, from the small details to the large details.

So this is basically a beautiful tool with 2 designflaws. Cannot do
Telekinesis, nor massages.

> __
> wolfgang

James Van Buskirk

unread,

Jan 8, 2008, 3:05:06 AM1/8/08

to

"Ratch" <wat...@comcast.net> wrote in message
news:18idnUQYntrtmB7a...@comcast.com...

> Assuming it is MASM code, the instruction will error because you are
> using a directive (OFFSET) in a nonsensical way. Describe what you are
> trying to do. My assertion is that MASM can generate every coding that
> the CPU is able to execute. Ratch

C:\>link /dump /disasm add2.obj
Microsoft (R) COFF/PE Dumper Version 8.00.40310.39
Copyright (C) Microsoft Corporation. All rights reserved.

Dump of file add2.obj

File Type: COFF OBJECT

testme:
00000000: 01 D8 add eax,ebx
00000002: 03 C3 add eax,ebx
00000004: C3 ret

Summary

5 .code

--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end

Wolfgang Kern

unread,

Jan 8, 2008, 8:33:30 AM1/8/08

to

Ratch wrote:

...
>>> Can you give a short example where MASM does not do something as good
>>> as most assemblers, or performs worse than most? Ratch

>> just one:
>> MOV eax,[data_label+offset*8+40h]

> Assuming it is MASM code, the instruction will error because you are

> using a directive (OFFSET) in a nonsensical way. Describe what you are
> trying to do.
> My assertion is that MASM can generate every coding that the
> CPU is able to execute. Ratch

We all know that this is possible, but you may know much better than
a confused newbie which addressing modes need workarounds and what's
not all in the 'reserved' list.

__
wolfgang

Wolfgang Kern

unread,

Jan 8, 2008, 8:54:00 AM1/8/08

to

Wannabee skrev:

[... about the advatage of RosAsm]

I agree on that.
If RosAsm would get additional support for L'unix and AMD64,
then it may become an ultimative tool ...

> So this is basically a beautiful tool with 2 designflaws. Cannot do
> Telekinesis, nor massages.

Telekinesis ?
Out of handicapped support:
Just use two (stereoscope) webcams which follow your eye focus,
and if you blink triceley to a character on screen the magic joins in ;)

And I prefer young naked girls over a PC for massages anyway.
__
wolfgang

//\\o//\\annabee

unread,

Jan 8, 2008, 6:56:39 PM1/8/08

to

På Tue, 08 Jan 2008 05:54:00 -0800, skrev Wolfgang Kern <now...@never.at>:

>
> Wannabee skrev:
>
> [... about the advatage of RosAsm]
>
> I agree on that.
> If RosAsm would get additional support for L'unix and AMD64,
> then it may become an ultimative tool ...
>
>
>> So this is basically a beautiful tool with 2 designflaws. Cannot do
>> Telekinesis, nor massages.
>
> Telekinesis ?
> Out of handicapped support:
> Just use two (stereoscope) webcams which follow your eye focus,

Is that what was left of the research I heard of with the diodes attached
to
the front of the head in order to move the mouse by detecting the muscles
that would move when you got angry enough?

> and if you blink triceley to a character on screen the magic joins in ;)
>
> And I prefer young naked girls over a PC for massages anyway.

:D agreed

> __
> wolfgang
>
>
>

Ratch

unread,

Jan 8, 2008, 10:47:01 AM1/8/08

to

"Wolfgang Kern" <now...@never.at> wrote in message

news:fm02ig$ds1$1...@newsreader1.xoc.utanet.at...

The reserved word list is documented. What addressing modes need "work
arounds"? Ratch

Ratch

unread,

Jan 8, 2008, 11:02:09 AM1/8/08

to

"James Van Buskirk" <not_...@comcast.net> wrote in message
news:CY2dnZhK1Ikysx7a...@comcast.com...

> "Ratch" <wat...@comcast.net> wrote in message
> news:18idnUQYntrtmB7a...@comcast.com...
>
>> Assuming it is MASM code, the instruction will error because you are
>> using a directive (OFFSET) in a nonsensical way. Describe what you are
>> trying to do. My assertion is that MASM can generate every coding that
>> the CPU is able to execute. Ratch
>
> C:\>link /dump /disasm add2.obj
> Microsoft (R) COFF/PE Dumper Version 8.00.40310.39
> Copyright (C) Microsoft Corporation. All rights reserved.
>
>
> Dump of file add2.obj
>
> File Type: COFF OBJECT
>
> testme:
> 00000000: 01 D8 add eax,ebx
> 00000002: 03 C3 add eax,ebx
> 00000004: C3 ret

So? They are equivalent instructions. Who cares if the op codes are
different? Here's what I get when I try that with MASM. Ratch

00000000 03 C3 ADD EAX,EBX
00000002 03 C3 ADD EAX,EBX
00000004 C3 RET

//\\o//\\annabee

unread,

Jan 8, 2008, 7:58:48 PM1/8/08

to

På Tue, 08 Jan 2008 08:02:09 -0800, skrev Ratch <wat...@comcast.net>:

>
> "James Van Buskirk" <not_...@comcast.net> wrote in message
> news:CY2dnZhK1Ikysx7a...@comcast.com...
>> "Ratch" <wat...@comcast.net> wrote in message
>> news:18idnUQYntrtmB7a...@comcast.com...
>>
>>> Assuming it is MASM code, the instruction will error because you
>>> are
>>> using a directive (OFFSET) in a nonsensical way. Describe what you are
>>> trying to do. My assertion is that MASM can generate every coding that
>>> the CPU is able to execute. Ratch
>>
>> C:\>link /dump /disasm add2.obj
>> Microsoft (R) COFF/PE Dumper Version 8.00.40310.39
>> Copyright (C) Microsoft Corporation. All rights reserved.
>>
>>
>> Dump of file add2.obj
>>
>> File Type: COFF OBJECT
>>
>> testme:
>> 00000000: 01 D8 add eax,ebx
>> 00000002: 03 C3 add eax,ebx
>> 00000004: C3 ret
>
> So? They are equivalent instructions. Who cares if the op codes
> are
> different?

Assembly programmers ....

Wolfgang Kern

unread,

Jan 8, 2008, 2:00:30 PM1/8/08

to

Ratch wrote:
...

>>> My assertion is that MASM can generate every coding that the
>>> CPU is able to execute. Ratch

>> We all know that this is possible, but you may know much better than
>> a confused newbie which addressing modes need workarounds and what's
>> not all in the 'reserved' list.

> The reserved word list is documented. What addressing modes need "work
> arounds"? Ratch

MOV eax,label ;should compile to B8.... mov eax,imm32
MOV eax,[label] ;should compile to A1.... mov eax,[imm32]

AFAIK MASM needs either a 'ptr' or an 'offset' directive and may confuse
itself with the brackets and more if 'label' isn't declared as a dword.

__
wolfgang

Wolfgang Kern

unread,

Jan 8, 2008, 2:09:04 PM1/8/08

to

Wannabee skrev:
...

>> Telekinesis ?
>> Out of handicapped support:
>> Just use two (stereoscope) webcams which follow your eye focus,

> Is that what was left of the research I heard of with the diodes
> attached to the front of the head in order to move the mouse by
> detecting the muscles that would move when you got angry enough?

We still have something similar in use, but EMG/EEG-controlled
devices joined in the scenario recently and they work astonishing well.
__
wolfgang

//\\o//\\annabee

unread,

Jan 8, 2008, 11:20:00 PM1/8/08

to

< http://dsplab.eng.fiu.edu/DSP/Publications/publication.asp?num=60 >

Like this?

I want one. Then I can program from my bed, and not even having to get up.
Then I should mostly be rested when coding, and if not sleepy, code.

> __
> wolfgang
>
>
>

Ratch

unread,

Jan 8, 2008, 4:36:16 PM1/8/08

to

"//\\o//\\annabee" <w...@www.akow> wrote in message
news:op.t4m00ajdwzh472@cyh1axtn1428g42...

> På Tue, 08 Jan 2008 08:02:09 -0800, skrev Ratch <wat...@comcast.net>:
>
>>
>> "James Van Buskirk" <not_...@comcast.net> wrote in message
>> news:CY2dnZhK1Ikysx7a...@comcast.com...
>>> "Ratch" <wat...@comcast.net> wrote in message
>>> news:18idnUQYntrtmB7a...@comcast.com...
>>>
>>>> Assuming it is MASM code, the instruction will error because you
>>>> are
>>>> using a directive (OFFSET) in a nonsensical way. Describe what you are
>>>> trying to do. My assertion is that MASM can generate every coding that
>>>> the CPU is able to execute. Ratch
>>>
>>> C:\>link /dump /disasm add2.obj
>>> Microsoft (R) COFF/PE Dumper Version 8.00.40310.39
>>> Copyright (C) Microsoft Corporation. All rights reserved.
>>>
>>>
>>> Dump of file add2.obj
>>>
>>> File Type: COFF OBJECT
>>>
>>> testme:
>>> 00000000: 01 D8 add eax,ebx
>>> 00000002: 03 C3 add eax,ebx
>>> 00000004: C3 ret
>>
>> So? They are equivalent instructions. Who cares if the op codes
>> are
>> different?
>
> Assembly programmers ....

Why? Ratch

Ratch

unread,

Jan 8, 2008, 5:28:17 PM1/8/08

to

"Wolfgang Kern" <now...@never.at> wrote in message

news:fm0hqv$tj5$1...@newsreader1.xoc.utanet.at...

>
> Ratch wrote:
> ...
>>>> My assertion is that MASM can generate every coding that the
>>>> CPU is able to execute. Ratch
>
>>> We all know that this is possible, but you may know much better than
>>> a confused newbie which addressing modes need workarounds and what's
>>> not all in the 'reserved' list.
>
>> The reserved word list is documented. What addressing modes need "work
>> arounds"? Ratch
>
> MOV eax,label ;should compile to B8.... mov eax,imm32

Why? Why should it assemble to an address with relocation by default.
It makes just as much sense to assemble so as to load the contents at the
label address. That's what it does. If you want the label address, use the
OFFSET directive. Either way, a load from a label is not a imm32. There is
relocation involved. See below.

> MOV eax,[label] ;should compile to A1.... mov eax,[imm32]

As so it does, with relocation. A constant does not have relocation.
See 'R' flags below.

>
> AFAIK MASM needs either a 'ptr' or an 'offset' directive and may confuse

MASM can't read your mind. It needs to know the size of the operand so
as to select the correct op code and check for errors. Doesn't any
assembler?

> itself with the brackets and more if 'label' isn't declared as a dword.

Show me. Ratch

>

00000010 00000003 LAB1 DWORD 3

00000000 .CODE

00000000 START:

00000000 TESTME:
00000000 B8 00000010 R MOV EAX,OFFSET LAB1
00000005 A1 00000010 R MOV EAX,LAB1
0000000A A1 00000010 R MOV EAX,[LAB1]
0000000F B8 00000010 MOV EAX,4*DWORD

Greg

unread,

Jan 9, 2008, 3:20:23 AM1/9/08

to

hello everyone
just thought i should ask, what is the job market for assembly
programmers the world over
i am not really learning assembly programming for the sake of getting
a better job but moreso because I like the concept of programming for
assemblies and kernels and the like. so back to my question what is
the job market for assembly programmers the world over. especially
with the assumption that most of you are probably doing this for a
living.

cheers

greg

James Van Buskirk

unread,

Jan 9, 2008, 4:22:35 AM1/9/08

to

"Ratch" <wat...@comcast.net> wrote in message

news:dPadnQFrvOEdAx7a...@comcast.com...

> So? They are equivalent instructions. Who cares if the op codes are
> different? Here's what I get when I try that with MASM. Ratch

C:\>link /dump /disasm ex2.obj

Dump of file ex2.obj

File Type: COFF OBJECT

testme2:
0000000000000000: A1 9A 78 56 34 12 mov eax,dword ptr
[00000012345678
9Ah]
00 00 00
0000000000000009: A1 78 56 34 12 00 mov eax,dword ptr
[00000000123456
78h]
00 00 00
0000000000000012: 67 A1 78 56 34 12 mov eax,dword ptr [12345678h]
0000000000000018: 8B 05 78 56 34 12 mov eax,dword ptr [12345696h]
000000000000001E: 8B 04 25 78 56 34 mov eax,dword ptr [12345678h]
12
0000000000000025: 8B 04 65 78 56 34 mov eax,dword ptr [12345678h]
12
000000000000002C: 8B 04 A5 78 56 34 mov eax,dword ptr [12345678h]
12
0000000000000033: 8B 04 E5 78 56 34 mov eax,dword ptr [12345678h]
12
000000000000003A: 40 8B 05 78 56 34 mov eax,dword ptr [123456B9h]
12
0000000000000041: 40 8B 04 25 78 56 mov eax,dword ptr [12345678h]
34 12
0000000000000049: 40 8B 04 65 78 56 mov eax,dword ptr [12345678h]
34 12
0000000000000051: 40 8B 04 A5 78 56 mov eax,dword ptr [12345678h]
34 12
0000000000000059: 40 8B 04 E5 78 56 mov eax,dword ptr [12345678h]
34 12
0000000000000061: C3 ret

Summary

62 .code

Wolfgang Kern

unread,

Jan 9, 2008, 9:12:21 AM1/9/08

to

Ratch asked:

>>>> testme:
>>>> 00000000: 01 D8 add eax,ebx
>>>> 00000002: 03 C3 add eax,ebx
>>>> 00000004: C3 ret

>>> So? They are equivalent instructions.
>>> Who cares if the op codes are different?

>> Assembly programmers ....

> Why? Ratch

It can help to determine which tool produced the code.
And even the variant with the set direction bit is easier to
handle for compilers and SMC programmers, sometimes the other
way may show better opportunities (a reason for 'db'-code).

Replace the ADD example with the MOV-doubles, and find LD/ST
instead of MOV.

__
wolfgang

Wolfgang Kern

unread,

Jan 9, 2008, 8:56:06 AM1/9/08

to

James Van Buskirk posted:

>> So? They are equivalent instructions. Who cares if the op codes are
>> different? Here's what I get when I try that with MASM. Ratch

> C:\>link /dump /disasm ex2.obj
> Microsoft (R) COFF/PE Dumper Version 8.00.40310.39
> Copyright (C) Microsoft Corporation. All rights reserved.

Something seems to be wrong with this interpretation ...
If it shall be 64-bit long mode code then

8B 05 .. should mean 'mov eax, [RIP+..]', same for 40 8b 05 ..

and I miss a 'Zero-page' indicator on 67 A1 ... or at least
an 'A32' for the '67' prefix.

Ok this SIB forms and the 40-rex are valid, but bloated and redundant,
so they may be just used to classify a compiler by disassembling.

A good test anyway.

__
wolfgang

______________________

Wolfgang Kern

unread,

Jan 9, 2008, 9:28:06 AM1/9/08

to

Wannabee skrev:
...
>>>> Telekinesis ?
>>>> Out of handicapped support:
>>>> Just use two (stereoscope) webcams which follow your eye focus,

>>> Is that what was left of the research I heard of with the diodes
>>> attached to the front of the head in order to move the mouse by
>>> detecting the muscles that would move when you got angry enough?

>> We still have something similar in use, but EMG/EEG-controlled
>> devices joined in the scenario recently and they work astonishing well.

> < http://dsplab.eng.fiu.edu/DSP/Publications/publication.asp?num=60 >

> Like this?

Yeah 1999 ...,
last year they could made a direct brain controlled artificial arm.
Don't know if it's published on net.

> I want one. Then I can program from my bed, and not even having to get up.
> Then I should mostly be rested when coding, and if not sleepy, code.

:)

this things aren't mass-production items and need individual adapting,
so they are quite expensive.
Much cheaper will be Mr.Bean's broom-supported remote-control :)
__
wolfgang

Wolfgang Kern

unread,

Jan 9, 2008, 10:05:39 AM1/9/08

to

Ratch wrote:

>>>>> My assertion is that MASM can generate every coding that the
>>>>> CPU is able to execute. Ratch

>>>> We all know that this is possible, but you may know much better than
>>>> a confused newbie which addressing modes need workarounds and what's
>>>> not all in the 'reserved' list.

>>> The reserved word list is documented. What addressing modes need "work
>>> arounds"? Ratch

>> MOV eax,label ;should compile to B8.... mov eax,imm32

> Why? Why should it assemble to an address with relocation by default.
> It makes just as much sense to assemble so as to load the contents at the
> label address. That's what it does. If you want the label address,
> use the OFFSET directive. Either way, a load from a label is not a
> imm32. There is relocation involved. See below.

>> MOV eax,[label] ;should compile to A1.... mov eax,[imm32]

> As so it does, with relocation. A constant does not have relocation.
> See 'R' flags below.

I don't see any need for relocation (except for external referenced labels),
the compiler calculates this immediate value from 'data' or 'code' offset.

>> AFAIK MASM needs either a 'ptr' or an 'offset' directive and may confuse

> MASM can't read your mind. It needs to know the size of the operand so
> as to select the correct op code and check for errors. Doesn't any
> assembler?

Most non-MASM tools 'know' the operand size and need size-casts only
for the few instructions which really need one
INC/DEC[mem]; MOVZX/MOVSX r,r/m; MUL/DIV[mem]; and perhaps a few more.

NASM may show up to three size casts in one instruction, but these
are optional code style modifiers and not mandatory.

>> itself with the brackets and more if 'label' isn't declared as a dword.

> Show me. Ratch

> 00000010 00000003 LAB1 DWORD 3
> 00000000 .CODE
> 00000000 START:
> 00000000 TESTME:
> 00000000 B8 00000010 R MOV EAX,OFFSET LAB1
> 00000005 A1 00000010 R MOV EAX,LAB1
> 0000000A A1 00000010 R MOV EAX,[LAB1]
> 0000000F B8 00000010 MOV EAX,4*DWORD

You just explained it.
Perhaps MASM lacks on direct reference calculation and need to
detour with runtime reallocation ?

Ok, all MASM users will know how to treat their tool ...
But I may continue to recommend FASM/NASM/RosAsm for newbies.
__
wolfgang

Ratch

unread,

Jan 9, 2008, 11:32:35 AM1/9/08

to

"Wolfgang Kern" <now...@never.at> wrote in message

news:fm2om6$osg$2...@newsreader1.xoc.utanet.at...

>
> Ratch asked:
>
>>>>> testme:
>>>>> 00000000: 01 D8 add eax,ebx
>>>>> 00000002: 03 C3 add eax,ebx
>>>>> 00000004: C3 ret
>
>>>> So? They are equivalent instructions.
>>>> Who cares if the op codes are different?
>
>>> Assembly programmers ....
>
>> Why? Ratch
>
> It can help to determine which tool produced the code.

Who cares?

> And even the variant with the set direction bit is easier to
> handle for compilers

I have no idea what you mean.

and SMC programmers,

What does "SMC" mean? Define your acronyms before you use them, unless
they are so common that everyone knows what they are.

>sometimes the other
> way may show better opportunities (a reason for 'db'-code).

The assembler has to choose one coding or the other. As long as each
instruction is the functionally equivalent and the same length, what more
can a assembler do if several possibilities exist?

>
> Replace the ADD example with the MOV-doubles, and find LD/ST
> instead of MOV.

What is "MOV-doubles" and "LD/ST"? Is that NASM jargon? If so, I
don't know it. Ratch

Ratch

unread,

Jan 9, 2008, 11:55:43 AM1/9/08

to

"James Van Buskirk" <not_...@comcast.net> wrote in message

news:MqSdnUbhj47IDxna...@comcast.com...

What does the above illustrate? The first constant is too large to fit
into a 32-bit register and MASM marks that line in error. Because the []
cannot possibly be addresses, MASM assumes they are (), and codes as a
constant. It should probably mark that code in error. Anyway, below is a
MASM listing. Ratch

00000000 TESTME:
MOV EAX,DWORD PTR [000000123456789AH]
TEST.asm(32) : error A2084: constant value too large
00000000 B8 12345678 MOV EAX,DWORD PTR [0000000012345678H]
00000005 B8 12345678 MOV EAX,DWORD PTR [12345678H]
0000000A B8 12345696 MOV EAX,DWORD PTR [12345696H]
0000000F B8 12345678 MOV EAX,DWORD PTR [12345678H]
00000014 B8 12345678 MOV EAX,DWORD PTR [12345678H]
00000019 B8 123456B9 MOV EAX,DWORD PTR [123456B9H]
0000001E B8 12345678 MOV EAX,DWORD PTR [12345678H]
00000023 B8 12345678 MOV EAX,DWORD PTR [12345678H]
00000028 B8 12345678 MOV EAX,DWORD PTR [12345678H]
0000002D B8 12345678 MOV EAX,DWORD PTR [12345678H]

Ratch

unread,

Jan 9, 2008, 12:11:58 PM1/9/08

to

"Wolfgang Kern" <now...@never.at> wrote in message

news:fm2om7$osg$4...@newsreader1.xoc.utanet.at...

>
> Ratch wrote:
>
>>>>>> My assertion is that MASM can generate every coding that the
>>>>>> CPU is able to execute. Ratch
>
>>>>> We all know that this is possible, but you may know much better than
>>>>> a confused newbie which addressing modes need workarounds and what's
>>>>> not all in the 'reserved' list.
>
>>>> The reserved word list is documented. What addressing modes need "work
>>>> arounds"? Ratch
>
>>> MOV eax,label ;should compile to B8.... mov eax,imm32
>
>> Why? Why should it assemble to an address with relocation by default.
>> It makes just as much sense to assemble so as to load the contents at the
>> label address. That's what it does. If you want the label address,
>> use the OFFSET directive. Either way, a load from a label is not a
>> imm32. There is relocation involved. See below.
>
>>> MOV eax,[label] ;should compile to A1.... mov eax,[imm32]
>
>> As so it does, with relocation. A constant does not have relocation.
>> See 'R' flags below.
>
> I don't see any need for relocation (except for external referenced
> labels),
> the compiler calculates this immediate value from 'data' or 'code' offset.

I don't know what compiler you are using, but MASM does not. It does
not assume that an instruction like MOV EAX,[24] is from the beginning of a
code segment. All addressing in MASM has relocation. If you want just a
constant between addresses, you should subtract the labels, which will
destroy the relocation. MOV EAX,LABEL1-BEGIN_LABEL

>
>>> AFAIK MASM needs either a 'ptr' or an 'offset' directive and may confuse
>
>> MASM can't read your mind. It needs to know the size of the operand so
>> as to select the correct op code and check for errors. Doesn't any
>> assembler?
>
> Most non-MASM tools 'know' the operand size and need size-casts only
> for the few instructions which really need one
> INC/DEC[mem]; MOVZX/MOVSX r,r/m; MUL/DIV[mem]; and perhaps a few more.

To do that, they may have to make unwarranted assumptions. MASM is
more critical.

>
> NASM may show up to three size casts in one instruction, but these
> are optional code style modifiers and not mandatory.

MASM also can modify the address size if there is any ambiguity. If
there is any question, MASM will not go forward with the assembly.

>
>>> itself with the brackets and more if 'label' isn't declared as a dword.
>
>> Show me. Ratch
>
>> 00000010 00000003 LAB1 DWORD 3
>> 00000000 .CODE
>> 00000000 START:
>> 00000000 TESTME:
>> 00000000 B8 00000010 R MOV EAX,OFFSET LAB1
>> 00000005 A1 00000010 R MOV EAX,LAB1
>> 0000000A A1 00000010 R MOV EAX,[LAB1]
>> 0000000F B8 00000010 MOV EAX,4*DWORD
>
> You just explained it.
> Perhaps MASM lacks on direct reference calculation and need to
> detour with runtime reallocation ?

What is "direct reference calculation"? MASM is out of the picture at
runtime.

>
> Ok, all MASM users will know how to treat their tool ...
> But I may continue to recommend FASM/NASM/RosAsm for newbies.

Whatever. Ratch

James Van Buskirk

unread,

Jan 9, 2008, 12:36:53 PM1/9/08

to

"Wolfgang Kern" <now...@never.at> wrote in message

news:fm2om5$osg$1...@newsreader1.xoc.utanet.at...

> James Van Buskirk posted:

>> C:\>link /dump /disasm ex2.obj
>> Microsoft (R) COFF/PE Dumper Version 8.00.40310.39
>> Copyright (C) Microsoft Corporation. All rights reserved.

> Something seems to be wrong with this interpretation ...
> If it shall be 64-bit long mode code then

> 8B 05 .. should mean 'mov eax, [RIP+..]', same for 40 8b 05 ..

Exactly so. This is one point where assemblers tend to get too
high-level. It can be awkward on some assemblers to choose between
an absolute address and an RIP-relative address. Also it can be
difficult to figure out how to specify that crazy 64-bit moffset32
when you need it.

> and I miss a 'Zero-page' indicator on 67 A1 ... or at least
> an 'A32' for the '67' prefix.

If DUMPBIN didn't point out the distinction between RIP-relative and
absolute addresses, then the above is no surprise.

> Ok this SIB forms and the 40-rex are valid, but bloated and redundant,
> so they may be just used to classify a compiler by disassembling.

Not completely redundant. Sometimes, actually quite often, you want
to align code within a loop. You can't insert NOPs there because they
gobble up real resources. Therefore it is necessary to be able to use
different forms of an instruction that have different lengths. Not
all assemblers let you specify the instruction closely enough to do
this.

> A good test anyway.

Too bad DUMPBIN couldn't properly disassemble F0 0F C7 C8; would have
been more fun trying to get ml64.exe to assemble to that.

Spam Killer

unread,

Jan 9, 2008, 3:58:10 PM1/9/08

to

On Mon, 07 Jan 2008 23:44:47 GMT, Frank Kotler wrote:
>...
>I see they've switched to Jeremy Gordon's "GoLink", etc. I vaguely
>recalled that Jeremy had a nice introductory tutorial to Windows
>programming on his site:
>...
Yeah, and the unicode examples reminded me, that NASM could need a
"du" directive. The DOS-Box must be set to the "Lucida Console" font.

; nasm -f win32 unicode.asm
; GoLink /console unicode.obj kernel32.dll

%include "nasmx.inc"
%include "windows.inc"
%include "kernel32.inc"

%imacro du 1-*
%rep %0
%ifstr %1
du1 %1
%else
dw %1
%endif
%rotate 1
%endrep
%endmacro

%macro du1 1
%strlen _charcnt %1
%assign _cnt 1
%rep _charcnt
%substr _char %1 _cnt
dw _char
%assign _cnt _cnt+1
%endrep
%endmacro

section .bss
written resd 1

section .data
string du "Hello world in Russian: "
du 417h, 434h, 440h, 430h, 432h, 441h, 442h, 432h
du 443h, 439h, ' ', 41Ch, 438h, 440h, ", and Greek: "
du 39Ah, 3B1h, 3BBh, 3B7h, 3BCh, 3ADh, 3C1h, 3B1h
du ' ', 3BAh, 3CCh, 3C3h, 3BCh, 3B5h, ' from NASM.'
%assign STRLEN $-string
%assign STRLEN STRLEN/2

global start

section .text
start: invoke GetStdHandle, STD_OUTPUT_HANDLE
invoke WriteConsoleW, eax, string, STRLEN, \
written, NULL

ExitProg: invoke GetLastError
invoke ExitProcess, eax
--
wfz

Rod Pemberton

unread,

Jan 9, 2008, 10:10:40 PM1/9/08

to

While you guys are discussing differences in compiling assembly sequences,
could you guys answer some syntax questions for me?

NASM will assemble these:

lgdt [dword ebp+0x1c]
lidt [dword ebp+0x1c]
lgdt [ebp+0x8]
lidt [ebp+0x8]

for BITS 32 as:

00000000 0F01951C000000 lgdt [ebp+0x1c]
00000007 0F019D1C000000 lidt [ebp+0x1c]
0000000E 0F015508 lgdt [ebp+0x8]
00000012 0F015D08 lidt [ebp+0x8]

WDIS (OpenWatcom) disassembles as:

0000 0F 01 95 1C 00 00 00 lgdt fword ptr 0x1c[ebp]
0007 0F 01 9D 1C 00 00 00 lidt fword ptr 0x1c[ebp]
000E 0F 01 55 08 lgdt fword ptr 0x8[ebp]
0012 0F 01 5D 08 lidt fword ptr 0x8[ebp]

But, re-assembles each of those sequences as (all four byte sequences, no
seven byte sequences):

000F 0F 01 55 1C lgdt fword ptr 0x1c[ebp]
0013 0F 01 5D 1C lidt fword ptr 0x1c[ebp]
0017 0F 01 55 08 lgdt fword ptr 0x8[ebp]
001B 0F 01 5D 08 lidt fword ptr 0x8[ebp]

If the offset is larger than 0xFF, it will assembly the 7-byte sequence.
But, how do you force the 32-bit offset form when it's value is below 0xFF?
What keyword am I missing?

I have the same problem with GAS. Objdump disassembles as:

0: 0f 01 95 1c 00 00 00 lgdtl 0x1c(%ebp)
7: 0f 01 9d 1c 00 00 00 lidtl 0x1c(%ebp)
e: 0f 01 55 08 lgdtl 0x8(%ebp)
12: 0f 01 5d 08 lidtl 0x8(%ebp)

GAS re-assembles as:

1c: 0f 01 55 1c lgdtl 0x1c(%ebp)
20: 0f 01 5d 1c lidtl 0x1c(%ebp)
24: 0f 01 55 08 lgdtl 0x8(%ebp)
28: 0f 01 5d 08 lidtl 0x8(%ebp)

Thanks.

Rod Pemberton

robert...@yahoo.com

unread,

Jan 9, 2008, 11:51:30 PM1/9/08

to

Does the assembler in question allow you to define the offset as an
external constant? If so, you'll probably force him to generate the
long form since he won't know that the value is small enough to fit
into eight bits. In MASM you'd use "extern abcdef:abs".

That introduces the obvious nuisance of having to define the offset in
another module but...

Wolfgang Kern

unread,

Jan 10, 2008, 7:38:21 AM1/10/08

to

James Van Buskirk replied:

>>> C:\>link /dump /disasm ex2.obj
>>> Microsoft (R) COFF/PE Dumper Version 8.00.40310.39
>>> Copyright (C) Microsoft Corporation. All rights reserved.

>> Something seems to be wrong with this interpretation ...
>> If it shall be 64-bit long mode code then

>> 8B 05 .. should mean 'mov eax, [RIP+..]', same for 40 8b 05 ..

> Exactly so. This is one point where assemblers tend to get too
> high-level. It can be awkward on some assemblers to choose between
> an absolute address and an RIP-relative address. Also it can be
> difficult to figure out how to specify that crazy 64-bit moffset32
> when you need it.

>> and I miss a 'Zero-page' indicator on 67 A1 ... or at least
>> an 'A32' for the '67' prefix.

> If DUMPBIN didn't point out the distinction between RIP-relative and
> absolute addresses, then the above is no surprise.

OK.

>> Ok this SIB forms and the 40-rex are valid, but bloated and redundant,
>> so they may be just used to classify a compiler by disassembling.

> Not completely redundant. Sometimes, actually quite often, you want
> to align code within a loop. You can't insert NOPs there because they
> gobble up real resources. Therefore it is necessary to be able to use
> different forms of an instruction that have different lengths. Not
> all assemblers let you specify the instruction closely enough to do
> this.

Alignment by redudance can be useful, even NOPs and 'EB 00' (once used
after I/O for delay) shouldn't take any resources on modern CPUs,
also the max. instruction length can be easy exceeded by prefixes.

>> A good test anyway.

> Too bad DUMPBIN couldn't properly disassemble F0 0F C7 C8; would have
> been more fun trying to get ml64.exe to assemble to that.

F0 0F C7 C8 produces a "memory operand only" on my disassembler yet,
need to check for its meaning in the CMPXCHG8 group.
It wont raise an illegal exception on certain CPUs ?
All my AMDs cry "06" on it.

__
wolfgang

Wolfgang Kern

unread,

Jan 10, 2008, 7:11:28 AM1/10/08

to

Ratch asked:

>>>>>> testme:
>>>>>> 00000000: 01 D8 add eax,ebx
>>>>>> 00000002: 03 C3 add eax,ebx
>>>>>> 00000004: C3 ret
>>>>> So? They are equivalent instructions.
>>>>> Who cares if the op codes are different?
>>>> Assembly programmers ....
>>> Why? Ratch

>> It can help to determine which tool produced the code.
> Who cares?

Me and my disassembler ...,
and all who are interested in details may care as well.

>> And even the variant with the set direction bit is easier to
>> handle for compilers

> I have no idea what you mean.

I see. bit1 is the dest<->source direction bit in many opcodes.

>> and SMC programmers,
> What does "SMC" mean? Define your acronyms before you use them, unless
> they are so common that everyone knows what they are.

SMC for Self_Modifying_Code isn't a common known abbreviation ?

>> sometimes the other
>> way may show better opportunities (a reason for 'db'-code).

> The assembler has to choose one coding or the other. As long as each
> instruction is the functionally equivalent and the same length, what more
> can a assembler do if several possibilities exist?

The assembler(the tool) may do it either way, but an ASM-programmer
can decide to force a desired opcode if several variants exists.

>> Replace the ADD example with the MOV-doubles, and find LD/ST
>> instead of MOV.

> What is "MOV-doubles" and "LD/ST"?
> Is that NASM jargon?

Not at all. LD and ST are know from Z-80,620x,650x,68xx,180x...

> If so, I don't know it. Ratch

MOV doubles are: 8B c1 MOV eax,ecx ;LD (read) eax from ecx
89 c8 MOV eax,ecx ;ST (write) ecx to eax
the sense is here: 8B 06 MOV eax,[esi] ;LD eax from [esi]
89 06 MOV [esi],eax ;ST eax to [esi]

There may be different pipes used for RD and WR also on your CPU,
but as you would say: "Who cares (about performance details)" ?
__
wolfgang

Wolfgang Kern

unread,

Jan 10, 2008, 8:02:12 AM1/10/08

to

Ratch wrote:
...

>> I don't see any need for relocation (except for external referenced
>> labels), the compiler calculates this immediate value from 'data' or
>> 'code' offset.

> I don't know what compiler you are using, but MASM does not. It does
> not assume that an instruction like MOV EAX,[24] is from the beginning of
a
> code segment. All addressing in MASM has relocation. If you want just a
> constant between addresses, you should subtract the labels, which will
> destroy the relocation. MOV EAX,LABEL1-BEGIN_LABEL

Perhaps this is why MASM created executables are 'a bit' larger ?

Offsets in DOS.com and DOS.exe are relative to Segments. Exe-files may
need reallocation, but only for segment altering instructions and not
for every single data/code label.

Offsets in windoze(tm) are relative to a cheated virtual address
(which default seems to be 00400000h) and a 'paging'-OS can grant
memory with an instance chosen start address.
So also here there is no need for reallocation in the main code.
Only external referenced labels will need a 'link'.

...
__
wolfgang

//\\o//\\annabee

unread,

Jan 10, 2008, 6:11:05 PM1/10/08

to

What does that mean. "cry 06" on it?

>
> __
> wolfgang
>
>
>

Wolfgang Kern

unread,

Jan 10, 2008, 9:32:44 AM1/10/08

to

Wannabee asked:

...

> What does that mean. "cry 06" on it?

Exception 06: 'illegal opcode'

__
wolfgang

James Van Buskirk

unread,

Jan 10, 2008, 11:38:46 AM1/10/08

to

"Wolfgang Kern" <now...@never.at> wrote in message

news:fm55tn$h08$2...@newsreader1.xoc.utanet.at...

> Alignment by redudance can be useful, even NOPs and 'EB 00' (once used
> after I/O for delay) shouldn't take any resources on modern CPUs,
> also the max. instruction length can be easy exceeded by prefixes.

See http://www.agner.org/optimize/instruction_tables.pdf . NOP at least
eats up decoding resources and takes up a port or pipeline slot that
might otherwise have gone to a useful instruction. EB 00 uses a BTB
entry as well as a port slot.

> F0 0F C7 C8 produces a "memory operand only" on my disassembler yet,
> need to check for its meaning in the CMPXCHG8 group.
> It wont raise an illegal exception on certain CPUs ?
> All my AMDs cry "06" on it.

http://www.x86.org/errata/dec97/f00fbug.htm

And while we're on the topic of Pentium Classic trivia,

C:\Asm\FASM\forum>link /dump /disasm ex3.obj

Dump of file ex3.obj

File Type: COFF OBJECT

testme3:
0000000000000000: D1 D8 rcr eax,1
0000000000000002: C1 D8 01 rcr eax,1
0000000000000005: A9 05 00 00 00 test eax,5
000000000000000A: F7 C0 05 00 00 00 test eax,5
0000000000000010: C3 ret

Summary

11 .code

Remember how A9 05 00 00 00 was pairable but not F7 C0 05 00 00 00?
D1 D8 is generally faster than C1 D8 01 on processors through
current editions.

Ratch

unread,

Jan 10, 2008, 12:09:25 PM1/10/08

to

"Wolfgang Kern" <now...@never.at> wrote in message

news:fm55tm$h08$1...@newsreader1.xoc.utanet.at...

>
> Ratch asked:
>
>>>>>>> testme:
>>>>>>> 00000000: 01 D8 add eax,ebx
>>>>>>> 00000002: 03 C3 add eax,ebx
>>>>>>> 00000004: C3 ret
>>>>>> So? They are equivalent instructions.
>>>>>> Who cares if the op codes are different?
>>>>> Assembly programmers ....
>>>> Why? Ratch
>
>>> It can help to determine which tool produced the code.
>> Who cares?
>
> Me and my disassembler ...,
> and all who are interested in details may care as well.

Well, it is easy enough to see what the assembler generated from the
listing or a dump. MASM doesn't seem to generate different instructions
when encountering identical consecutive inputs like the example above. Not
when I tried it anyway. The question remains, what instruction should it
select? Or maybe it should stop at each fork and present a multiple list of
possibilities so you can select the one you want.

>
>>> And even the variant with the set direction bit is easier to
>>> handle for compilers
>
>> I have no idea what you mean.
>
> I see. bit1 is the dest<->source direction bit in many opcodes.

I see no reference to that in the Intel 32-bit documentation. The only
thing about direction is the direction flag (DF).

>
>>> and SMC programmers,
>> What does "SMC" mean? Define your acronyms before you use them, unless
>> they are so common that everyone knows what they are.
>
> SMC for Self_Modifying_Code isn't a common known abbreviation ?

Not for me. I don't do "SMC". For one thing, it is not reenterant.

>
>>> sometimes the other
>>> way may show better opportunities (a reason for 'db'-code).
>
>> The assembler has to choose one coding or the other. As long as
>> each
>> instruction is the functionally equivalent and the same length, what more
>> can a assembler do if several possibilities exist?
>
> The assembler(the tool) may do it either way, but an ASM-programmer
> can decide to force a desired opcode if several variants exists.

What a drag! Having to scan the listings and decide to modify a
perfectly good instruction.

>
>>> Replace the ADD example with the MOV-doubles, and find LD/ST
>>> instead of MOV.
>
>> What is "MOV-doubles" and "LD/ST"?
>> Is that NASM jargon?
>
> Not at all. LD and ST are know from Z-80,620x,650x,68xx,180x...

I am MASMatized, and suffer from MASMatosis. I know very little about
any other processor.

>
>> If so, I don't know it. Ratch
>
> MOV doubles are: 8B c1 MOV eax,ecx ;LD (read) eax from ecx
> 89 c8 MOV eax,ecx ;ST (write) ecx to eax
> the sense is here: 8B 06 MOV eax,[esi] ;LD eax from [esi]
> 89 06 MOV [esi],eax ;ST eax to [esi]

LD and ST are not used in MASM, which is what I am about. MOV is what
MASM understands. Actually MOV (move) is a misnomer. It really should be
CPY (copy)

>
> There may be different pipes used for RD and WR also on your CPU,
> but as you would say: "Who cares (about performance details)" ?

It is the coder's choice. Ratch

Ratch

unread,

Jan 10, 2008, 12:23:05 PM1/10/08

to

"Wolfgang Kern" <now...@never.at> wrote in message

news:fm55to$h08$3...@newsreader1.xoc.utanet.at...

>
> Ratch wrote:
> ...
>>> I don't see any need for relocation (except for external referenced
>>> labels), the compiler calculates this immediate value from 'data' or
>>> 'code' offset.
>
>> I don't know what compiler you are using, but MASM does not. It does
>> not assume that an instruction like MOV EAX,[24] is from the beginning of
> a
>> code segment. All addressing in MASM has relocation. If you want just a
>> constant between addresses, you should subtract the labels, which will
>> destroy the relocation. MOV EAX,LABEL1-BEGIN_LABEL
>
> Perhaps this is why MASM created executables are 'a bit' larger ?

That should make no difference as far as the object file is concerned.
If the executable is larger, talk to your linker.

>
> Offsets in DOS.com and DOS.exe are relative to Segments. Exe-files may
> need reallocation, but only for segment altering instructions and not
> for every single data/code label.

Unless you can guarantee the program will load in the same physical
address everytime, relocation is a must. I did a brain wipe on DOS. I am
strictly windows now.

>
> Offsets in windoze(tm) are relative to a cheated virtual address
> (which default seems to be 00400000h) and a 'paging'-OS can grant
> memory with an instance chosen start address.
> So also here there is no need for reallocation in the main code.
> Only external referenced labels will need a 'link'.

If that 040000000H "virtual address" is mapped to a different physical
address, then you will need relocation. Ratch

//\\o//\\annabee

unread,

Jan 10, 2008, 10:34:45 PM1/10/08

to

På Thu, 10 Jan 2008 18:23:05 +0100, skrev Ratch <wat...@comcast.net>:

> If that 040000000H "virtual address" is mapped to a different
> physical
> address, then you will need relocation. Ratch

Even worse for memoryadresses allocated at runtime. It DOES change even in
the same version
of the OS. (Turning skinsupport off in XP) And it allways change for
upgrades (rollups) and stuff like that. But, since the adress space is
virtual, there should not be any reason to??
(I dont know, but I sort of thought that was the purpose of virtual memory
addresses,
that you could just map in the addessspace that the userapp is expecting.
But with M$
even respecting the user addressspace is two much to ask...)?

--
0 = a newb
1 = c programmer
110101 = c programmer with asm skills
111111111 = full time asm programer

- (All flags raised and paranoid as hell)

Wolfgang Kern

unread,

Jan 10, 2008, 2:36:03 PM1/10/08

to

Ratch wrote:

>>>>>>> Who cares if the op codes are different?
>>>>>> Assembly programmers ....
>>>>> Why? Ratch
>>>> It can help to determine which tool produced the code.
>>> Who cares?
>> Me and my disassembler ...,
>> and all who are interested in details may care as well.

> Well, it is easy enough to see what the assembler generated from the
> listing or a dump. MASM doesn't seem to generate different instructions
> when encountering identical consecutive inputs like the example above.
Not
> when I tried it anyway. The question remains, what instruction should it
> select? Or maybe it should stop at each fork and present a multiple list
of > possibilities so you can select the one you want.

Some tools got commandline switches to override a given default behaviour,
others offer directives and the rest can use DB-coding.

[direction bit]

>>> I have no idea what you mean.
>> I see. bit1 is the dest<->source direction bit in many opcodes.
> I see no reference to that in the Intel 32-bit documentation. The only
> thing about direction is the direction flag (DF).

check 00,A1,89 vs. 02,A3,8b codes, or look at Intels 25366714.pdf

Volume 2B,Appendix B, B.1.7 "direction bit"

not to confuse with the "direction flag".

>> SMC for Self_Modifying_Code isn't a common known abbreviation ?
> Not for me. I don't do "SMC". For one thing, it is not reenterant.

Why can't SMC be reentrant ? It's main purpose is "short but reusable".
It is rare used on PCs, I use it on Vmode changes to keep my routines.

[opcode variants]

>> The assembler(the tool) may do it either way, but an ASM-programmer
>> can decide to force a desired opcode if several variants exists.

> What a drag! Having to scan the listings and decide to modify a
> perfectly good instruction.

You aren't forced to anyway, but a programmer may desire to chose.

>>> What is "MOV-doubles" and "LD/ST"? Is that NASM jargon?
>> Not at all. LD and ST are know from Z-80,620x,650x,68xx,180x...
> I am MASMatized, and suffer from MASMatosis. I know very little about
> any other processor.

Ok.

>> MOV doubles are: 8B c1 MOV eax,ecx ;LD (read) eax from ecx
>> 89 c8 MOV eax,ecx ;ST (write) ecx to eax
>> the sense is here: 8B 06 MOV eax,[esi] ;LD eax from [esi]
>> 89 06 MOV [esi],eax ;ST eax to [esi]

> LD and ST are not used in MASM, which is what I am about. MOV is what
> MASM understands. Actually MOV (move) is a misnomer. It really should be
> CPY (copy)

>> There may be different pipes used for RD and WR also on your CPU,
>> but as you would say: "Who cares (about performance details)" ?

> It is the coder's choice. Ratch

Right.
__
wolfgang

Wolfgang Kern

unread,

Jan 10, 2008, 3:22:07 PM1/10/08

to

Ratch wrote:
...

>> Offsets in windoze(tm) are relative to a cheated virtual address
>> (which default seems to be 00400000h) and a 'paging'-OS can grant
>> memory with an instance chosen start address.
>> So also here there is no need for reallocation in the main code.
>> Only external referenced labels will need a 'link'.

> If that 040000000H "virtual address" is mapped to a different physical
> address, then you will need relocation. Ratch

??? please show me how to access 'physical memory' in windoze.

If this MASM relocation trick worked, I'd switch to MASM today
__
wolfgang

Wolfgang Kern

unread,

Jan 10, 2008, 3:42:59 PM1/10/08

to

"James Van Buskirk" wrote:
>> Alignment by redudance can be useful, even NOPs and 'EB 00' (once used
>> after I/O for delay) shouldn't take any resources on modern CPUs,
>> also the max. instruction length can be easy exceeded by prefixes.

> See http://www.agner.org/optimize/instruction_tables.pdf . NOP at least
> eats up decoding resources and takes up a port or pipeline slot that
> might otherwise have gone to a useful instruction. EB 00 uses a BTB
> entry as well as a port slot.

Yes, I have this 'pentium' paper already.
AMD optimisation guides show this different.

>> F0 0F C7 C8 produces a "memory operand only" on my disassembler yet,
>> need to check for its meaning in the CMPXCHG8 group.
>> It wont raise an illegal exception on certain CPUs ?
>> All my AMDs cry "06" on it.

> http://www.x86.org/errata/dec97/f00fbug.htm

So I'm glad to use only AMD's.

> And while we're on the topic of Pentium Classic trivia,

> C:\Asm\FASM\forum>link /dump /disasm ex3.obj
> Microsoft (R) COFF/PE Dumper Version 8.00.40310.39
> Copyright (C) Microsoft Corporation. All rights reserved.

> Dump of file ex3.obj
>
> File Type: COFF OBJECT
>
> testme3:
> 0000000000000000: D1 D8 rcr eax,1
> 0000000000000002: C1 D8 01 rcr eax,1
> 0000000000000005: A9 05 00 00 00 test eax,5
> 000000000000000A: F7 C0 05 00 00 00 test eax,5
> 0000000000000010: C3 ret
>
> Summary
>
> 11 .code
>
> Remember how A9 05 00 00 00 was pairable but not F7 C0 05 00 00 00?
> D1 D8 is generally faster than C1 D8 01 on processors through
> current editions.

The F7.. test on the ACCU may use both ALU-pipes ?
D1 D8 should be faster on all CPUs,
IIRC it doesn't load the shift counter.
__
wolfgang

//\\o//\\annabee

unread,

Jan 11, 2008, 1:20:25 AM1/11/08

to

På Thu, 10 Jan 2008 20:36:03 +0100, skrev Wolfgang Kern <now...@never.at>:

>
> check 00,A1,89 vs. 02,A3,8b codes, or look at Intels 25366714.pdf
>
> Volume 2B,Appendix B, B.1.7 "direction bit"

I find it in section B.1.4.8 of
the volume 2B Instruction Set Referance, appendix B:

I rename all my download intel manuals.
So I have no "253..." on my computer.

"
In many two operand instructions, a direction bit (d) indicates which
operand is
considered the source and which is the destination, see Table B-11
"

think I heard you say it before, but I forgot since then.

> not to confuse with the "direction flag".
>
>>> SMC for Self_Modifying_Code isn't a common known abbreviation ?
>> Not for me. I don't do "SMC". For one thing, it is not reenterant.
>
> Why can't SMC be reentrant ? It's main purpose is "short but reusable".
> It is rare used on PCs, I use it on Vmode changes to keep my routines.

Also me. Why, it can just rewrite, or copy the code ahead or after.

> [opcode variants]
>>> The assembler(the tool) may do it either way, but an ASM-programmer
>>> can decide to force a desired opcode if several variants exists.
>
>> What a drag! Having to scan the listings and decide to modify a
>> perfectly good instruction.
>
> You aren't forced to anyway, but a programmer may desire to chose.

Yep. In fact so much that he will consider, seriously writing his own
assembler just to be 100% sure he gets what he asks for.
in my view, all asemblers should provide a hex listing, even with adresses
pseudoed into the table, as a cross referance. Could be an extra feature in
the margin, which could be not seen most the time. It should also be
possible to choose in which mode to write. Is this possible? I dont know,
but I figure it could be possible?

Terence

unread,

Jan 10, 2008, 6:32:52 PM1/10/08

to

I wrote and recommended 16-bit MASM, because Greg is still doing a
correspondence course to complete his interrupted degree, while now in
Pretoria. The 16 bit version (and possibly DEBUG.exe) is what he
needs right now to experiment with on an x86 type computer.

Ratch

unread,

Jan 10, 2008, 8:02:43 PM1/10/08

to

"Wolfgang Kern" <now...@never.at> wrote in message

news:fm6015$be8$3...@newsreader1.xoc.utanet.at...

>
> Ratch wrote:
> ...
>
>>> Offsets in windoze(tm) are relative to a cheated virtual address
>>> (which default seems to be 00400000h) and a 'paging'-OS can grant
>>> memory with an instance chosen start address.
>>> So also here there is no need for reallocation in the main code.
>>> Only external referenced labels will need a 'link'.
>
>> If that 040000000H "virtual address" is mapped to a different physical
>> address, then you will need relocation. Ratch
>
> ??? please show me how to access 'physical memory' in windoze.

You can't, at least not by standard methods.

>
> If this MASM relocation trick worked, I'd switch to MASM today

It is not a method or trick, it is a requirement. Any OS needs to
practice relocation if it wants to load a program at an arbitrary location
in its physical memory. Relocation is the bridge between logical addresses
and physical addresses. MASM and other assemblers that produce code for
Wintel must support relocation. Ratch

Ratch

unread,

Jan 10, 2008, 8:13:38 PM1/10/08

to

"Wolfgang Kern" <now...@never.at> wrote in message

news:fm6013$be8$1...@newsreader1.xoc.utanet.at...

OK, that is documented. But why should I care about that when I have
an assembler that takes care of those details.

>
>>> SMC for Self_Modifying_Code isn't a common known abbreviation ?
>> Not for me. I don't do "SMC". For one thing, it is not reenterant.
>
> Why can't SMC be reentrant ? It's main purpose is "short but reusable".
> It is rare used on PCs, I use it on Vmode changes to keep my routines.

Because the first thread that uses that section of code modifies it,
and makes it unusable for the following threads.

>
> [opcode variants]
>>> The assembler(the tool) may do it either way, but an ASM-programmer
>>> can decide to force a desired opcode if several variants exists.
>
>> What a drag! Having to scan the listings and decide to modify a
>> perfectly good instruction.
>
> You aren't forced to anyway, but a programmer may desire to chose.

Seems to me like a lot of work for little gain. Ratch

hutch--

unread,

Jan 11, 2008, 2:21:36 AM1/11/08

to

The level of ignorance here is truly underwhelming.

> ??? please show me how to access 'physical memory' in windoze.

I cannot help you with wInDoZe but the answer for Microsoft Windows is
very simple, write a ring0 device driver. Best of luck if you have to
try and use a Mickey Mouse Club assembler but the simple answer is
download the Microsoft DDK for the OS version you wish to asault and
you can hit the big time there when you use the industry standard MASM
to do the job.

Poor Wolfgang needs to understand that the 1992 DOS assembler approach
will not do the job with late model 32 and 64 bit protected mode
operating systems.

Ho hum, pigs bum etc ....

hutch at movsd dot com

Wolfgang Kern

unread,

Jan 11, 2008, 9:04:15 AM1/11/08

to

Ratch wrote:

...

>>>> SMC for Self_Modifying_Code isn't a common known abbreviation ?
>>> Not for me. I don't do "SMC". For one thing, it is not reenterant.
>> Why can't SMC be reentrant ? It's main purpose is "short but reusable".
>> It is rare used on PCs, I use it on Vmode changes to keep my routines.

> Because the first thread that uses that section of code modifies it,
> and makes it unusable for the following threads.

SMC is usually not used on a per thread way. Its advantage is found
in not frequent changed global routines to save on multiple code
variants to be loaded or kept in memory.

__
wolfgang

Wolfgang Kern

unread,

Jan 11, 2008, 9:43:19 AM1/11/08

to

Ratch wrote:

>>>> Offsets in windoze(tm) are relative to a cheated virtual address
>>>> (which default seems to be 00400000h) and a 'paging'-OS can grant
>>>> memory with an instance chosen start address.
>>>> So also here there is no need for reallocation in the main code.
>>>> Only external referenced labels will need a 'link'.

>>> If that 040000000H "virtual address" is mapped to a different physical
>>> address, then you will need relocation. Ratch

>> ??? please show me how to access 'physical memory' in windoze.

> You can't, at least not by standard methods.

>> If this MASM relocation trick worked, I'd switch to MASM today

> It is not a method or trick, it is a requirement. Any OS needs to
> practice relocation if it wants to load a program at an arbitrary location
> in its physical memory.

Again, DOS,windoze,Linux and other paging-OS needs relocation for segments
and external references only, and not for every code/data label.

> Relocation is the bridge between logical addresses and physical addresses.

NO! The page tables are the bridge and nothing else.
What would it help you to know the physical address within an paged OS ?
Don't tell me your application or even a ring0-driver will disable
the paging and use physical addressing furtherhin.
Only the systems memory-manager need to know and can handle this.

> MASM and other assemblers that produce code for
> Wintel must support relocation. Ratch

Don't know Wintel, but I expect a similar issue.

__
wolfgang

Wolfgang Kern

unread,

Jan 11, 2008, 9:24:27 AM1/11/08

to

Wannabee skrev:
..

>> Volume 2B,Appendix B, B.1.7 "direction bit"
> I find it in section B.1.4.8 of
> the volume 2B Instruction Set Referance, appendix B:

> I rename all my download intel manuals.
> So I have no "253..." on my computer.

I mainly use AMD-docs and have the Intel books just to
keep my disassembler up to date.

> "
> In many two operand instructions, a direction bit (d) indicates which
> operand is considered the source and which is the destination,
> see Table B-11
> "
> think I heard you say it before, but I forgot since then.

looks like this is mentioned once a year by someone anyway ;)

>> not to confuse with the "direction flag".

>>>> SMC for Self_Modifying_Code isn't a common known abbreviation ?
>>> Not for me. I don't do "SMC". For one thing, it is not reenterant.
>> Why can't SMC be reentrant ? It's main purpose is "short but reusable".
>> It is rare used on PCs, I use it on Vmode changes to keep my routines.

> Also me. Why, it can just rewrite, or copy the code ahead or after.

The disadvantage of SMC is that it may cause a whole TLB-flush
beside some cache-lines become invalidated.
So this huge penalty restrict its usage and so it's not recommended
for frequent modify/execute.

[opcode variants]
>>>> The assembler(the tool) may do it either way, but an ASM-programmer
>>>> can decide to force a desired opcode if several variants exists.

> >> What a drag! Having to scan the listings and decide to modify a
> >> perfectly good instruction.
> >
> > You aren't forced to anyway, but a programmer may desire to chose.
>
> Yep. In fact so much that he will consider, seriously writing his own
> assembler just to be 100% sure he gets what he asks for.
> in my view, all asemblers should provide a hex listing, even with adresses
> pseudoed into the table, as a cross referance. Could be an extra feature
in
> the margin, which could be not seen most the time. It should also be
> possible to choose in which mode to write. Is this possible? I dont know,
> but I figure it could be possible?

An assembler tool could offer a default configuration and/or
a right-click option for possible code variants ;)
__
wolfgang

//\\o//\\annabee

unread,

Jan 11, 2008, 9:22:36 AM1/11/08

to

På Fri, 11 Jan 2008 15:24:27 +0100, skrev Wolfgang Kern <now...@never.at>:

> An assembler tool could offer a default configuration and/or
> a right-click option for possible code variants ;)

I have dreams.:)

> __
> wolfgang
>
>
>

Ratch

unread,

Jan 11, 2008, 12:37:48 PM1/11/08

to

"Wolfgang Kern" <now...@never.at> wrote in message

news:fm7vc7$fib$4...@newsreader1.xoc.utanet.at...

>
> Ratch wrote:
>
>>>>> Offsets in windoze(tm) are relative to a cheated virtual address
>>>>> (which default seems to be 00400000h) and a 'paging'-OS can grant
>>>>> memory with an instance chosen start address.
>>>>> So also here there is no need for reallocation in the main code.
>>>>> Only external referenced labels will need a 'link'.
>
>>>> If that 040000000H "virtual address" is mapped to a different physical
>>>> address, then you will need relocation. Ratch
>
>>> ??? please show me how to access 'physical memory' in windoze.
>
>> You can't, at least not by standard methods.
>
>>> If this MASM relocation trick worked, I'd switch to MASM today
>
>> It is not a method or trick, it is a requirement. Any OS needs to
>> practice relocation if it wants to load a program at an arbitrary
>> location
>> in its physical memory.
>
> Again, DOS,windoze,Linux and other paging-OS needs relocation for segments
> and external references only, and not for every code/data label.

I don't do DOS. I know nothing about Linux. I only do Windows, and
don't worry about segments because Win-32 it has a flat memory model.

>
>> Relocation is the bridge between logical addresses and physical
>> addresses.
>
> NO! The page tables are the bridge and nothing else.

I don't see how. Even the labels within the pages need to have
relocation unless they will always be loaded in the same physical memory.
And the code would have to assembled at least once for a particular physical
address.

> What would it help you to know the physical address within an paged OS ?

If using relocation like Wintel does, and MASM supports, no good at
all. The linker and loader take care of that through relocation. If coding
without relocation, like maybe an embedded application, then the code has to
be assembled for a particular starting address and cannot be moved
elsewhere.

> Don't tell me your application or even a ring0-driver will disable
> the paging and use physical addressing furtherhin.

Nope, Wintel is a relocation based method.

> Only the systems memory-manager need to know and can handle this.

Through relocation.

>
>> MASM and other assemblers that produce code for
>> Wintel must support relocation. Ratch
>
> Don't know Wintel, but I expect a similar issue.

Wintel means relocation. Ratch

Frank Kotler

unread,

Jan 11, 2008, 3:45:42 PM1/11/08

to

Ratch wrote:
> "Wolfgang Kern" <now...@never.at> wrote in message

...

> I don't do DOS. I know nothing about Linux. I only do Windows, and
> don't worry about segments because Win-32 it has a flat memory model.

fs?

>>>Relocation is the bridge between logical addresses and physical
>>>addresses.
>>
>>NO! The page tables are the bridge and nothing else.
>
> I don't see how. Even the labels within the pages need to have
> relocation unless they will always be loaded in the same physical memory.
> And the code would have to assembled at least once for a particular physical
> address.

Would you like to rephrase this? Or are you willing to have us believe
that you have *no* idea what you're talking about?

Best,
Frank

//\\o//\\annabee

unread,

Jan 11, 2008, 2:03:42 PM1/11/08

to

På Fri, 11 Jan 2008 21:45:42 +0100, skrev Frank Kotler
<fbko...@verizon.net>:

I guess I misunderstood this issue. I am not even sure what you are
talking about.
Call and jumps are relative, so it will work even if the base has moved,
but not
if it refers to data. (if the database has moved)

>
> Best,
> Frank
>

Frank Kotler

unread,

Jan 11, 2008, 4:45:11 PM1/11/08

to

Ratch wrote:

...

> MASM and other assemblers that produce code for
> Wintel must support relocation. Ratch

Dunno if this DEBUG script still works. Used to. You telling me that
DEBUG supports relocation?

Best,
Frank

f 100 354 0
e 100 'MZ'
e 104 2
e 108 2
e 10a 1e
e 10c 1e
e 111 2
a 120
push cs
pop ds
mov dx,e
mov ah,9
int 21
mov ax,4c01
int 21

e 12e 'Win32 EXE!'
e 138 7 d a 24 40
e 140 'PE'
e 144 4c 1 1
e 154 e0 0 2 1 b 1
e 169 10
e 176 40
e 179 10
e 17d 2
e 180 4
e 188 4
e 190 54
e 195 2
e 19c 2
e 1a2 10
e 1a5 10
e 1aa 10
e 1ad 10
e 1b4 10
e 1c0 11 10
e 1c4 43
e 238 '.text'
e 245 10
e 248 54
e 24d 2
e 25c 60
e 25f e0
e 300 b8 25 10 40 00 99 52 50 50 52 ff 15 40 10 40
e 310 c3 40 10
e 319 ff ff ff ff 35 10
e 321 40 10
e 325 'Hello world!'
e 335 'user32.dll'
e 340 46 10
e 348 'MessageBoxA'

n hwmb2.com
r cx
254

w
q

Ratch

unread,

Jan 11, 2008, 7:22:56 PM1/11/08

to

"Frank Kotler" <fbko...@verizon.net> wrote in message
news:WzQhj.492$rG.270@trndny02...

> Ratch wrote:
>> "Wolfgang Kern" <now...@never.at> wrote in message
>
> ...
>> I don't do DOS. I know nothing about Linux. I only do Windows, and
>> don't worry about segments because Win-32 it has a flat memory model.
>
> fs?

Not for me to worry about.

>
>>>>Relocation is the bridge between logical addresses and physical
>>>>addresses.
>>>
>>>NO! The page tables are the bridge and nothing else.
>>
>> I don't see how. Even the labels within the pages need to have
>> relocation unless they will always be loaded in the same physical memory.
>> And the code would have to assembled at least once for a particular
>> physical address.
>
> Would you like to rephrase this?

No.

>Or are you willing to have us believe that you have *no* idea what you're
>talking about?

Believe what you want. Ratch

Ratch

unread,

Jan 11, 2008, 7:26:16 PM1/11/08

to

"//\\o//\\annabee" <w...@www.akow> wrote in message
news:op.t4r4kgbowzh472@cyh1axtn1428g42...

That's right. Jumps to a label are relative from the jump instruction,
but loads from a label are logical addresses. They have to be converted to
a physical address through relocation. Ratch

Ratch

unread,

Jan 11, 2008, 7:30:23 PM1/11/08

to

"Frank Kotler" <fbko...@verizon.net> wrote in message

news:HrRhj.230$rV6.22@trndny06...

> Ratch wrote:
>
> ...
>> MASM and other assemblers that produce code for Wintel must support
>> relocation. Ratch
>
> Dunno if this DEBUG script still works. Used to. You telling me that DEBUG
> supports relocation?

That's DOS code. I don't do DOS. I have no idea what that DEBUG
script does, or what it means. Ratch

Frank Kotler

unread,

Jan 11, 2008, 8:15:37 PM1/11/08

to

//\\o//\\annabee wrote:
> På Fri, 11 Jan 2008 21:45:42 +0100, skrev Frank Kotler
> <fbko...@verizon.net>:
>
>> Ratch wrote:
>>
>>> "Wolfgang Kern" <now...@never.at> wrote in message
>>
>> ...
>>> I don't do DOS. I know nothing about Linux. I only do
>>> Windows, and don't worry about segments because Win-32 it has a flat
>>> memory model.
>>
>> fs?

The point being that fs points to "thread local storage" in Windows, and
does *not* have a "base" of zero. (AFAIK)

>>>>> Relocation is the bridge between logical addresses and physical
>>>>> addresses.
>>>>
>>>>
>>>> NO! The page tables are the bridge and nothing else.
>>>
>>> I don't see how. Even the labels within the pages need to
>>> have relocation unless they will always be loaded in the same
>>> physical memory. And the code would have to assembled at least once
>>> for a particular physical address.
>>
>>
>> Would you like to rephrase this? Or are you willing to have us
>> believe that you have *no* idea what you're talking about?
>
>
> I guess I misunderstood this issue. I am not even sure what you are
> talking about.
> Call and jumps are relative, so it will work even if the base has
> moved, but not
> if it refers to data. (if the database has moved)

Correct. But this has nothing whatever to do with *physical* memory.

I'm not up-to-date with Windows... last I knew, "apps" were loaded to a
known address - a 4 with some number of zeros... we've seen 40000h and
40000000h posted in this thread - I thought it was 4000000h... Dlls had
a "preferred" load address, but *could* be relocated if neccessary to
load at a different address. But this has *nothing* to do with physical
memory addresses - which we ordinarily don't know and don't care about.
The paging mechanism takes care of this, and it has *nothing* to do with
relocation.

Best,
Frank

Frank Kotler

unread,

Jan 11, 2008, 8:24:01 PM1/11/08

to

Ratch wrote:

...
> That's DOS code.

Is it? How do you make that distinction?

> I don't do DOS. I have no idea what that DEBUG
> script does, or what it means.

Agreed.

Any idea what this means?

>>e 140 'PE'

Best,
Frank

Ratch

unread,

Jan 11, 2008, 9:47:52 PM1/11/08

to

"Frank Kotler" <fbko...@verizon.net> wrote in message

news:REUhj.893$YW6.98@trndny07...

> Ratch wrote:
>
> ...
>> That's DOS code.
>
> Is it? How do you make that distinction?

Because I see INT instructions in the code.

>
>> I don't do DOS. I have no idea what that DEBUG script does, or what it
>> means.
>
> Agreed.
>
> Any idea what this means?
>
>>>e 140 'PE'

No. Ratch

Ratch

unread,

Jan 11, 2008, 10:03:53 PM1/11/08

to

"Frank Kotler" <fbko...@verizon.net> wrote in message

news:ZwUhj.250$sA6.218@trndny08...

> //\\o//\\annabee wrote:
>> På Fri, 11 Jan 2008 21:45:42 +0100, skrev Frank Kotler
>> <fbko...@verizon.net>:
>>
>>> Ratch wrote:
>>>
>>>> "Wolfgang Kern" <now...@never.at> wrote in message
>>>
>>> ...
>>>> I don't do DOS. I know nothing about Linux. I only do Windows,
>>>> and don't worry about segments because Win-32 it has a flat memory
>>>> model.
>>>
>>> fs?
>
> The point being that fs points to "thread local storage" in Windows, and
> does *not* have a "base" of zero. (AFAIK)

The Windows OS takes care of loading FS and any other seg registers it
needs. That is why I don't worry about it. Logically, the addresses look
flat to the programmer.

>
>>>>>> Relocation is the bridge between logical addresses and physical
>>>>>> addresses.
>>>>>
>>>>>
>>>>> NO! The page tables are the bridge and nothing else.
>>>>
>>>> I don't see how. Even the labels within the pages need to have
>>>> relocation unless they will always be loaded in the same physical
>>>> memory. And the code would have to assembled at least once for a
>>>> particular physical address.
>>>
>>>
>>> Would you like to rephrase this? Or are you willing to have us believe
>>> that you have *no* idea what you're talking about?
>>
>>
>> I guess I misunderstood this issue. I am not even sure what you are
>> talking about.
>> Call and jumps are relative, so it will work even if the base has moved,
>> but not
>> if it refers to data. (if the database has moved)
>
> Correct. But this has nothing whatever to do with *physical* memory.

Sure it does. Because, when a program is loaded at a arbitrary
physical address of memory, the operand of a MOV EAX,TAG1 has to change.
The linker and loader cannot change every operand, otherwise it would change
operands that are constants. So MASM marks the operands that are
relocatable and the linker and loader use that info to modify the operand
for the particular physical base address.

>
> I'm not up-to-date with Windows... last I knew, "apps" were loaded to a
> known address - a 4 with some number of zeros... we've seen 40000h and
> 40000000h posted in this thread - I thought it was 4000000h... Dlls had a
> "preferred" load address, but *could* be relocated if neccessary to load
> at a different address. But this has *nothing* to do with physical memory
> addresses - which we ordinarily don't know and don't care about. The
> paging mechanism takes care of this, and it has *nothing* to do with
> relocation.

The paging mechanism is practicing relocation. MASM marks the operand
fields as relocatable so the linker and loader knows what operands to
adjust. They are in cahoots with each other. Ratch

Frank Kotler

unread,

Jan 11, 2008, 10:21:19 PM1/11/08

to

Ratch wrote:
> "Frank Kotler" <fbko...@verizon.net> wrote in message
> news:REUhj.893$YW6.98@trndny07...
>
>>Ratch wrote:
>>
>>...
>>
>>> That's DOS code.
>>
>>Is it? How do you make that distinction?
>
>
> Because I see INT instructions in the code.

I see. So if I looked in a Windows executable made with Masm, I wouldn't
see an "int" instruction, is that correct?

Best,
Frank

Ratch

unread,

Jan 11, 2008, 10:57:55 PM1/11/08

to

"Frank Kotler" <fbko...@verizon.net> wrote in message

news:PmWhj.510$rG.496@trndny02...

I believe that is correct for ordinary application programs. I don't
know for sure about ring-0 or dll programs, I never wrote any. Ratch

Charles Crayne

unread,

Jan 11, 2008, 11:01:48 PM1/11/08

to

On Fri, 11 Jan 2008 20:47:52 -0600
"Ratch" <wat...@comcast.net> wrote:

> Because I see INT instructions in the code.

I don't want to spoil the little game which you are Frank are playing,
but here's a hint for those who are watching from the sidelines. Does
anyone remember trying to run a Windows program from DOS and getting a
message that the program requires Windows?

-- Chuck

Keith Kanios

unread,

Jan 12, 2008, 12:34:23 AM1/12/08

to

On Jan 11, 10:01 pm, Charles Crayne <ccra...@crayne.org> wrote:
> On Fri, 11 Jan 2008 20:47:52 -0600
>

> "Ratch" <watc...@comcast.net> wrote:
> > Because I see INT instructions in the code.
>
> I don't want to spoil the little game which you are Frank are playing,
> but here's a hint for those who are watching from the sidelines. Does
> anyone remember trying to run a Windows program from DOS and getting a
> message that the program requires Windows?
>
> -- Chuck

You mean the INT 0x21 that most likely contributed to printing such a
message embedded in the MZ (DOS STUB) portion of a PE... for
starters???

IIRC, NT utilizes INT 0x2E as the kernel entry-point, akin to *nix
0x80, but the function entry numbers seem to change from service pack
to service pack making it more desirable to just utilize the
encapsulating Win32 DLLs.

Frank Kotler

unread,

Jan 12, 2008, 2:51:06 AM1/12/08

to

Ratch wrote:

...
>>>>> That's DOS code.
>>>>
>>>>Is it? How do you make that distinction?
>>>
>>>
>>> Because I see INT instructions in the code.
>>
>>I see. So if I looked in a Windows executable made with Masm, I wouldn't
>>see an "int" instruction, is that correct?
>
>
> I believe that is correct for ordinary application programs. I don't
> know for sure about ring-0 or dll programs, I never wrote any. Ratch

Take a look at offset 47h into your ordinary application programs -
seems to be a "fixed" offset - and see if you don't see CD 21. As Chuck
hinted, it's the "dos stub" that says "this program requires Windows",
or "won't run in a DOS session", or whatever it says these days - one I
posted said "Win32 EXE!" - obsessivly short ones just say "Win32!", or
don't do anything at all. But you really ought to have a dos stub - I
think it's "always" there.

While you're peering into your executables, see if you see ".reloc" (MS
says) or ".rloc" (what I see... in a .dll but not in an .exe) - near
where you see ".text" and ".data", etc.. I don't think you'll see it in
an .exe.

The relocation section (in a .dll) is used to adjust memory offsets if
the code doesn't load at its preferred address, as you know. Since an
.exe goes into memory first, it always loads at 400000h, and so doesn't
need relocation. But it's strictly virtual addresses that are being
manipulated. Has nothing to do with where it may be mapped to in
physical memory. That's handled by the paging mechanism - part software
and part hardware. Relocation is not "the bridge between logical
addresses and physical addresses". It isn't a question of "belief".

Best,
Frank

robert...@yahoo.com

unread,

Jan 12, 2008, 3:18:16 AM1/12/08

to

On Jan 12, 1:51 am, Frank Kotler <fbkot...@verizon.net> wrote:
> Take a look at offset 47h into your ordinary application programs -
> seems to be a "fixed" offset - and see if you don't see CD 21. As Chuck
> hinted, it's the "dos stub" that says "this program requires Windows",
> or "won't run in a DOS session", or whatever it says these days - one I
> posted said "Win32 EXE!" - obsessivly short ones just say "Win32!", or
> don't do anything at all. But you really ought to have a dos stub - I
> think it's "always" there.

I'm not sure you can call the stub part of a Win32 program. Other
than figuring out how long it is (from the MZ header), so that the
loader can find the PE header, it's completely ignored when loading a
Win32 program. It can be quite ill-formed, and nothing will happen,
unless you try running the program on a DOS machine (or with
FORCEDOS).

> While you're peering into your executables, see if you see ".reloc" (MS
> says) or ".rloc" (what I see... in a .dll but not in an .exe) - near
> where you see ".text" and ".data", etc.. I don't think you'll see it in
> an .exe.
>
> The relocation section (in a .dll) is used to adjust memory offsets if
> the code doesn't load at its preferred address, as you know. Since an
> .exe goes into memory first, it always loads at 400000h, and so doesn't
> need relocation.

By default an EXE is linked with /FIXED and /BASE:4000000, so the
relocation section is omitted. Since the EXE almost always loads
first, the area at 0x400000 is inherently free. This speeds load time
(no relocations), improves paging (since images of the relocated pages
are not needed, the loaded image can be just a collection of memory
maps of section of the EXE file), and reduces executable file size.

OTOH, if you link with /FIXED:NO (the default for DLLs), relocation
information is included. Further, if you /BASE the executable
somewhere odd, it will be relocated by the loader (for example, if
you /BASE:0, Windows will relocate the image to 0x10000, since it
can't run at 0x0). If you /BASE the application someplace odd (like
0), and you link with /FIXED, it will fail to load.

There's nothing preventing you from linking a DLL /FIXED, although
that will cause problems if something else loads at that location
first.

Charles Crayne

unread,

Jan 12, 2008, 3:56:40 AM1/12/08

to

On Sat, 12 Jan 2008 00:18:16 -0800 (PST)
"robert...@yahoo.com" <robert...@yahoo.com> wrote:

> By default an EXE is linked with /FIXED and /BASE:4000000, so the
> relocation section is omitted.

The key point here is "linked". The object files have to contain
relocation information, or multiple object file could NOT be linked
into a single executable.

-- Chuck

Ratch

unread,

Jan 12, 2008, 10:57:45 AM1/12/08

to

"Frank Kotler" <fbko...@verizon.net> wrote in message

news:Kj_hj.536$rG.248@trndny02...

> Ratch wrote:
>
> ...
>>>>>> That's DOS code.
>>>>>
>>>>>Is it? How do you make that distinction?
>>>>
>>>>
>>>> Because I see INT instructions in the code.
>>>
>>>I see. So if I looked in a Windows executable made with Masm, I wouldn't
>>>see an "int" instruction, is that correct?
>>
>>
>> I believe that is correct for ordinary application programs. I
>> don't know for sure about ring-0 or dll programs, I never wrote any.
>> Ratch
>
> Take a look at offset 47h into your ordinary application programs - seems
> to be a "fixed" offset - and see if you don't see CD 21. As Chuck hinted,
> it's the "dos stub" that says "this program requires Windows", or "won't
> run in a DOS session", or whatever it says these days - one I posted said
> "Win32 EXE!" - obsessivly short ones just say "Win32!", or don't do
> anything at all. But you really ought to have a dos stub - I think it's
> "always" there.

I could care less. I know I did not put it there, so I don't know
about it. I let the linker and OS loader worry about those things, and I
don't try to second guess them.

>
> While you're peering into your executables, see if you see ".reloc" (MS
> says) or ".rloc" (what I see... in a .dll but not in an .exe) - near where
> you see ".text" and ".data", etc.. I don't think you'll see it in an .exe.

I never peered into an exe, and it does not interest me. I let the
tools that were written for that job do their thing, and I don't try to do
their work.

>
> The relocation section (in a .dll) is used to adjust memory offsets if the
> code doesn't load at its preferred address, as you know. Since an .exe
> goes into memory first, it always loads at 400000h, and so doesn't need
> relocation. But it's strictly virtual addresses that are being
> manipulated. Has nothing to do with where it may be mapped to in physical
> memory. That's handled by the paging mechanism - part software and part
> hardware.

Just by definition, the paging mechanism is a relocation method. It
converts a logical address to a physical address. Isn't that obvious?

Relocation is not "the bridge between logical
> addresses and physical addresses". It isn't a question of "belief".

No, it is a method of relocation. Ratch

Keith Kanios

unread,

Jan 12, 2008, 2:07:00 PM1/12/08

to

On Jan 12, 9:57 am, "Ratch" <watc...@comcast.net> wrote:
> Just by definition, the paging mechanism is a relocation method. It
> converts a logical address to a physical address. Isn't that obvious?
>

I haven't heard that one before.

However, I've heard/implemented other definitions of what paging is/
does...

1.) To get around physical address space fragmentation/holes.
2.) For memory space isolation, a form of protection.
3.) To abstract the architecture's full memory address space even if
you don't have the equivalent amount of RAM to support it.
4.) To establish shared memory regions despite potential barriers,
e.g. a monolithic kernel, dynamic libraries and file buffers.
5.) As a means of memory/buffer overflow protection, e.g. stack
"guard" pages.

I think there may be a few more definitions, but the above should
cover the majority of expected use.

I've always heard of relocation as a by-product of #2 and a result of
#4 above, but never as a method *of* paging.

As an example, your standard PE is linked with the assumption of
having a base address at 0x00400000. If for some reason it can't load
there, the loader will try to find a memory location sufficient for
loading and recalculate all appropriate relocatable addresses based
upon the newly established base address. Please note that this is
beyond your standard relocation calculations like those done for DLL
linking.

Wolfgang Kern

unread,

Jan 12, 2008, 3:20:04 PM1/12/08

to

Ratch wrote:

And now you stumbled in the trap you prepared for youself (period).

Sometimes it may be wise to accept the truth ...
but a few youngsters always try to prove or sell what's in their mind,
or more worse sell the lack in their prefered tool as a feature.

__
wolfgang (for me the discussion with Ratch ends here.
other detail questions, if any at all, are welcome)

//\\o//\\annabee

unread,

Jan 12, 2008, 1:40:10 PM1/12/08

to

I have gaps to fill here for sure.
What happens with the datasegment, inside the PE?

if I say

[somedata: 100]
mov ecx D$SomeData ; "static" data "segment"

Does the CPU perform the relative translation of the label based on
information elsewhere, or does the "linker", do it?? What about a dynamic
segment. That is allocated at runtime, yes? So then it must be the CPU
doing the translation. And the relocation must then just be telling
somehow the CPU the new relative adresse?

[somedata2: ?]

mov ecx D$SomeData2

If I save this code (its binary) to a file,
and CHANGE the PE, and then load the binary file, then
it will no more work. The address will be incorrectly translated.

So all data and code can be "relocated" and this must be just some small
operation for the loader, to just somehow inform the CPU where to find the
new relative address for the page(s)?. then what I said earlier must be
wrong.

The binary:
8b 0d 00 30 40 00 (00403000) ; mov ecx D$Somedata
8b 0d 04 30 40 00 (00403004) ; mov ecx D$SomeData2

6A 00 ; push 0
FF 15 03 10 40 00 (00401030) ; call "kernel32.ExitProcess"

hmm. just 4 bytes appart? thats weird isnt it?
shouldnt those segements be at further apart?
Whats the rest of the PE filled with then?

Did you do some trickey here Betov? Did you get fed
up with beeing critized of the PE being sooooo "large" and
implemented some trixs to make it smaller in the case
of small PES????? :D

Then why it is so big????

See what I am saying?

:)

a hexlisting is definitly needed sometimes.

Ratch

unread,

Jan 12, 2008, 4:07:36 PM1/12/08

to

"Keith Kanios" <ke...@kanios.net> wrote in message
news:8ebe1a7f-dd88-4e6b...@v29g2000hsf.googlegroups.com...

> On Jan 12, 9:57 am, "Ratch" <watc...@comcast.net> wrote:
>> Just by definition, the paging mechanism is a relocation method. It
>> converts a logical address to a physical address. Isn't that obvious?
>>
>
> I haven't heard that one before.
>
> However, I've heard/implemented other definitions of what paging is/
> does...
>
> 1.) To get around physical address space fragmentation/holes.
> 2.) For memory space isolation, a form of protection.
> 3.) To abstract the architecture's full memory address space even if
> you don't have the equivalent amount of RAM to support it.
> 4.) To establish shared memory regions despite potential barriers,
> e.g. a monolithic kernel, dynamic libraries and file buffers.
> 5.) As a means of memory/buffer overflow protection, e.g. stack
> "guard" pages.
>
> I think there may be a few more definitions, but the above should
> cover the majority of expected use.

I would call the above a list of uses for paging, but not a definition
of paging itself.

>
> I've always heard of relocation as a by-product of #2 and a result of
> #4 above, but never as a method *of* paging.

I never spoke of a method of paging. I talked about paging being a
method of relocation.

>
> As an example, your standard PE is linked with the assumption of
> having a base address at 0x00400000. If for some reason it can't load
> there, the loader will try to find a memory location sufficient for
> loading and recalculate all appropriate relocatable addresses based
> upon the newly established base address. Please note that this is
> beyond your standard relocation calculations like those done for DLL
> linking.

I will agree with that. All high level OS's have dynamic allocators
that practice dynamic relocation in addition to supporting the static
allocation from the assembler/linker/loader. Ratch

Keith Kanios

unread,

Jan 12, 2008, 4:30:40 PM1/12/08

to

On Jan 12, 12:40 pm, //\\\\o//\\\\annabee <w...@www.akow> wrote:
>
> I have gaps to fill here for sure.
> What happens with the datasegment, inside the PE?
>
> if I say
>
> [somedata: 100]
> mov ecx D$SomeData ; "static" data "segment"
>
> Does the CPU perform the relative translation of the label based on
> information elsewhere, or does the "linker", do it??

The linker provides structures so that the loader can do its job.

> What about a dynamic
> segment. That is allocated at runtime, yes? So then it must be the CPU
> doing the translation. And the relocation must then just be telling
> somehow the CPU the new relative adresse?

A "dynamic" segment? Let me introduce you to the world of computing
without the unnecessary abstractions. You have data that loads from a
pre-existing source, and then you have dynamic (e.g. malloc)
allocations. Either way, the CPU has nothing to do with it, as it only
provides the means in which paging structures become relevant.

However, if you are talking about actual *segmentation*, then you are
way off scope.

> [somedata2: ?]
>
> mov ecx D$SomeData2
>
> If I save this code (its binary) to a file,
> and CHANGE the PE, and then load the binary file, then
> it will no more work. The address will be incorrectly translated.

Study the PE internals from the official documentation, it can be
enlightening if you keep an open mind.

> So all data and code can be "relocated" and this must be just some small
> operation for the loader, to just somehow inform the CPU where to find the
> new relative address for the page(s)?. then what I said earlier must be
> wrong.
>
> The binary:
> 8b 0d 00 30 40 00 (00403000) ; mov ecx D$Somedata
> 8b 0d 04 30 40 00 (00403004) ; mov ecx D$SomeData2
>
> 6A 00 ; push 0
> FF 15 03 10 40 00 (00401030) ; call "kernel32.ExitProcess"
>
> hmm. just 4 bytes appart? thats weird isnt it?
> shouldnt those segements be at further apart?
> Whats the rest of the PE filled with then?
>

A smart linker won't waste another 4KB just so that a static variable
can have its own "segment" ;)

Once again, read the official documentation, don't depend on RosASM's
half-assed implementation.

//\\o//\\annabee

unread,

Jan 12, 2008, 3:03:51 PM1/12/08

to

På Sat, 12 Jan 2008 22:30:40 +0100, skrev Keith Kanios <ke...@kanios.net>:

> On Jan 12, 12:40 pm, //\\\\o//\\\\annabee <w...@www.akow> wrote:
>>
>> I have gaps to fill here for sure.
>> What happens with the datasegment, inside the PE?
>>
>> if I say
>>
>> [somedata: 100]
>> mov ecx D$SomeData ; "static" data "segment"
>>
>> Does the CPU perform the relative translation of the label based on
>> information elsewhere, or does the "linker", do it??
>
> The linker provides structures so that the loader can do its job.

Thats fine with me, but it does not explain the thing. Its just a
meaningless sentance.

>
>> What about a dynamic
>> segment. That is allocated at runtime, yes? So then it must be the CPU
>> doing the translation. And the relocation must then just be telling
>> somehow the CPU the new relative adresse?
>
> A "dynamic" segment? Let me introduce you to the world of computing
> without the unnecessary abstractions. You have data that loads from a
> pre-existing source, and then you have dynamic (e.g. malloc)
> allocations. Either way, the CPU has nothing to do with it, as it only
> provides the means in which paging structures become relevant.
>
> However, if you are talking about actual *segmentation*, then you are
> way off scope.

the way I pictured it, was like in the amount of at least 512 bytes.
I allways thought (and had been told ) that RosAsm "dynamic" memory, is
allocated at
runtime. It is zummed and then it is allocated by a dynamic memory call.
(by the loader - I would guess, added to the running PE image perhaps) -
While the "Static" memory is real memory inside the PE image. (Stored on
disk). Eg- it increases the size of the file. While "dynamic data" does
not.

For rosasm, we have the icon, that takes some memory, and will require
some room. Then
we have the code itself, the source, and the static data area. Are there
more? I will maybe have to read the RosAsm source to make sure.

>> [somedata2: ?]
>>
>> mov ecx D$SomeData2
>>
>> If I save this code (its binary) to a file,
>> and CHANGE the PE, and then load the binary file, then
>> it will no more work. The address will be incorrectly translated.
>
> Study the PE internals from the official documentation, it can be
> enlightening if you keep an open mind.

Yes. But this is why I chose to use an assembler, to avoid having to know
it all at once.
this was way to much for me to start with when I first started asm
programming. Why I choise RosAsm which makes the coding in asm so easy.

>> So all data and code can be "relocated" and this must be just some small
>> operation for the loader, to just somehow inform the CPU where to find
>> the
>> new relative address for the page(s)?. then what I said earlier must be
>> wrong.
>>
>> The binary:
>> 8b 0d 00 30 40 00 (00403000) ; mov ecx D$Somedata
>> 8b 0d 04 30 40 00 (00403004) ; mov ecx D$SomeData2
>>
>> 6A 00 ; push 0
>> FF 15 03 10 40 00 (00401030) ; call "kernel32.ExitProcess"
>>
>> hmm. just 4 bytes appart? thats weird isnt it?
>> shouldnt those segements be at further apart?
>> Whats the rest of the PE filled with then?
>>
>
> A smart linker won't waste another 4KB just so that a static variable
> can have its own "segment" ;)

The PE is 3,5 kb. with icon, source, data, and code. This is less then the
cluster it occupies on disk.

> Once again, read the official documentation, don't depend on RosASM's
> half-assed implementation.

Why dont you read it. Then if you did you could have been able to explain
it?

I have coded 3+ megs with this assembler. As far as I cant tell its not
only the best assembler ever created (I also coded some small app in Nasm)
it is by far the most impressive _programming_ tool in all of existance.
As for half asses, I can ensure you that what is half-assed is _all_ the
other assemblers that are not your version of "half-assed" only for the
simply reason that they are completly useless for anything but the very
crude hobby coder. It takes nearly nothing to make a simple assembler. The
diffrence between an encoder and a beast like RosAsm is the diffrence
between a spit in the oscean and a full blown programmming studio. Just
that in this case, just a very small team has been doing the work, for
free, with all code included, and mostly Betov did most of it. Calling
that halfasses is so utterly indecent. and ignorant, it deserve no
response.

Ratch

unread,

Jan 12, 2008, 5:47:07 PM1/12/08

to

"Wolfgang Kern" <now...@never.at> wrote in message

news:fmb87f$a06$1...@newsreader1.xoc.utanet.at...

How so? Saying it's so doesn't make it so. Remember, relocation means
moving to somewhere else. Paging moves a logical address to a different
physical address, right?

>
> Sometimes it may be wise to accept the truth ...

Always that is true. The question is what is the truth.

> but a few youngsters always try to prove or sell what's in their mind,
> or more worse sell the lack in their prefered tool as a feature.

If they can substantiate their assertions, then fine.

> __
> wolfgang (for me the discussion with Ratch ends here.
> other detail questions, if any at all, are welcome)

Declare victory and walk away. Ratch

Charles Crayne

unread,

Jan 12, 2008, 6:33:16 PM1/12/08

to

On Sat, 12 Jan 2008 21:03:51 +0100
//\\\\o//\\\\annabee <w...@www.akow> wrote:

> Thats fine with me, but it does not explain the thing. Its just a
> meaningless sentance.

The compiler or assembler creates a list of locations in the object
code module which contain addresses which must be relocated. If the
executable is going to be relocatable, the linker combines the lists
from the various object modules, otherwise, it performs the relocation,
and discards the lists.

-- Chuck

Charles Crayne

unread,

Jan 12, 2008, 6:46:23 PM1/12/08

to

On Sat, 12 Jan 2008 16:47:07 -0600
"Ratch" <wat...@comcast.net> wrote:

> Remember, relocation means
> moving to somewhere else.

By which definition, picking up your computer from the left side of your
desk and setting it down on the right side, is clearly a form of memory
relocation. However, the conventional use of the term does not
refer to moving anything, but rather, to the process of changing the
addresses within the object code so that it will run correctly at a
different address than it was compiled/assembled for. Paging does not
do this.

-- Chuck

Keith Kanios

unread,

Jan 12, 2008, 7:01:02 PM1/12/08

to

On Jan 12, 2:03 pm, //\\\\o//\\\\annabee <w...@www.akow> wrote:
> På Sat, 12 Jan 2008 22:30:40 +0100, skrev Keith Kanios <ke...@kanios.net>:
>
> > On Jan 12, 12:40 pm, //\\\\o//\\\\annabee <w...@www.akow> wrote:
>
> >> I have gaps to fill here for sure.
> >> What happens with the datasegment, inside the PE?
>
> >> if I say
>
> >> [somedata: 100]
> >> mov ecx D$SomeData ; "static" data "segment"
>
> >> Does the CPU perform the relative translation of the label based on
> >> information elsewhere, or does the "linker", do it??
>
> > The linker provides structures so that the loader can do its job.
>
> Thats fine with me, but it does not explain the thing. Its just a
> meaningless sentance.
>

It explains everything, but you lack the necessary knowledge to
understand what that sentence meant. Vital difference. Drop the
foolish pride and take the time to read the necessary documentation
instead of assuming the world will simply rearrange itself to meet
your understanding.

>
> >> What about a dynamic
> >> segment. That is allocated at runtime, yes? So then it must be the CPU
> >> doing the translation. And the relocation must then just be telling
> >> somehow the CPU the new relative adresse?
>
> > A "dynamic" segment? Let me introduce you to the world of computing
> > without the unnecessary abstractions. You have data that loads from a
> > pre-existing source, and then you have dynamic (e.g. malloc)
> > allocations. Either way, the CPU has nothing to do with it, as it only
> > provides the means in which paging structures become relevant.
>
> > However, if you are talking about actual *segmentation*, then you are
> > way off scope.
>
> the way I pictured it, was like in the amount of at least 512 bytes.
> I allways thought (and had been told ) that RosAsm "dynamic" memory, is
> allocated at
> runtime. It is zummed and then it is allocated by a dynamic memory call.
> (by the loader - I would guess, added to the running PE image perhaps) -
> While the "Static" memory is real memory inside the PE image. (Stored on
> disk). Eg- it increases the size of the file. While "dynamic data" does
> not.
>
> For rosasm, we have the icon, that takes some memory, and will require
> some room. Then
> we have the code itself, the source, and the static data area. Are there
> more? I will maybe have to read the RosAsm source to make sure.

Then RosASM unnecessarily abstracts pre-existing facilities. What you
call "static data" is actually initialized data. What you call
"dynamic data" is actually uninitialized data. In a short generalized
summary, initialized data (.*data) sections are loaded from a data
source and uninitialized data (.bss) sections are dynamically
allocated.

The assembler/compiler/linker specifies in the OBJ/PE how much BSS
space is to be allocated and then the loader zero-allocates it,
followed by calculations that patch up relocatable address references
to that newly allocated BSS space.

> >> [somedata2: ?]
>
> >> mov ecx D$SomeData2
>
> >> If I save this code (its binary) to a file,
> >> and CHANGE the PE, and then load the binary file, then
> >> it will no more work. The address will be incorrectly translated.
>
> > Study the PE internals from the official documentation, it can be
> > enlightening if you keep an open mind.
>
> Yes. But this is why I chose to use an assembler, to avoid having to know
> it all at once.
> this was way to much for me to start with when I first started asm
> programming. Why I choise RosAsm which makes the coding in asm so easy.
>

This is a prime example of a life-long lesson you should definitely
adhere to: What is easy now can cost you in the long run. It is
generally harder to unlearn things, and when you absorb bad/incomplete
knowledge it makes you stubborn and resistant to re-learning things
the right way. Now you are stuck with task of unlearning the half-
assed garbage floating around inside your head and re-learning things
as they actually are.

Patience is a virtue ;)

>
> >> So all data and code can be "relocated" and this must be just some small
> >> operation for the loader, to just somehow inform the CPU where to find
> >> the
> >> new relative address for the page(s)?. then what I said earlier must be
> >> wrong.
>
> >> The binary:
> >> 8b 0d 00 30 40 00 (00403000) ; mov ecx D$Somedata
> >> 8b 0d 04 30 40 00 (00403004) ; mov ecx D$SomeData2
>
> >> 6A 00 ; push 0
> >> FF 15 03 10 40 00 (00401030) ; call "kernel32.ExitProcess"
>
> >> hmm. just 4 bytes appart? thats weird isnt it?
> >> shouldnt those segements be at further apart?
> >> Whats the rest of the PE filled with then?
>
> > A smart linker won't waste another 4KB just so that a static variable
> > can have its own "segment" ;)
>
> The PE is 3,5 kb. with icon, source, data, and code. This is less then the
> cluster it occupies on disk.
>
> > Once again, read the official documentation, don't depend on RosASM's
> > half-assed implementation.
>
> Why dont you read it. Then if you did you could have been able to explain
> it?

I have, but I don't think it is my job to spoon-feed you
specifications due to your inherit laziness. Your viral GPL mindset
where you think everyone should just yield code/info at whim simply
does not apply in the real world. Drop the spoon, let other things
drop, and make an effort to learn things yourself.

> I have coded 3+ megs with this assembler. As far as I cant tell its not
> only the best assembler ever created (I also coded some small app in Nasm)
> it is by far the most impressive _programming_ tool in all of existance.
> As for half asses, I can ensure you that what is half-assed is _all_ the
> other assemblers that are not your version of "half-assed" only for the
> simply reason that they are completly useless for anything but the very
> crude hobby coder. It takes nearly nothing to make a simple assembler. The
> diffrence between an encoder and a beast like RosAsm is the diffrence
> between a spit in the oscean and a full blown programmming studio. Just
> that in this case, just a very small team has been doing the work, for
> free, with all code included, and mostly Betov did most of it. Calling
> that halfasses is so utterly indecent. and ignorant, it deserve no
> response.

You code in RosASM for the same reason people use C++, C# and Java...
not because it is necessarily better, but because it makes certain
things easier. Your choice of RosASM is not much different than using
HLA, with HLA being honest about what it really is... which would
easily explain the competitive bitterness between the two user bases
in light of HLA's one-sided comparative popularity ;)

Make no mistake, the complexity of a tool grows proportionately to its
power/usefulness. If you want to push a button and think it is real
programming, download VB.Net and get the masquerade over with.

Keith Kanios

unread,

Jan 12, 2008, 7:11:30 PM1/12/08

to

On Jan 12, 4:47 pm, "Ratch" <watc...@comcast.net> wrote:
>
> How so? Saying it's so doesn't make it so. Remember, relocation means
> moving to somewhere else. Paging moves a logical address to a different
> physical address, right?
>

Not necessarily. Locations can be "identity mapped", where a span of
virtual memory can be numerically equal to its physical memory
location.

However, I think this is getting to the point where everyone is
splitting hairs as I can easily move this down one level and say that
physical memory is simply a method of relocation since the actual
address is mapped to a particular RAM bank/chip. What is more
important, is how *useful* that definition is in the real world.
Answer: Not very for either case.