Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Nasm lgdt and lidt

1,014 views
Skip to first unread message

James Harris

unread,
Nov 13, 2011, 2:31:20 PM11/13/11
to
IME Nasm is better than certain other assemblers in a number of ways,
one of which is allowing expression of corner cases. Where some
assemblers require explicit definition of the bytes needed to make a
little-used instruction Nasm often provides a means of expressing it
mnemonically.

I've been trying to find a way to code a couple of obscure
instructions, lgdt and lidt, in a certain way. I should say I may have
misunderstood the instruction encoding and would welcome correction
but with that said, the issue as I understand it can be illustrated
using lgdt as follows.

Say I wanted to load a 6-byte GDT value I could code

lgdt [mem]

This would work in protected mode but in real mode it seems lgdt it
will by default load five bytes instead of six. See the comments on 16-
bit and 32-bit operands at

http://pdos.csail.mit.edu/6.828/2011/readings/i386/LGDT.htm

I tried to tell Nasm to use the form with the 32-bit base by means of

lgdt dword [mem]

but the version of Nasm I have does not accept that. I can't find any
info on lgdt data size hints/overrides in the latest Nasm doc either.

I can code

o32 lgdt [mem]

which, I think, does the job. The question is that since Nasm allows
other instructions to use a data size mnemonic like dword does it have
a similar mnemonic for use in the lgdt instruction? As above, this
seems only needed in real mode when loading the GDTR for protected
mode. As I say, a corner case, but for consistency with other
instructions it looks like there should be a way to encode this
without using o32. Anyone know if there is one - or if I am on the
wrong track?

James

Frank Kotler

unread,
Nov 13, 2011, 5:52:58 PM11/13/11
to
AFAIK, you'd be stuck with o32 in that case. It isn't clear to me how
one would, in RMODE code, locate the gdt such that the upper bits of the
linear address would be non-zero, so it's "way out in the corner", but
if you need it, I think o32 would be the way to do it. It was news to me
that ligt/lgdt only loads 24 bits in 16-bit code, but I guess it's true.
I don't see why you'd need 32 bits, but you ought to be able to do it!

Best,
Frank

Rod Pemberton

unread,
Nov 14, 2011, 5:07:46 AM11/14/11
to
"James Harris" <james.h...@nospicedham.googlemail.com> wrote in message
news:f203cd58-ee54-4310...@t8g2000yql.googlegroups.com...
>
> IME Nasm is better than certain other assemblers in a number of ways,
> one of which is allowing expression of corner cases.
>

It's better than MASM or GAS, but doesn't allow everything. E.g., for
instructions with the short eax/ax form, IIRC, there is no way to tell NASM
to use the normal eax/ax form of the instruction. The short form will
always be encoded. That probably affects all of these instructions:
MOV, XCHG, ADC, ADD, AND, CMP, OR, SBB, SUB, TEST, XOR

Recent post of mine on 64-bit short form in NASM:
http://groups.google.com/group/comp.lang.asm.x86/msg/6726fc2a55755f32

> I've been trying to find a way to code a couple of obscure
> instructions, lgdt and lidt, in a certain way.

2008 post of mine on issues with for LGDT and LIDT in NASM, WASM, GAS ...
http://groups.google.com/group/alt.lang.asm/msg/15dee384b7a3af6d

> I should say I may have
> misunderstood the instruction encoding and would welcome correction
> but with that said, the issue as I understand it can be illustrated
> using lgdt as follows.
>
> Say I wanted to load a 6-byte GDT value I could code
>
> lgdt [mem]
>
> This would work in protected mode but in real mode it seems lgdt it
> will by default load five bytes instead of six. See the comments on 16-
> bit and 32-bit operands at
>
> [link]
>
> I tried to tell Nasm to use the form with the 32-bit base by means of
>
> lgdt dword [mem]
>
> but the version of Nasm I have does not accept that. I can't find any
> info on lgdt data size hints/overrides in the latest Nasm doc either.
>
> I can code
>
> o32 lgdt [mem]
>
> which, I think, does the job.

If you missed it, I replied to that a few days ago on a.o.d.:
http://groups.google.com/group/alt.os.development/msg/81e0b71c8a8c26d9

> The question is that since Nasm allows
> other instructions to use a data size mnemonic like dword does it have
> a similar mnemonic for use in the lgdt instruction? As above, this
> seems only needed in real mode when loading the GDTR for protected
> mode. As I say, a corner case, but for consistency with other
> instructions it looks like there should be a way to encode this
> without using o32. Anyone know if there is one - or if I am on the
> wrong track?
>

That must've been directed at Frank since he's listed as an active NASM
developer (or maybe HPA) ...

NASM has a website:
http://www.nasm.us/

NASM has a forum:
http://forum.nasm.us/

NASM has mailing lists:
http://sourceforge.net/mail/?group_id=6208

Read or post to NASM's mailing list via Usenet reader:
news://news.gmane.org/gmane.comp.lang.nasm.bugs
news://news.gmane.org/gmane.comp.lang.nasm.cvs
news://news.gmane.org/gmane.comp.lang.nasm.devel
news://news.gmane.org/gmane.comp.lang.nasm.general

Read via any browser with built-in Usenet support, like Opera, using the
same links immediately above.

Read via any browser via Gmane mailing-list gateway:
http://dir.gmane.org/?prefix=gmane.comp.lang.nasm

Is this a clax FAQ ... ?


Rod Pemberton


James Harris

unread,
Nov 14, 2011, 4:27:09 PM11/14/11
to
On Nov 13, 10:52 pm, Frank Kotler
<fbkot...@nospicedham.myfairpoint.net> wrote:

...

> AFAIK, you'd be stuck with o32 in that case. It isn't clear to me how
> one would, in RMODE code, locate the gdt such that the upper bits of the
> linear address would be non-zero,

That's a good point. I guess it would have to have been set up in so-
called unreal mode. Wolfgang seems to have tried exactly that at

http://groups.google.com/group/alt.os.development/msg/5b8882f9d77efa37

To quote, "Both load only 16+24 bits without a 66h prefix and zero out
the high byte." In other words the instruction would fail if the
initial GDT were located too high.

> so it's "way out in the corner", but
> if you need it, I think o32 would be the way to do it. It was news to me
> that ligt/lgdt only loads 24 bits in 16-bit code, but I guess it's true.

It was a surprise to me too. I was blissfully unaware of the issue
until a few days ago when it came up in a discussion with Rod on
alt.os.development. I have code that uses lgdt in real mode that I
wrote years ago. I never even thought about the possibility of it
loading only a 3-byte base. As I didn't use unreal mode the GDT
started below 16Mby and it has never been an issue. However, it would
be good if the assembler could have flagged up a warning about an
unspecified data size - but maybe that's too much to ask. ;-)

> I don't see why you'd need 32 bits, but you ought to be able to do it!

As above, it's not something I would do but it seems possible....
Actually, another potential way to run into this problem just occured
to me. It is quite legitimate to enable paging and protected mode at
the same time. I just checked a relevant Intel doc (in this case the
PPro developer's manual volume 3) and it seems quite easy.

According to that document, when enabling paging only the mov to cr0
and the immediately following jump need to be identity mapped. This
sounds to me like the programmer could easily decide to place the GDT
in high memory such as somewhere near the top of the address space. If
so the full 4-byte base *would* be needed in lgdt. That doesn't sound
at all far fetched.

To try to get my head round lgdt and opsiz encoding I used nasm to get
some comparisons with other instructions with a similar form. Taking
push as an example in "bits 16" mode:

push word [mem] generates FF36[1501]
push dword [mem] generates 66FF36[1501]

Notably,

push [mem]

is rightly not supported as the operand size is not known. It would be
great if nasm were to similarly generate a warning for

lgdt [mem]

for the same reason: operand size not known. It would also be great if
it were to accept

lgdt word [mem]
lgdt dword [mem]

Those three (the first not permitted, the last two permitted) would
make it consistent with push and the like and would help highlight to
the programmer (e.g. me) that the data size needed to be thought
about. What do you think? The same would be relevant to lidt too.

I've never heard of a nasm change causing it to baulk on old code but
I wonder if at least a warning could be generated....?

James

James Harris

unread,
Nov 14, 2011, 5:00:05 PM11/14/11
to
On Nov 14, 10:07 am, "Rod Pemberton"
<do_not_h...@nospicedham.noavailemail.cmm> wrote:
> "James Harris" <james.harri...@nospicedham.googlemail.com> wrote in message

...

> > IME Nasm is better than certain other assemblers in a number of ways,
> > one of which is allowing expression of corner cases.
>
> It's better than MASM or GAS, but doesn't allow everything.  E.g., for
> instructions with the short eax/ax form, IIRC, there is no way to tell NASM
> to use the normal eax/ax form of the instruction.  The short form will
> always be encoded.  That probably affects all of these instructions:
> MOV, XCHG, ADC, ADD, AND, CMP, OR, SBB, SUB, TEST, XOR

Could you give a specific example?

...

> If you missed it, I replied to that a few days ago on a.o.d.:
http://groups.google.com/group/alt.os.development/msg/81e0b71c8a8c26d9

Was that where you thought I was a masm programmer? :-o

I didn't miss it and will reply.

> > The question is that since Nasm allows
> > other instructions to use a data size mnemonic like dword does it have
> > a similar mnemonic for use in the lgdt instruction? As above, this
> > seems only needed in real mode when loading the GDTR for protected
> > mode. As I say, a corner case, but for consistency with other
> > instructions it looks like there should be a way to encode this
> > without using o32. Anyone know if there is one - or if I am on the
> > wrong track?
>
> That must've been directed at Frank since he's listed as an active NASM
> developer (or maybe HPA) ...

I hoped Frank and/or others who know a lot more about x86 encoding
than I do would see the post.

James

Rod Pemberton

unread,
Nov 15, 2011, 8:11:10 AM11/15/11
to
"James Harris" <james.h...@nospicedham.googlemail.com> wrote in message
news:7242756a-9d2b-4caf...@e15g2000vba.googlegroups.com...
> On Nov 14, 10:07 am, "Rod Pemberton"
> <do_not_h...@nospicedham.noavailemail.cmm> wrote:
> > "James Harris" <james.harri...@nospicedham.googlemail.com> wrote in
...

> > > IME Nasm is better than certain other assemblers in a number
> > > of ways, one of which is allowing expression of corner cases.
>
> > It's better than MASM or GAS, but doesn't allow everything. E.g., for
> > instructions with the short eax/ax form, IIRC, there is no way to tell
> > NASM to use the normal eax/ax form of the instruction. The short form
> > will always be encoded. That probably affects all of these instructions:
> > MOV, XCHG, ADC, ADD, AND, CMP, OR, SBB, SUB, TEST, XOR
>
> Could you give a specific example?
>

Let's take the ADC instruction:

ADC AL,0A5h

The byte sequences for two encodings or forms of that instruction:
(14h vs. 80h)

14 A5
80 D0 A5

The instruction form listed in the manuals:

ADC AL,imm8
ADC r/m8,imm8

The first can only specify AL. It's the "short" form since it produces a
shorter byte sequence when encoded. The second form (i.e.,"normal" or
"long" form) can specify any r8, including AL. That means both encode "ADC
AL, imm8". NASM will generate the 80h form for any r8 except AL:

BITS 16
ADC AL, 0A5h
ADC BL, 0A5h
ADC CL, 0A5h
ADC DL, 0A5h

nasm -f bin -o jh.bin jh.asm
ndisasm -b16 jh.bin

00000000 14A5 adc al,0xa5
00000002 80D3A5 adc bl,0xa5
00000005 80D1A5 adc cl,0xa5
00000008 80D2A5 adc dl,0xa5

NASM will always generate the short form (14h) for AL. How do you tell NASM
to generate the 80h form for AL? AFAIK, you can't. It's possible it's been
added to NASM. I haven't read newer NASM manuals.

The same short form encoding arises for AX/EAX (15h vs. 83h) with ADC. It
should be so for the other arithmetic and binary instructions above. MOV
and XCHG should have similar issues.

There are other situations in x86 where instruction syntax can have
different byte sequences. I.e., x86 is not 1-to-1 encoding. "r/m, r" forms
will sometimes have an "r, r/m" form too. So, two encodings for "r, r" can
be generated.

Let's take ADC again:

ADC AL,DL

The byte sequences for two encodings or forms of that instruction:
(10h vs. 12h)

10 D0 ; ADC r/m8,r8
12 C2 ; ADC r8,r/m8

Both allow "r8, r8". Same instruction. Different encoding. AFAIK, one
"r8, r8" form is is not supported by NASM syntax: 12h. The same issue
applies to 11h vs 13h with 16/32-bit regs.

If you ever get around to creating assembly code for some binary, you'll
likely attempt to check it by comparing the resulting binaries. If the
binary was created by MASM and you're using NASM for the recreation,
you'll come across branches that NASM encodes one way and MASM encodes
another way. I.e., look at Jcc for multiple encodings.

Get a hex editor. Pick an instruction that can have multiple equivalent
encodings (above). Manually construct the instructions as you believe they
are encoded. Many x86 instructions use 8-bits in octal as 2/3/3 for
encoding registers and the memory mode. Disassemble.

You may want other disassemblers besides NASM's NDISASM to cross-check the
disassembly. While NASM does a good job for assembly, NDISASM doesn't (or
didn't ...) disassemble everything correctly. You may have to develop a
template or minimal object file for some disassemblers to plug-in code or
add bytes via binary edit. E.g., I created an obj template for NASM that
allows me to compile NASM assembly to an obj compatible with OpenWatcom's
WDIS disassembler (MASM clone). So, I can see what both NDISASM and WDIS
"think" NASM encoded. I originally did so in an attempt to determine when
and where various MASM keywords were used, but the old version of WDIS I use
didn't emit them.


Rod Pemberton


Markus Wichmann

unread,
Nov 15, 2011, 3:10:19 PM11/15/11
to
When is it a problem that nasm generates _less_ code than MASM? AFAIK
the short forms execute faster. Plus you get more code into the cache
that way. If that form of the instruction destroys some form of
alignment, you can easily enforce this with the "align" macro (which by
default insert's 90h's)

> NASM will always generate the short form (14h) for AL. How do you tell NASM
> to generate the 80h form for AL? AFAIK, you can't. It's possible it's been
> added to NASM. I haven't read newer NASM manuals.
>

Well, if you do know the exact machine code you want to have, you might
as well tell NASM about it:

db 80h, 0d0h, 0a5h

> The same short form encoding arises for AX/EAX (15h vs. 83h) with ADC. It
> should be so for the other arithmetic and binary instructions above. MOV
> and XCHG should have similar issues.
>
> There are other situations in x86 where instruction syntax can have
> different byte sequences. I.e., x86 is not 1-to-1 encoding. "r/m, r" forms
> will sometimes have an "r, r/m" form too. So, two encodings for "r, r" can
> be generated.
>

Yes, that is normal. x86 offers some flexibility there. That's actually
quite nice. However, assemblers were originally created so programmers
wouldn't have to fiddle with the specifics of machine code. That's why
assemblers do have some freedom when choosing their encoding. And I
don't know of any use case were the specific encoding is so important.
And even if it is, you could hardcode it. The expression after the "d?"
pseudo instructions isn't even a critical expression, so you might as
well make it variable if need be.

> Let's take ADC again:
>
> ADC AL,DL
>
> The byte sequences for two encodings or forms of that instruction:
> (10h vs. 12h)
>
> 10 D0 ; ADC r/m8,r8
> 12 C2 ; ADC r8,r/m8
>
> Both allow "r8, r8". Same instruction. Different encoding. AFAIK, one
> "r8, r8" form is is not supported by NASM syntax: 12h. The same issue
> applies to 11h vs 13h with 16/32-bit regs.
>
> If you ever get around to creating assembly code for some binary, you'll
> likely attempt to check it by comparing the resulting binaries.

I really don't know of any way this could possibly disturb you, _ever_.
Especially, I don't know why I would want to write an assembler source
file that generates the _exact_ same output as some existing file.
That's what disassemblers are for! Even "smart" disassemblers like objdump.

In my book, "getting around to creating assembly code" happens only for
few reasons; and trying to recreate an existing binary is not one of
them. Usually people want to code in assembly because they want to get
to know their machine better or because they have to optimize something
beyond the capabilities of their compilers.

> If the
> binary was created by MASM and you're using NASM for the recreation,
> you'll come across branches that NASM encodes one way and MASM encodes
> another way. I.e., look at Jcc for multiple encodings.
>

Looking at the manuals, I don't see much ambiguity. Or do you mean:

70 cb JO rel8off
0F 80 cw JO rel16off
0F 80 cd JO rel32off

So you can encode a jump within 128 bytes in three ways? NASM always
chooses the smallest possible one. I don't see what's wrong with that
behaviour.

> Get a hex editor. Pick an instruction that can have multiple equivalent
> encodings (above). Manually construct the instructions as you believe they
> are encoded. Many x86 instructions use 8-bits in octal as 2/3/3 for
> encoding registers and the memory mode. Disassemble.
>

You do know that assemblers were invented so people wouldn't have to
bother with machine code, right? Anyway, do you want to tell us the long
way that there are different ways to encode the same instruction and you
think NASM chooses the wrong ones?

> You may want other disassemblers besides NASM's NDISASM to cross-check the
> disassembly. While NASM does a good job for assembly, NDISASM doesn't (or
> didn't ...) disassemble everything correctly.

How come that again? I've yet to see a file ndisasm can't handle. Ugly
hacks, like jumps into the middle of an instruction it can't do,
granted, but it can do anything else. It even has an "intelligent sync"
feature, allowing it to correctly decode an instruction/data mix.
However, for that to work, the address of an instruction start has to
occur as part of an instruction before the address itself. If that
doesn't happen, there's also "manual sync".

> You may have to develop a
> template or minimal object file for some disassemblers to plug-in code or
> add bytes via binary edit. E.g., I created an obj template for NASM that
> allows me to compile NASM assembly to an obj compatible with OpenWatcom's
> WDIS disassembler (MASM clone). So, I can see what both NDISASM and WDIS
> "think" NASM encoded. I originally did so in an attempt to determine when
> and where various MASM keywords were used, but the old version of WDIS I use
> didn't emit them.
>

And that's where you see the difference between ndisasm and wdis, or,
for that matter, objdump. The latter two try to be "intelligent" and
interpret file formats. ndisasm isn't so presumptuous. It asumes that
the input is already binary machine code, so in order to deal with file
formats it is _your_ responsibility to extract the code. And set up the
disassembler correctly ("origin" and operating mode come in mind).

>
> Rod Pemberton
>
>

Ciao,
Markus

hopcode

unread,
Nov 16, 2011, 2:31:29 AM11/16/11
to
Il 15.11.2011 21:10, Markus Wichmann ha scritto:
> That's why
> assemblers do have some freedom when choosing their encoding. And I
> don't know of any use case were the specific encoding is so important.

Hi Markus,
imho, they shouldnt choose their encoding. afaik,
there is any asseembler out in the world that let you
customize opcode outputting. i had on external
forum discussion about that some time ago.
my assembler let you choose the "fingerprint"
at every possible evaluable instruction (i am thinking
to extend the feature as scriptable)
this has many advantages, for example:

1) applies a sort of steganography on the output object.
( i sign my object with my name :-)
2) indirectly simplifies copy protection technics on that object.
3) the dumb eurihstic of antivirus software doesent mark the object
as "virus", because of the fact other malware was done from the same
"fingerprint" of your object.

What is more important, assembler, especially when used as
back-end, should not raise error from the following code,

nestlevel equ 127
enter 20h,nestlevel

only because "nestlevel" is out of range (0-31)

Intels docs tell us it is an _imm8_

C8 iw ib ENTER imm16, imm8 Create a nested stack frame for a procedure
then CPU microoperations of ENTER, applies internally a mod 32
on that imm8.

my reasons against that warning/error from the compiler are:

1) future eventual extension of that instruction from Intel
2) "nestlevel" maybe be used later in the code for other purpouses.

Also, i am fudamentally for a WYSIWYG assembler.
And when in doubt apply hardware specification,
not the assembler native encoding.

Cheers,



.:mrk[hopcode]
.:x64lab:.
group http://groups.google.com/group/x64lab
site http://sites.google.com/site/x64lab

Bob Masta

unread,
Nov 16, 2011, 7:55:45 AM11/16/11
to
I seem to recall that someone (I think it was Eric Isaacson
with A86) used (uses?) these different encodings to provide
a "fingerprint" to identify code written with his assembler.

Best regards,


Bob Masta

DAQARTA v6.02
Data AcQuisition And Real-Time Analysis
www.daqarta.com
Scope, Spectrum, Spectrogram, Sound Level Meter
Frequency Counter, FREE Signal Generator
Pitch Track, Pitch-to-MIDI
Science with your sound card!

Markus Wichmann

unread,
Nov 16, 2011, 8:46:09 AM11/16/11
to
On 16.11.2011 08:31, hopcode wrote:
> Il 15.11.2011 21:10, Markus Wichmann ha scritto:
>> That's why
>> assemblers do have some freedom when choosing their encoding. And I
>> don't know of any use case were the specific encoding is so important.
>
> Hi Markus,
> imho, they shouldnt choose their encoding.

You are not making sense. If I want to

add ecx, edx

in 32 bit mode or higher, the assembler can choose to encode the stuff as

01 CA

or

03 D1

It both means the same and it's the assembler's job to choose an
encoding. What do you mean by "they shouldnt choose"? Should they
present you with the choice? How long do you wish to click or type
through dialogs until you have actually assembled a sizable project?

> afaik,
> there is any asseembler out in the world that let you
> customize opcode outputting.

Some stuff you actually can do in NASM. There is the "strict" keyword.
The example given in the documentation is that usually, NASM would assemble

push dword 21h

as

6A 21

(in 32 bit mode). However, some people might want, for some reason like
leaving enough space to be filled in later, to have the full length
operand encoding. So you can write

push strict dword 21h

which gets assembled to

68 21 00 00 00


> i had on external
> forum discussion about that some time ago.
> my assembler let you choose the "fingerprint"
> at every possible evaluable instruction (i am thinking
> to extend the feature as scriptable)
> this has many advantages, for example:
>
> 1) applies a sort of steganography on the output object.
> ( i sign my object with my name :-)

Nice feature, but pretty obscure. I'd simply write my name into the data
section and do something with it.

> 2) indirectly simplifies copy protection technics on that object.

What do you mean by that? Do you think a skilled reverse engineer is
uncapable of making sense of a specially crafted code?

> 3) the dumb eurihstic of antivirus software doesent mark the object
> as "virus", because of the fact other malware was done from the same
> "fingerprint" of your object.
>

Well, that's a problem for the anti virus industry. I stopped using AV
software "agents" long ago and resorted to monthly scans of the hard
drive. That's easier than writing my code just so that it doesn't tip
off the AV. That problem is not unique to assembly either: I once
tripped over the AV of my school (by, that's some time ago...) while
coding Pascal. I changed some condition to its equivalent (i.e. make "if
(x > 0)" to "if (0 < x)") and it worked again. But I started thinking at
that time, that AV software only relies on two checks: fingerprints and
heuristics. The more fingerprints are gathered into the database the
more bit patterns are suddenly forbidden. At some point the trouble
caused by false positives is bound to be more expensive than any damage
a virus could do.

As for heuristics, the word is derived from something greek that
translates roughly as "quick test". Which is also prone to false positives.

Plus, as long as there is an industry out there that relies on viruses
to exist, there won't be any real solutions.

> What is more important, assembler, especially when used as
> back-end, should not raise error from the following code,
>
> nestlevel equ 127
> enter 20h,nestlevel
>
> only because "nestlevel" is out of range (0-31)
>

Well, I'd want a warning issued there, but only at request.

> Intels docs tell us it is an _imm8_
>

Yeah, and that it's masked automatically to five bits. Which is
something someone might have overlooked.

> C8 iw ib ENTER imm16, imm8 Create a nested stack frame for a procedure
> then CPU microoperations of ENTER, applies internally a mod 32
> on that imm8.
>
> my reasons against that warning/error from the compiler are:
>
> 1) future eventual extension of that instruction from Intel

That's what actively developed assemblers are for.

> 2) "nestlevel" maybe be used later in the code for other purpouses.
>

True. NASM however would allow you to suppress the warning entirely by
writing

enter 20h, nestlevel&31


> Also, i am fudamentally for a WYSIWYG assembler.
> And when in doubt apply hardware specification,
> not the assembler native encoding.
>

Again, no sense in there. The problem with the catch phrase WYSIWYG here
is that WYS is necessarily some textual representation of WYG, and there
is no bijective function between the two. In particular, the same
mnemonics may have multiple encodings.

I never liked that term, either. I saw it applied to HTML designers and
could make the whole class stare in amazement by simply pointing w3m to
the site they created. What they got was nothing like they saw before.

hopcode

unread,
Nov 16, 2011, 10:49:05 AM11/16/11
to
Il 16.11.2011 14:46, Markus Wichmann ha scritto:
>
> You are not making sense. If I want to
> add ecx, edx
> in 32 bit mode or higher, the assembler can choose to encode the stuff as
> 01 CA
> or
> 03 D1
>
> It both means the same and it's the assembler's job to choose an
> encoding. What do you mean by "they shouldnt choose"? Should they
> present you with the choice? How long do you wish to click or type
> through dialogs until you have actually assembled a sizable project?
>
>> afaik,
>> there is any asseembler out in the world that let you
>> customize opcode outputting.

but, as reported in the following posts, do really A86 allow it ?

> Some stuff you actually can do in NASM. There is the "strict" keyword.
>>
>> 1) applies a sort of steganography on the output object.
>> ( i sign my object with my name :-)
>
> Nice feature, but pretty obscure. I'd simply write my name into the data
> section and do something with it.

Your opinion, but there's nothing obscure there, because
the _delta_ between
> 01 CA
> or
> 03 D1
as you wrote above in the example is fully possible and documented.

>> 2) indirectly simplifies copy protection technics on that object.
> What do you mean by that? Do you think a skilled reverse engineer is
> uncapable of making sense of a specially crafted code?
!?
the reason i used the word "steganography" στεγανός (hidden).
but the goal is signing the object as unique, not making it "obscure".

>> 3) the dumb eurihstic of antivirus software doesent mark the object
>> as "virus", because of the fact other malware was done from the same
>> "fingerprint" of your object.
>>
> Well, that's a problem for the anti virus industry. I stopped using AV
> software "agents" long ago and resorted to monthly scans of the hard
> drive.
> As for heuristics, the word is derived from something greek that
> translates roughly as "quick test". Which is also prone to false positives.
> Plus, as long as there is an industry out there that relies on viruses
> to exist, there won't be any real solutions.

then, we can argue that real solutions are certainly not
the false positives ones ;-)

in fact lot of AV mark code, sent to me by email and licensed 2 or more
years ago, as malware. there's no malware in it (i know the source and
the binaries) and that code doesnt do any damage.

in other case, wouldnt be it embarassing if your customer download your
updated app and it will be automatically quarantined by your local AV ?

>> What is more important, assembler, especially when used as
>> back-end, should not raise error from the following code,
>>
>> nestlevel equ 127
>> enter 20h,nestlevel
>>
>> only because "nestlevel" is out of range (0-31)
>>
>
> Well, I'd want a warning issued there, but only at request.
..mmh, warning ok, but not more than that in this case.
>
> True. NASM however would allow you to suppress the warning entirely by
> writing
> enter 20h, nestlevel&31
>
and here again, You will agree that it's developer's responibility;
also "kurz und knapp" (You are German, i suppose) developer chooses
his own encoding. the NASM "strict" is too "strict" for my taste, where
an improved level of customization that _simplifies_ working with
"fingerprints" isnt possible with NASM at the moment.

>> Also, i am fudamentally for a WYSIWYG assembler.
>> And when in doubt apply hardware specification,
>> not the assembler native encoding.
>>
>
> Again, no sense in there. The problem with the catch phrase WYSIWYG here
> is that WYS is necessarily some textual representation of WYG, and there
> is no bijective function between the two. In particular, the same
> mnemonics may have multiple encodings.
> I never liked that term, either. I saw it applied to HTML designers and
> could make the whole class stare in amazement by simply pointing w3m to
> the site they created. What they got was nothing like they saw before.
>

WYSIWYG... that's only a provocatory way to stimulate discussion.
unless told, or in special cases, i think shorter encoding fits the best
choice. anyway even for assemblers there is a special term to express
what in the phrase "WYSIWYG" is commonly referred to printing from
word-processors and, as reported, to rendering HTML.

in any case, i would like to recall that opcodes are hardware
functionalities. the more the assembler is well designed, the more it
is capable to encode those functionalities in a 1:1 way,
also WYSIWYG.

Rod Pemberton

unread,
Nov 16, 2011, 1:28:02 PM11/16/11
to
"Markus Wichmann" <null...@nospicedham.gmx.net> wrote in message
news:bitap8-...@voyager.wichi.de.vu...
> On 15.11.2011 14:11, Rod Pemberton wrote:
...

> > 00000000 14A5 adc al,0xa5
> > 00000002 80D3A5 adc bl,0xa5
> > 00000005 80D1A5 adc cl,0xa5
> > 00000008 80D2A5 adc dl,0xa5
> >
>
> When is it a problem that nasm generates _less_ code than MASM?

It's a problem when you need to have the exact byte sequence for one form to
be generated, but can't generate it.

E.g. #1, let's say you're recreating source for a program and it uses that
instruction, but it's not available in your assembler. What do you do?
Hard code it? What that if the instruction has an offset, and that offset
gets changed with changes to the source? Do you manually correct the
hardcoded instruction each time? Do you create a self-modifying patch?

E.g. #2, let's say you're testing an x86 interpreter. How do you use NASM
to generate the code to test the missing long form? What if the 014h form
is not implemented yet? So, you either hardcode the missing instruction,
binary edit a file to add it, or perhaps code a C program to emit it ...
Not having all forms of instruction encodings available complicates porting,
patching, and source code recreation.

I am of the opinion, as are many others who program in assembly, that I
should be able to generate all forms of an instruction. As you become more
involved you become with programming in assembly, there will be a time an
place that someone will need that functionality.

> AFAIK
> the short forms execute faster. Plus you get more code into the cache
> that way. If that form of the instruction destroys some form of
> alignment, you can easily enforce this with the "align" macro (which by
> default insert's 90h's)
>

True.

> > NASM will always generate the short form (14h) for AL. How do you tell
> > NASM to generate the 80h form for AL? AFAIK, you can't. It's possible
> > it's been added to NASM. I haven't read newer NASM manuals.
> >
>
> Well, if you do know the exact machine code you want to have, you might
> as well tell NASM about it:
>
> db 80h, 0d0h, 0a5h
>

That's exactly what is not wanted. We were discussing unimplemented corner
cases. That's an unimplemented corner case.

> However, assemblers were originally created so programmers
> wouldn't have to fiddle with the specifics of machine code. That's why
> assemblers do have some freedom when choosing their encoding.

How do you choose exactly? That's the point. Without assembler syntax for
it, you can't. You can only hardcode it, as above with db ...

I, and probably many others, have run into "boundary cases". NASM is far
better in regards to this for x86, but not complete. You can get NASM to
generate long form of instructions that GAS and WASM won't. I'm not sure
about other assemblers.

> > If you ever get around to creating assembly code for some binary, you'll
> > likely attempt to check it by comparing the resulting binaries.
>
> I really don't know of any way this could possibly disturb you, _ever_.
> Especially, I don't know why I would want to write an assembler source
> file that generates the _exact_ same output as some existing file.

To ensure that the disassembled source for one assembler is equivalent to
the original source of another assembler, you want to check that the new
binary identical to the original. If it's not, you'll have to check each
difference by disassembling. There could be thousands of them. Also, if a
some instruction assembles to a form using a larger offset, then all code
and data that comes after is off by a byte(s), and then the recreated source
won't work. You want check as few differences as possible. You don't want
to be checking numerous different but equivalent "reg, reg" encodings.
You'd like to only check encoding which could break the program, i.e., those
with differently sized offsets, such as jumps and branches. But, if you've
got many "reg, reg" encodings to search through, you can't easily locate
wrongly sized branches and jumps. Also, if you can't select different sizes
of instruction encodings, how do you fix it? Kludge ... Hardcode ...
Self-modifying code ... etc.

> In my book, "getting around to creating assembly code" happens only
> for few reasons; and trying to recreate an existing binary is not one of
> them.

Sometimes, programs need to be converted to another assembler, e.g., the
original assembler no longer exists or is hard to obtain. How do you
recompile? Let's say you've got a binary that can be disassembled, but you
need to be able to verify the recreated source is correct ... That means
comparing the resulting binary with the original. If you have a Public
Domain program as a binary but no source, e.g., DOS device driver, what do
you do? Let's say you need to update that program, or convert or re-use a
part of it.

> > If the
> > binary was created by MASM and you're using NASM for the recreation,
> > you'll come across branches that NASM encodes one way and MASM
> > encodes another way. I.e., look at Jcc for multiple encodings.
> >
>
> Looking at the manuals, I don't see much ambiguity. Or do you mean:
>
> 70 cb JO rel8off
> 0F 80 cw JO rel16off
> 0F 80 cd JO rel32off
>
> So you can encode a jump within 128 bytes in three ways?

Only one of those encodes a jump exclusively within 128 bytes ... But, yes,
the 0F 80 form can encode a short branch.

> NASM always chooses the smallest possible one. I don't
> see what's wrong with that behaviour.

Nothing is "wrong" with that behavior. All assemblers have a "default"
behavior.

Fortunately, NASM allows you to select between those forms using NEAR and
o16/o32. NASM doesn't allow you to select on the earlier AL or AX/EAX
forms.

E.g., one piece of assembly I converted required NASM's SHORT on a couple of
JMPs, BYTE on all the arithmetic, and WORD on many MOVs and all the
remaining JMPs. NASM allows you to chose those corner cases. It would've
been much harder if one had to db all of them. That same x86 code has stuff
like x86 instructions as labels: CLD, CMOVE. You can argue CMOVE wasn't
around at the time, but not CLD. So, that assembler preferenced branch
names over instructions, while NASM doesn't.

> > Get a hex editor. Pick an instruction that can have multiple equivalent
> > encodings (above). Manually construct the instructions as you believe
> > they are encoded. Many x86 instructions use 8-bits in octal as 2/3/3
> > for encoding registers and the memory mode. Disassemble.
>
> You do know that assemblers were invented so people wouldn't have to
> bother with machine code, right?

I first programmed in assembly in the early '80's for 6502 with an assembler
which was one-step beyond a text-editor ...

> Anyway, do you want to tell us the long way that there are
> different ways to encode the same instruction and you think
> NASM chooses the wrong ones?

That's not what I said.

James was asking about problems with corner cases with NASM, especially LGDT
and LIDT. There are instructions where NASM can't generate all forms. It's
fairly well known that NASM emits some instructions differently from MASM,
which causes problems with porting.

E.g., NASM will emit the first hex byte for these instructions, whereas some
other assembler (e.g., perhaps MASM, A86, or ASM86) emits the second:

or 09 vs. 0B
xor 31 vs. 33
cmp 39 vs. 3b
cmp 38 vs. 3a
mov 89 vs. 8b
mov 88 vs. 8a
sub 28 vs. 2b
etc.

Those are probably all "r8,r8" variations.

For that program, there are just over 150 binary differences, half of which
are instructions, in just 6KB. In a normal program, the quantity of
differences could be huge.

> > You may want other disassemblers besides NASM's NDISASM to
> > cross-check the disassembly. While NASM does a good job for
> > assembly, NDISASM doesn't (or didn't ...) disassemble everything
> > correctly.
>
> How come that again? I've yet to see a file ndisasm can't handle. Ugly
> hacks, like jumps into the middle of an instruction it can't do,
> granted, but it can do anything else.
...

> It even has an "intelligent sync" feature, allowing it to correctly decode
> an instruction/data mix. However, for that to work, the address of an
> instruction start has to occur as part of an instruction before the
> address itself. If that doesn't happen, there's also "manual sync".

It sounds like you're discussing Rosasm or the ancient x86 Bubble
disassembler ...

Which versions of NDISASM support this?

Anyway, "intelligent sync" as you described it, isn't guaranteed to work.
All that means is some found bytes were equivalent to an offset or address
within the address range you've chosen to disassemble. x86 has a complete
single-byte instruction map. So, there is no way to correctly sync via
programatic methods. Even with full program analysis of all possible
branches and possible entry points, it's still possible that programmatic
methods won't locate the correct disassembly, i.e., multiple found
instruction paths. When lots of data or text is added to the mix, or code
that is not used by the main program, or the program is short, the problem
becomes more difficult. If there is a code size switch, e.g., 32-bit
following 16-bit, it is "impossible" to determine programmaticly. The x86
16-bit and 32-bit encodings have a huge amount of overlap, but are slightly
different. That difference may be detectable, but wouldn't be easy which is
why I quoted impossible. 64-bit mode, you could probably do a frequency
analysis on the code for a REX prefix to guess that it's 64-bit code.
Sequences of ASCII text can be guesses if bit 7 is clear for bytes in a code
sequence. Anyway, the safe method is for the programmer to determine the
code entry points, e.g., by determining text data via a dump, entry location
via an object header, etc. If one entry point is known to be correct, you
can sync for a while ...


Rod Pemberton



Rugxulo

unread,
Nov 17, 2011, 12:54:31 AM11/17/11
to
Hi,

On Nov 16, 6:55 am, N0S...@daqarta.com (Bob Masta) wrote:
>
> I seem to recall that someone (I think it was Eric Isaacson
> with A86) used (uses?) these different encodings to provide
> a "fingerprint" to identify code written with his assembler.

Allegedly ... yes, this was so he could catch and (mostly?) prove that
someone was using his assembler "commercially" without registering it.
But as recently as a handful of years ago, he said he'd never gone
after anyone yet. (Just FYI.)

Lars Erdmann

unread,
Nov 17, 2011, 5:59:51 PM11/17/11
to
The Intel spec states that in either case, 16-bit or 32-bit operand size,
you will always need to specify a 6-byte memory location (for 64-bit, make
that a 10-byte location):

lgdt fword [mem]

but if the operand size is 16-bits, from the linear base address only 3
bytes (instead of 4 bytes) will actually be used (with the remaining byte
being set to zero in the GDTR). This is implicitely done by the CPU and not
by the assembler.
I guess that leaves little room for the assembler to find out what you want
to do unless you specify the operand size prefix explicitely as you did.

Lars


"James Harris" <james.h...@nospicedham.googlemail.com> schrieb im
Newsbeitrag
news:f203cd58-ee54-4310...@t8g2000yql.googlegroups.com...

s_dub...@nospicedham.yahoo.com

unread,
Nov 17, 2011, 7:23:45 PM11/17/11
to
On Nov 13, 1:31 pm, James Harris
For 16 bits real mode, it looks like you need line # 43:

re: bits 16.

1 ;; File: LGDT.NSM By:
s_dub...@yahoo.com
2 ;; Last: 01-Jan-10 02:44:30 PM
3 ;; Vers: 0r0
4 ;; test %macro for 'jmps' ->
'jmp short'
5 ;; ck LGDT - 17-Nov-11
6
7 [BITS 16]
8
9 ;;[BITS 32]
10
11 %macro jmps 1 ;; redefine
'jmps' addr as 'jmp short' addr
12 jmp short %1
13 %endmacro
14
15 main:
16 jmps .next ;; the parameter
is a label of the destination.
17 00000000 EB03 <1> jmp short %1
18 00000002 90 nop
19 00000003 90 nop
20 00000004 90 nop
21 .next:
22
23 00000005 66A1[6400] mov eax, [mem]
24 00000009 670F0110 lgdt [eax]
25
26 BITS 32
27 0000000D A1[64000000] mov eax, [mem]
28 00000012 0F0110 lgdt [eax]
29
30 00000015 670F0115[64000000] a16 lgdt [mem]
31 0000001D 0F0115[64000000] a32 lgdt [mem]
32 00000024 660F0115[64000000] o16 lgdt [mem]
33 0000002C 0F0115[64000000] o32 lgdt [mem]
34 00000033 670F0116[6400] lgdt [word mem]
35 00000039 0F0115[64000000] lgdt [dword mem]
36
37 BITS 16
38 00000040 0F0116[6400] a16 lgdt [mem]
39 00000045 670F0116[6400] a32 lgdt [mem]
40 0000004B 0F0116[6400] o16 lgdt [mem]
41 00000050 660F0116[6400] o32 lgdt [mem]
42 00000056 0F0116[6400] lgdt [word mem]
43 0000005B 670F0115[64000000] lgdt [dword mem]
44
45 00000063 C3 RET
46
47 mem: ;; 6 byte pseudo-
descriptor
48 00000064 0000 limit: dw 0
49 00000066 00000000 base: dd 0
50
51 ;; -= eof =-


re: bits 32

1 ;; File: LGDT.NSM By:
s_dub...@yahoo.com
2 ;; Last: 01-Jan-10 02:44:30 PM
3 ;; Vers: 0r0
4 ;; test %macro for 'jmps' ->
'jmp short'
5 ;; ck LGDT - 17-Nov-11
6
7 ;;[BITS 16]
8
9 [BITS 32]
10
11 %macro jmps 1 ;; redefine
'jmps' addr as 'jmp short' addr
12 jmp short %1
13 %endmacro
14
15 main:
16 jmps .next ;; the parameter
is a label of the destination.
17 00000000 EB03 <1> jmp short %1
18 00000002 90 nop
19 00000003 90 nop
20 00000004 90 nop
21 .next:
22
23 00000005 A1[64000000] mov eax, [mem]
24 0000000A 0F0110 lgdt [eax]
25
26 BITS 32
27 0000000D A1[64000000] mov eax, [mem]
28 00000012 0F0110 lgdt [eax]
29
30 00000015 670F0115[64000000] a16 lgdt [mem]
31 0000001D 0F0115[64000000] a32 lgdt [mem]
32 00000024 660F0115[64000000] o16 lgdt [mem]
33 0000002C 0F0115[64000000] o32 lgdt [mem]
34 00000033 670F0116[6400] lgdt [word mem]
35 00000039 0F0115[64000000] lgdt [dword mem]
36
37 BITS 16
38 00000040 0F0116[6400] a16 lgdt [mem]
39 00000045 670F0116[6400] a32 lgdt [mem]
40 0000004B 0F0116[6400] o16 lgdt [mem]
41 00000050 660F0116[6400] o32 lgdt [mem]
42 00000056 0F0116[6400] lgdt [word mem]
43 0000005B 670F0115[64000000] lgdt [dword mem]
44
45 00000063 C3 RET
46
47 mem: ;; 6 byte pseudo-
descriptor
48 00000064 0000 limit: dw 0
49 00000066 00000000 base: dd 0
50
51 ;; -= eof =-

The last nibble holds the r/m field, maybe used as operand size, the
manual is vague to me.

0F 01 nn, nn = 15h = 00 010 101b - or - 16h = 00 010 110b

0F 01 refers to a 2 byte opcode map of Group7.
From it, the modR/M -> mod[2],nnn[3],R/M[3].

nnn of 010b is LGDT (Ms), (a nnn of 011b is LIDT)

Ms, where M designates that "the ModR/M byte may refer only to memory;
e.g., LES, ...
and where s designates "Six-byte pseudo-descriptor"

-I'm guessing the R/M field encodes the 'so called' operand size; 6 =
16bit, 5 = 32bit, unless the 67h flips its meaning.. [HELP]!

hth,

Steve

Philip Lantz

unread,
Nov 18, 2011, 1:42:34 AM11/18/11
to
On Thu, 17 Nov 2011 16:23:45 -0800, s_dubrovich wrote:

> On Nov 13, 1:31 pm, James Harris wrote:
> The last nibble holds the r/m field, maybe used as operand size, the
> manual is vague to me.
>
> 0F 01 nn, nn = 15h = 00 010 101b - or - 16h = 00 010 110b
>
> 0F 01 refers to a 2 byte opcode map of Group7.
> From it, the modR/M -> mod[2],nnn[3],R/M[3].
>
> nnn of 010b is LGDT (Ms), (a nnn of 011b is LIDT)
>
> Ms, where M designates that "the ModR/M byte may refer only to memory;
> e.g., LES, ...
> and where s designates "Six-byte pseudo-descriptor"
>
> -I'm guessing the R/M field encodes the 'so called' operand size; 6 =
> 16bit, 5 = 32bit, unless the 67h flips its meaning.. [HELP]!

No, the mod and R/M fields encode the addressing mode.

The operand size is based on the current operand size, which may be
overridden by the operand-size override prefix (66h).

67h is the address-size override prefix.

hopcode

unread,
Nov 18, 2011, 3:50:14 AM11/18/11
to
catched from curiosity, i browsed yesterday wikipedia and
na... what found, on

http://en.wikipedia.org/wiki/A86_%28software%29

this interesting paper about Hiding Infos in Executable Object
HIEO[tm] from now :-) :
http://www.cs.jhu.edu/~rubin/courses/fall04/hydan.pdf

where creativity is the only limit. then i browsed
A86/A386 features at http://eji.com/a86/features.htm

well,i think EI has a very nice humour. the fact
that he doesnt release _yet_ the source code awakes my
curiosity and gains all my acceptance when, in the
same time, he claims:
"A86 assembles at a rate of over 100,000 lines per second."
incredible !,isnt it ? and then
"That's per second. NOT per minute, per second."

per second, ja ? not per minute !
ok, i take the word for it, but i ask him anyway
and publicly the following, because it sounds very
interesting for me:

-1 why not sharing the source code
among the poor human beings for free ?
-2 is there a 64bit version of the assembler ?

Perhaps he can answer, who knows ?

Cheers,

--

Rod Pemberton

unread,
Nov 18, 2011, 4:37:56 AM11/18/11
to
<s_dub...@nospicedham.yahoo.com> wrote in message
news:df70131e-f1b2-42b3...@o5g2000yqa.googlegroups.com...
> On Nov 13, 1:31 pm, James Harris
> <james.harri...@nospicedham.googlemail.com> wrote:
> > [snip LGDT encoding for NASM]
> >
> > I can code
> >
> > o32 lgdt [mem]
> >
> > which, I think, does the job.
>
> For 16 bits real mode, it looks like you need line # 43:
>

41 ... ?

AIUI, 43 allows you to locate the 5/6-byte GDT descriptor at a 32-bit
address while using 16-bit code, but will load a 5-byte descriptor for the
GDT base address (24-bit or 3-byte) and limit (16-bit or 2-byte). E.g., for
both 32-bit address and 6-byte descriptor (32-bit/4-byte address,
16-bit/2-byte limit), you'd need:

o32 lgdt [dword mem]

> [snip]
>
> 37 BITS 16
> 38 00000040 0F0116[6400] a16 lgdt [mem]
> 39 00000045 670F0116[6400] a32 lgdt [mem]
> 40 0000004B 0F0116[6400] o16 lgdt [mem]
> 41 00000050 660F0116[6400] o32 lgdt [mem]
> 42 00000056 0F0116[6400] lgdt [word mem]
> 43 0000005B 670F0115[64000000] lgdt [dword mem]
> 44
>
> [snip]


Rod Pemberton




Rod Pemberton

unread,
Nov 18, 2011, 4:40:24 AM11/18/11
to
"Lars Erdmann" <lars.e...@nospicedham.arcor.de> wrote in message
news:4ec591e8$0$7619$9b4e...@newsspool1.arcor-online.net...
> "James Harris" <james.h...@nospicedham.googlemail.com> schrieb im
> Newsbeitrag
> news:f203cd58-ee54-4310...@t8g2000yql.googlegroups.com...
> >
[for Usenet newsgroups, you're reply goes in-between or after]

> The Intel spec states that in either case, 16-bit or 32-bit operand size,
> you will always need to specify a 6-byte memory location (for 64-bit,
> make that a 10-byte location):
>
> lgdt fword [mem]

FYI, that's not valid NASM syntax. 'fword' is for MASM.

Older NASM supports BYTE, WORD, DWORD, QWORD,
TWORD (i.e., TWORD is MASM's TBYTE). Newer NASM
supports those and OWORD, YWORD for SSE and YMM
registers.

> but if the operand size is 16-bits, from the linear base address only 3
> bytes (instead of 4 bytes) will actually be used (with the remaining byte
> being set to zero in the GDTR). This is implicitely done by the CPU
> and not by the assembler.

I've seen this a half-dozen times recently. So, it's time to mention it.

There is no 'cite' in implicitly.


Rod Pemberton


Bob Masta

unread,
Nov 18, 2011, 8:14:28 AM11/18/11
to
Just as a point of reference, MASM32 assembles 138,000 lines
of code in a little over 2 seconds on my cheap old 1.6 GHz
XP laptop. I don't recall for sure what language Microsoft
wrote MASM in, but I expect it was C. So 100,000 lines per
second sounds perfectly reasonable for A86, which was
probably written in assembler. (And may have been tested on
a system twice as fast as mine, anyway.)

s_dub...@nospicedham.yahoo.com

unread,
Nov 18, 2011, 9:54:44 AM11/18/11
to
On Nov 18, 3:37 am, "Rod Pemberton"
<do_not_h...@nospicedham.noavailemail.cmm> wrote:
> <s_dubrov...@nospicedham.yahoo.com> wrote in message
>
> news:df70131e-f1b2-42b3...@o5g2000yqa.googlegroups.com...
>
> > On Nov 13, 1:31 pm, James Harris
> > <james.harri...@nospicedham.googlemail.com> wrote:
> > > [snip LGDT encoding for NASM]
>
> > > I can code
>
> > > o32 lgdt [mem]
>
> > > which, I think, does the job.
>
> > For 16 bits real mode, it looks like you need line # 43:
>
> 41 ... ?

Thanks to you and Philip for straightening me out.

Steve

Lars Erdmann

unread,
Nov 18, 2011, 11:45:53 AM11/18/11
to

>> >
>
> [for Usenet newsgroups, you're reply goes in-between or after]

It's:
[for Usenet newsgroups, your reply goes in-between or after]

>> but if the operand size is 16-bits, from the linear base address only 3
>> bytes (instead of 4 bytes) will actually be used (with the remaining byte
>> being set to zero in the GDTR). This is implicitely done by the CPU
>> and not by the assembler.
>
> I've seen this a half-dozen times recently. So, it's time to mention it.
>
> There is no 'cite' in implicitly.

Do you want me to answer in german ?


Lars

wolfgang kern

unread,
Nov 18, 2011, 2:00:47 PM11/18/11
to

Bob Masta replied to hopcode:
If I'd compare RosAsm with MASM32 in terms of compilation speed
(application performance will depend on code style anyway),
I'd had to tell that MASM/MASM32 are awful slow dead horses.

A86 follows a long forgetten path in programming and is too far away from
beeing smart nor show any features to update for newer hardware (>1987).

Compilation-speed may be a matter of programs which use this unsafe
"compile at site" feature... an invitation for hackers since long :)

I take my time to compile (manually) and check against almost all odds,
but deliver only tested and fully working code.
So I actually dont care much about the time to create user demanded code.
__
wolfgang


Rugxulo

unread,
Nov 18, 2011, 2:05:40 PM11/18/11
to
Hi,

On Nov 18, 10:45 am, "Lars Erdmann"
<lars.erdm...@nospicedham.arcor.de> wrote:
>
> >> being set to zero in the GDTR). This is implicitely done
>
> > I've seen this a half-dozen times recently.
> > So, it's time to mention it.
> > There is no 'cite' in implicitly.
>
> Do you want me to answer in german ?

Nein, it's okay, Rod is a jelly donut. ;-)

(mods: yes, off-topic, but a little humor never hurt anyone)

James Harris

unread,
Nov 18, 2011, 4:25:05 PM11/18/11
to
On Nov 18, 9:40 am, "Rod Pemberton"
<do_not_h...@nospicedham.noavailemail.cmm> wrote:
> "Lars Erdmann" <lars.erdm...@nospicedham.arcor.de> wrote in message

...

> [for Usenet newsgroups, you're reply goes in-between or after]

...

> > being set to zero in the GDTR). This is implicitely done by the CPU
> > and not by the assembler.
>
> I've seen this a half-dozen times recently.  So, it's time to mention it.
>
> There is no 'cite' in implicitly.

And there's no apostrophe in your.

;-) Sorry, but given your comment and that I had noticed your you're
it seemed only just to make that reply. I repeatedly find I've made
similar mistakes - and it's good to see correct language being pointed
out.

James

Rod Pemberton

unread,
Nov 19, 2011, 11:36:36 AM11/19/11
to
"Rugxulo" <rug...@nospicedham.gmail.com> wrote in message
news:086f4ed2-321c-45d1...@o14g2000yqh.googlegroups.com...
> On Nov 18, 10:45 am, "Lars Erdmann"
<lars.erdm...@nospicedham.arcor.de> wrote:
>
>
> > Do you want me to answer in german ?
> >
> Nein, it's okay, Rod is a jelly donut. ;-)
>

"Nein"? Ich verstehe nicht. Ich kann kein Deutsch. ;-)

Google translate to the rescue ...
http://translate.google.com/

> (mods: yes, off-topic, but a little humor never hurt anyone)
>

Yeah, since no one here liked your suggested Forth filename extensions,
we're all still awaiting your review of GP's DOS port of the Dillo GUI ...

;-)


Rod Pemberton


Rugxulo

unread,
Nov 19, 2011, 2:44:38 PM11/19/11
to
Hi,

On Nov 19, 10:36 am, "Rod Pemberton"
<do_not_h...@nospicedham.noavailemail.cmm> wrote:
> "Rugxulo" <rugx...@nospicedham.gmail.com> wrote in message
>
> news:086f4ed2-321c-45d1...@o14g2000yqh.googlegroups.com...
>
> > On Nov 18, 10:45 am, "Lars Erdmann"
> <lars.erdm...@nospicedham.arcor.de> wrote:
>
> > > Do you want me to answer in german ?
>
> > Nein, it's okay, Rod is a jelly donut.   ;-)

http://en.wikipedia.org/wiki/Ich_bin_ein_Berliner#Jelly_doughnut_misconception

> "Nein"?  Ich verstehe nicht.  Ich kann kein Deutsch.  ;-)
>
> Google translate to the rescue ...http://translate.google.com/
>
> > (mods: yes, off-topic, but a little humor never hurt anyone)
>
> Yeah, since no one here liked your suggested Forth filename extensions,
> we're all still awaiting your review of GP's DOS port of the Dillo GUI ...
>
> ;-)

Heh, I knew Forthers wouldn't agree on anything, nobody ever does.

As for Dillo, last I heard it didn't accept downloads, and packet
drivers for new network cards (as you know) are rare, so I haven't
bothered trying it, even under VirtualBox, where networking works,
barely. (But it's still cool.)
0 new messages