Us "purists" don't like it much, but in the real world, most asm is in
subroutines called from C[++]. The hard parts are done in asm, and C[++]
provides the "glue" that holds it together, and provides the OS-specific
parts. (not going to be portable across CPUs, of course)
An alternative to this would be inline asm. When the C folks talk about
portability, they ain't talking about cross-toolchain portability of
inline asm!!! Awful mess. Maybe gcc on Linux (BSD/Mac) vs gcc on Windows
would work...
An object file of the proper format will (should) link with object
file(s) of the same format, regardless of *what* language the were
written in ("Python" is popular... I wanna wrestle a carniverous reptile
that outweighs me... Yeah, right!). Different kind of "portability".
Since Nasm will produce object files in a variety of formats, it's a
suitable tool for the asm part. We won't have "binary portability" of
course, it'll have to be re-assembled for each OS we want to use it with
(same as C)...
I don't suppose C has a "getcpuvendor()"... maybe it does. Struck me as
something this might be "good for". (a better example might pass more
parameters...) Something like:
bits 32
global getvendor
section .text
getvendor:
pusha
xor eax, eax
cpuid
mov eax, [esp + 36]
mov [eax], ebx
mov [eax + 4], edx
mov [eax + 8], ecx
mov byte [eax + 12], 0
popa
xor eax, eax
ret
For Linux, we can assemble this as "nasm -f elf getvendor.asm" - add
"-g" or "-O" switches if you want... For other-than-ELF, we probably
want an underscore on "getvendor", so "nasm -f win32 --prefix _
getvendor.asm". I'm not sure if MacIntel wants underscores for "-f
macho" or not. If my information is correct, Watcom C wants a *trailing*
underscore... "--postfix _" should do it. Hmmmm, what output format
would Watcom want? Hmmm... Borland wants "-f obj", I believe. "-f obj"
defaults to "bits 16", so we'll need "bits 32" if we want it to work
with "-f obj" - won't hurt "-f elf", etc... There, I put it in
(improving already!).
Now, we'll want something to call it from...
main()
{char vendorbuf[13];
getvendor(vendorbuf);
puts(vendorbuf);
}
gcc -o getvendortest getvendortest.c getvendor.o
That's horribly improper C. Won't tolerate "-Wall". We'll want "int
main"... "puts" is implicitly declared in "stdio.h", I guess. We could
include a "getvendor.h"... hardly seems worthwhile. "Prototype" it here?
(before or after "main"...?) An explict "return". What's the "minimal"
but "correct" way to do it?
I don't know how to call it from VB or whateverall... any gurus? Hey, we
can call it from asm!
global _start
extern getvendor
section .text
_start:
nop
commence:
push vendorbuf
call getvendor
add esp, 4
mov ecx, vendorbuf
mov edx, 12
call write_stdout
mov ebx, eax
mov eax, 1
int 80h
;------------------
write_stdout:
mov ebx, 1 ; STDOUT
mov eax, 4 ; __NR_write
int 80h
ret
;-------------------
section .bss
vendorbuf resb 13
;-----------------------
That'll have to be changed for other-than-Linux, of course. WriteFile...
put it in a window with some fancy font, if you like... Maybe, for
Windows, we ought to arrange for "ret 4" in the callee so it'd be more
like an API call?
Just a first draft...
Best,
Frank
This might be better:
#include <stdio.h>
void getvendor(char *vendorbuf);
int main(void) {
char vendorbuf[13];
getvendor(vendorbuf);
puts(vendorbuf);
return 0;
}
This assumes that vendorbuf is big enough for getvendor and that
getvendor terminates it's output with a null character. Otherwise bad
things are likely to happen.
The real issue is the ABI. The ABI that the C compiler decides on *must*
agree with the ABI that the assembler routine expects. There is
unlikely to be much of a problem for a simple example like this, but
there are more headaches when it comes to complex routines that accept
many types of parameters, especially structures and also return complex
types like structures or unions. The only way to make the assembler and
the C code interoperate is to learn the ABI for that C implementation
and adjust the assembler code accordingly. It *is* possible to tell the
C compiler to use a different ABI, but this is likely to be a more
slippery path, as not all compilers have such support, and it isn't as
flexible as modifying your assembler code.
Take this example C function
struct ret_struct my_func(
ptrdiff_t p,
size_t s,
struct example ex,
unsigned long long ull,
void (*fx)(int *, int *),
char c
);
This is artificial, but it does illustrate that this is not a simple
matter. You need to know the exact sizes of the above parameters and
how they should be laid out on the stack for the C compiler to not
choke on it. A similar complicated assembler routine must also know how
the C compiler would send it it's arguments and where it would expect
the return value.
There is no portable (from C's POV) way to do this.
At least it's programming related. It's definately better than MAIBTYA
arguments that seemed to dominate for the last couple of years... (My
Assembler Is Better Than Your Assembler) Isn't it? I'd say we're now one
step closer to assembly, wouldn't you? ;-) ;-)
> Us "purists" don't like it much, but in the real world, most asm is in
> subroutines called from C[++]. The hard parts are done in asm,
Correction, the non-C related parts, i.e., CISC, is done in asm because it
can't be done in C... :)
<snip, asm conversation of desperate man with self>
> I don't know how to call it from VB or whateverall... any gurus? Hey, we
> can call it from asm!
:)
Actually, if you converted a couple of those small int 80h routines to
macro's, you're on your way to C primitives...
Rod Pemberton
...
> This might be better:
>
> #include <stdio.h>
>
> void getvendor(char *vendorbuf);
>
> int main(void) {
> char vendorbuf[13];
> getvendor(vendorbuf);
> puts(vendorbuf);
> return 0;
> }
Perfect. Gcc can find nothing to complain about (must be very
frustrating for it). Thanks, Santosh! (although this may be futile...)
> This assumes that vendorbuf is big enough for getvendor
Okay... At this point, "getvendor.h" is starting to look better. Put the
prototype in it, and "#define VENDORBUFSIZE 13"... or maybe round it
up... Then "h2incn getvendor.h" (Thank you, Johannes!) will produce
"getvendor.inc" with "%define VENDORBUFSIZE 13", and our ass(m) ought to
be covered.
> and that
> getvendor terminates it's output with a null character.
Sigh... so much for my assumption that asm is readable! :(
> Otherwise bad
> things are likely to happen.
A demo of how and why these bad things happen... the famous buffer
overflow exploit, and why we should be ashamed of ourselves for making
them... another day, perhaps...
> The real issue is the ABI. The ABI that the C compiler decides on *must*
> agree with the ABI that the assembler routine expects.
Yeah... I've been assuming that "Intel ABI" means something...
Okay. If the C compilers are going to get all random on us, this is not
going to be useful. Back to pure asm, where we know what the tools are
going to do!
Best,
Frank
You've uncovered my plot! :)
>>Us "purists" don't like it much, but in the real world, most asm is in
>>subroutines called from C[++]. The hard parts are done in asm,
>
> Correction, the non-C related parts, i.e., CISC, is done in asm because it
> can't be done in C... :)
Okay.
> <snip, asm conversation of desperate man with self>
Not all *that* desperate, actually.
>>I don't know how to call it from VB or whateverall... any gurus? Hey, we
>>can call it from asm!
>
>
> :)
>
> Actually, if you converted a couple of those small int 80h routines to
> macro's, you're on your way to C primitives...
Okay, between you and Santosh, you've convinced me that this is *not* a
worthwhile path to walk down.
C is Hitler! I'm outta here. :)
Best,
Frank
Well, if we are programming in assembler we have already sacrificed any
notion of cross-architecture portability. So we might as well get into
all the nitty gritty details of what the C compilers do for the system,
at least if we want to use C code.
Structure padding is another issue that is troublesome for programmers
trying to use object code derived from C. C compilers typically place
some padding within structures to align them properly, and they would
expect any structures constructed by foreign code to do the same.
This is one reason why C programmers tend to prefer inline asm. You
don't have to worry about such things there.
The problem is, while assembler is as flexible as you can get, HLLs
generally have a lot of conventions and restrictions. It's even more of
a nightmare to try to interoperate with C++ code.
...
> Well, if we are programming in assembler we have already sacrificed any
> notion of cross-architecture portability.
Right.
> So we might as well get into
> all the nitty gritty details of what the C compilers do for the system,
> at least if we want to use C code.
Yeah. I was under the impression that within, say "32-bit C compilers
for x86", we wouldn't run into *too* many differences in the nitty
gritty details. I'm aware of some, and of ways to work around 'em within
limits. Apparently the situation is worse than I imagined.
> Structure padding is another issue that is troublesome for programmers
> trying to use object code derived from C. C compilers typically place
> some padding within structures to align them properly, and they would
> expect any structures constructed by foreign code to do the same.
Right. If the structure is under our control, we can include explicit
padding to alignments such that a "sane" compiler wouldn't touch 'em.
This may be like believing in the tooth fairy. If the structure is
"forced" on us, all we can do is hope for the best. Xwindows adds
specific padding to all(?) structures. Gcc doesn't mess with 'em (that
I've seen). Might not be true for other compilers(?). I understand that
there's a "pragma pack" syntax that will control this, but we can't
count on whether it's been used or not...
> This is one reason why C programmers tend to prefer inline asm. You
> don't have to worry about such things there.
As long as they stick with the same compiler...
> The problem is, while assembler is as flexible as you can get,
In the sense that we can do anything we want, yes. In the sense that an
assembler is *not* at liberty to mess with our code, as long as it does
the same thing, *not* at liberty to pad our structure definitions,
beyond what we "say", *less* flexible, I'd say.
> HLLs
> generally have a lot of conventions and restrictions.
Even with my limited experience, I've noticed. :)
> It's even more of
> a nightmare to try to interoperate with C++ code.
The only thing I know about C++ is that we have to declare our external
asm functions "C", to keep 'em from being "decorated" This apparently
differs even from version to version of the same compiler. We've still
got the underscore issue. I'm guessing that I don't want to bother with
the issues I'm not yet aware of, if any. :)
This is apparently not as profitable an exercise as I believed...
Best,
Frank