Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

reading register cr0, no error, but weird behavior

200 views
Skip to first unread message

bilsch01

unread,
Nov 16, 2015, 10:48:00 AM11/16/15
to
The C program that calls the assembly function that reads register cr0
is as follows:

// run with sudo ./rdcr0
// gcc -Wall -O2 rdcr0.c -c
// then run: gcc -Wall -O2 rdcr0.o rdcr0_ref.o -o rdcr0

#include <stdio.h>
extern int rdregcr0(int a);

int main(void)
{
int x;
printf("\n this is the first statement\n\n");
x = rdregcr0(1);
printf("bit 1 of cr0 is: %d\n", x);
printf("\n this is the last statement\n\n");
return 0;
}

The assembly function that reads cr0 is:

; nasm -felf64 rdcr0.asm -o rdcr0_ref.o

global rdregcr0

SECTION .text

rdregcr0:


push rbp

mov rbp, rsp

mov [rbp-8], rdi
mov rcx, [rbp-8]
mov rbx, mask1
shl rbx, cl

mov rax, cr0

and rax, rbx

pop rbp

ret

SECTION .data

mask1 dq 1


There are no errors generated when running the C program with sudo.
The first printf() statement prints out but the second two printf's
don't print. These two are after the call to the assembly function that
reads cr0. There are no error messages. My objective is to read the
bits in cr0, but with this behavior I won't be able to do it. I can't
figure why I should be prevented from reading the register - reading
causes no harm. Your suggestions will be appreciated.

Bill S.

Frank Kotler

unread,
Nov 16, 2015, 11:48:09 AM11/16/15
to
This gets you the address of "mask". Pretty sure that's not what you
want. Try "mov rbx, [mask1]"
(not sure why you need a variable here at all...)

> shl rbx, cl
>
> mov rax, cr0
>
> and rax, rbx
>
> pop rbp
>
> ret
>
> SECTION .data
>
> mask1 dq 1
>
>
> There are no errors generated when running the C program with sudo.
> The first printf() statement prints out but the second two printf's
> don't print. These two are after the call to the assembly function that
> reads cr0. There are no error messages. My objective is to read the
> bits in cr0, but with this behavior I won't be able to do it. I can't
> figure why I should be prevented from reading the register - reading
> causes no harm. Your suggestions will be appreciated.
>
> Bill S.

I should probably leave this for someone who can try it, but... fools
rush in...

Best,
Frank

bilsch01

unread,
Nov 16, 2015, 12:33:14 PM11/16/15
to
I fixed that now that you've pointed it out, but I get the same results.

Thanks for your help.

Melzzzzz

unread,
Nov 16, 2015, 1:03:22 PM11/16/15
to
You should preserve rbx register as per x86-64 ABI.

Rod Pemberton

unread,
Nov 16, 2015, 6:48:52 PM11/16/15
to
FYI, generally, the SMSW instruction is used to get CR0's flags,
since it's not a privileged instruction like MOV CR0. However,
it doesn't return all of CR0's bits, just the lower 16 bits.

Does this C code work for your compiler? ...

#include <stdio.h>

unsigned long long __get_cr0(void)
{
unsigned long long result;
__asm__ __volatile__ (
"smswq %%rax\n"
"movq %%rax,%0\n"
: "=r" (result)
: /* no input */
: "%rax" /* clobbered */
);

return result;
}

int main(void)
{
unsigned long long x;
printf("\n this is the first statement\n\n");
x=__get_cr0();
printf("bit 1 of cr0 is: %d\n",(int)(x&0x01));
printf("\n this is the last statement\n\n");
return(0);
}

If it does, you can compare the output and the (dis)assembly.

Note that it might be easier to just return CR0's full value
and mask and cast it in C. I did that here with this version.
I.e., if you want to see the full value for SMSW, just change
"%d" to "%llx" and "(int)(x&0x01)" to "x".

I used inline GAS (Gnu ASsembler) instead of NASM for the
assembly which removes a layer of complexity, i.e., two source
files and linking. Albeit, you must deal with GAS syntax. But,
you could use GAS with the ".intel" directive to use MASM syntax
with GAS. Fortunately, inlined GAS assembly has the nice
"clobbered" register feature. This preserves and rearranges
in-use registers for you ... I.e., this means there is no need
for you to understand the x86-64 ABI. Although, you should
just in case. I need to go do that too. So, don't fault me
if the code isn't 100% correct. :-)


Rod Pemberton


--
When El Chapo is the most beloved man in Mexico and Trump is the most hated,
it shows that Mexico is truly fouled up.

bilsch01

unread,
Nov 16, 2015, 11:04:31 PM11/16/15
to
I pushed and later popped rbx, also got rid of the un-needed data
section. My results is the same as before.
Here is the new assembly:

global rdregcr0



rdregcr0:



push rbx

push rbp

mov rbp, rsp

mov [rbp-16], rdi

mov rcx, [rbp-16]
xor rbx, rbx
add rbx, 1

shl rbx, cl

mov rax, cr0

and rax, rbx

pop rbp

pop rbx

ret


bilsch01

unread,
Nov 17, 2015, 12:49:48 AM11/17/15
to
Your code compiles and the resulting file gives the desired output. I
called your file pembr1.c and then used the following command line:

gcc -S -masm=intel pembr1.c -o pembr1.asm

The result has a lot of extraneous stuff interspersed with assembly code
and the code is partly still in ATT format. I deleted stuff that
doesn't look like code and slightly rearranged some lines. Here's what I
got:

__get_cr0:
push rbp
mov rbp, rsp
smswq %rax
movq %rax,rdx
mov QWORD PTR [rbp-8], rdx
mov rax, QWORD PTR [rbp-8]
pop rbp
ret

push rbp
mov rbp, rsp
sub rsp, 16
mov edi, OFFSET FLAT:.LC0
call puts
call __get_cr0
mov QWORD PTR [rbp-8], rax
mov rax, QWORD PTR [rbp-8]
and eax, 1
mov esi, eax
mov edi, OFFSET FLAT:.LC1
mov eax, 0
call printf
mov edi, OFFSET FLAT:.LC2
call puts
mov eax, 0
leave
.LC0: .string "\n this is the first statement\n"
.LC1: .string "bit 1 of cr0 is: %d\n"
.LC2: .string "\n this is the last statement\n"

If I could convert it completely to Intel Assembly then I would have the
whole thing in assembly.


A couple points I don't understand. ATT format is reversed from Intel
however you have this line:
" "movq %%rax,%0\n - which isn't reversed.

Thanks. Bill S.

Rod Pemberton

unread,
Nov 17, 2015, 1:49:53 AM11/17/15
to
> Your code compiles and the resulting file gives the desired output. I
> called your file pembr1.c and then used the following command line:
>
> gcc -S -masm=intel pembr1.c -o pembr1.asm
>
> The result has a lot of extraneous stuff interspersed with assembly code
> and the code is partly still in ATT format. I deleted stuff that
> doesn't look like code and slightly rearranged some lines. Here's what I
> got:
>
> __get_cr0:
> push rbp
> mov rbp, rsp
> smswq %rax
> movq %rax,rdx
> mov QWORD PTR [rbp-8], rdx
> mov rax, QWORD PTR [rbp-8]
> pop rbp
> ret

This appears to be Intel/AT&T, except for "QWORD PTR" which appears
to be MASM and except for the GAS code I posted. I.e., it appears
to me that you have syntax for three different assemblers in that
code ... But, I'll accept that GCC output that for Intel syntax
without confirming.

Whether Intel/AT&T or MASM, you can't use the '%' which is only for
GAS' syntax for inlined assembly in C.

So, for these lines:

smswq %rax
movq %rax,rdx

You probably want these lines:

smsw rax
mov rdx,rax

GAS uses a reverse order and size qualifiers, i.e., 'q' for quad.

> [...]
>
> A couple points I don't understand. ATT format is reversed from Intel
> however you have this line:
> " "movq %%rax,%0\n - which isn't reversed.

But, it is reversed.

RAX is moved into another register, allocated by the compiler.
That output register is represented by %0 here.

For Intel or AT&T syntax, it would be "MOV ..., RAX" where '...'
is the other register.

I.e., GAS is "from, to" whereas Intel/AT&T is "to, from."

See the register order in the suggested changes for MOV above.

Rod Pemberton

unread,
Nov 17, 2015, 1:49:54 AM11/17/15
to
On Tue, 17 Nov 2015 00:40:05 -0500, bilsch01 <kin...@nospicedham.comcast.net> wrote:

> A couple points I don't understand. ATT format is reversed from Intel
> however you have this line:
> " "movq %%rax,%0\n - which isn't reversed.

Correction.

Replace "Intel/AT&T" in my last post with "Intel" and
"GAS" with "GAS/AT&T".

One shouldn't post while almost asleep.


RP

wolfgang kern

unread,
Nov 17, 2015, 3:50:07 AM11/17/15
to

"bilsch01" wrote:
...
> __get_cr0:
> push rbp
> mov rbp, rsp
> smswq %rax
> movq %rax,rdx
> mov QWORD PTR [rbp-8], rdx
> mov rax, QWORD PTR [rbp-8]
> pop rbp
> ret

Why 'that heavy' detours on the stack ?
I'd just have:

get_cr0:
smsw eax ;only lower 16 bits are used
ret
or:
mov rax,cr0 ;this need PL=0 in VM,PM
ret

and then filter the desired bits or even show them all at once, perhaps
for you easier in HLL with print bin/hex (dunno if std.h contain such).
__
wolfgang

bilsch01

unread,
Nov 17, 2015, 8:05:30 AM11/17/15
to
This works - I tried it. Thanks.

bilsch01

unread,
Nov 17, 2015, 8:20:32 AM11/17/15
to
On 11/16/2015 10:45 PM, Rod Pemberton wrote:
> On Tue, 17 Nov 2015 00:40:05 -0500, bilsch01
> <kin...@nospicedham.comcast.net> wrote:
>
>> A couple points I don't understand. ATT format is reversed from Intel
>> however you have this line:
>> " "movq %%rax,%0\n - which isn't reversed.
>
> Correction.
>
> Replace "Intel/AT&T" in my last post with "Intel" and
> "GAS" with "GAS/AT&T".
>
> One shouldn't post while almost asleep.
>
>
> RP
>
I got it figured to:

__get_cr0:
push rbp
mov rbp, rsp
xor rax,rax
smsw ax
mov rdx,rax
mov [rbp-8], rdx
mov rax,[rbp-8]
pop rbp
ret

Wolgang's method works also.

Bernhard Schornak

unread,
Nov 17, 2015, 1:06:22 PM11/17/15
to
bilsch01 wrote:


> __get_cr0:
> push rbp
> mov rbp, rsp


Do you really need a stack frame? This

SUB RSP, (Size_Of_Stackframe - 8)
MOV [RSP + Offset], RBP
...
MOV RBP, [RSP + Offset]
ADD RSP, (Size_Of_Stackframe - 8)
RET

is faster, more flexible (no negative offsets!) and saves one
multi purpose register for other tasks.

(Size_Of_Stackframe - 8) preserves the return address. Both -
Windows and Linux - demand RSP aligned to the next paragraph,
so our stack frame is properly aligned if we subtract numbers
ending with eight.


> xor rax,rax
> smsw ax


Do you evaluate the upper 16 bit later on? If not,

SMSW AX

alone should do it. If yes,

SMSW EAX (RAX)

clears the upper 16 (48) bit without contortions.


> mov rdx,rax
> mov [rbp-8], rdx
> mov rax,[rbp-8]


RAX => RDX
RDX => MEM
MEM => RAX


The SMSW is slow enough on its own. You don't need additional
delays to slow it down further... ;)


> pop rbp
> ret


As Wolfgang told: The function could be reduced to "SMSW EAX"
within the body of the calling function.


Greetings from Augsburg

Bernhard Schornak

Rod Pemberton

unread,
Nov 17, 2015, 6:22:26 PM11/17/15
to
On Tue, 17 Nov 2015 08:10:30 -0500, bilsch01 <kin...@nospicedham.comcast.net> wrote:

> Wolgang's method works also.

If you're using assembly, you can place the SMSW instruction
directly where you want it, i.e., inlined.

If you're using C, you can use attributes to remove the
stack frame or to place the instruction inline. A procedure
without a stack frame is called "naked" in C parlance.
So, you can end up with the same code as assembly with
optimization, or very similar code, i.e., maybe with a
few extra instructions to save registers. Of course,
the C compiler "knows" which instructions are in use and
can generate the correct code for this every time.

In other words, if you're coding a standalone program in
assembly, it's best to use an assembler like MASM, NASM,
or GAS, each of which have different assembly syntaxes for
x86. However, if you have any C involved, it's better to
use inline assembly for the host C compiler. The C compiler
can do many things to the code such as inlining the code,
make the code naked, i.e., no stack frame or procedure
prolog and epilog, preserve in-use registers, optimize the
assembly code, or use the exact code if marked as volatile.
This just depends on what C attributes are used. Using
inline assembly in C also eliminates extra linking and the
need for an additional assembler.

Which method do I use? Both. I use standalone assembly
when no C is involved and inlined assembly when C is.


Rod Pemberton

bilsch01

unread,
Nov 17, 2015, 8:38:07 PM11/17/15
to
By 'inlined' you mean in the same file as the C code, preceded by _asm_
or asm?

The stack frame is necessary in many cases, is it not? But when we do
inlined those stack operations are done behind the scene?

Rod Pemberton

unread,
Nov 18, 2015, 4:41:17 PM11/18/15
to
> By 'inlined' you mean in the same file as the C code,
> preceded by _asm_ or asm?

I mean the assembly instructions for an C function marked
as inline are placed where the inlined function is called.

This is the same as cutting out the assembly for the main
body of __get_cr0() or rdregcr0() and pasting it right
where each is called.

So, you'd have SMSW or MOV reg, CR0 placed right where
__get_cr0() or rdregcr0() was called in C, perhaps with
a few extra instructions to make it all work correctly.

Also, if the function is called multiple times, you won't
have a single piece of code for the procedure. Each call
will have it's own copy of the procedure inserted locally.

> The stack frame is necessary in many cases, is it not?

It's needed to pass parameters to a C function and
possibly return some data via the stack, if passed
by reference, i.e., pointer.

> But when we do inlined those stack operations are done
> behind the scene?

All stack operations are already "behind the scenes" in
C since C doesn't require a stack.

Basically, the compiler will attempt to use registers
as much as possible for those inlined parameters, instead
of passing parameters via a stack and using a prolog and
epilog to set up and clean up. You can also tell some
C compilers to pass via registers, but the function won't
also be inlined without being marked as inline. It'll
still be called.

Philip Lantz

unread,
Nov 21, 2015, 11:20:32 PM11/21/15
to
bilsch01 wrote:
>
> The C program that calls the assembly function that reads register cr0

[snip code]

> There are no errors generated when running the C program with sudo.
> The first printf() statement prints out but the second two printf's
> don't print. These two are after the call to the assembly function that
> reads cr0. There are no error messages. My objective is to read the
> bits in cr0, but with this behavior I won't be able to do it. I can't
> figure why I should be prevented from reading the register - reading
> causes no harm. Your suggestions will be appreciated.

You got a lot of good suggestions for improvements to your code, but
no one clearly pointed out the real problem:

You can't read CR0 from user-mode code.

(Rod alluded to this when he suggested using SMSW.) Running it with
sudo doesn't make any difference--it is still running in user mode
and the attempt to read CR0 will cause #GP.

I'm surprised you said it didn't give any errors. Doesn't your shell
print a message when a process dies with an exception?

Philip Lantz

unread,
Nov 21, 2015, 11:35:34 PM11/21/15
to
bilsch01 wrote:

> The stack frame is necessary in many cases, is it not? But when we do
> inlined those stack operations are done behind the scene?

A frame pointer is very rarely necessary on i386 and x86-64, since they
support stack-pointer-relative addressing. A frame pointer is merely a
convenience in functions where the stack pointer changes during execution
of the function. It is fairly easy to arrange things so that the stack
pointer is changed only during the prologue and epilogue of the function,
so that offsets from the stack pointer to data stored in the stack are
consistent throughout the function.

The 8086 doesn't support stack-pointer-relative addressing, so a frame
pointer is more useful, but there is still no point in using one if
there is no data stored on the stack.

Melzzzzz

unread,
Nov 22, 2015, 12:50:48 AM11/22/15
to
Heh here is quick kernel module to place contents of cr0
in /proc/cr0_proc ;)

[bmaxa@maxa-pc kmodule]$ cat cr0_proc.c
#include <linux/module.h>
#include <linux/proc_fs.h>
#include <linux/seq_file.h>

static int proc_show(struct seq_file *m, void *v) {
unsigned long long val;
asm (
"movq %%cr0,%%rax\n"
"movq %%rax,%0\n"
:"=r"(val)
:
:"%rax"
);
seq_printf(m, "cr0: %llx\n",val);
return 0;
}

static int proc_open(struct inode *inode, struct file *file) {
return single_open(file, proc_show, NULL);
}

static const struct file_operations proc_fops = {
.owner = THIS_MODULE,
.open = proc_open,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
};

static int __init proc_init(void) {
proc_create("cr0_proc", 0, NULL, &proc_fops);
printk("cr0_proc inserted\n");
return 0;
}

static void __exit proc_exit(void) {
remove_proc_entry("cr0_proc", NULL);
printk("cr0_proc removed\n");
}

MODULE_LICENSE("GPL");
module_init(proc_init);
module_exit(proc_exit);

[bmaxa@maxa-pc kmodule]$ cat Makefile
ifneq ($(KERNELRELEASE),)
obj-m := cr0_proc.o

else
KDIR := /lib/modules/$(shell uname -r)/build
PWD := $(shell pwd)
default:
$(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules
rm -r -f .tmp_versions *.mod.c .*.cmd *.o *.symvers

endif

[bmaxa@maxa-pc kmodule]$ make
make -C /lib/modules/4.3.0-1-MANJARO/build SUBDIRS=/home/bmaxa/zfs/bmaxa_data/examples/kmodule modules
make[1]: Entering directory '/usr/lib/modules/4.3.0-1-MANJARO/build'
CC [M] /home/bmaxa/zfs/bmaxa_data/examples/kmodule/cr0_proc.o
Building modules, stage 2.
MODPOST 1 modules
CC /home/bmaxa/zfs/bmaxa_data/examples/kmodule/cr0_proc.mod.o
LD [M] /home/bmaxa/zfs/bmaxa_data/examples/kmodule/cr0_proc.ko
make[1]: Leaving directory '/usr/lib/modules/4.3.0-1-MANJARO/build'
rm -r -f .tmp_versions *.mod.c .*.cmd *.o *.symvers

[bmaxa@maxa-pc kmodule]$ sudo insmod ./cr0_proc.ko
[sudo] password for bmaxa:
[bmaxa@maxa-pc kmodule]$ cat /proc/cr0_proc
cr0: 80050033
[bmaxa@maxa-pc kmodule]$ sudo rmmod ./cr0_proc.ko

HTH

Alexei A. Frounze

unread,
Nov 22, 2015, 3:05:59 AM11/22/15
to
On which CPU will this execute?

Melzzzzz

unread,
Nov 22, 2015, 3:36:03 AM11/22/15
to
On Sat, 21 Nov 2015 23:59:00 -0800 (PST)
"Alexei A. Frounze" <alexf...@nospicedham.gmail.com> wrote:

> On which CPU will this execute?

That's not an issue ;)

Martin Str|mberg

unread,
Nov 22, 2015, 8:40:48 AM11/22/15
to
Melzzzzz <m...@nospicedham.zzzzz.com> wrote:
> static int proc_show(struct seq_file *m, void *v) {
> unsigned long long val;
> asm (
> "movq %%cr0,%%rax\n"
> "movq %%rax,%0\n"
> :"=r"(val)
> :
> :"%rax"
> );
> seq_printf(m, "cr0: %llx\n",val);
> return 0;
> }

Why don't you use the inline assembly without clobbering around?

asm (
"movq %%cr0,%0\n"
:"=r"(val)
);

In "mov %cr0, X" X can be any general purpose register, right?

Alternatively if X only can be %rax:

asm (
"movq %%cr0,%0\n"
:"=a"(val)
);



Otherwise niece piece of code (although I haven't tested it).


--
MartinS

Martin Str|mberg

unread,
Nov 22, 2015, 8:40:58 AM11/22/15
to
Alexei A. Frounze <alexf...@nospicedham.gmail.com> wrote:
> On which CPU will this execute?

The appearance of rax indicates amd64. This newsgroup makes that
almost certain.


--
MartinS

Melzzzzz

unread,
Dec 15, 2015, 1:21:41 AM12/15/15
to
Oh I think Alexei meant on which CPU/core it will run. To return result
from each core first number of CPU-s have to be determined, then bind
thread to each core and return cr0. More sophisticated than this
example, I leave that as exercise for OP ;)

bilsch01

unread,
Dec 16, 2015, 7:39:17 PM12/16/15
to
On 12/14/2015 10:19 PM, Melzzzzz wrote:
> On Sun, 22 Nov 2015 08:14:34 +0000 (UTC)
> Martin Str|mberg <a...@nospicedham.ludd.luth.se> wrote:
>
>> Alexei A. Frounze <alexf...@nospicedham.gmail.com> wrote:
>>> On which CPU will this execute?
>>
>> The appearance of rax indicates amd64. This newsgroup makes that
>> almost certain.
>>
>>
>
>bind thread to each core and return cr0.

Can you tell me brief example of how to do this?

0 new messages