%ebx/%rbx: when should it be preserved?

Victor Khimenko

unread,

Aug 18, 2015, 2:43:31 PM8/18/15

to Native Client Discuss

Looks like CLANG and GCC have the opposite ideas WRT when %ebx/%rbx must be preserved. Clang preserves %rbx in x86-64 mode and there is the appropriate comment:

$ cat pepper_canary/toolchain/linux_pnacl/lib/clang/3.7.0/include/cpuid.h

...

/* x86-64 uses %rbx as the base register, so preserve it. */

...

__asm(" xchgq %%rbx,%q1\n" \

...

but GCC does the opposite:

$ cat pepper_canary/toolchain/linux_x86_glibc/lib/gcc/x86_64-nacl/4.4.3/include/cpuid.h

...

#if defined(__i386__) && defined(__PIC__)

/* %ebx may be the PIC register. */

...

__asm__ ("xchg{l}\t{%%}ebx, %1\n\t"

...

They couldn't both be right and still be compatible, so... which compiler does it wrong?

Mark Seaborn

unread,

Aug 18, 2015, 3:03:43 PM8/18/15

to Native Client Discuss

"Should" according to what requirements?

You might be referring to calling conventions here, but that's not applicable, because this is inline assembly. The only question is whether this inline asm() is declaring the correct register constraints. I had a look at the code you referenced and it looks OK to me.

There's a separate question of whether GCC/LLVM handle inline assembly that clobbers %ebx/%rbx or reject it at compile time. GCC used not to handle that, but LLVM does, and I think newer versions of GCC might accept it now.

Cheers,

Mark

Roland McGrath

unread,

Aug 18, 2015, 5:05:58 PM8/18/15

to native-cli...@googlegroups.com

Mark said it right.

In the x86-32 ABI, %ebx is a callee-saves register in general, but is special for PIC code. The ABI requires pointing %ebx at your GOT before calling into your PLT.

Because of this requirement, %ebx used to be just made a fixed register under -fPIC in GCC. The compiler has been smarter than just that for many years now. But I'm not really sure about its ability to handle asm clobbers of %ebx in a function that makes PLT calls. Historically the compiler would barf on this, so code was explicitly written to save and restore %ebx in the asm so the compiler could continue to believe it was never touched.

In the x86-64 ABI under the "small" code model (-mcmodel=small, the default), %rbx is just another callee-saves register; there is nothing special about it. No special register setup is required for PLT calls. In some of the other code models, %rbx is used by the compiler as some sort of base pointer for data. I don't understand the details, but I don't think it is part of any ABI. It's just the GCC implementation's choice to reserve %rbx as a fixed register in these cases. Those are the situations in which it might not be able to handle asm clauses that clobber %rbx.

Victor Khimenko

unread,

Aug 18, 2015, 5:20:14 PM8/18/15

to Native Client Discuss

On Tue, Aug 18, 2015 at 9:03 PM, Mark Seaborn <msea...@chromium.org> wrote:

On 18 August 2015 at 11:43, Victor Khimenko <kh...@chromium.org> wrote:
Looks like CLANG and GCC have the opposite ideas WRT when %ebx/%rbx must be preserved. Clang preserves %rbx in x86-64 mode and there is the appropriate comment:

$ cat pepper_canary/toolchain/linux_pnacl/lib/clang/3.7.0/include/cpuid.h
...
/* x86-64 uses %rbx as the base register, so preserve it. */
...
__asm(" xchgq %%rbx,%q1\n" \
...

but GCC does the opposite:

$ cat pepper_canary/toolchain/linux_x86_glibc/lib/gcc/x86_64-nacl/4.4.3/include/cpuid.h
...
#if defined(__i386__) && defined(__PIC__)
/* %ebx may be the PIC register. */
...
__asm__ ("xchg{l}\t{%%}ebx, %1\n\t"
...

They couldn't both be right and still be compatible, so... which compiler does it wrong?

"Should" according to what requirements?

That's the question. Why LLVM tries to preserve %rbx in 64bit case while GCC tries to preserve %eax in 32bit case? There was a mixup of clang/gcc headers in our build system where clang used gcc's headers. This is fixed now but it just looked strange to us that clang and gcc defines preserve %xBX in precisely the opposite cases.

You might be referring to calling conventions here, but that's not applicable, because this is inline assembly. The only question is whether this inline asm() is declaring the correct register constraints. I had a look at the code you referenced and it looks OK to me.

There's a separate question of whether GCC/LLVM handle inline assembly that clobbers %ebx/%rbx or reject it at compile time. GCC used not to handle that, but LLVM does, and I think newer versions of GCC might accept it now.

GCC 5+ does not have any "xchg" magics in there thus we could assume that it handles these cases fine now, but why "xchg" is in LLVM's header if LLVM handles everything just fine?

Victor Khimenko

unread,

Aug 18, 2015, 5:23:09 PM8/18/15

to Native Client Discuss

I can understand why gcc does what it does: our old version of gcc probably have only ever supported "small" code model for x86-64 and thus only needed to save "%ebx" in 32-bit case. Ok. Makes sense. But LLVM does not save %ebx in that case! Instead it saves "%rbx" in 64-bit case even in small model! THAT does not make any sense to me...

Mark Seaborn

unread,

Aug 18, 2015, 5:32:35 PM8/18/15

to Native Client Discuss

I don't know the answer for LLVM. I guess you'd have to dig through LLVM's version history to look at the explanation when that code was added, and maybe ask whoever wrote it. Maybe the code was cargo-culted and there's no particular reason.