[llvm-dev] How to debug if LTO generate wrong code?

1,588 views
Skip to first unread message

Shi, Steven via llvm-dev

unread,
May 13, 2016, 10:18:46 AM5/13/16
to llvm-dev, cfe...@lists.llvm.org

Hello,

I'm enabling clang LTO to improve code size of Uefi standard (http://www.uefi.org/) firmware (https://github.com/tianocore/edk2), which is mostly C code. My project is in https://github.com/shijunjing/edk2 branch llvm : https://github.com/shijunjing/edk2/tree/llvm. I find my most firmware modules work well after enable LTO, but some X64 modules will not run (e.g. hang with CPU exception) , and these X64 modules work well if build with the LTO disabled (-fno-lto).

I don’t know how to efficiently debug these LTO wrong code and investigate if there is compiler’s bug. I appreciate if anyone can  give me some suggestions about the clang LTO issue debug method, commands, or BKMs.

 

Below are  my clang LTO build tools and options, I use clang 3.8 release with binutils 2.26 ld (I’ve pushed ld support LLVM gold plugin https://sourceware.org/bugzilla/show_bug.cgi?id=20070). Any suggestion is welcome!

 

 

 

##################

# CLANGLTO38 X64 definitions

##################

*_CLANGLTO38_X64_OBJCOPY_PATH         = DEF(GCC53_X64_PREFIX)objcopy

*_CLANGLTO38_X64_CC_PATH              = DEF(CLANG38_X64_PREFIX)clang

*_CLANGLTO38_X64_SLINK_PATH           = DEF(CLANG38_X64_PREFIX)llvm-ar

*_CLANGLTO38_X64_DLINK_PATH           = DEF(CLANG38_X64_PREFIX)clang

*_CLANGLTO38_X64_ASM_PATH             = DEF(CLANG38_X64_PREFIX)clang

*_CLANGLTO38_X64_PP_PATH              = DEF(CLANG38_X64_PREFIX)clang

*_CLANGLTO38_X64_RC_PATH              = DEF(GCC53_X64_PREFIX)objcopy

 

*_CLANGLTO38_X64_CC_FLAGS = -c -fshort-wchar -fno-strict-aliasing -Wall -Werror -Wno-array-bounds -Wno-empty-body -ffunction-sections -fdata-sections -include AutoGen.h -DSTRING_ARRAY_NAME=$(BASE_NAME)Strings -fno-stack-protector -fno-builtin -mms-bitfields -Wno-address -Wno-shift-negative-value -Wno-parentheses-equality -Wno-unknown-pragmas -Wno-tautological-constant-out-of-range-compare -Wno-incompatible-library-redeclaration -target x86_64-pc-linux-gnu -fno-asynchronous-unwind-tables -m64 -Wno-enum-conversion "-DEFIAPI=__attribute__((ms_abi))" -mno-red-zone -mcmodel=large -g -Os -flto 

*_CLANGLTO38_X64_DLINK_FLAGS  = -flto -nostdlib -Wl,-n -Wl,-q -Wl,--gc-sections -Wl,-z,common-page-size=0x40 -Wl,--entry,$(IMAGE_ENTRY_POINT) -Wl,-u,$(IMAGE_ENTRY_POINT) -Wl,-Map,$(DEST_DIR_DEBUG)/$(BASE_NAME).map -Wl,-melf_x86_64 -Wl,--oformat=elf64-x86-64

*_CLANGLTO38_X64_ASM_FLAGS            = -c -x assembler -imacros $(DEST_DIR_DEBUG)/AutoGen.h -m64 -target x86_64-pc-linux-gnu

*_CLANGLTO38_X64_RC_FLAGS             = -I binary -O elf64-x86-64        -B i386    --rename-section .data=.hii

*_CLANGLTO38_X64_NASM_FLAGS           = -f elf64

 

 

 

Steven Shi

Intel\SSG\STO\UEFI Firmware

 

Tel: +86 021-61166522

iNet: 821-6522

 

Umesh Kalappa via llvm-dev

unread,
May 13, 2016, 2:14:18 PM5/13/16
to Shi, Steven, llvm-dev, cfe...@lists.llvm.org
Steven,

Brute force method is ,get the disassemble of the hanged function and
try to check the difference with and without LTO in the generated
code.

or try to attach gdb and check for the instruction ,that cause the exception .

Thank you
~Umesh

> _______________________________________________
> LLVM Developers mailing list
> llvm...@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Shi, Steven via llvm-dev

unread,
May 16, 2016, 10:40:21 AM5/16/16
to Umesh Kalappa, llvm-dev, cfe...@lists.llvm.org

Hi Umesh,

Thank you for the suggestion. I can use the "Brute force method " to narrow down the LTO wrong instructions here and there, but I still don't know why these wrong instructions are generated, and how to let Clang LTO don't generate those wrong instructions.

I suspect the wrong code is caused by some LTO wrong optimization pass, so I hope to disable all optimizations in the LTO firstly, then enable them one by one later to narrow down my issue root cause. But when I try to disable the optimization by enforcing –O0 in the LTO build, I find the ld fails to recognize some clang  bitcode library, and fail to link.

 

e.g. use the Clang_LTO_Fails_On_LD example in below bug attachment

https://sourceware.org/bugzilla/show_bug.cgi?id=20070

 

If I enforce the -O0 to disable the optimization in LTO, the ld fail to link:

~/clang38/bin/clang -o Hello.dll -flto -O0 -nostdlib -Wl,-n -Wl,-q -Wl,--gc-sections -Wl,-z,common-page-size=0x40 -Wl,--entry,_ModuleEntryPoint -Wl,-u,_ModuleEntryPoint -Wl,-Map,Hello.map -Wl,-melf_x86_64 -Wl,--oformat=elf64-x86-64 -Wl,--start-group,,@static_library_files.lst -Wl,--end-group

BaseLib.lib: error adding symbols: File format not recognized

clang-3.8: error: linker command failed with exit code 1 (use -v to see invocation)

 

But if I enable the -O1, -O2, or higher -On, the ld  link pass:

~/clang38/bin/clang -o Hello.dll -flto -O1 -nostdlib -Wl,-n -Wl,-q -Wl,--gc-sections -Wl,-z,common-page-size=0x40 -Wl,--entry,_ModuleEntryPoint -Wl,-u,_ModuleEntryPoint -Wl,-Map,Hello.map -Wl,-melf_x86_64 -Wl,--oformat=elf64-x86-64 -Wl,--start-group,,@static_library_files.lst -Wl,--end-group

 

 

How can I correctly disable all the optimization in clang LTO? How can I know, enable and disable the specific optimizations in clang LTO? Any suggestion is welcomed!

 

 

 

Steven Shi

Intel\SSG\STO\UEFI Firmware

 

Tel: +86 021-61166522

iNet: 821-6522

 

> -----Original Message-----

> From: Umesh Kalappa [mailto:umesh.k...@gmail.com]

> Sent: Saturday, May 14, 2016 2:14 AM

> To: Shi, Steven <steve...@intel.com>

> Cc: llvm-dev <llvm...@lists.llvm.org>; cfe...@lists.llvm.org

> Subject: Re: [llvm-dev] How to debug if LTO generate wrong code?

Shi, Steven via llvm-dev

unread,
May 17, 2016, 4:33:43 AM5/17/16
to llvm-dev, cfe...@lists.llvm.org

Hello,

Let me ask a LTO simple question again. For the llvm LTO example in the link: http://llvm.org/docs/LinkTimeOptimization.html, I use below build commands to generate three different optimization level binary: -O0, -O1, -O2. By nm listing the foo1~4 symbols , I can see these different optimizations really works.

1.       How can I know what different optimizations are used by the clang LTO among -O0, -O1 and -O2?

2.       Is the compiler domain optimization (e.g. clang/llvm) or the linker (e.g. ld) domain optimization make these difference?

3.       How can I explicitly enable or disable these specific optimizations besides using -O0, -O1, -O2?

 

 

$clang -emit-llvm -c main.c -o main.bc

$clang -emit-llvm -c a.c -o a.bc

$llvm-ar cr main.lib main.bc

$llvm-ar cr a.lib a.bc

$clang -O0 -flto main.lib a.lib -o main0

$clang -O1 -flto main.lib a.lib -o main1

$clang -O2 -flto main.lib a.lib -o main2

 

$nm main0

00000000004005a0 t foo1

0000000000400580 t foo2

00000000004005e0 t foo3

0000000000400530 t foo4

0000000000400500 t frame_dummy

$ nm main1

0000000000400550 t foo1

0000000000400580 t foo3

0000000000400530 t foo4

0000000000400500 t frame_dummy

$ nm main2

00000000004004d0 t frame_dummy

 

From blew verbose output, tt seems only linker( e.g. ld) is invovled to do the optimization?

 

$ clang -O2 -flto main.lib a.lib -o main2 -v

clang version 3.8.0 (tags/RELEASE_380/final)

Target: x86_64-unknown-linux-gnu

Thread model: posix

InstalledDir: /usr/local/bin

Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9

Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9.3

Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5.3.1

Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6.0.0

Found candidate GCC installation: /usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/7.0.0

Selected GCC installation: /usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/7.0.0

Candidate multilib: .;@m64

Candidate multilib: 32;@m32

Selected multilib: .;@m64

"/usr/bin/ld" -z relro --hash-style=gnu --build-id --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o main2 /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o /usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/7.0.0/crtbegin.o -L/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/7.0.0 -L/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/7.0.0/../../../../lib64 -L/usr/local/bin/../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/7.0.0/../../.. -L/usr/local/bin/../lib -L/lib -L/usr/lib -plugin /usr/local/bin/../lib/LLVMgold.so -plugin-opt=mcpu=x86-64 -plugin-opt=O2 main.lib a.lib -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/7.0.0/crtend.o /usr/lib/x86_64-linux-gnu/crtn.o

Mehdi Amini via llvm-dev

unread,
May 17, 2016, 11:52:33 AM5/17/16
to Shi, Steven, llvm-dev, cfe...@lists.llvm.org
On May 17, 2016, at 1:33 AM, Shi, Steven via llvm-dev <llvm...@lists.llvm.org> wrote:

Hello,
Let me ask a LTO simple question again. For the llvm LTO example in the link:http://llvm.org/docs/LinkTimeOptimization.html, I use below build commands to generate three different optimization level binary: -O0, -O1, -O2. By nm listing the foo1~4 symbols , I can see these different optimizations really works. 
1.       How can I know what different optimizations are used by the clang LTO among -O0, -O1 and -O2?

LTO is linker specific, clang is only forwarding the option to the linker here.

2.       Is the compiler domain optimization (e.g. clang/llvm) or the linker (e.g. ld) domain optimization make these difference?

In you case, you invoke clang with "emit-llvm", without any optimization level, so you get O0.
For what the linker is doing at these optimizations levels, again this is linker specific.

3.       How can I explicitly enable or disable these specific optimizations besides using -O0, -O1, -O2?

If you're talking about the LTO, this is linker specific again (ld is not the same program on every platform). For instance there is no such thing as O0/O1/O2 on OS X.


 
 
$clang -emit-llvm -c main.c -o main.bc
$clang -emit-llvm -c a.c -o a.bc
$llvm-ar cr main.lib main.bc
$llvm-ar cr a.lib a.bc
$clang -O0 -flto main.lib a.lib -o main0
$clang -O1 -flto main.lib a.lib -o main1
$clang -O2 -flto main.lib a.lib -o main2
 
$nm main0
00000000004005a0 t foo1
0000000000400580 t foo2
00000000004005e0 t foo3
0000000000400530 t foo4
0000000000400500 t frame_dummy
$ nm main1
0000000000400550 t foo1
0000000000400580 t foo3
0000000000400530 t foo4
0000000000400500 t frame_dummy
$ nm main2
00000000004004d0 t frame_dummy
 
From blew verbose output, tt seems only linker( e.g. ld) is invovled to do the optimization?

Yes.
Usually the LTO pipeline is a bit different from what you're doing, I'm used to see:

$clang -flto -O3 -c main.c -o main.o
$clang -flto -O3 -c a.c -o a.o
$clang -flto -O3 main.o a.o -o main0


-- 
Mehdi



 
$ clang -O2 -flto main.lib a.lib -o main2 -v
clang version 3.8.0 (tags/RELEASE_380/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9.3
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5.3.1
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6.0.0
Found candidate GCC installation: /usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/7.0.0
Selected GCC installation: /usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/7.0.0
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64
"/usr/bin/ld" -z relro --hash-style=gnu --build-id --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o main2 /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o /usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/7.0.0/crtbegin.o -L/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/7.0.0 -L/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/7.0.0/../../../../lib64 -L/usr/local/bin/../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/7.0.0/../../.. -L/usr/local/bin/../lib -L/lib -L/usr/lib -plugin /usr/local/bin/../lib/LLVMgold.so -plugin-opt=mcpu=x86-64 -plugin-opt=O2 main.lib a.lib -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/local/bin/../lib/gcc/x86_64-pc-linux-gnu/7.0.0/crtend.o /usr/lib/x86_64-linux-gnu/crtn.o
 
 
Steven Shi
Intel\SSG\STO\UEFI Firmware
 
iNet: 821-6522
 

Umesh Kalappa via llvm-dev

unread,
May 17, 2016, 2:21:40 PM5/17/16
to Mehdi Amini, llvm-dev, cfe...@lists.llvm.org
Steven,

As mehdi stated , the optimisation level is specific to linker and it
enables Inter-Pro opts passes ,please refer function

PassManagerBuilder::addLTOOptimizationPasses() at
http://llvm.org/docs/doxygen/html/PassManagerBuilder_8cpp_source.html

internal options to disable to them ,i don't think ,you can do so.

Thank you
~Umesh

> _______________________________________________
> cfe-dev mailing list
> cfe...@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

Mehdi Amini via llvm-dev

unread,
May 17, 2016, 4:02:40 PM5/17/16
to Umesh Kalappa, llvm-dev, cfe...@lists.llvm.org

> On May 17, 2016, at 11:21 AM, Umesh Kalappa <umesh.k...@gmail.com> wrote:
>
> Steven,
>
> As mehdi stated , the optimisation level is specific to linker and it
> enables Inter-Pro opts passes ,please refer function

To be very clear: the -O option may trigger *linker* optimizations as well, independently of LTO.

--
Mehdi

Shi, Steven via llvm-dev

unread,
May 29, 2016, 10:36:36 AM5/29/16
to mehdi...@apple.com, Umesh Kalappa, eli...@gmail.com, llvm-dev, cfe...@lists.llvm.org

Hi Mehdi,

After deeper debug, I found my firmware LTO wrong code issue is related to X64 code model (-mcmodel=large) is always overridden as small (-mcmodel=small) if LTO build. And I don't know how to correctly specific the large code model for my X64 firmware LTO build. Appreciate if you could let me know it.

 

You know, parts of my Uefi firmware (BIOS) have to been loaded to run in high address (larger than 2 GB) at the very beginning, and I need the code makes absolutely no assumptions about the addresses and data sections. But current LLVM LTO seems stick to use the small code model and generate many code with 32-bit RIP-relative addressing, which cause CPU exceptions when run in address larger than 2GB.

 

Below, I just simply reuse the Eli's codemodel1.c example (link: http://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models) to show the LLVM LTO code model issue.

$ clang -g -O0 codemodel1.c -mcmodel=large -o codemodel1_large.bin

$ clang -g -O0 codemodel1.c -mcmodel=small -o codemodel1_small.bin

$ clang -g -O0 -flto codemodel1.c -mcmodel=large -o codemodel1_large_lto.bin

$ clang -g -O0 -flto codemodel1.c -mcmodel=small -o codemodel1_small_lto.bin

 

You will see the codemodel1_large_lto.bin and codemodel1_small_lto.bin are exactly the same!

And if you disassemble the codemodel1_large_lto.bin, you will see it uses the small code model (32-bit RIP-relative), not large, to do addressing as below.

 

$ objdump -dS codemodel1_large_lto.bin

 

int main(int argc, const char* argv[])

{

  4004f0:       55                      push   %rbp

  4004f1:       48 89 e5                mov    %rsp,%rbp

  4004f4:       48 83 ec 20             sub    $0x20,%rsp

  4004f8:       c7 45 fc 00 00 00 00    movl   $0x0,-0x4(%rbp)

  4004ff:       89 7d f8                mov    %edi,-0x8(%rbp)

  400502:       48 89 75 f0             mov    %rsi,-0x10(%rbp)

    int t = global_func(argc);

  400506:       8b 7d f8                mov    -0x8(%rbp),%edi

  400509:       e8 d2 ff ff ff          callq  4004e0 <global_func>

  40050e:       89 45 ec                mov    %eax,-0x14(%rbp)

    t += global_arr[7];

  400511:       8b 04 25 4c 10 60 00    mov    0x60104c,%eax

  400518:       03 45 ec                add    -0x14(%rbp),%eax

  40051b:       89 45 ec                mov    %eax,-0x14(%rbp)

    t += static_arr[7];

  40051e:       8b 04 25 dc 11 60 00    mov    0x6011dc,%eax

  400525:       03 45 ec                add    -0x14(%rbp),%eax

  400528:       89 45 ec                mov    %eax,-0x14(%rbp)

    t += global_arr_big[7];

  40052b:       8b 04 25 6c 13 60 00    mov    0x60136c,%eax

  400532:       03 45 ec                add    -0x14(%rbp),%eax

  400535:       89 45 ec                mov    %eax,-0x14(%rbp)

    t += static_arr_big[7];

  400538:       8b 04 25 ac 20 63 00    mov    0x6320ac,%eax

  40053f:       03 45 ec                add    -0x14(%rbp),%eax

  400542:       89 45 ec                mov    %eax,-0x14(%rbp)

    return t;

  400545:       8b 45 ec                mov    -0x14(%rbp),%eax

  400548:       48 83 c4 20             add    $0x20,%rsp

  40054c:       5d                      pop    %rbp

  40054d:       c3                      retq

  40054e:       66 90                   xchg   %ax,%ax

 

 

So, does LTO support large code model? How to correctly specify the LTO code model option?

 

 

Steven Shi

Intel\SSG\STO\UEFI Firmware

 

Tel: +86 021-61166522

iNet: 821-6522

 

> -----Original Message-----

> From: mehdi...@apple.com [mailto:mehdi...@apple.com]

> Sent: Wednesday, May 18, 2016 4:02 AM

> To: Umesh Kalappa <umesh.k...@gmail.com>

> Cc: Shi, Steven <steve...@intel.com>; llvm-dev <llvm...@lists.llvm.org>;

> cfe...@lists.llvm.org

> Subject: Re: [cfe-dev] [llvm-dev] How to debug if LTO generate wrong code?

Mehdi Amini via llvm-dev

unread,
May 29, 2016, 4:28:07 PM5/29/16
to Shi, Steven, llvm-dev, cfe...@lists.llvm.org
Hi,


Same answer as before: LTO is setup by the linker, so the option for that, if it exists, will be linker specific.

As far as I can tell, neither libLTO-based linker (ld64 on OS X for example), neither the gold plugin supports such an option and the code model is always "default". 

I don't know about lld, CC Rafael about that.

-- 
Mehdi

Davide Italiano via llvm-dev

unread,
May 29, 2016, 5:18:58 PM5/29/16
to Mehdi Amini, llvm-dev, cfe...@lists.llvm.org

Neither lld does (yet), to the best of my knowledge.

Cheers,

--
Davide

Shi, Steven via llvm-dev

unread,
May 29, 2016, 8:10:23 PM5/29/16
to mehdi...@apple.com, llvm-dev, cfe...@lists.llvm.org

Hi Mehdi,

GCC LTO seems support large code model in my side as below, if the code model is linker specific, does the GCC LTO use a special linker which is different from the one in GNU Binutils?

I’m a bit surprised if both OS X ld64 and gold plugin do not support large code model in LTO. Since modern system widely use the 64bit, the code need to run in high address (larger than 2 GB) is a reasonable requirement.

 

$ gcc -g -O0 -flto codemodel1.c -mcmodel=large -o codemodel1_large_lto_gcc.bin

$ objdump -dS codemodel1_large_lto_gcc.bin

 

int main(int argc, const char* argv[])

{

  40048b:       55                      push   %rbp

  40048c:       48 89 e5                mov    %rsp,%rbp

  40048f:       48 83 ec 20             sub    $0x20,%rsp

  400493:       89 7d ec                mov    %edi,-0x14(%rbp)

  400496:       48 89 75 e0             mov    %rsi,-0x20(%rbp)

    int t = global_func(argc);

  40049a:       8b 45 ec                mov    -0x14(%rbp),%eax

  40049d:       89 c7                   mov    %eax,%edi

  40049f:       48 b8 76 04 40 00 00    movabs $0x400476,%rax

  4004a6:       00 00 00

  4004a9:       ff d0                   callq  *%rax

  4004ab:       89 45 fc                mov    %eax,-0x4(%rbp)

    t += global_arr[7];

  4004ae:       48 b8 20 09 60 00 00    movabs $0x600920,%rax

  4004b5:       00 00 00

  4004b8:       8b 40 1c                mov    0x1c(%rax),%eax

  4004bb:       01 45 fc                add    %eax,-0x4(%rbp)

    t += static_arr[7];

  4004be:       48 b8 c0 0a 60 00 00    movabs $0x600ac0,%rax

  4004c5:       00 00 00

  4004c8:       8b 40 1c                mov    0x1c(%rax),%eax

  4004cb:       01 45 fc                add    %eax,-0x4(%rbp)

    t += global_arr_big[7];

  4004ce:       48 b8 60 0c 60 00 00    movabs $0x600c60,%rax

  4004d5:       00 00 00

  4004d8:       8b 40 1c                mov    0x1c(%rax),%eax

  4004db:       01 45 fc                add    %eax,-0x4(%rbp)

    t += static_arr_big[7];

  4004de:       48 b8 a0 19 63 00 00    movabs $0x6319a0,%rax

  4004e5:       00 00 00

  4004e8:       8b 40 1c                mov    0x1c(%rax),%eax

  4004eb:       01 45 fc                add    %eax,-0x4(%rbp)

    return t;

  4004ee:       8b 45 fc                mov    -0x4(%rbp),%eax

}

 

Steven Shi

Intel\SSG\STO\UEFI Firmware

 

Tel: +86 021-61166522

iNet: 821-6522

Sent: Monday, May 30, 2016 4:28 AM
To: Shi, Steven <steve...@intel.com>

Mehdi Amini via llvm-dev

unread,
May 29, 2016, 8:17:01 PM5/29/16
to Shi, Steven, llvm-dev, cfe...@lists.llvm.org
On May 29, 2016, at 5:10 PM, Shi, Steven <steve...@intel.com> wrote:

Hi Mehdi,
GCC LTO seems support large code model in my side as below, if the code model is linker specific, does the GCC LTO use a special linker which is different from the one in GNU Binutils?

I don't know anything about GCC.
(And I doubt the GNU linker supports LTO with LLVM).

I’m a bit surprised if both OS X ld64 and gold plugin do not support large code model in LTO. Since modern system widely use the 64bit, the code need to run in high address (larger than 2 GB) is a reasonable requirement.

The fact that we don't support it for now seems to indicate that it is not a widely requested feature, especially considering that it is really a trivial option to add.
What is the linker you're using? Are you building your own clang?

-- 
Mehdi

Joerg Sonnenberger via llvm-dev

unread,
May 29, 2016, 8:27:24 PM5/29/16
to cfe...@lists.llvm.org, llvm-dev
On Mon, May 30, 2016 at 12:10:09AM +0000, Shi, Steven via cfe-dev wrote:
> I'm a bit surprised if both OS X ld64 and gold plugin do not support
> large code model in LTO. Since modern system widely use the 64bit, the
> code need to run in high address (larger than 2 GB) is a reasonable requirement.

Actually, given that PIC is (almost) free in terms of codegen, there is
rarely a need for the large model on AMD64. Programs with more than 2GB
of static data are moderately rare and programs with more than 2GB of
text even more so.

Joerg

Shi, Steven via llvm-dev

unread,
May 29, 2016, 8:44:50 PM5/29/16
to mehdi...@apple.com, llvm-dev, cfe...@lists.llvm.org

(And I doubt the GNU linker supports LTO with LLVM).

[Steven]: I’ve pushed GNU Binutils ld to support LLVM gold plugin, see detail in this bug https://sourceware.org/bugzilla/show_bug.cgi?id=20070. The new GNU ld linker works well with LLVM/Clang LTO when build IA32 code in my side. And from the ld owner input in the bug comments, the current X64 LLVM LTO issue is in llvm LTO plugin.

 

 

The fact that we don't support it for now seems to indicate that it is not a widely requested feature, especially considering that it is really a trivial option to add.

What is the linker you're using? Are you building your own clang?

[Steven]: I’m using the standard LLVM 3.8 with the above GNU new ld linker. I can build my own clang in my side if needed. I’m happy to know it is not difficult to enable the large code model in LLVM LTO and “it is really a trivial option to add”. Could you let me know how to enable it? My lots of work have been blocked by the large code model issue. Thank you!

Shi, Steven via llvm-dev

unread,
May 29, 2016, 10:12:18 PM5/29/16
to llvm-dev
Hi Joerg,
My firmware case is I need load my firmware to run in high address which is hardware required. It is not that my firmware really need large static data and text code. My firmware modules are shared library (like a DLL) built with "-fpic", and firmware loader will load them to high address (larger than 2GB). I think this need is quite common to system software, like firmware, driver and kernel mode code.


Steven Shi
Intel\SSG\STO\UEFI Firmware

Tel: +86 021-61166522
iNet: 821-6522


> -----Original Message-----
> From: llvm-dev [mailto:llvm-dev...@lists.llvm.org] On Behalf Of Joerg
> Sonnenberger via llvm-dev
> Sent: Monday, May 30, 2016 8:27 AM
> To: cfe...@lists.llvm.org; llvm-dev <llvm...@lists.llvm.org>
> Subject: Re: [llvm-dev] [cfe-dev] How to debug if LTO generate wrong code?
>
> On Mon, May 30, 2016 at 12:10:09AM +0000, Shi, Steven via cfe-dev wrote:
> > I'm a bit surprised if both OS X ld64 and gold plugin do not support
> > large code model in LTO. Since modern system widely use the 64bit, the
> > code need to run in high address (larger than 2 GB) is a reasonable
> requirement.
>
> Actually, given that PIC is (almost) free in terms of codegen, there is
> rarely a need for the large model on AMD64. Programs with more than 2GB
> of static data are moderately rare and programs with more than 2GB of
> text even more so.
>
> Joerg
> _______________________________________________
> LLVM Developers mailing list
> llvm...@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
cfe-dev mailing list
cfe...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

Mehdi Amini via llvm-dev

unread,
May 30, 2016, 2:13:54 AM5/30/16
to Shi, Steven, llvm-dev, cfe...@lists.llvm.org
On May 29, 2016, at 5:44 PM, Shi, Steven <steve...@intel.com> wrote:

(And I doubt the GNU linker supports LTO with LLVM).
[Steven]: I’ve pushed GNU Binutils ld to support LLVM gold plugin, see detail in this bug https://sourceware.org/bugzilla/show_bug.cgi?id=20070. The new GNU ld linker works well with LLVM/Clang LTO when build IA32 code in my side. And from the ld owner input in the bug comments, the current X64 LLVM LTO issue is in llvm LTO plugin.
 
 
The fact that we don't support it for now seems to indicate that it is not a widely requested feature, especially considering that it is really a trivial option to add.
What is the linker you're using? Are you building your own clang?
[Steven]: I’m using the standard LLVM 3.8 with the above GNU new ld linker. I can build my own clang in my side if needed. I’m happy to know it is not difficult to enable the large code model in LLVM LTO and “it is really a trivial option to add”. Could you let me know how to enable it? My lots of work have been blocked by the large code model issue. Thank you!


I can't test it locally, but here is a starting point in the gold plugin, inspired by the code present in clang:

code-model-gold.patch

Shi, Steven via llvm-dev

unread,
May 30, 2016, 2:28:57 AM5/30/16
to mehdi...@apple.com, llvm-dev, cfe...@lists.llvm.org

Hi Mehdi,

Should I apply your attached patch on my llvm3.8 source firstly? Or should I use the latest llvm SVN trunk instead?

 

 

Steven Shi

Intel\SSG\STO\UEFI Firmware

 

Tel: +86 021-61166522

iNet: 821-6522

 

From: mehdi...@apple.com [mailto:mehdi...@apple.com]
Sent: Monday, May 30, 2016 2:13 PM
To: Shi, Steven <steve...@intel.com>
Cc: Umesh Kalappa <umesh.k...@gmail.com>; eli...@gmail.com; llvm-dev <llvm...@lists.llvm.org>; cfe...@lists.llvm.org; Rafael Espíndola <rafael.e...@gmail.com>
Subject: Re: [cfe-dev] [llvm-dev] How to debug if LTO generate wrong code?

 

 

On May 29, 2016, at 5:44 PM, Shi, Steven <steve...@intel.com> wrote:

 

You need to use your linker-specific way of passing the option "-lto-use-large-codemodel=..." to the plugin.

 

Let me know if it works for you!

 

-- 

Mehdi

 



 

 

Steven Shi

Intel\SSG\STO\UEFI Firmware

 

Tel: +86 021-61166522

iNet: 821-6522

 


Sent: Monday, May 30, 2016 8:17 AM
To: Shi, Steven <steve...@intel.com>
Cc: Umesh Kalappa <umesh.k...@gmail.com>; eli...@gmail.com; llvm-dev <llvm...@lists.llvm.org>; cfe...@lists.llvm.org; Rafael Espíndola <rafael.e...@gmail.com>
Subject: Re: [cfe-dev] [llvm-dev] How to debug if LTO generate wrong code?

On May 29, 2016, at 5:10 PM, Shi, Steven <steve...@intel.com> wrote:

 

Hi Mehdi,

GCC LTO seems support large code model in my side as below, if the code model is linker specific, does the GCC LTO use a special linker which is different from the one in GNU Binutils?

 

I don't know anything about GCC.

(And I doubt the GNU linker supports LTO with LLVM).

 

I’m a bit surprised if both OS X ld64 and gold plugin do not support large code model in LTO. Since modern system widely use the 64bit, the code need to run in high address (larger than 2 GB) is a reasonable requirement.

 

The fact that we don't support it for now seems to indicate that it is not a widely requested feature, especially considering that it is really a trivial option to add.

What is the linker you're using? Are you building your own clang?

 

-- 

Mehdi Amini via llvm-dev

unread,
May 30, 2016, 2:32:01 AM5/30/16
to Shi, Steven, llvm-dev, cfe...@lists.llvm.org
Hi Steven,


On May 29, 2016, at 11:28 PM, Shi, Steven <steve...@intel.com> wrote:

Hi Mehdi,
Should I apply your attached patch on my llvm3.8 source firstly? Or should I use the latest llvm SVN trunk instead?

I wrote it on trunk, but I expect it to be fairly easy to port on 3.8. This is really just quickly plumbing an option on the TargetMachine creation.

Shi, Steven via llvm-dev

unread,
May 30, 2016, 3:13:36 AM5/30/16
to mehdi...@apple.com, llvm-dev, cfe...@lists.llvm.org

Hi Mehdi,

The llvm3.8 gold-plugin.cpp is very different from the latest one on trunk. Your patch has compiling failure on llvm3.8 as below. I will try it on latest trunk later. Thank you help anyway!

 

Building CXX object tools/gold/CMakeFiles/LLVMgold.dir/gold-plugin.cpp.o

cd /home/jshi19/llvm38releasebuild/tools/gold && /home/jshi19/clang38/bin/clang++   -DGTEST_HAS_RTTI=0 -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -D_LARGEFILE_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/jshi19/llvm38releasebuild/tools/gold -I/home/jshi19/llvm-3.8.0.src/tools/gold -I/home/jshi19/llvm38releasebuild/include -I/home/jshi19/llvm-3.8.0.src/include -I/home/jshi19/binutils-2.26/include  -fPIC -fvisibility-inlines-hidden -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wcovered-switch-default -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -std=c++11 -ffunction-sections -fdata-sections -O3 -DNDEBUG -fPIC    -fno-exceptions -fno-rtti -o CMakeFiles/LLVMgold.dir/gold-plugin.cpp.o -c /home/jshi19/llvm-3.8.0.src/tools/gold/gold-plugin.cpp

/home/jshi19/llvm-3.8.0.src/tools/gold/gold-plugin.cpp:60:16: error: unknown type name 'string'; did you mean 'std::string'?

static cl::opt<string> LTOCodeModel("lto-use-large-codemodel", cl::Hidden,

               ^~~~~~

               std::string

/usr/lib/gcc/x86_64-linux-gnu/5.3.1/../../../../include/c++/5.3.1/bits/stringfwd.h:74:33: note: 'std::string' declared here

  typedef basic_string<char>    string;

                                ^

/home/jshi19/llvm-3.8.0.src/tools/gold/gold-plugin.cpp:800:9: error: no template named 'StringSwitch' in namespace 'llvm'; did you mean 'StringSet'?

  llvm::StringSwitch<unsigned>(LTOCodeModel)

  ~~~~~~^~~~~~~~~~~~

        StringSet

/home/jshi19/llvm-3.8.0.src/include/llvm/ADT/StringSet.h:23:9: note: 'StringSet' declared here

  class StringSet : public llvm::StringMap<char, AllocatorTy> {

        ^

In file included from /home/jshi19/llvm-3.8.0.src/tools/gold/gold-plugin.cpp:16:

In file included from /home/jshi19/llvm-3.8.0.src/include/llvm/ADT/DenseSet.h:17:

In file included from /home/jshi19/llvm-3.8.0.src/include/llvm/ADT/DenseMap.h:17:

In file included from /home/jshi19/llvm-3.8.0.src/include/llvm/ADT/DenseMapInfo.h:17:

In file included from /home/jshi19/llvm-3.8.0.src/include/llvm/ADT/ArrayRef.h:13:

In file included from /home/jshi19/llvm-3.8.0.src/include/llvm/ADT/Hashing.h:49:

In file included from /home/jshi19/llvm-3.8.0.src/include/llvm/Support/Host.h:17:

/home/jshi19/llvm-3.8.0.src/include/llvm/ADT/StringMap.h:228:12: error: multiple overloads of 'StringMap' instantiate to the same signature 'void (unsigned int)'

  explicit StringMap(AllocatorTy A)

           ^

/home/jshi19/llvm-3.8.0.src/include/llvm/ADT/StringSet.h:23:28: note: in instantiation of template class 'llvm::StringMap<char, unsigned int>' requested here

  class StringSet : public llvm::StringMap<char, AllocatorTy> {

                           ^

/home/jshi19/llvm-3.8.0.src/tools/gold/gold-plugin.cpp:800:3: note: in instantiation of template class 'llvm::StringSet<unsigned int>' requested here

  llvm::StringSwitch<unsigned>(LTOCodeModel)

  ^

/home/jshi19/llvm-3.8.0.src/include/llvm/ADT/StringMap.h:225:12: note: previous declaration is here

  explicit StringMap(unsigned InitialSize)

           ^

3 errors generated.

tools/gold/CMakeFiles/LLVMgold.dir/build.make:65: recipe for target 'tools/gold/CMakeFiles/LLVMgold.dir/gold-plugin.cpp.o' failed

make[3]: *** [tools/gold/CMakeFiles/LLVMgold.dir/gold-plugin.cpp.o] Error 1

make[3]: Leaving directory '/home/jshi19/llvm38releasebuild'

CMakeFiles/Makefile2:17855: recipe for target 'tools/gold/CMakeFiles/LLVMgold.dir/all' failed

make[2]: *** [tools/gold/CMakeFiles/LLVMgold.dir/all] Error 2

make[2]: Leaving directory '/home/jshi19/llvm38releasebuild'

CMakeFiles/Makefile2:17867: recipe for target 'tools/gold/CMakeFiles/LLVMgold.dir/rule' failed

make[1]: *** [tools/gold/CMakeFiles/LLVMgold.dir/rule] Error 2

make[1]: Leaving directory '/home/jshi19/llvm38releasebuild'

Makefile:3944: recipe for target 'LLVMgold' failed

make: *** [LLVMgold] Error 2

Joerg Sonnenberger via llvm-dev

unread,
May 30, 2016, 6:18:26 AM5/30/16
to llvm...@lists.llvm.org
On Mon, May 30, 2016 at 02:12:09AM +0000, Shi, Steven via llvm-dev wrote:
> Hi Joerg,
> My firmware case is I need load my firmware to run in high address
> which is hardware required. It is not that my firmware really need
> large static data and text code. My firmware modules are shared library
> (like a DLL) built with "-fpic", and firmware loader will load them to
> high address (larger than 2GB). I think this need is quite common to
> system software, like firmware, driver and kernel mode code.

It sounds more like there a confusion about what symbols are internal
and what not. The normal code model on AMD64 requires code and data to
fit into 2GB, but non-local symbols are accessed indirectly. That
doesn't happen in your case. For functions, that's normally partially
the job of the linker (via stubs), but for data the compiler has to be
aware of it. The load address is irrelevant for PIC.

Shi, Steven via llvm-dev

unread,
May 30, 2016, 7:39:09 AM5/30/16
to Joerg Sonnenberger, llvm...@lists.llvm.org, cfe...@lists.llvm.org
Yes, the "normal" code model you mentioned is the small code model, which use RIP-relative addressing, but my firmware need large code model which can reside anywhere in the full 64-bit address space. So, I need LLVM LTO support large code model.

Steven Shi
Intel\SSG\STO\UEFI Firmware

Tel: +86 021-61166522
iNet: 821-6522

> -----Original Message-----
> From: llvm-dev [mailto:llvm-dev...@lists.llvm.org] On Behalf Of Joerg
> Sonnenberger via llvm-dev
> Sent: Monday, May 30, 2016 6:18 PM
> To: llvm...@lists.llvm.org
> Subject: Re: [llvm-dev] [cfe-dev] How to debug if LTO generate wrong code?
>

Shi, Steven via llvm-dev

unread,
May 30, 2016, 8:12:39 AM5/30/16
to mehdi...@apple.com, llvm-dev, cfe...@lists.llvm.org

Hi Mehdi,

Your patch cannot compile with gold-plugin.cpp in latest trunk either, the failure is same as llvm3.8 as below.

 

 

Steven Shi

Intel\SSG\STO\UEFI Firmware

 

Tel: +86 021-61166522

iNet: 821-6522

 

Mehdi Amini via llvm-dev

unread,
May 30, 2016, 12:51:11 PM5/30/16
to Shi, Steven, llvm-dev, cfe...@lists.llvm.org
I didn't try to compile it locally (I don't have/use gold), but you should be able to fix the 3 errors fairly easily (looks like missing includes mostly).

-- 
Mehdi

Rafael Espíndola

unread,
May 30, 2016, 4:34:19 PM5/30/16
to Mehdi Amini, llvm-dev, cfe...@lists.llvm.org
We don't use cl::opt in gold, instead we parse the -plugin-opts that
gold passes the plugin (see process_plugin_option).

Cheers,
Rafael

> You need to use your linker-specific way of passing the option
> "-lto-use-large-codemodel=..." to the plugin.
>
> Let me know if it works for you!
>

> --
> Mehdi
>
>
>
>
> Steven Shi
> Intel\SSG\STO\UEFI Firmware
>
> Tel: +86 021-61166522
> iNet: 821-6522
>

> From: mehdi...@apple.com [mailto:mehdi...@apple.com]
> Sent: Monday, May 30, 2016 8:17 AM
> To: Shi, Steven <steve...@intel.com>
> Cc: Umesh Kalappa <umesh.k...@gmail.com>; eli...@gmail.com; llvm-dev
> <llvm...@lists.llvm.org>; cfe...@lists.llvm.org; Rafael Espíndola
> <rafael.e...@gmail.com>
> Subject: Re: [cfe-dev] [llvm-dev] How to debug if LTO generate wrong code?
>
>
>
> On May 29, 2016, at 5:10 PM, Shi, Steven <steve...@intel.com> wrote:
>
> Hi Mehdi,
> GCC LTO seems support large code model in my side as below, if the code
> model is linker specific, does the GCC LTO use a special linker which is
> different from the one in GNU Binutils?
>
>
> I don't know anything about GCC.

> (And I doubt the GNU linker supports LTO with LLVM).
>
>

> I’m a bit surprised if both OS X ld64 and gold plugin do not support large
> code model in LTO. Since modern system widely use the 64bit, the code need
> to run in high address (larger than 2 GB) is a reasonable requirement.
>
>

> The fact that we don't support it for now seems to indicate that it is not a
> widely requested feature, especially considering that it is really a trivial
> option to add.
> What is the linker you're using? Are you building your own clang?
>

Mehdi Amini via llvm-dev

unread,
May 30, 2016, 4:56:18 PM5/30/16
to Rafael Espíndola, llvm-dev, cfe...@lists.llvm.org


On 05/30/16 01:34 PM, Rafael Espíndola <rafael.e...@gmail.com> wrote:
We don't use cl::opt in gold, instead we parse the -plugin-opts that
gold passes the plugin (see process_plugin_option).
What about that:

$ grep ParseCommandLineOptions tools/gold/gold-plugin.cpp

      // ParseCommandLineOptions() expects argv[0] to be program name. Lazily

    cl::ParseCommandLineOptions(NumOpts, &options::extra[0]);



-- 
Mehdi

Rafael Espíndola

unread,
May 30, 2016, 7:52:47 PM5/30/16
to Mehdi Amini, llvm-dev, cfe...@lists.llvm.org
On 30 May 2016 at 16:56, Mehdi Amini <mehdi...@apple.com> wrote:
>
>
> On 05/30/16 01:34 PM, Rafael Espíndola <rafael.e...@gmail.com> wrote:
>
> We don't use cl::opt in gold, instead we parse the -plugin-opts that
> gold passes the plugin (see process_plugin_option).
>
> What about that:
>
> $ grep ParseCommandLineOptions tools/gold/gold-plugin.cpp
>
> // ParseCommandLineOptions() expects argv[0] to be program name.
> Lazily
>
> cl::ParseCommandLineOptions(NumOpts, &options::extra[0]);


That is for the options that the gold plugin itself doesn't understand
and just passes to llvm. This allows you to do things like
--plugin-opt=-debug-pass=Arguments.

Cheers,
Rafael

Mehdi Amini via llvm-dev

unread,
May 30, 2016, 7:55:32 PM5/30/16
to Rafael Espíndola, llvm-dev, cfe...@lists.llvm.org

Sent from my iPhone

> On May 30, 2016, at 4:52 PM, Rafael Espíndola <rafael.e...@gmail.com> wrote:
>
>> On 30 May 2016 at 16:56, Mehdi Amini <mehdi...@apple.com> wrote:
>>
>>
>> On 05/30/16 01:34 PM, Rafael Espíndola <rafael.e...@gmail.com> wrote:
>>
>> We don't use cl::opt in gold, instead we parse the -plugin-opts that
>> gold passes the plugin (see process_plugin_option).
>>
>> What about that:
>>
>> $ grep ParseCommandLineOptions tools/gold/gold-plugin.cpp
>>
>> // ParseCommandLineOptions() expects argv[0] to be program name.
>> Lazily
>>
>> cl::ParseCommandLineOptions(NumOpts, &options::extra[0]);
>
>
> That is for the options that the gold plugin itself doesn't understand
> and just passes to llvm. This allows you to do things like
> --plugin-opt=-debug-pass=Arguments.

This is what I expected, so my cl:opt should work, right? I don't really get your original point?

Mehdi

Rafael Espíndola

unread,
May 30, 2016, 7:56:30 PM5/30/16
to Mehdi Amini, llvm-dev, cfe...@lists.llvm.org
On 30 May 2016 at 19:55, Mehdi Amini <mehdi...@apple.com> wrote:
>
>
> Sent from my iPhone
>
>> On May 30, 2016, at 4:52 PM, Rafael Espíndola <rafael.e...@gmail.com> wrote:
>>
>>> On 30 May 2016 at 16:56, Mehdi Amini <mehdi...@apple.com> wrote:
>>>
>>>
>>> On 05/30/16 01:34 PM, Rafael Espíndola <rafael.e...@gmail.com> wrote:
>>>
>>> We don't use cl::opt in gold, instead we parse the -plugin-opts that
>>> gold passes the plugin (see process_plugin_option).
>>>
>>> What about that:
>>>
>>> $ grep ParseCommandLineOptions tools/gold/gold-plugin.cpp
>>>
>>> // ParseCommandLineOptions() expects argv[0] to be program name.
>>> Lazily
>>>
>>> cl::ParseCommandLineOptions(NumOpts, &options::extra[0]);
>>
>>
>> That is for the options that the gold plugin itself doesn't understand
>> and just passes to llvm. This allows you to do things like
>> --plugin-opt=-debug-pass=Arguments.
>
> This is what I expected, so my cl:opt should work, right? I don't really get your original point?

Just that the gold plugin itself never defines a cl::opt. It just
forwards the llvm ones.

Cheers,
Rafael

Shi, Steven via llvm-dev

unread,
May 31, 2016, 4:08:43 AM5/31/16
to Rafael Espíndola, Mehdi Amini, llvm-dev, cfe...@lists.llvm.org, af...@apple.com
Hi Mehdi,
What's the default code model for x86_64 Mac OS X App? Andrew showed me some example code of Mac OS X App as below, which looks to use the small code model but can run at >4GB high address.

For example if you read a global like this the compiler will generate this code.
int constant = 0;

int get_constant(void)
{
return constant;
}

(lldb) dis -n get_constant -b
a.out`get_constant:
a.out[0x100000f8c] <+0>: 55 pushq %rbp
a.out[0x100000f8d] <+1>: 48 89 e5 movq %rsp, %rbp
a.out[0x100000f90] <+4>: 8b 05 6a 00 00 00 movl 0x6a(%rip), %eax
a.out[0x100000f96] <+10>: 5d popq %rbp
a.out[0x100000f97] <+11>: c3 retq


Steven Shi
Intel\SSG\STO\UEFI Firmware

Tel: +86 021-61166522
iNet: 821-6522

Rafael Espíndola

unread,
May 31, 2016, 9:21:15 AM5/31/16
to Shi, Steven, llvm-dev, af...@apple.com, cfe...@lists.llvm.org
On 31 May 2016 at 01:08, Shi, Steven <steve...@intel.com> wrote:
> Hi Mehdi,
> What's the default code model for x86_64 Mac OS X App? Andrew showed me some example code of Mac OS X App as below, which looks to use the small code model but can run at >4GB high address.

Small, but PIC.

> For example if you read a global like this the compiler will generate this code.
> int constant = 0;
>
> int get_constant(void)
> {
> return constant;
> }


Compiling for ELF with -FPIE -Os I get

get_constant: # @get_constant
# BB#0: # %entry
movl constant(%rip), %eax
retq

Which should also be able to run at any address.

Cheers,
Rafael

Shi, Steven via llvm-dev

unread,
May 31, 2016, 10:24:58 AM5/31/16
to Rafael Espíndola, llvm-dev, af...@apple.com, cfe...@lists.llvm.org
OK, I get it. Adding "-pie" link option can force 64bits relocation address (e.g. EM_X86_64), instead of the 32bits one (e.g. R_X86_64_32S). I only used -fpic and -fpie in clang LTO compile option, and forgot add the "-pie" link option in ld. So my clang + ld LTO executable was still not position independent.

Thank you all!

Steven Shi
Intel\SSG\STO\UEFI Firmware

Tel: +86 021-61166522
iNet: 821-6522

> -----Original Message-----
> From: Rafael Espíndola [mailto:rafael.e...@gmail.com]
> Sent: Tuesday, May 31, 2016 9:21 PM
> To: Shi, Steven <steve...@intel.com>
> Cc: Mehdi Amini <mehdi...@apple.com>; Umesh Kalappa
> <umesh.k...@gmail.com>; eli...@gmail.com; llvm-dev <llvm-
> d...@lists.llvm.org>; cfe...@lists.llvm.org; af...@apple.com
> Subject: Re: [cfe-dev] [llvm-dev] How to debug if LTO generate wrong code?
>

Shi, Steven via llvm-dev

unread,
Jun 7, 2016, 10:55:23 AM6/7/16
to Rafael Espíndola, llvm-dev, Lu, Hongjiu, af...@apple.com, cfe...@lists.llvm.org
Hi Rafael,
I finally enable the clang LTO build with small code model and PIE, and my clang LTO Uefi firmware works now. Thank you! But I have one more issue on the clang normal build (without LTO) now. I find the small code model + "-fpie" build option will let clang generate some R_X86_64_GOTPCREL type relocation entries in my firmware image, which not happen in the clang LTO build. I wish I could enforce the clang normal build not to use the R_X86_64_GOTPCREL relocation type but to use R_X86_64_PC32 or R_X86_64_PLT32 instead. How could I do it?

Rafael Espíndola

unread,
Jun 7, 2016, 11:04:54 AM6/7/16
to Shi, Steven, llvm-dev, Lu, Hongjiu, af...@apple.com, cfe...@lists.llvm.org
On 7 June 2016 at 10:54, Shi, Steven <steve...@intel.com> wrote:
> Hi Rafael,
> I finally enable the clang LTO build with small code model and PIE, and my clang LTO Uefi firmware works now. Thank you! But I have one more issue on the clang normal build (without LTO) now. I find the small code model + "-fpie" build option will let clang generate some R_X86_64_GOTPCREL type relocation entries in my firmware image, which not happen in the clang LTO build. I wish I could enforce the clang normal build not to use the R_X86_64_GOTPCREL relocation type but to use R_X86_64_PC32 or R_X86_64_PLT32 instead. How could I do it?

Does it reproduce with clang trunk?

Shi, Steven via llvm-dev

unread,
Jun 7, 2016, 11:15:54 AM6/7/16
to Rafael Espíndola, llvm-dev, Lu, Hongjiu, af...@apple.com, cfe...@lists.llvm.org
Yes. I use the trunk, and my version is below. I could try the latest version tomorrow. And if you need, I could give the reproduce build steps on my Uefi firmware tomorrow. Thank you!
clang version 3.9.0 (trunk 271203)


Steven Shi
Intel\SSG\STO\UEFI Firmware

Tel: +86 021-61166522
iNet: 821-6522


> -----Original Message-----
> From: Rafael Espíndola [mailto:rafael.e...@gmail.com]
> Sent: Tuesday, June 07, 2016 11:05 PM
> To: Shi, Steven <steve...@intel.com>
> Cc: Mehdi Amini <mehdi...@apple.com>; Umesh Kalappa
> <umesh.k...@gmail.com>; llvm-dev <llvm...@lists.llvm.org>; cfe-
> d...@lists.llvm.org; af...@apple.com; Lu, Hongjiu <hongj...@intel.com>
> Subject: Re: [cfe-dev] [llvm-dev] How to debug if LTO generate wrong code?
>

Lu, Hongjiu via llvm-dev

unread,
Jun 7, 2016, 11:47:40 AM6/7/16
to Shi, Steven, Rafael Espíndola, llvm-dev, cfe...@lists.llvm.org, af...@apple.com
See:

https://llvm.org/bugs/show_bug.cgi?id=24964



> -----Original Message-----
> From: Shi, Steven
Reply all
Reply to author
Forward
0 new messages