[llvm-dev] Building libjpeg-turbo with LTO

147 views
Skip to first unread message

Shishir V Jessu via llvm-dev

unread,
Apr 8, 2020, 1:23:11 PM4/8/20
to llvm-dev
Hi, 

I have tried to build libjpeg-turbo with LTO in LLVM, using both clangbut get many errors in lld that look like the following: 

ld: error: undefined symbol: jpeg_std_error
>>> referenced by jcstest.c:76
>>>               lto.tmp:(main)

ld: error: undefined symbol: jpeg_CreateCompress
>>> referenced by jcstest.c:86
>>>               lto.tmp:(main)

ld: error: undefined symbol: jpeg_set_defaults
>>> referenced by jcstest.c:88
>>>               lto.tmp:(main)

ld: error: undefined symbol: jpeg_default_colorspace
>>> referenced by jcstest.c:90
>>>               lto.tmp:(main)
>>> referenced by jcstest.c:114
>>>               lto.tmp:(main)

This only occurs when compiling with the -flto flag. Has anyone been able to build libjpeg-turbo with LTO? Are there any modifications I need to make to the makefile or other configuration in order to do so? Thanks for your help!

Best, 
Shishir Jessu

Shishir V Jessu via llvm-dev

unread,
Apr 8, 2020, 1:25:26 PM4/8/20
to llvm-dev
To correct a typo: I am using both clang 6.0.0, and a local build of clang 10.0.0, and each result in the same error.

Best, 
Shishir Jessu

Teresa Johnson via llvm-dev

unread,
Apr 8, 2020, 2:02:12 PM4/8/20
to Shishir V Jessu, llvm-dev
Are the object files for jcstest.c and the source files defining these symbols being directly LTO linked together, or are the defs first LTO linked into a shared library? It would be helpful to see the build commands involved. 
Teresa

_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


--
Teresa Johnson | Software Engineer | tejo...@google.com |

Teresa Johnson via llvm-dev

unread,
Apr 8, 2020, 8:11:00 PM4/8/20
to Shishir V Jessu, Fangrui Song, Rui Ueyama, llvm-dev
Adding a couple of lld folks.

I helped Shishir debug this, the link line looked like:
   /home/sjessu/build/bin/clang -O0 -flto -o jcstest jcstest.o  ./.libs/libjpeg.a
and the issue was that libjpeg.a was created with the system ar instead of llvm-ar. It worked when recreating libjpeg.a with llvm-ar.

I noticed that the lld code has some special handling for the case when there is a missing symbol table, which often happens with system ar created archives containing bitcode. I noticed that the lld code will sometimes emit an error, but actually contains a special hack to handle archives containing *only* bitcode objects, so that they are handled correctly even when there is no symbol table because it was created with the system ar. Unfortunately, in this case it neither gave an error nor did the special handling, because libjpeg.a also contains some native objects and thus had a non-zero symbol table. I created a version of libjpeg.a using the system library and containing only the bitcode objects, and confirmed it links fine with lld (the native objects weren't needed in this case). BTW this is the code in ELF/Driver.cpp LinkerDriver::addFile.

Would it be possible to extend the hack in lld to handle cases like this with some bitcode objects and some non-bitcode objects, so that the bitcode objects are not simply ignored?

Thanks,
Teresa

On Wed, Apr 8, 2020 at 10:25 AM Shishir V Jessu via llvm-dev <llvm...@lists.llvm.org> wrote:
_______________________________________________

Fangrui Song via llvm-dev

unread,
Apr 10, 2020, 1:35:15 PM4/10/20
to Teresa Johnson, llvm-dev, Shishir V Jessu
On 2020-04-08, Teresa Johnson wrote:
>Adding a couple of lld folks.
>
>I helped Shishir debug this, the link line looked like:
> /home/sjessu/build/bin/clang -O0 -flto -o jcstest jcstest.o
> ./.libs/libjpeg.a
>and the issue was that libjpeg.a was created with the system ar instead of
>llvm-ar. It worked when recreating libjpeg.a with llvm-ar.
>
>I noticed that the lld code has some special handling for the case
>when there is a missing symbol table, which often happens with system ar
>created archives containing bitcode. I noticed that the lld code will
>sometimes emit an error, but actually contains a special hack to handle
>archives containing *only* bitcode objects, so that they are handled
>correctly even when there is no symbol table because it was created with
>the system ar.

https://reviews.llvm.org/D63781 added the "archive has no index; run
ranlib to add one" error.

A clarification: the LinkerDriver::addFile code handles mix-and-match
ELF object members and bitcode members. A LazyObjFile can be
either an ELF object file or an LLVM bitcode file.

>Unfortunately, in this case it neither gave an error nor did
>the special handling, because libjpeg.a also contains some native objects
>and thus had a non-zero symbol table. I created a version of libjpeg.a
>using the system library and containing only the bitcode objects, and
>confirmed it links fine with lld (the native objects weren't needed in this
>case). BTW this is the code in ELF/Driver.cpp LinkerDriver::addFile.
>
>Would it be possible to extend the hack in lld to handle cases like this
>with some bitcode objects and some non-bitcode objects, so that the bitcode
>objects are not simply ignored?
>
>Thanks,
>Teresa

I guess what happened here is that the archive has an incomplete symbol table.
nm -s (--print-armap) can print the symbol table.

% ar rc a.a a.bc a.o; nm -s a.a

Archive index:
_start in a.o
nm: a.bc: file format not recognized

a.o:
0000000000000000 T _start

Currently lld trusts the archive symbol table. If the archive symbol table
actually misses some entries (GNU ar does not add bitcode definitions to the
symbol table), lld will not know that some lazy definitions are actually
missing.

It seems that if we have to make the GNU ar scenario work, lld has to distrust the archive symbol table when it contains bitcode files...
To not pessimize the case with all bitcode members but no ELF object members, we need to refine the hack to "distrust" the archive symbol table
if (the archive symbol table exists && an ELF object member exists && a bitcode member exists).

Does this scheme sound good?

>On Wed, Apr 8, 2020 at 10:25 AM Shishir V Jessu via llvm-dev <
>llvm...@lists.llvm.org> wrote:
>
>> To correct a typo: I am using both clang 6.0.0, and a local build of clang
>> 10.0.0, and each result in the same error.
>>
>> Best,
>> Shishir Jessu
>>

>> On Wed, Apr 8, 2020 at 12:22 PM Shishir V Jessu <shishi...@utexas.edu>
>> wrote:
>>
>>> Hi,
>>>
>>> I have tried to build libjpeg-turbo

>>> <https://github.com/libjpeg-turbo/libjpeg-turbo> with LTO in LLVM, using

James Y Knight via llvm-dev

unread,
Apr 10, 2020, 5:07:59 PM4/10/20
to Fangrui Song, llvm-dev, Shishir V Jessu
I don't think there really ought to be an expectation that this works with an ar implementation which can't parse the LTO files.

The only way it works with GCC is that they ship /usr/lib/bfd-plugins/liblto_plugin.so which "claims" the LTO object files and tells ar about the symbol table.

Either users should be using llvm-ar, or LLVM should be shipping a gnu binutils plugin.

Teresa Johnson via llvm-dev

unread,
Apr 10, 2020, 5:34:26 PM4/10/20
to James Y Knight, llvm-dev, Shishir V Jessu
On Fri, Apr 10, 2020 at 2:07 PM James Y Knight <jykn...@google.com> wrote:
I don't think there really ought to be an expectation that this works with an ar implementation which can't parse the LTO files.

The only way it works with GCC is that they ship /usr/lib/bfd-plugins/liblto_plugin.so which "claims" the LTO object files and tells ar about the symbol table.

Either users should be using llvm-ar, or LLVM should be shipping a gnu binutils plugin.

I believe the system ar will work in combination with the LLVM gold plugin, btw.

The confusing thing here is that it fails silently. If you don't know what you are looking for (I didn't even remember this when initially helping Shishir, and I spend a lot of time looking at LTO behavior), it's impossible to figure out why the link is failing. It would be friendliest to users if lld either consistently gave a meaningful error, or consistently just worked (like it does in the all bitcode case, even without a symbol table).

Fangrui - I am not sure I followed your suggestion. But if it means that a mixed bitcode/native object case would just be handled with or without a complete symbol table, that would be awesome. In this case the symbol table is incomplete (only has symbols for the native objects).

Teresa

Rui Ueyama via llvm-dev

unread,
Apr 13, 2020, 12:59:57 AM4/13/20
to Teresa Johnson, llvm-dev, Shishir V Jessu
Teresa,

On Thu, Apr 9, 2020 at 9:10 AM Teresa Johnson <tejo...@google.com> wrote:
Adding a couple of lld folks.

I helped Shishir debug this, the link line looked like:
   /home/sjessu/build/bin/clang -O0 -flto -o jcstest jcstest.o  ./.libs/libjpeg.a
and the issue was that libjpeg.a was created with the system ar instead of llvm-ar. It worked when recreating libjpeg.a with llvm-ar.

I noticed that the lld code has some special handling for the case when there is a missing symbol table, which often happens with system ar created archives containing bitcode. I noticed that the lld code will sometimes emit an error, but actually contains a special hack to handle archives containing *only* bitcode objects, so that they are handled correctly even when there is no symbol table because it was created with the system ar. Unfortunately, in this case it neither gave an error nor did the special handling, because libjpeg.a also contains some native objects and thus had a non-zero symbol table. I created a version of libjpeg.a using the system library and containing only the bitcode objects, and confirmed it links fine with lld (the native objects weren't needed in this case). BTW this is the code in ELF/Driver.cpp LinkerDriver::addFile.

Would it be possible to extend the hack in lld to handle cases like this with some bitcode objects and some non-bitcode objects, so that the bitcode objects are not simply ignored?

Interesting suggestion. So, as you summarized, lld has a special hack for LTO in terms of archive file handling. That is, if an archive file's symbol table is completely empty, we consider it as a result that the system linker (which doesn't understand the LLVM bitcode file format) is wrongly used against bitcode files. However, if at least one member object file is in the native ELF format, the archive file will have some symbol in its symbol table, so the hack won't kick in.

I think one approach to fix the issue is to not trust the archive file symbol table for bitcode files at all. Instead, we can read directly from a symbol table of each archive member bitcode file. That shouldn't be technically difficult. I'm a bit worried about the performance penalty of doing that, though, because in order to read bitcode file symbol tables, we have to identify which file is bitcode file and which file is native ELF file. That means we have to read a file magic from all archive members. That might be noticeably slow, in particular, if thin archives are in use, but that's highly dependent on the filesystem where the input files are laid out.
Reply all
Reply to author
Forward
0 new messages