[llvm-dev] LLVM_DYLIB and CLANG_DYLIB with MSVC

248 views
Skip to first unread message

Cristian Adam via llvm-dev

unread,
May 30, 2021, 10:16:58 AM5/30/21
to llvm...@lists.llvm.org

Hi,

Clang 12 can be configured on Windows with MinGW (either GNU or LLVM) with the following CMake parameters:

  • LLVM_BUILD_LLVM_DYLIB=ON
  • LLVM_LINK_LLVM_DYLIB=ON
  • CLANG_LINK_CLANG_DYLIB=ON

which has some effect on the binary size of the build.

I configured the llvm-project with the following parameters:

  • CMAKE_BUILD_TYPE=Release
  • LLVM_TARGETS_TO_BUILD=X86
  • LLVM_ENABLE_PROJECTS=clang;clang-tools-extra

The installed (stripped) build of Clang 12 with llvm-mingw release 12.0.0 resulted in:

  • Normal build: 1,76 GB
  • shlib build: 481 MB

Due to the nature of MSVC regarding default visibility of symbols (hidden by default, whereas MinGW has visible by default), one needs to generate a .def file with the symbols needed to be exported.

This is done already in two cases for LLVM_BUILD_LLVM_C_DYLIB (llvm/tools/llvm-shlib/gen-msvc-exports.py) and for LLVM_EXPORT_SYMBOLS_FOR_PLUGINS (llvm/utils/extract_symbols.py).

I've put together a patch that enables LLVM_DYLIB and CLANG_DYLIB for MSVC.

I tested with clang-cl from the official Clang 12 x64 Windows binary release:

  • Normal build: 1,42 GB
  • shlib build: 536 MB

The shlib release build compiled and linked fine with LLVM.dll and clang-cpp.dll, unfortunately it crashes at runtime. For example llvm-nm:

$ llvm-nm
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.      Program arguments: llvm-nm
#0 0x00007ffd32807d43 llvm::StringMap<llvm::cl::Option *,llvm::MallocAllocator>::begin C:\Projects\llvm-project\repo\llvm\include\llvm\ADT\StringMap.h:204:0
#1 0x00007ffd32807d43 llvm::cl::HideUnrelatedOptions(class llvm::cl::OptionCategory &, class llvm::cl::SubCommand &) C:\Projects\llvm-project\repo\llvm\lib\Support\CommandLine.cpp:2589:0
#2 0x00007ff689df2b13 llvm::StringRef::StringRef C:\Projects\llvm-project\repo\llvm\include\llvm\ADT\StringRef.h:107:0
#3 0x00007ff689df2b13 main C:\Projects\llvm-project\repo\llvm\tools\llvm-nm\llvm-nm.cpp:2232:0
#4 0x00007ff689e26d04 invoke_main D:\agent\_work\10\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78:0
#5 0x00007ff689e26d04 __scrt_common_main_seh D:\agent\_work\10\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288:0
#6 0x00007ffd9a7f7034 (C:\Windows\System32\KERNEL32.DLL+0x17034)
#7 0x00007ffd9b742651 (C:\Windows\SYSTEM32\ntdll.dll+0x52651)

This crash is due to llvm::cl::HideUnrelatedOptions which accesses TopLevelSubCommand, which is defined as:
extern ManagedStatic<SubCommand> TopLevelSubCommand;

The MSVC 2019 build behaves in the same way as the clang-cl build.

I have tried building without LLVM_ENABLE_THREADS, or by linking to the CRT statically LLVM_USE_CRT_RELEASE=MT, didn't help.

The MSVC 2019 build sizes were:

  • Normal build: 1,74 GB
  • shlib build: 949 MB

I would appreciate any help in getting the shlib build running. It works fine with llvm-mingw, I think it should also work with clang-cl / cl.

Cheers,
Cristian.

Reid Kleckner via llvm-dev

unread,
Jun 3, 2021, 1:11:56 PM6/3/21
to Cristian Adam, Tom Stellard, Fangrui Song, llvm-dev
I agree, for the reasons you outline, LLVM needs to do more to support producing DLLs on Windows. However, I don't really have time to dig into your proposal and the issues you are running into.

Personally, I think LLVM should move away from that .def file generation python script, and towards source-level annotations. It would allow us to use fvisibility=hidden in the shared library build on other platforms. That would greatly reduce the number of dynamic symbols in LLVM, which I understand is desirable. Today we used Bsymbolic-functions, so we may have already captured most of the benefits of fvisibility=hidden, but hidden visibility is always better for performance and binary size.

I put out this alternative proposal mainly to see if there is any enthusiasm for it. If not, you are welcome to continue forward with our existing solutions. But if there is broad interest in adding API annotations to LLVM, I think that would be a better way to go long term.

_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Martin Storsjö via llvm-dev

unread,
Jun 3, 2021, 5:10:04 PM6/3/21
to Cristian Adam, llvm...@lists.llvm.org
On Sun, 30 May 2021, Cristian Adam via llvm-dev wrote:

> Due to the nature of MSVC regarding default visibility of symbols (hidden by
> default, whereas MinGW has visible by default), one needs to generate a .def
> file with the symbols needed to be exported.
>
> This is done already in two cases for LLVM_BUILD_LLVM_C_DYLIB
> (llvm/tools/llvm-shlib/gen-msvc-exports.py) and for
> LLVM_EXPORT_SYMBOLS_FOR_PLUGINS (llvm/utils/extract_symbols.py).
>
> I've put together a patch that enables LLVM_DYLIB and CLANG_DYLIB for MSVC.
>
> I tested with clang-cl from the official Clang 12 x64 Windows binary
> release:
>

> * Normal build: 1,42 GB
> * shlib build: 536 MB


>
> The shlib release build compiled and linked fine with LLVM.dll and
> clang-cpp.dll, unfortunately it crashes at runtime.

Without digging into the scripts, I have one hunch:

Does the def generator script differentiate between code and data symbols?

For the cases where accessing a data symbol from another DLL, the caller
would have to have seen a declaration with the dllimport attribute. For
functions, it doesn't matter (it just does an extra hop via the import
thunk), but for data variables it matters. If the def file would have
proper DATA annotations for such symbols, you would end up with linker
errors (where you'd have an undefined reference to dataSymbol, where the
import library only provides __imp_dataSymbol).

This is fixed up by the autoimport feature when linking in mingw mode
(which, in general, requires you to link against the mingw runtime too);
for cases where the caller references dataSymbol but you only have
__imp_dataSymbol available, the linker adds an entry to a list of pseudo
relocations, which the mingw runtime handles when loaded, which then maps
sections as writable and patches up the addresses to where they are
located in another DLL.

So to avoid this, we would either need to actually provide proper
dllimport declarations at least for all data symbols, or avoid cross-DLL
data accesses (by using e.g. accessor functions instead).


// Martin

Fangrui Song via llvm-dev

unread,
Jun 3, 2021, 5:43:23 PM6/3/21
to Reid Kleckner, llvm-dev
On 2021-06-03, Reid Kleckner wrote:
>I agree, for the reasons you outline, LLVM needs to do more to support
>producing DLLs on Windows. However, I don't really have time to dig into
>your proposal and the issues you are running into.
>
>Personally, I think LLVM should move away from that .def file generation
>python script, and towards source-level annotations. It would allow us to
>use fvisibility=hidden in the shared library build on other platforms. That
>would greatly reduce the number of dynamic symbols in LLVM, which I
>understand is desirable. Today we used Bsymbolic-functions, so we may have
>already captured most of the benefits of fvisibility=hidden, but hidden
>visibility is always better for performance and binary size.

Yes, for Clang, the benefits are mostly captured with just -Bsymbolic-functions.

For GCC -fno-semantic-interposition is needed to enable interprocedural
optimizations for -fPIC compiles (https://reviews.llvm.org/D102453)

>I put out this alternative proposal mainly to see if there is any
>enthusiasm for it. If not, you are welcome to continue forward with our
>existing solutions. But if there is broad interest in adding API
>annotations to LLVM, I think that would be a better way to go long term.

I think explicit annotations make sense. I think most large-scale
projects considering ELF/Windows portability are doing this and
llvm-project is an unfortunate outlier.

If we add annotations (I persoanlly favor it), there will be churn to
llvm/include/llvm/**/*.h header files.

Moreover, as is, almost every function defined in llvm/lib/A/*.cpp is
exported to llvm/include/llvm/A/*.h if it is used by another library
llvm/lib/B. Every cross-lib API is public. We don't do a good job
making clear what are internal and what are public (and what are
stabler and what are less).

>On Sun, May 30, 2021 at 7:17 AM Cristian Adam via llvm-dev <
>llvm...@lists.llvm.org> wrote:
>
>> Hi,
>>
>> Clang 12 can be configured on Windows with MinGW (either GNU or LLVM) with
>> the following CMake parameters:
>>

>> - LLVM_BUILD_LLVM_DYLIB=ON
>> - LLVM_LINK_LLVM_DYLIB=ON
>> - CLANG_LINK_CLANG_DYLIB=ON


>>
>> which has some effect on the binary size of the build.
>>
>> I configured the llvm-project with the following parameters:
>>

>> - CMAKE_BUILD_TYPE=Release
>> - LLVM_TARGETS_TO_BUILD=X86
>> - LLVM_ENABLE_PROJECTS=clang;clang-tools-extra


>>
>> The installed (stripped) build of Clang 12 with llvm-mingw release 12.0.0

>> <https://github.com/mstorsjo/llvm-mingw/releases/tag/20210423> resulted
>> in:
>>
>> - Normal build: 1,76 GB
>> - shlib build: 481 MB


>>
>> Due to the nature of MSVC regarding default visibility of symbols (hidden
>> by default, whereas MinGW has visible by default), one needs to generate a
>> .def file with the symbols needed to be exported.
>>
>> This is done already in two cases for LLVM_BUILD_LLVM_C_DYLIB (llvm/tools/llvm-shlib/gen-msvc-exports.py)
>> and for LLVM_EXPORT_SYMBOLS_FOR_PLUGINS (llvm/utils/extract_symbols.py).
>>
>> I've put together a patch

>> <https://github.com/cristianadam/llvm-project/commit/3a3b8a7df17a49ba7c0153b0c9a7ee25705ede46>


>> that enables LLVM_DYLIB and CLANG_DYLIB for MSVC.
>>
>> I tested with clang-cl from the official Clang 12 x64 Windows binary
>> release:
>>

>> - Normal build: 1,42 GB
>> - shlib build: 536 MB

>> - Normal build: 1,74 GB
>> - shlib build: 949 MB

Cristian Adam via llvm-dev

unread,
Jun 5, 2021, 9:46:45 AM6/5/21
to Martin Storsjö, llvm...@lists.llvm.org

On 03/06/2021 23:09, Martin Storsjö wrote:

On Sun, 30 May 2021, Cristian Adam via llvm-dev wrote:

Due to the nature of MSVC regarding default visibility of symbols (hidden by
default, whereas MinGW has visible by default), one needs to generate a .def
file with the symbols needed to be exported.

This is done already in two cases for LLVM_BUILD_LLVM_C_DYLIB
(llvm/tools/llvm-shlib/gen-msvc-exports.py) and for
LLVM_EXPORT_SYMBOLS_FOR_PLUGINS (llvm/utils/extract_symbols.py).

I've put together a patch that enables LLVM_DYLIB and CLANG_DYLIB for MSVC.

I tested with clang-cl from the official Clang 12 x64 Windows binary
release:

 *  Normal build: 1,42 GB
 *  shlib build: 536 MB

The shlib release build compiled and linked fine with LLVM.dll and
clang-cpp.dll, unfortunately it crashes at runtime.

Without digging into the scripts, I have one hunch:

Does the def generator script differentiate between code and data symbols?

No. The CMake script just calls llvm-nm, exports the symbols and filters some out.

llvm-nm needs to exist beforehand. I tried using dumpbin, but it's slower than llvm-nm. CMake's __create_def also works, but it's not exporting all symbols needed to proper link.

For the cases where accessing a data symbol from another DLL, the caller would have to have seen a declaration with the dllimport attribute. For functions, it doesn't matter (it just does an extra hop via the import thunk), but for data variables it matters. If the def file would have proper DATA annotations for such symbols, you would end up with linker errors (where you'd have an undefined reference to dataSymbol, where the import library only provides __imp_dataSymbol).

This is fixed up by the autoimport feature when linking in mingw mode (which, in general, requires you to link against the mingw runtime too); for cases where the caller references dataSymbol but you only have __imp_dataSymbol available, the linker adds an entry to a list of pseudo relocations, which the mingw runtime handles when loaded, which then maps sections as writable and patches up the addresses to where they are located in another DLL.

So to avoid this, we would either need to actually provide proper dllimport declarations at least for all data symbols, or avoid cross-DLL data accesses (by using e.g. accessor functions instead).


According to https://docs.microsoft.com/en-us/cpp/build/reference/exports?view=msvc-160  you need to:

When you export a variable from a DLL by using a .DEF file, you do not have to specify __declspec(dllexport) on the variable. However, in any file that uses the DLL, you must still use __declspec(dllimport) on the declaration of data.

Then tested it out in a small project https://github.com/cristianadam/test-dll-def/.

Then proceeded to add dllimport declarations for llvm::cl::TopLevelSubCommand and llvm::cl::AllSubCommands as seen in the updated patch:
https://github.com/cristianadam/llvm-project/commit/56ecad41992bd9345702fccaf3805ab186dca25c

Now llvm-nm doesn't crash anymore!

Adding the dllimport declaration for the exported data symbols should be less work than doing proper dllimport declaration for everything that uses LLVM.dll and clang-cpp.dll. Now I just need to find out which data variables I need to update.

Thank you!

Cheers,
Cristian.

Joachim Meyer via llvm-dev

unread,
Jun 14, 2021, 1:34:09 PM6/14/21
to llvm...@lists.llvm.org
Hi all,

I'm new to this mailing list so, a short intro from my side :)
======
I am a computer engineering student from Heidelberg University, currently
working on my master's thesis where I extend the hipSYCL Clang plugin to
enable optimized support for SYCL's nd_range parallel_for on CPUs.
In hipSYCL we happen to be able to support Windows with our Clang plugin and
would obviously love to keep it that way to bring my current work to Windows
at some point as well.

The pain point starts when using the new PM with the currently only
optimization pass in the plugin, as the static Key members of Analyses are not
correctly ex-/imported on Windows. As mentioned by Martin this could be worked
around by moving them into function local static variables, but this would be
just another workaround and would not solve the general issue with symbol
exports on Windows, which also includes some symbols being filtered out, which
makes working with the system pretty flakey.
(e.g. llvm::DominatorTreeBase<class llvm::BasicBlock, 1> seems to be exported
correctly but llvm::DominatorTreeBase<class llvm::BasicBlock, 0> is not for no
obvious reason.. ref: https://github.com/fodinabor/hipSYCL/runs/2736333414?
check_suite_focus=true#step:10:3489)

As far as I understand, marking required symbols as exported, can help in this
situation, especially as proper support for using LLVM as shared library in
the Clang driver and so on could drastically reduce the symbol pressure in
each of the binaries.
======

I am thus coming from a slightly different angle to the problem, but see the
same ideal solution as Reid and Fang-rui, therefore I'd like to ask *if there
are objections against starting to explicitly mark symbols as being exported*.

The proposal from last weeks' Windows/COFF call, was to start by adding a
header defining `LLVM_EXPORT` macros for explicit marking of exported symbols,
which can be used to mark data symbols as `__declspec(dllimport)` as well.
With this in the codebase, the symbols required when using shared libraries,
can be marked successively to enable more API boundaries to rely on the
explicit exports.

As I see it, this is also in-line with the comment by Tom Stellard regarding
the LLD-as-a-library design, where they wish for a single shared library that
only exports symbols if explicitly marked.

Another suggestion in the call was to provide a single shared library for each
of the projects (LLVM, Clang, ..), instead of many small shared libraries as
it is the case at the moment when using shared lib builds. The obligatory
exception to the rule might be the Support lib, which could be a useful
candidate to keep separate.

To conclude, I'd love feedback on whether this is generally something the
wider community sees as something worth to pursue and I am open to more
suggestions on the best way forward.

-- Joachim Meyer

Am Donnerstag, 3. Juni 2021, 19:11:40 CEST schrieb Reid Kleckner:
> I agree, for the reasons you outline, LLVM needs to do more to support
> producing DLLs on Windows. However, I don't really have time to dig into
> your proposal and the issues you are running into.
>
> Personally, I think LLVM should move away from that .def file generation
> python script, and towards source-level annotations. It would allow us to
> use fvisibility=hidden in the shared library build on other platforms. That
> would greatly reduce the number of dynamic symbols in LLVM, which I
> understand is desirable. Today we used Bsymbolic-functions, so we may have
> already captured most of the benefits of fvisibility=hidden, but hidden
> visibility is always better for performance and binary size.
>
> I put out this alternative proposal mainly to see if there is any
> enthusiasm for it. If not, you are welcome to continue forward with our
> existing solutions. But if there is broad interest in adding API
> annotations to LLVM, I think that would be a better way to go long term.
>
> On Sun, May 30, 2021 at 7:17 AM Cristian Adam via llvm-dev <
>
> llvm-dev at lists.llvm.org> wrote:
> > Hi,
> >
> > Clang 12 can be configured on Windows with MinGW (either GNU or LLVM) with
> >
> > the following CMake parameters:
> > - LLVM_BUILD_LLVM_DYLIB=ON
> > - LLVM_LINK_LLVM_DYLIB=ON
> > - CLANG_LINK_CLANG_DYLIB=ON
> >
> > which has some effect on the binary size of the build.
> >
> > I configured the llvm-project with the following parameters:
> > - CMAKE_BUILD_TYPE=Release
> > - LLVM_TARGETS_TO_BUILD=X86
> > - LLVM_ENABLE_PROJECTS=clang;clang-tools-extra
> >
> > The installed (stripped) build of Clang 12 with llvm-mingw release 12.0.0
> > <https://github.com/mstorsjo/llvm-mingw/releases/tag/20210423> resulted
> >
> > in:
> > - Normal build: 1,76 GB
> > - shlib build: 481 MB
> >
> > Due to the nature of MSVC regarding default visibility of symbols (hidden
> > by default, whereas MinGW has visible by default), one needs to generate a
> > .def file with the symbols needed to be exported.
> >
> > This is done already in two cases for LLVM_BUILD_LLVM_C_DYLIB
> > (llvm/tools/llvm-shlib/gen-msvc-exports.py) and for
> > LLVM_EXPORT_SYMBOLS_FOR_PLUGINS (llvm/utils/extract_symbols.py).
> >
> > I've put together a patch
> > <https://github.com/cristianadam/llvm-project/commit/3a3b8a7df17a49ba7c015
> > 3b0c9a7ee25705ede46> that enables LLVM_DYLIB and CLANG_DYLIB for MSVC.
> >
> > I tested with clang-cl from the official Clang 12 x64 Windows binary
> >
> > release:
> > - Normal build: 1,42 GB
> > - shlib build: 536 MB
> > - Normal build: 1,74 GB
> > - shlib build: 949 MB
> >
> > I would appreciate any help in getting the shlib build running. It works
> > fine with llvm-mingw, I think it should also work with clang-cl / cl.
> >
> > Cheers,
> > Cristian.
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210603/2154c017/att
> achment.html>
signature.asc

Tom Stellard via llvm-dev

unread,
Jun 14, 2021, 1:49:32 PM6/14/21
to Joachim Meyer, llvm...@lists.llvm.org

This would be a huge improvement, if you want to work on it, go for it.


> The proposal from last weeks' Windows/COFF call, was to start by adding a
> header defining `LLVM_EXPORT` macros for explicit marking of exported symbols,
> which can be used to mark data symbols as `__declspec(dllimport)` as well.
> With this in the codebase, the symbols required when using shared libraries,
> can be marked successively to enable more API boundaries to rely on the
> explicit exports.
>

There is already the LLVM_EXTERNAL_VISIBILITY macro defined in
llvm/Support/Compiler.h macro which is used in llvm/lib/Target.
I would start using this one instead of creating a new LLVM_EXPORT macro.
We can always rename the macro later if people like the name LLVM_EXPORT
better.

- Tom

>> llvm...@lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list

llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Reid Kleckner via llvm-dev

unread,
Jun 15, 2021, 2:00:40 PM6/15/21
to Tom Stellard, llvm...@lists.llvm.org
On Mon, Jun 14, 2021 at 10:49 AM Tom Stellard via llvm-dev <llvm...@lists.llvm.org> wrote:
There is already the LLVM_EXTERNAL_VISIBILITY macro defined in
llvm/Support/Compiler.h macro which is used in llvm/lib/Target.
I would start using this one instead of creating a new LLVM_EXPORT macro.
We can always rename the macro later if people like the name LLVM_EXPORT
better.

IMO we should go ahead and do a renaming and make project-specific headers. In the end, we need different macros for clang, lld, and llvm, and it seems wrong to put all of those in llvm/Support/Compiler.h.

Personally, I like the *_EXPORT name, but the other widely used convention is *_API. I think ICU uses that.

Here are some examples of existing export header templates:

Tom Stellard via llvm-dev

unread,
Jun 15, 2021, 3:49:52 PM6/15/21
to Reid Kleckner, llvm...@lists.llvm.org
On 6/15/21 11:00 AM, Reid Kleckner wrote:

> On Mon, Jun 14, 2021 at 10:49 AM Tom Stellard via llvm-dev <llvm...@lists.llvm.org <mailto:llvm...@lists.llvm.org>> wrote:
>
> There is already the LLVM_EXTERNAL_VISIBILITY macro defined in
> llvm/Support/Compiler.h macro which is used in llvm/lib/Target.
> I would start using this one instead of creating a new LLVM_EXPORT macro.
> We can always rename the macro later if people like the name LLVM_EXPORT
> better.
>
>
> IMO we should go ahead and do a renaming and make project-specific headers. In the end, we need different macros for clang, lld, and llvm, and it seems wrong to put all of those in llvm/Support/Compiler.h.
>

Ok, this is fine with me too.

-Tom

> Personally, I like the *_EXPORT name, but the other widely used convention is *_API. I think ICU uses that.
>
> Here are some examples of existing export header templates:

> https://cmake.org/cmake/help/latest/module/GenerateExportHeader.html <https://cmake.org/cmake/help/latest/module/GenerateExportHeader.html>
> https://gitlab.kitware.com/cmake/cmake/-/blob/master/Modules/exportheader.cmake.in <https://gitlab.kitware.com/cmake/cmake/-/blob/master/Modules/exportheader.cmake.in>
> https://source.chromium.org/chromium/chromium/src/+/main:base/base_export.h;l=12?q=base_export%20&ss=chromium <https://source.chromium.org/chromium/chromium/src/+/main:base/base_export.h;l=12?q=base_export%20&ss=chromium>

Reply all
Reply to author
Forward
0 new messages