[boost] [cmake] Problem with dependencies for configure checks

149 views
Skip to first unread message

Andrey Semashev via Boost

unread,
Jun 3, 2021, 10:22:50 AM6/3/21
to bo...@lists.boost.org, Andrey Semashev
Hi,

I've been having this problem with CMake in Boost.Filesystem, where I
need to have the Boost::library targets available before
Boost::filesystem target is defined. This is needed in order to perform
configuration checks that will affect how Boost::filesystem will be
defined. For example, see the check_cxx_source_compiles calls:

https://github.com/boostorg/filesystem/blob/af6ac28b5785820b93a2af7dd6e9b801232fec19/CMakeLists.txt#L23-L50

For the check_cxx_source_compiles calls, I would have liked to add
targets like Boost::config and Boost::winapi to CMAKE_REQUIRED_LIBRARIES
so that the checks are compiled with include directories of those
dependencies added in the command line. This doesn't work because these
targets may not be defined at the point when the configuration checks
are performed.

My current workaround, as you can see, is to add the Boost root to
CMAKE_REQUIRED_INCLUDES, where the boost directory with the common
headers tree is located. However, this will not work if we're going to
make that directory optional or remove it. Also, you can see that I'm
reusing checks implemented in Boost.Config, for which I also had to hard
code its relative location from Boost.Filesystem. If we're going to
allow different libraries located in separate directory trees, this will
also break.

My question is, can this be fixed and how?

I was wondering if it is possible to make the target generation
two-stage, where the first stage identifies library locations, including
directories with their headers, and the second stage would define build
targets. This way all library directories would be available to the
second stage. Although I can already see the problem with second order
dependencies with this approach...

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Peter Dimov via Boost

unread,
Jun 3, 2021, 11:43:44 AM6/3/21
to bo...@lists.boost.org, Peter Dimov
Andrey Semashev wrote:
> Hi,
>
> I've been having this problem with CMake in Boost.Filesystem, where I need to
> have the Boost::library targets available before Boost::filesystem target is
> defined. This is needed in order to perform configuration checks that will
> affect how Boost::filesystem will be defined. For example, see the
> check_cxx_source_compiles calls:
>
> https://github.com/boostorg/filesystem/blob/af6ac28b5785820b93a2af7dd6
> e9b801232fec19/CMakeLists.txt#L23-L50
>
> For the check_cxx_source_compiles calls, I would have liked to add targets like
> Boost::config and Boost::winapi to CMAKE_REQUIRED_LIBRARIES so that the
> checks are compiled with include directories of those dependencies added in
> the command line. This doesn't work because these targets may not be
> defined at the point when the configuration checks are performed.
>
> My current workaround, as you can see, is to add the Boost root to
> CMAKE_REQUIRED_INCLUDES, where the boost directory with the common
> headers tree is located. However, this will not work if we're going to make that
> directory optional or remove it. Also, you can see that I'm reusing checks
> implemented in Boost.Config, for which I also had to hard code its relative
> location from Boost.Filesystem. If we're going to allow different libraries
> located in separate directory trees, this will also break.
>
> My question is, can this be fixed and how?

I add ../config/include to CMAKE_REQUIRED_INCLUDES.

https://github.com/boostorg/stacktrace/blob/79aff77771a8f3a3d62852de6619a955fd4c2634/CMakeLists.txt#L75

I don't think we should mandate add_subdirectory calls to come in a specific
order. Even if we can hack something in the superproject to do it, users who use
add_subdirectory manually will need to get the order right.

Incidentally,

#include <${CMAKE_CURRENT_SOURCE_DIR}/config/has_stat_st_blksize.cpp>

doesn't work on Cygwin with the Windows CMake; this ends up generating

#include <C:/boost-dir/libs/filesystem/config/ has_stat_st_blksize.cpp>

which the Cygwin compiler doesn't like and the test always fails. So I've had to use

https://github.com/boostorg/stacktrace/blob/79aff77771a8f3a3d62852de6619a955fd4c2634/CMakeLists.txt#L59

instead.

Andrey Semashev via Boost

unread,
Jun 3, 2021, 12:03:25 PM6/3/21
to bo...@lists.boost.org, Andrey Semashev
But that would mean one has to specify second order dependencies as well
for configure checks. This might be fine for Boost.Config, but not for
other libraries that have dependencies. Boost.WinAPI is an example; I
wouldn't want to replicate its dependencies in all downstream libraries
that have checks involving it.

> I don't think we should mandate add_subdirectory calls to come in a specific
> order. Even if we can hack something in the superproject to do it, users who use
> add_subdirectory manually will need to get the order right.
>
> Incidentally,
>
> #include <${CMAKE_CURRENT_SOURCE_DIR}/config/has_stat_st_blksize.cpp>
>
> doesn't work on Cygwin with the Windows CMake; this ends up generating
>
> #include <C:/boost-dir/libs/filesystem/config/ has_stat_st_blksize.cpp>
>
> which the Cygwin compiler doesn't like and the test always fails. So I've had to use
>
> https://github.com/boostorg/stacktrace/blob/79aff77771a8f3a3d62852de6619a955fd4c2634/CMakeLists.txt#L59
>
> instead.

I'm not sure I understand. Is it that Cygwin doesn't accept angle
brackets with Windows paths?

Niall Douglas via Boost

unread,
Jun 3, 2021, 1:42:58 PM6/3/21
to bo...@lists.boost.org, Niall Douglas
On 03/06/2021 14:54, Andrey Semashev via Boost wrote:

> For the check_cxx_source_compiles calls, I would have liked to add
> targets like Boost::config and Boost::winapi to CMAKE_REQUIRED_LIBRARIES
> so that the checks are compiled with include directories of those
> dependencies added in the command line. This doesn't work because these
> targets may not be defined at the point when the configuration checks
> are performed.

You might find `if(TARGET <tgt>)` useful here.

Niall

Andrey Semashev via Boost

unread,
Jun 3, 2021, 2:19:25 PM6/3/21
to bo...@lists.boost.org, Andrey Semashev
On 6/3/21 8:42 PM, Niall Douglas via Boost wrote:
> On 03/06/2021 14:54, Andrey Semashev via Boost wrote:
>
>> For the check_cxx_source_compiles calls, I would have liked to add
>> targets like Boost::config and Boost::winapi to CMAKE_REQUIRED_LIBRARIES
>> so that the checks are compiled with include directories of those
>> dependencies added in the command line. This doesn't work because these
>> targets may not be defined at the point when the configuration checks
>> are performed.
>
> You might find `if(TARGET <tgt>)` useful here.

I don't think that helps in this case. The point is not to skip
configuration checks if the target is not available. The checks must
execute, I just want to have a way to specify include directories even
when the targets are not yet defined.

Peter Dimov via Boost

unread,
Jun 3, 2021, 2:31:03 PM6/3/21
to bo...@lists.boost.org, Peter Dimov
Andrey Semashev wrote:
> But that would mean one has to specify second order dependencies as well for
> configure checks.

¯\_(ツ)_/¯

> I'm not sure I understand. Is it that Cygwin doesn't accept angle brackets with
> Windows paths?

The Cygwin compiler just doesn't accept Windows paths, as far as I can see.

Performing C++ SOURCE FILE Test BOOST_FILESYSTEM_HAS_STAT_ST_BLKSIZE failed with the following output:
Change Dir: C:/boost-git/develop/__build_gcc/CMakeFiles/CMakeTmp

Run Build Command(s):C:/cygwin64/bin/make.exe -f Makefile cmTC_8f9df/fast && /usr/bin/make -f CMakeFiles/cmTC_8f9df.dir/build.make CMakeFiles/cmTC_8f9df.dir/build
make[1]: Entering directory '/cygdrive/c/boost-git/develop/__build_gcc/CMakeFiles/CMakeTmp'
Building CXX object CMakeFiles/cmTC_8f9df.dir/src.cxx.obj

C:/cygwin64/bin/c++.exe -DBOOST_FILESYSTEM_HAS_STAT_ST_BLKSIZE @CMakeFiles/cmTC_8f9df.dir/includes_CXX.rsp -o CMakeFiles/cmTC_8f9df.dir/src.cxx.obj -c C:/boost-git/develop/__build_gcc/CMakeFiles/CMakeTmp/src.cxx
C:/boost-git/develop/__build_gcc/CMakeFiles/CMakeTmp/src.cxx:1:10: fatal error: C:/boost-git/develop/libs/filesystem/config/has_stat_st_blksize.cpp: No such file or directory
1 | #include <C:/boost-git/develop/libs/filesystem/config/has_stat_st_blksize.cpp>
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[1]: *** [CMakeFiles/cmTC_8f9df.dir/build.make:79: CMakeFiles/cmTC_8f9df.dir/src.cxx.obj] Error 1
make[1]: Leaving directory '/cygdrive/c/boost-git/develop/__build_gcc/CMakeFiles/CMakeTmp'
make: *** [Makefile:127: cmTC_8f9df/fast] Error 2


Source file was:
#include <C:/boost-git/develop/libs/filesystem/config/has_stat_st_blksize.cpp>

Andrey Semashev via Boost

unread,
Jun 3, 2021, 2:40:52 PM6/3/21
to bo...@lists.boost.org, Andrey Semashev
On 6/3/21 9:30 PM, Peter Dimov via Boost wrote:
> Andrey Semashev wrote:
>> But that would mean one has to specify second order dependencies as well for
>> configure checks.
>
> ¯\_(ツ)_/¯

Ok, so it seems I'll have to add a check to enforce that the unified
headers directory exists. This means that we are still dependent on `b2
headers`, at least until there is a CMake equivalent.

>> I'm not sure I understand. Is it that Cygwin doesn't accept angle brackets with
>> Windows paths?
>
> The Cygwin compiler just doesn't accept Windows paths, as far as I can see.
>
> Performing C++ SOURCE FILE Test BOOST_FILESYSTEM_HAS_STAT_ST_BLKSIZE failed with the following output:
> Change Dir: C:/boost-git/develop/__build_gcc/CMakeFiles/CMakeTmp
>
> Run Build Command(s):C:/cygwin64/bin/make.exe -f Makefile cmTC_8f9df/fast && /usr/bin/make -f CMakeFiles/cmTC_8f9df.dir/build.make CMakeFiles/cmTC_8f9df.dir/build
> make[1]: Entering directory '/cygdrive/c/boost-git/develop/__build_gcc/CMakeFiles/CMakeTmp'
> Building CXX object CMakeFiles/cmTC_8f9df.dir/src.cxx.obj
>
> C:/cygwin64/bin/c++.exe -DBOOST_FILESYSTEM_HAS_STAT_ST_BLKSIZE @CMakeFiles/cmTC_8f9df.dir/includes_CXX.rsp -o CMakeFiles/cmTC_8f9df.dir/src.cxx.obj -c C:/boost-git/develop/__build_gcc/CMakeFiles/CMakeTmp/src.cxx
> C:/boost-git/develop/__build_gcc/CMakeFiles/CMakeTmp/src.cxx:1:10: fatal error: C:/boost-git/develop/libs/filesystem/config/has_stat_st_blksize.cpp: No such file or directory
> 1 | #include <C:/boost-git/develop/libs/filesystem/config/has_stat_st_blksize.cpp>
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> compilation terminated.
> make[1]: *** [CMakeFiles/cmTC_8f9df.dir/build.make:79: CMakeFiles/cmTC_8f9df.dir/src.cxx.obj] Error 1
> make[1]: Leaving directory '/cygdrive/c/boost-git/develop/__build_gcc/CMakeFiles/CMakeTmp'
> make: *** [Makefile:127: cmTC_8f9df/fast] Error 2
>
>
> Source file was:
> #include <C:/boost-git/develop/libs/filesystem/config/has_stat_st_blksize.cpp>

Ok, I think I understand. You're using native Windows CMake that
generates Windows paths and a Cygwin compiler that doesn't understand
them. That looks like an incorrect config to me; you should be using
Cygwin CMake, which should generate Cygwin paths both in source files
and command lines.

Peter Dimov via Boost

unread,
Jun 3, 2021, 3:00:19 PM6/3/21
to bo...@lists.boost.org, Peter Dimov
Andrey Semashev wrote:
> On 6/3/21 9:30 PM, Peter Dimov via Boost wrote:
> > Andrey Semashev wrote:
> >> But that would mean one has to specify second order dependencies as
> >> well for configure checks.
> >
> > ¯\_(ツ)_/¯
>
> Ok, so it seems I'll have to add a check to enforce that the unified headers
> directory exists. This means that we are still dependent on `b2 headers`, at
> least until there is a CMake equivalent.

"We" are not dependent on any such thing, and there will never be such a
CMake equivalent. This directory will not be maintained in a CMake build.

Edward Diener via Boost

unread,
Jun 3, 2021, 3:02:16 PM6/3/21
to bo...@lists.boost.org, Edward Diener

Maybe

#include
<cygdrive/c/boost-git/develop/libs/filesystem/config/has_stat_st_blksize.cpp>

works ?

Andrey Semashev via Boost

unread,
Jun 3, 2021, 3:18:09 PM6/3/21
to bo...@lists.boost.org, Andrey Semashev
On 6/3/21 9:59 PM, Peter Dimov via Boost wrote:
> Andrey Semashev wrote:
>> On 6/3/21 9:30 PM, Peter Dimov via Boost wrote:
>>> Andrey Semashev wrote:
>>>> But that would mean one has to specify second order dependencies as
>>>> well for configure checks.
>>>
>>> ¯\_(ツ)_/¯
>>
>> Ok, so it seems I'll have to add a check to enforce that the unified headers
>> directory exists. This means that we are still dependent on `b2 headers`, at
>> least until there is a CMake equivalent.
>
> "We" are not dependent on any such thing, and there will never be such a
> CMake equivalent. This directory will not be maintained in a CMake build.

Well, the only other alternative I see is globbing through all libraries
to collect their include paths. And that would have to be done in every
library, unless a common solution exists. I just looked at Boost.Atomic
and it also relies on the unified headers dir, and I'm pretty sure I'm
going to need it in Boost.Log. So the scanning will happen at least
three times per build, which is already not good.

Peter Dimov via Boost

unread,
Jun 3, 2021, 6:05:51 PM6/3/21
to bo...@lists.boost.org, Peter Dimov
Andrey Semashev wrote:
> >> Ok, so it seems I'll have to add a check to enforce that the unified
> >> headers directory exists. This means that we are still dependent on
> >> `b2 headers`, at least until there is a CMake equivalent.
> >
> > "We" are not dependent on any such thing, and there will never be such
> > a CMake equivalent. This directory will not be maintained in a CMake build.
>
> Well, the only other alternative I see is globbing through all libraries to collect
> their include paths. And that would have to be done in every library, unless a
> common solution exists.

Why would every library need to do it? Why would it need to glob? I've done
most of them by now and not one of them needs to do so.

Andrey Semashev via Boost

unread,
Jun 3, 2021, 7:37:54 PM6/3/21
to bo...@lists.boost.org, Andrey Semashev
On 6/4/21 1:05 AM, Peter Dimov via Boost wrote:
> Andrey Semashev wrote:
>>>> Ok, so it seems I'll have to add a check to enforce that the unified
>>>> headers directory exists. This means that we are still dependent on
>>>> `b2 headers`, at least until there is a CMake equivalent.
>>>
>>> "We" are not dependent on any such thing, and there will never be such
>>> a CMake equivalent. This directory will not be maintained in a CMake build.
>>
>> Well, the only other alternative I see is globbing through all libraries to collect
>> their include paths. And that would have to be done in every library, unless a
>> common solution exists.
>
> Why would every library need to do it? Why would it need to glob? I've done
> most of them by now and not one of them needs to do so.

Probably because most of them don't do configure checks or don't use
Boost while doing so. In the cases I mentioned, I do use Boost for
configure time checks, and globbing is a way to avoid having to hardcode
the dependency tree for those checks.

Peter Dimov via Boost

unread,
Jun 3, 2021, 8:29:39 PM6/3/21
to bo...@lists.boost.org, Peter Dimov
Andrey Semashev wrote:
> > Why would every library need to do it? Why would it need to glob? I've
> > done most of them by now and not one of them needs to do so.
>
> Probably because most of them don't do configure checks or don't use Boost
> while doing so. In the cases I mentioned, I do use Boost for configure time
> checks, and globbing is a way to avoid having to hardcode the dependency
> tree for those checks.

I don't think globbing would be a particularly good practice in this case, but
whatever floats your boat, I suppose. I will continue hardcoding the required
include directories. That's not that great a practice either (*), but I don't see
anything better.

Why is that even necessary? Configure checks should be standalone. There's
no problem using <windows.h> there instead of insisting on winapi, is there?

The only other such place is Stacktrace, which wants to #error in its check
when BOOST_NO_CXX11_THREAD_LOCAL is defined, even though it already
uses thread_local itself, which would presumably fail. I understand why it's
done - Config presumably encodes knowledge of platforms where
thread_local compiles but then doesn't work - but the right thing would still
have been to make the check standalone. I was just too lazy to do it.

(*) It will fail when dependencies are pulled at configure time by
FetchContent or package manager integration. E.g.

pacman_add_subdir(boost_this boost/this)
pacman_add_subdir(boost_that boost/that)
pacman_add_subdir(boost_other boost/other)

If boost_this wants boost_other, it won't find it, because it hasn't been
downloaded yet. It will find it on the second configure run, but the checks
would have been cached then, and will not run.

Andrey Semashev via Boost

unread,
Jun 3, 2021, 8:47:27 PM6/3/21
to bo...@lists.boost.org, Andrey Semashev
On 6/4/21 3:29 AM, Peter Dimov via Boost wrote:
> Andrey Semashev wrote:
>>> Why would every library need to do it? Why would it need to glob? I've
>>> done most of them by now and not one of them needs to do so.
>>
>> Probably because most of them don't do configure checks or don't use Boost
>> while doing so. In the cases I mentioned, I do use Boost for configure time
>> checks, and globbing is a way to avoid having to hardcode the dependency
>> tree for those checks.
>
> I don't think globbing would be a particularly good practice in this case, but
> whatever floats your boat, I suppose.

Sure, I don't like it either. That's why I asked if something better
could be done in Boost.CMake, or however it is called.

> I will continue hardcoding the required
> include directories. That's not that great a practice either (*), but I don't see
> anything better.
>
> Why is that even necessary? Configure checks should be standalone. There's
> no problem using <windows.h> there instead of insisting on winapi, is there?

In the particular case of Boost.WinAPI I want it (Boost.WinAPI) to
define all the version macros it defines. This way I ensure that the
config check is compiling in the same environment as the library will. I
actually include a special header from Boost.Filesystem
(platform_config.hpp) to define all platform-specific macros, which
indirectly includes Boost.Config and Boost.WinAPI.

In Boost.Log I have a config check that uses Boost.Atomic to test if it
supports the required lock-free operations on a given target platform.

> (*) It will fail when dependencies are pulled at configure time by
> FetchContent or package manager integration. E.g.
>
> pacman_add_subdir(boost_this boost/this)
> pacman_add_subdir(boost_that boost/that)
> pacman_add_subdir(boost_other boost/other)
>
> If boost_this wants boost_other, it won't find it, because it hasn't been
> downloaded yet. It will find it on the second configure run, but the checks
> would have been cached then, and will not run.

I'm inclined to simply not support incomplete source trees.

Peter Dimov via Boost

unread,
Jun 3, 2021, 9:07:43 PM6/3/21
to bo...@lists.boost.org, Peter Dimov
Andrey Semashev wrote:
> In the particular case of Boost.WinAPI I want it (Boost.WinAPI) to define all the
> version macros it defines. This way I ensure that the config check is compiling
> in the same environment as the library will. I actually include a special header
> from Boost.Filesystem
> (platform_config.hpp) to define all platform-specific macros, which indirectly
> includes Boost.Config and Boost.WinAPI.

While on the subject of Filesystem configure checks, I don't know if it's been
reported already, but the statx kernel call is causing problems; Filesystem
operations failing with ENOTSUPP.

You shouldn't be assuming that the machine where the library is built will
be the same as the one the program will run on. (I think this has come up
in the context of Docker, where Filesystem is probably built in the host
environment but the program that uses it runs in Docker and apparently
the supported kernel calls aren't the same. Or perhaps it's built when the
Docker image is created, not sure.)

Andrey Semashev via Boost

unread,
Jun 4, 2021, 5:13:37 AM6/4/21
to bo...@lists.boost.org, Andrey Semashev
On 6/4/21 4:07 AM, Peter Dimov via Boost wrote:
> Andrey Semashev wrote:
>> In the particular case of Boost.WinAPI I want it (Boost.WinAPI) to define all the
>> version macros it defines. This way I ensure that the config check is compiling
>> in the same environment as the library will. I actually include a special header
>> from Boost.Filesystem
>> (platform_config.hpp) to define all platform-specific macros, which indirectly
>> includes Boost.Config and Boost.WinAPI.
>
> While on the subject of Filesystem configure checks, I don't know if it's been
> reported already, but the statx kernel call is causing problems; Filesystem
> operations failing with ENOTSUPP.

ENOSYS, perhaps?

Yes, there were reports.

https://github.com/boostorg/filesystem/issues/172

> You shouldn't be assuming that the machine where the library is built will
> be the same as the one the program will run on. (I think this has come up
> in the context of Docker, where Filesystem is probably built in the host
> environment but the program that uses it runs in Docker and apparently
> the supported kernel calls aren't the same. Or perhaps it's built when the
> Docker image is created, not sure.)

My stance on this is that Boost.Filesystem must be built on a system
with kernel headers matching or older than the systems that will run the
binary. This matches the same requirements for Windows. I don't support
configs with headers newer than runtime kernel.

Niall Douglas via Boost

unread,
Jun 4, 2021, 7:04:26 AM6/4/21
to bo...@lists.boost.org, Niall Douglas
On 04/06/2021 10:12, Andrey Semashev via Boost wrote:

> My stance on this is that Boost.Filesystem must be built on a system
> with kernel headers matching or older than the systems that will run the
> binary. This matches the same requirements for Windows. I don't support
> configs with headers newer than runtime kernel.

It's trivially easy to do a runtime check for statx kernel support and
fall back on older syscalls if needed.

This is LLFIO doing so, and it works on Linux kernel 2.6:

https://github.com/ned14/llfio/blob/develop/include/llfio/v2.0/detail/impl/posix/stat.ipp#L205

Niall

Andrey Semashev via Boost

unread,
Jun 4, 2021, 8:06:55 AM6/4/21
to bo...@lists.boost.org, Andrey Semashev
On 6/4/21 2:03 PM, Niall Douglas via Boost wrote:
> On 04/06/2021 10:12, Andrey Semashev via Boost wrote:
>
>> My stance on this is that Boost.Filesystem must be built on a system
>> with kernel headers matching or older than the systems that will run the
>> binary. This matches the same requirements for Windows. I don't support
>> configs with headers newer than runtime kernel.
>
> It's trivially easy to do a runtime check for statx kernel support and
> fall back on older syscalls if needed.
>
> This is LLFIO doing so, and it works on Linux kernel 2.6:
>
> https://github.com/ned14/llfio/blob/develop/include/llfio/v2.0/detail/impl/posix/stat.ipp#L205

The code you pointed to is not what I would call trivial. There are
quite a few places where statx is used in Boost.Filesystem, in some
places exclusively (i.e. there is no fallback). I could add a runtime
check, but I definitely don't want that check to happen on every call.
This means the result must be stored globally in a thread-safe manner.
Then there are other syscalls, on Linux and other systems, should I also
add runtime checks for those? All that to support what I consider
improperly configured systems. It just doesn't look like a worthwhile
effort while I have other things to do, including in Boost.Filesystem too.

Peter Dimov via Boost

unread,
Jun 4, 2021, 10:23:38 AM6/4/21
to bo...@lists.boost.org, Peter Dimov
Andrey Semashev wrote:
> My stance on this is that Boost.Filesystem must be built on a system with
> kernel headers matching or older than the systems that will run the binary.
> This matches the same requirements for Windows. I don't support configs
> with headers newer than runtime kernel.

I don't think this is a realistic requirement to put on users. Is it possible to
handle the runtime failure and fall back on the no-statx code path?

Peter Dimov via Boost

unread,
Jun 4, 2021, 10:27:18 AM6/4/21
to bo...@lists.boost.org, Peter Dimov
> I don't think globbing would be a particularly good practice in this case, but
> whatever floats your boat, I suppose. I will continue hardcoding the required
> include directories.

To put this in perspective, since winapi only depends on config and predef, and
those depend on nothing else, you only need three include directories:

"../config/include" "../predef/include" "../winapi/include"

Globbing, in contrast, would add 141 include directories.

I fail to see how the latter can in any way be considered a superior solution.

Niall Douglas via Boost

unread,
Jun 4, 2021, 10:45:20 AM6/4/21
to bo...@lists.boost.org, Niall Douglas
On 04/06/2021 13:04, Andrey Semashev via Boost wrote:
> On 6/4/21 2:03 PM, Niall Douglas via Boost wrote:
>> On 04/06/2021 10:12, Andrey Semashev via Boost wrote:
>>
>>> My stance on this is that Boost.Filesystem must be built on a system
>>> with kernel headers matching or older than the systems that will run the
>>> binary. This matches the same requirements for Windows. I don't support
>>> configs with headers newer than runtime kernel.
>>
>> It's trivially easy to do a runtime check for statx kernel support and
>> fall back on older syscalls if needed.
>>
>> This is LLFIO doing so, and it works on Linux kernel 2.6:
>>
>> https://github.com/ned14/llfio/blob/develop/include/llfio/v2.0/detail/impl/posix/stat.ipp#L205
>>
>
> The code you pointed to is not what I would call trivial. There are
> quite a few places where statx is used in Boost.Filesystem, in some
> places exclusively (i.e. there is no fallback).

There are only a **very** few things which statx provides which older
syscalls cannot. statx's big gain is it is a snapshot not subject to
races, but that hardly affects Filesystem.

> I could add a runtime
> check, but I definitely don't want that check to happen on every call.
> This means the result must be stored globally in a thread-safe manner.

At work we've got llfio::stat_t::fill() being called in some places
millions of times per second.

It is implemented by attempting statx first, and if it is not supported,
then stat or other syscalls to emulate the same.

You cannot cache statx support/non-support because only some filing
systems support statx, and others do not. So depending on the path used,
it may work, but at other times, not.

Right now at work we're on kernels which don't support statx, so we are
always getting back ENOSYS and then falling back to emulation. We have
not found any performance issues.

> Then there are other syscalls, on Linux and other systems, should I also
> add runtime checks for those? All that to support what I consider
> improperly configured systems. It just doesn't look like a worthwhile
> effort while I have other things to do, including in Boost.Filesystem too.

A very great deal of Boost users are either on older Linux kernels, or
are running Boost inside Docker containers where various newer syscalls
are disabled e.g. where I work. In my opinion, if you like your users,
you need to implement syscall fallback for all recent Linux syscalls,
same as you'd do on Windows. I agree that you can reasonably cut off all
Linux kernels past EOL, so you only need to support syscall fallback for
kernels since EOL.

Niall

Peter Dimov via Boost

unread,
Jun 4, 2021, 10:56:28 AM6/4/21
to bo...@lists.boost.org, Peter Dimov
Niall Douglas wrote:
> On 04/06/2021 13:04, Andrey Semashev via Boost wrote:
> > I could add a runtime
> > check, but I definitely don't want that check to happen on every call.
> > This means the result must be stored globally in a thread-safe manner.
>
> At work we've got llfio::stat_t::fill() being called in some places millions of
> times per second.
>
> It is implemented by attempting statx first, and if it is not supported, then stat
> or other syscalls to emulate the same.
>
> You cannot cache statx support/non-support because only some filing systems
> support statx, and others do not. So depending on the path used, it may work,
> but at other times, not.
>
> Right now at work we're on kernels which don't support statx, so we are
> always getting back ENOSYS and then falling back to emulation. We have not
> found any performance issues.

A kernel call is infinitely more expensive than checking a number against
ENOSYS. It's absolutely impossible for the test to ever be a performance issue.

Andrey Semashev via Boost

unread,
Jun 4, 2021, 11:07:05 AM6/4/21
to bo...@lists.boost.org, Andrey Semashev
On 6/4/21 5:26 PM, Peter Dimov via Boost wrote:
>> I don't think globbing would be a particularly good practice in this case, but
>> whatever floats your boat, I suppose. I will continue hardcoding the required
>> include directories.
>
> To put this in perspective, since winapi only depends on config and predef, and
> those depend on nothing else, you only need three include directories:
>
> "../config/include" "../predef/include" "../winapi/include"
>
> Globbing, in contrast, would add 141 include directories.

The number of include directories doesn't really matter, as long as they
don't cause conflicts (they shouldn't). The performance hit is
unfortunate, and as I said, I would have liked to avoid it. This is
ugly, I get it, and I fully agree, but I don't see a better way.

> I fail to see how the latter can in any way be considered a superior solution.

It is superior as it does not require active maintenance. Especially
given that a failed configure check because of a missing dependency
might not be noticeable in CI.

Of all resources we have in Boost, I consider maintainer's time and
effort the most valuable one. This is evident by the number of libraries
without active maintainance we have. Any solution that has zero
maintenance cost in the long run, without drastic downsides in other
areas, is a worthy solution.

Niall Douglas via Boost

unread,
Jun 4, 2021, 11:21:17 AM6/4/21
to bo...@lists.boost.org, Niall Douglas
On 04/06/2021 15:54, Peter Dimov via Boost wrote:
> Niall Douglas wrote:
>> On 04/06/2021 13:04, Andrey Semashev via Boost wrote:
>>> I could add a runtime
>>> check, but I definitely don't want that check to happen on every call.
>>> This means the result must be stored globally in a thread-safe manner.
>>
>> At work we've got llfio::stat_t::fill() being called in some places millions of
>> times per second.
>>
>> It is implemented by attempting statx first, and if it is not supported, then stat
>> or other syscalls to emulate the same.
>>
>> You cannot cache statx support/non-support because only some filing systems
>> support statx, and others do not. So depending on the path used, it may work,
>> but at other times, not.
>>
>> Right now at work we're on kernels which don't support statx, so we are
>> always getting back ENOSYS and then falling back to emulation. We have not
>> found any performance issues.
>
> A kernel call is infinitely more expensive than checking a number against
> ENOSYS. It's absolutely impossible for the test to ever be a performance issue.

Well, that depends.

If you're on a kernel that has no statx syscall, then the userspace
syscall dispatcher will spot the unknown syscall number, and you'll get
back ENOSYS very quickly, as you point out.

But if you're on a kernel with statx syscall, and instead it is the
filing system which doesn't support statx, then that's a full fat
syscall just to get back ENOSYS.

I don't personally think anyone who cares about metadata retrieval
performance would ever use Filesystem or std::filesystem because both
tend to do dynamic memory allocations, which easily swamp a statx call.
So, in that sense, even many syscalls can be done and it'll be
unmeasurable compared to say a filesystem path construction.

If this were Windows, because Filesystem calls many Win32 functions to
build a stat, that is a very different situation. But not on POSIX.

Niall

Peter Dimov via Boost

unread,
Jun 4, 2021, 11:28:36 AM6/4/21
to bo...@lists.boost.org, Peter Dimov
Andrey Semashev wrote:
> On 6/4/21 5:26 PM, Peter Dimov via Boost wrote:
> >> I don't think globbing would be a particularly good practice in this
> >> case, but whatever floats your boat, I suppose. I will continue
> >> hardcoding the required include directories.
> >
> > To put this in perspective, since winapi only depends on config and
> > predef, and those depend on nothing else, you only need three include
> directories:
> >
> > "../config/include" "../predef/include" "../winapi/include"
> >
> > Globbing, in contrast, would add 141 include directories.
>
> The number of include directories doesn't really matter, as long as they don't
> cause conflicts (they shouldn't). The performance hit is unfortunate, and as I
> said, I would have liked to avoid it. This is ugly, I get it, and I fully agree, but I
> don't see a better way.
>
> > I fail to see how the latter can in any way be considered a superior solution.
>
> It is superior as it does not require active maintenance. Especially given that a
> failed configure check because of a missing dependency might not be
> noticeable in CI.

We're talking about one of your libraries depending on another one of your
libraries. The dependencies of winapi aren't going to suddenly change. And no,
config and predef aren't going to acquire secondary dependencies either.

Checking that winapi can't be included isn't hard; add another check that only
includes it and checks nothing and message(AUTHOR_WARNING when it fails.

Andrey Semashev via Boost

unread,
Jun 4, 2021, 11:32:11 AM6/4/21
to bo...@lists.boost.org, Andrey Semashev
Isn't statx supposed to complete successfully through the stat route in
the kernel, if the filesystem does not support statx natively?

> Right now at work we're on kernels which don't support statx, so we are
> always getting back ENOSYS and then falling back to emulation. We have
> not found any performance issues.
>
>> Then there are other syscalls, on Linux and other systems, should I also
>> add runtime checks for those? All that to support what I consider
>> improperly configured systems. It just doesn't look like a worthwhile
>> effort while I have other things to do, including in Boost.Filesystem too.
>
> A very great deal of Boost users are either on older Linux kernels, or
> are running Boost inside Docker containers where various newer syscalls
> are disabled e.g. where I work. In my opinion, if you like your users,
> you need to implement syscall fallback for all recent Linux syscalls,
> same as you'd do on Windows. I agree that you can reasonably cut off all
> Linux kernels past EOL, so you only need to support syscall fallback for
> kernels since EOL.

My point is that if you're running a kernel that doesn't support statx
then you should compile Boost.Filesystem with headers from that kernel.
It will not have statx syscall and Boost.Filesystem will compile to use
the old stat syscall.

<rant>
Docker people seem to think of Docker images as of VMs while they are
not. Hence they consider it normal to build the image on one system and
then run it on another, with possibly older kernel. Well, this is wrong.
It doesn't work that way, and Boost.Filesystem is not the only piece of
software that will break because of that.
</rant>

Peter Dimov via Boost

unread,
Jun 4, 2021, 11:34:45 AM6/4/21
to bo...@lists.boost.org, Peter Dimov
Niall Douglas wrote:
> On 04/06/2021 15:54, Peter Dimov via Boost wrote:
> > A kernel call is infinitely more expensive than checking a number
> > against ENOSYS. It's absolutely impossible for the test to ever be a
> > performance issue.
>
> Well, that depends.
>
> If you're on a kernel that has no statx syscall, then the userspace syscall
> dispatcher will spot the unknown syscall number, and you'll get back ENOSYS
> very quickly, as you point out.
>
> But if you're on a kernel with statx syscall, and instead it is the filing system
> which doesn't support statx, then that's a full fat syscall just to get back
> ENOSYS.

Sure, but the alternative is to do that same fat syscall just to fail, as Filesystem
currently does. The performance concerns in our case are about the scenario
where the syscall doesn't fail, because that's the case that works. The
performance of something that doesn't work is irrelevant.

Peter Dimov via Boost

unread,
Jun 4, 2021, 11:37:35 AM6/4/21
to bo...@lists.boost.org, Peter Dimov
Andrey Semashev wrote:
> <rant>
> Docker people seem to think of Docker images as of VMs while they are not.
> Hence they consider it normal to build the image on one system and then run
> it on another, with possibly older kernel. Well, this is wrong.
> It doesn't work that way, and Boost.Filesystem is not the only piece of
> software that will break because of that.
> </rant>

As Niall mentioned, it might be that Docker just doesn't support the syscalls,
even when running on the same kernel.

Andrey Semashev via Boost

unread,
Jun 4, 2021, 11:39:56 AM6/4/21
to bo...@lists.boost.org, Andrey Semashev
Kernel should not return ENOSYS for a syscall that is implemented. This
error code is reserved for this specific purpose. What you should get is
probably ENOTSUP or EINVAL. Are you sure you're getting specifically ENOSYS?

> I don't personally think anyone who cares about metadata retrieval
> performance would ever use Filesystem or std::filesystem because both
> tend to do dynamic memory allocations, which easily swamp a statx call.
> So, in that sense, even many syscalls can be done and it'll be
> unmeasurable compared to say a filesystem path construction.
>
> If this were Windows, because Filesystem calls many Win32 functions to
> build a stat, that is a very different situation. But not on POSIX.

Boost.Filesystem doesn't implement stat, it uses it as a means to
implement other, more fine grained operations. On Windows, these
operations tend to be implemented with a single WinAPI call, in some
cases with a pair open/close file handle calls.

Andrey Semashev via Boost

unread,
Jun 4, 2021, 11:53:01 AM6/4/21
to bo...@lists.boost.org, Andrey Semashev
On 6/4/21 6:27 PM, Peter Dimov via Boost wrote:
> Andrey Semashev wrote:
>> On 6/4/21 5:26 PM, Peter Dimov via Boost wrote:
>>>> I don't think globbing would be a particularly good practice in this
>>>> case, but whatever floats your boat, I suppose. I will continue
>>>> hardcoding the required include directories.
>>>
>>> To put this in perspective, since winapi only depends on config and
>>> predef, and those depend on nothing else, you only need three include
>> directories:
>>>
>>> "../config/include" "../predef/include" "../winapi/include"
>>>
>>> Globbing, in contrast, would add 141 include directories.
>>
>> The number of include directories doesn't really matter, as long as they don't
>> cause conflicts (they shouldn't). The performance hit is unfortunate, and as I
>> said, I would have liked to avoid it. This is ugly, I get it, and I fully agree, but I
>> don't see a better way.
>>
>>> I fail to see how the latter can in any way be considered a superior solution.
>>
>> It is superior as it does not require active maintenance. Especially given that a
>> failed configure check because of a missing dependency might not be
>> noticeable in CI.
>
> We're talking about one of your libraries depending on another one of your
> libraries.

I can't be held to remember which of my libraries depend on what. My
memory isn't that good, sorry.

And I'm not immortal, you know. One day there will be another maintainer
(hopefully), and the less work he has to do the better.

> The dependencies of winapi aren't going to suddenly change.

Well, I wouldn't bet on that. I'm not going to add dependencies just
because, but I want to be able to should I need it without silently
breaking someone's configure checks.

> And no,
> config and predef aren't going to acquire secondary dependencies either.
>
> Checking that winapi can't be included isn't hard; add another check that only
> includes it and checks nothing and message(AUTHOR_WARNING when it fails.

Eh, that's still brittle and requires intervention...

Andrey Semashev via Boost

unread,
Jun 4, 2021, 12:10:00 PM6/4/21
to bo...@lists.boost.org, Andrey Semashev
On 6/4/21 6:34 PM, Peter Dimov via Boost wrote:
> Andrey Semashev wrote:
>> <rant>
>> Docker people seem to think of Docker images as of VMs while they are not.
>> Hence they consider it normal to build the image on one system and then run
>> it on another, with possibly older kernel. Well, this is wrong.
>> It doesn't work that way, and Boost.Filesystem is not the only piece of
>> software that will break because of that.
>> </rant>
>
> As Niall mentioned, it might be that Docker just doesn't support the syscalls,
> even when running on the same kernel.

This is a different case, and for that I added compile-time options to
disable some syscalls in Boost.Filesystem.

Peter Dimov via Boost

unread,
Jun 4, 2021, 12:18:21 PM6/4/21
to bo...@lists.boost.org, Peter Dimov
Andrey Semashev wrote:
> > Checking that winapi can't be included isn't hard; add another check
> > that only includes it and checks nothing and message(AUTHOR_WARNING
> > when it fails.
>
> Eh, that's still brittle and requires intervention...

It also has the advantage of telling you that your configure checks don't
work for some other reason.

Peter Dimov via Boost

unread,
Jun 4, 2021, 12:23:05 PM6/4/21
to bo...@lists.boost.org, Peter Dimov
Andrey Semashev wrote:
> On 6/4/21 6:34 PM, Peter Dimov via Boost wrote:
> > As Niall mentioned, it might be that Docker just doesn't support the
> > syscalls, even when running on the same kernel.
>
> This is a different case, and for that I added compile-time options to disable
> some syscalls in Boost.Filesystem.

That's not going to be of help in general. Many users have no control
over how the library is built. They get it from the system, or from a package
manager.

If this check is unreliable, it should be off by default and opt-in. It does more
harm than good.

Andrey Semashev via Boost

unread,
Jun 4, 2021, 12:37:01 PM6/4/21
to bo...@lists.boost.org, Andrey Semashev
On 6/4/21 7:20 PM, Peter Dimov via Boost wrote:
> Andrey Semashev wrote:
>> On 6/4/21 6:34 PM, Peter Dimov via Boost wrote:
>>> As Niall mentioned, it might be that Docker just doesn't support the
>>> syscalls, even when running on the same kernel.
>>
>> This is a different case, and for that I added compile-time options to disable
>> some syscalls in Boost.Filesystem.
>
> That's not going to be of help in general. Many users have no control
> over how the library is built. They get it from the system, or from a package
> manager.

That's a fair point. Damn, it looks like I'll have to make a runtime
check after all...

> If this check is unreliable, it should be off by default and opt-in. It does more
> harm than good.

Stuff like this never gets enabled. And besides, I do need statx
unconditionally for some of the operations.

Alexander Grund via Boost

unread,
Jun 8, 2021, 10:05:57 AM6/8/21
to bo...@lists.boost.org, Alexander Grund
> I don't think we should mandate add_subdirectory calls to come in a specific
> order. Even if we can hack something in the superproject to do it, users who use
> add_subdirectory manually will need to get the order right.
I know this is a kind of trade-off.
At least for the super-project it is kinda easy: You already have a
function which includes dependencies recursively. All you need to do is
do the depth-first.

For users who add_subdirectory the boost libs I admit this is an issue.
But to me this only means that Boost authors should avoid relying on
configure-checks which need other libs and where required use the
Boost::* aliases so the error in case it was not added is quite clear.
Users then need to make sure to e.g. add Boost::predef and Boost::config
first.
I don't think this is too bad.


Reply all
Reply to author
Forward
0 new messages