Include-cleaner experiment in WebRTC

248 views
Skip to first unread message

Florent Castelli

unread,
Apr 24, 2024, 10:11:42 AM4/24/24
to bu...@chromium.org
Hi,

As part of an effort to improve the code quality in the WebRTC library, we're currently experimenting with clang-include-cleaner to sort out which headers should be added, which should be removed and fix any surfacing layering violations.

This led me to discover how the tool is figuring out which compilation flags are used for each input. While CC files are fine since there is a matching command line in the compile compile_commands.json file, headers are a bit problematic. The tool attempts to identify a related file and use its flags, which works quite often (especially when the header is paired with a .cc file), but fails in some situations (we provide header-only mocks using gmock, so it needs to depend on a special target for the right include path).

Some workarounds exist around this such as adding headers to the compilation database with a tool called compdb (which is having specific issues too). I think we can do better and have that functionality integrated directly in GN since it has all the required information.
I think it would be interesting to see if we can generate rules to create pch files out of our headers automatically from executable, static_library or source_set rules automatically (possibly guarded by a GN setting), and those rules would not necessarily be built by default by Ninja either, merely used for other tooling relying on compilation databases.

Do you think that would be the right way forward?
Do you think we could come up with a solution that is transparent with the current toolchain definitions in GN?
Would anyone be interested in this project and willing to guide me to achieve this?

Any feedback is interesting!

/Florent

Nico Weber

unread,
Apr 24, 2024, 10:30:25 AM4/24/24
to Florent Castelli, bu...@chromium.org
Hi,

there's some very recent activity talking about include-cleaner over at https://issues.chromium.org/issues/40248422 . Maybe ask there, you might get a better reply there :)

Nico


--
You received this message because you are subscribed to the Google Groups "build" group.
To unsubscribe from this group and stop receiving emails from it, send an email to build+un...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/build/CADRnnSXWWqOcuNJUK-%3DdkFrdZFOge%2BMHwBDTsXs4yeFM%2BD-bbw%40mail.gmail.com.

Takuto Ikuta (生田 拓人)

unread,
Apr 25, 2024, 12:32:16 AM4/25/24
to Nico Weber, Florent Castelli, bu...@chromium.org
Hi,

On Wed, Apr 24, 2024 at 11:30 PM Nico Weber <tha...@chromium.org> wrote:
Hi,

there's some very recent activity talking about include-cleaner over at https://issues.chromium.org/issues/40248422 . Maybe ask there, you might get a better reply there :)

Nico


On Wed, Apr 24, 2024 at 10:11 AM 'Florent Castelli' via build <bu...@chromium.org> wrote:
Hi,

As part of an effort to improve the code quality in the WebRTC library, we're currently experimenting with clang-include-cleaner to sort out which headers should be added, which should be removed and fix any surfacing layering violations.

This led me to discover how the tool is figuring out which compilation flags are used for each input. While CC files are fine since there is a matching command line in the compile compile_commands.json file, headers are a bit problematic. The tool attempts to identify a related file and use its flags, which works quite often (especially when the header is paired with a .cc file), but fails in some situations (we provide header-only mocks using gmock, so it needs to depend on a special target for the right include path).

Some workarounds exist around this such as adding headers to the compilation database with a tool called compdb (which is having specific issues too). I think we can do better and have that functionality integrated directly in GN since it has all the required information.

Thanks, I didn't know about compdb. But can we also know what kind of issues does that have?
 
I think it would be interesting to see if we can generate rules to create pch files out of our headers automatically from executable, static_library or source_set rules automatically (possibly guarded by a GN setting), and those rules would not necessarily be built by default by Ninja either, merely used for other tooling relying on compilation databases.


Why do you need to have a rule for pch instead of individual headers?
 
Do you think that would be the right way forward? 

Yeah, I think it is reasonable to change GN to export compilation database entries for header files. Although it is better if clang-include-cleaner can handle #includes in header files.
 
Do you think we could come up with a solution that is transparent with the current toolchain definitions in GN?
Would anyone be interested in this project and willing to guide me to achieve this?


And I think we can do that by updating around

I can review the CL if you send that to me.
 
Any feedback is interesting!

/Florent

--
You received this message because you are subscribed to the Google Groups "build" group.
To unsubscribe from this group and stop receiving emails from it, send an email to build+un...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/build/CADRnnSXWWqOcuNJUK-%3DdkFrdZFOge%2BMHwBDTsXs4yeFM%2BD-bbw%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "build" group.
To unsubscribe from this group and stop receiving emails from it, send an email to build+un...@chromium.org.


--
Takuto Ikuta
Software Engineer in Tokyo
Chrome Ops (chrome browser build team)

Takuto Ikuta (生田 拓人)

unread,
Apr 25, 2024, 12:49:03 AM4/25/24
to Nico Weber, Florent Castelli, bu...@chromium.org
On Thu, Apr 25, 2024 at 1:31 PM Takuto Ikuta (生田 拓人) <tik...@google.com> wrote:
Hi,

On Wed, Apr 24, 2024 at 11:30 PM Nico Weber <tha...@chromium.org> wrote:
Hi,

there's some very recent activity talking about include-cleaner over at https://issues.chromium.org/issues/40248422 . Maybe ask there, you might get a better reply there :)

Nico


On Wed, Apr 24, 2024 at 10:11 AM 'Florent Castelli' via build <bu...@chromium.org> wrote:
Hi,

As part of an effort to improve the code quality in the WebRTC library, we're currently experimenting with clang-include-cleaner to sort out which headers should be added, which should be removed and fix any surfacing layering violations.

This led me to discover how the tool is figuring out which compilation flags are used for each input. While CC files are fine since there is a matching command line in the compile compile_commands.json file, headers are a bit problematic. The tool attempts to identify a related file and use its flags, which works quite often (especially when the header is paired with a .cc file), but fails in some situations (we provide header-only mocks using gmock, so it needs to depend on a special target for the right include path).

Some workarounds exist around this such as adding headers to the compilation database with a tool called compdb (which is having specific issues too). I think we can do better and have that functionality integrated directly in GN since it has all the required information.

Thanks, I didn't know about compdb. But can we also know what kind of issues does that have?
 
I think it would be interesting to see if we can generate rules to create pch files out of our headers automatically from executable, static_library or source_set rules automatically (possibly guarded by a GN setting), and those rules would not necessarily be built by default by Ninja either, merely used for other tooling relying on compilation databases.


Why do you need to have a rule for pch instead of individual headers?
 
Do you think that would be the right way forward? 

Yeah, I think it is reasonable to change GN to export compilation database entries for header files. Although it is better if clang-include-cleaner can handle #includes in header files.
 

I found https://github.com/llvm/llvm-project/issues/67550 to support header in clang-include-cleaner.
 
Do you think we could come up with a solution that is transparent with the current toolchain definitions in GN?
Would anyone be interested in this project and willing to guide me to achieve this?


And I think we can do that by updating around

I can review the CL if you send that to me.
 
Any feedback is interesting!

/Florent

--
You received this message because you are subscribed to the Google Groups "build" group.
To unsubscribe from this group and stop receiving emails from it, send an email to build+un...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/build/CADRnnSXWWqOcuNJUK-%3DdkFrdZFOge%2BMHwBDTsXs4yeFM%2BD-bbw%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "build" group.
To unsubscribe from this group and stop receiving emails from it, send an email to build+un...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/build/CADZ1XiaE1OepU9mBU1mKnETnQT28f1m_Ahwrq9npSSpYNPZa5Q%40mail.gmail.com.


--
Takuto Ikuta
Software Engineer in Tokyo
Chrome Ops (chrome browser build team)

Bruce Dawson

unread,
Apr 25, 2024, 1:15:00 AM4/25/24
to Takuto Ikuta (生田 拓人), Nico Weber, Florent Castelli, bu...@chromium.org
> I think it would be interesting to see if we can generate rules to create pch files out of our headers

Beware that pch files with clang don't work great. They tend to cause over including and therefore extra dependencies, and they don't have great performance characteristics. See "Structures 2 and 4 – Precompiled headers" in this blog post where I found that loading the PCH files was taking about four seconds per translation unit, and the precompiled headers were being redundantly created. They probably still had some savings, but it was hard to be sure.




--
Bruce Dawson, he/him

Florent Castelli

unread,
Apr 25, 2024, 5:16:30 AM4/25/24
to Takuto Ikuta (生田 拓人), Nico Weber, bu...@chromium.org
On Thu, Apr 25, 2024 at 6:32 AM Takuto Ikuta (生田 拓人) <tik...@google.com> wrote:
Hi,

On Wed, Apr 24, 2024 at 11:30 PM Nico Weber <tha...@chromium.org> wrote:
Hi,

there's some very recent activity talking about include-cleaner over at https://issues.chromium.org/issues/40248422 . Maybe ask there, you might get a better reply there :)

Nico


On Wed, Apr 24, 2024 at 10:11 AM 'Florent Castelli' via build <bu...@chromium.org> wrote:
Hi,

As part of an effort to improve the code quality in the WebRTC library, we're currently experimenting with clang-include-cleaner to sort out which headers should be added, which should be removed and fix any surfacing layering violations.

This led me to discover how the tool is figuring out which compilation flags are used for each input. While CC files are fine since there is a matching command line in the compile compile_commands.json file, headers are a bit problematic. The tool attempts to identify a related file and use its flags, which works quite often (especially when the header is paired with a .cc file), but fails in some situations (we provide header-only mocks using gmock, so it needs to depend on a special target for the right include path).

Some workarounds exist around this such as adding headers to the compilation database with a tool called compdb (which is having specific issues too). I think we can do better and have that functionality integrated directly in GN since it has all the required information.

Thanks, I didn't know about compdb. But can we also know what kind of issues does that have?

The main issue was that we need to add the file type to the command lines it generates with -xc++-headers. Otherwise, clang-include-header will complain about using C++ in a file that should be a c-header, which is producing a warning that turns into an error with -Werror.
It's also spewing lots of errors about many headers that can't be found, mainly from third_party. I didn't look too much into that as it did not impact the files I was trying to improve.
And lastly, it's not a very very fast tool.
 
 
I think it would be interesting to see if we can generate rules to create pch files out of our headers automatically from executable, static_library or source_set rules automatically (possibly guarded by a GN setting), and those rules would not necessarily be built by default by Ninja either, merely used for other tooling relying on compilation databases.


Why do you need to have a rule for pch instead of individual headers?

My naive mind thinks that we want well formed rules to "build" a header and that the natural output is a PCH file. It could also help to detect headers that are not self-contained (I did find a few in WebRTC api/ folder while experimenting). I do not plan to use them, they would just be a side-product of the verification and helping to bridge the compilation database for tools.
 
 
Do you think that would be the right way forward? 

Yeah, I think it is reasonable to change GN to export compilation database entries for header files. Although it is better if clang-include-cleaner can handle #includes in header files.

I think both approaches have merit. Header-only code would be difficult to verify for some projects. They could possibly run the checks over artificial CC files or test files though.
Also, wouldn't checking header files recursively from a CC file would increase the amount of work by a lot globally, it would probably need to be accompanied by special filter flags to specify which headers need to be scanned in a CC file and that adds probably a lot more complexity to the tooling (we'd need to annotate some CC files with which header they are meant to transitively test).
 
 
Do you think we could come up with a solution that is transparent with the current toolchain definitions in GN?
Would anyone be interested in this project and willing to guide me to achieve this?


And I think we can do that by updating around

That seems to be targeting the --export-compile-commands flags. While useful, it's currently not the preferred way for most to generate a compilation database as it adds reclient and its arguments as prefixes. The preferred way at the moment should be tools/clang/scripts/generate_compdb.py which also removes a few other arguments from the command line, and it exports the rules from "ninja -t compb" output. Maybe it could work in tandem with another script that is filtering the compilation database output to remove the prefixes.

It would be nice if we could make the output from --export-compile-commands compatible with our tooling by removing all the prefixes at least, which would enable them to be generated automatically by automatic "gn gen" invocations (ninja rules invoke gn gen with the last set of arguments used, possibly --export-compile-commands too). Currently, you need to manually regenerate the database when the build system has changed, and using a dummy build folder for analysis without reclient is not an option as you need to build that folder first to ensure all generated files have been created.

Currently, I've been poking around trying to get the CTool accept C++ headers and generate the right commands for the headers.
Reply all
Reply to author
Forward
0 new messages