Windows /showIncludes and interprocess communication

107 views
Skip to first unread message

EShuman

unread,
Aug 10, 2023, 10:14:58 AM8/10/23
to ninja-build
Hello!

On Windows, when Ninja creates a Subprocess to compile a file, it redirects the Subprocess's std::out and std::err to read dependencies from it:

startup_info.hStdOutput = child_pipe;
startup_info.hStdError = child_pipe;


I am using a custom compiler that, after compiling a file, prints the dependencies to std::err. In my case, printing to std::err might take about 2-3 seconds.
I tried to exclude Ninja from the process of reading from std::err by redirecting the Subprocess's std::err to nul:

startup_info.hStdError = nul;

With this change, writing to std::err only takes a few microseconds.

Is there a way to fix this issue and improve interprocess communication speed on Windows?

Ben Boeckel

unread,
Aug 10, 2023, 10:40:28 AM8/10/23
to EShuman, ninja-build
On Thu, Aug 10, 2023 at 02:19:40 -0700, EShuman wrote:
> Hello!
>
> On Windows, when Ninja creates a Subprocess to compile a file, it redirects
> the Subprocess's std::out and std::err to read dependencies from it:

Well, stdout is for dependencies. stderr is for diagnostics (`ninja`
batches output per `build` rule to avoid interleaved output).

> startup_info.hStdOutput = child_pipe;
> startup_info.hStdError = child_pipe;
>
> I am using a custom compiler that, after compiling a file, prints the
> dependencies to std::err. In my case, printing to std::err might take about
> 2-3 seconds.
> I tried to exclude Ninja from the process of reading from std::err by
> redirecting the Subprocess's std::err to nul:
>
> startup_info.hStdError = nul;
>
> With this change, writing to std::err only takes a few microseconds.

How are users expected to get diagnostics from the compiler if `stderr`
is dropped on the floor? Or get correct dependencies if using your
custom compiler for that matter?

> Is there a way to fix this issue and improve interprocess communication
> speed on Windows?

--Ben

jha...@gmail.com

unread,
Aug 10, 2023, 12:08:14 PM8/10/23
to ninja-build
How much data does the compiler print? 2-3 seconds sound like a lot. Is the time spent in capturing the output or actually reprinting it on the console (this is slow on Windows)?

Evan Martin

unread,
Aug 10, 2023, 12:44:34 PM8/10/23
to EShuman, ninja-build
I am using a custom compiler that, after compiling a file, prints the dependencies to std::err. In my case, printing to std::err might take about 2-3 seconds.
I tried to exclude Ninja from the process of reading from std::err by redirecting the Subprocess's std::err to nul:

Ninja will wait for the subprocess to complete regardless, so it may not matter where you redirect the stderr to.
(I think we're all not clear on which layer is causing this to take so long.)

EShuman

unread,
Aug 11, 2023, 2:22:54 AM8/11/23
to ninja-build
Just to clarify, I'm referring to this file: src/subprocess-win32.cc. In this file, a completion port is created to monitor pipes from Subprocesses:

ioport_ = ::CreateIoCompletionPort(INVALID_HANDLE_VALUE, NULL, 0, 1);

As far as I understand it, when a pipe is ready to be read from, it sends a packet to the completion port, and the data begins to be read from the pipe. However, based on the implementation in subprocess-win32.cc, the reading process from pipes is single-threaded (SubprocessSet::DoWork()). When I run Ninja with -j100, multiple Subprocesses (pipes) write to std::err simultaneously, but only one thread reads them sequentially. This seems to significantly slow down the process. The print data size is approximately 60-80 KB, and it's not displayed on the console.

David Turner

unread,
Aug 11, 2023, 10:25:03 AM8/11/23
to EShuman, ninja-build
On Fri, Aug 11, 2023 at 8:22 AM EShuman <eugeneza...@gmail.com> wrote:
Just to clarify, I'm referring to this file: src/subprocess-win32.cc. In this file, a completion port is created to monitor pipes from Subprocesses:

ioport_ = ::CreateIoCompletionPort(INVALID_HANDLE_VALUE, NULL, 0, 1);

As far as I understand it, when a pipe is ready to be read from, it sends a packet to the completion port, and the data begins to be read from the pipe. 

Technically, the packet is sent to the completion port when the read has finished (i.e. the data is in the read buffer). Another overlapped ReadFile() can then be launched to start another asynchronous operation.
This allows the kernel to perform multiple reads() in parallel even if the main program is single-threaded. Win32 overlapped I/O works very differently from Posix async I/O.
 
However, based on the implementation in subprocess-win32.cc, the reading process from pipes is single-threaded (SubprocessSet::DoWork()). When I run Ninja with -j100, multiple Subprocesses (pipes) write to std::err simultaneously, but only one thread reads them sequentially. This seems to significantly slow down the process. The print data size is approximately 60-80 KB, and it's not displayed on the console.

No, on Windows, the reads are performed in parallel by the kernel. Each subprocess has its own read buffer so there is no contention/locking there. The slowness might be from something else (e.g. it may be that pipe operations are slow on Windows, I have no idea, or something that happens later when processing the content of these files), but that part of the code seems really fine to me.
 

--
You received this message because you are subscribed to the Google Groups "ninja-build" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ninja-build...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ninja-build/30d8345a-b6c4-4f86-8397-28010feac3fen%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages