Path slash normalization on Windows

1,853 views
Skip to first unread message

Scott Graham

unread,
Apr 10, 2012, 6:51:06 PM4/10/12
to ninja...@googlegroups.com
Hi ninja-build,

I was undertaking to sanitize slashes as path separators on Windows.
That is, only supporting \ rather than /.

I would prefer to only support \ (rather than both \ and /) because of
the use of hashes containing filenames, and it feels a bit cleaner to
require only one or the other. It seems more natural to support \ on
Windows for interacting with other tools.

Does anyone agree/disagree with that as a general idea?


The main changes involved are modifying the canonicalization
functions, the associated test data, and ensuring that the dependency
helper always outputs in the correct style. I haven't tried this yet,
but I think I would like to explicitly disallow / in filenames at the
input level to avoid confusion too.

There's a commit here to see the general scope of this
https://github.com/sgraham/ninja/commit/dbd555ad41141e7abc916d0ce32f1533d0e17050
(I unfortunately changed line endings on some files so the diff is
bigger than it should be).

The overall motivation here is to stat in a more reasonable amount of
time on Windows. To do so, stating needs to be done on a directory
level, not a file level (see stat_cache.cc). And, if Nodes are stored
with consistent slashes, then the mtimes that come out of walking the
directory tree can be pushed into the Nodes more easily.

scott

Nicolas Desprès

unread,
Apr 11, 2012, 2:58:19 PM4/11/12
to ninja...@googlegroups.com
On Wed, Apr 11, 2012 at 12:51 AM, Scott Graham <sco...@chromium.org> wrote:
> Hi ninja-build,
>
> I was undertaking to sanitize slashes as path separators on Windows.
> That is, only supporting \ rather than /.
>
> I would prefer to only support \ (rather than both \ and /) because of
> the use of hashes containing filenames, and it feels a bit cleaner to
> require only one or the other. It seems more natural to support \ on
> Windows for interacting with other tools.
>
> Does anyone agree/disagree with that as a general idea?
>

If you are only speaking about the Windows port, then it sounds
reasonable to me. FYI, other project, like cmake, tend to support
slashes everywhere and do the substitution for you. But I think it is
not an option for Ninja to do the substitution.

-Nico

>
> The main changes involved are modifying the canonicalization
> functions, the associated test data, and ensuring that the dependency
> helper always outputs in the correct style. I haven't tried this yet,
> but I think I would like to explicitly disallow / in filenames at the
> input level to avoid confusion too.
>
> There's a commit here to see the general scope of this
> https://github.com/sgraham/ninja/commit/dbd555ad41141e7abc916d0ce32f1533d0e17050
> (I unfortunately changed line endings on some files so the diff is
> bigger than it should be).
>
> The overall motivation here is to stat in a more reasonable amount of
> time on Windows. To do so, stating needs to be done on a directory
> level, not a file level (see stat_cache.cc). And, if Nodes are stored
> with consistent slashes, then the mtimes that come out of walking the
> directory tree can be pushed into the Nodes more easily.
>
> scott

--
Nicolas Desprès

Scott Graham

unread,
Apr 11, 2012, 3:01:20 PM4/11/12
to ninja...@googlegroups.com
2012/4/11 Nicolas Desprès <nicolas...@gmail.com>:

> On Wed, Apr 11, 2012 at 12:51 AM, Scott Graham <sco...@chromium.org> wrote:
>> Hi ninja-build,
>>
>> I was undertaking to sanitize slashes as path separators on Windows.
>> That is, only supporting \ rather than /.
>>
>> I would prefer to only support \ (rather than both \ and /) because of
>> the use of hashes containing filenames, and it feels a bit cleaner to
>> require only one or the other. It seems more natural to support \ on
>> Windows for interacting with other tools.
>>
>> Does anyone agree/disagree with that as a general idea?
>>
>
> If you are only speaking about the Windows port, then it sounds
> reasonable to me.

Yes, I only meant for Windows. Non-Windows would still be '/' only of course.

thanks

Amine Khaldi

unread,
Apr 14, 2012, 7:47:11 AM4/14/12
to ninja...@googlegroups.com
Hi,

I just wanted to mention that in the topic of slashes on Windows, there seems to be a difference between the mingw32/mingw-w64 toolchains and the msvc ones. If you pass foo\bar to gcc in windows you'll get a concatenation. We faced this problem and had to convert the slashes to unix style (foo/bar) for such tools to work correctly on Windows, even though cl for example can deal with both foo\bar and foo/bar correctly.

Regards,
Amine.

okuoku

unread,
Apr 15, 2012, 3:05:56 AM4/15/12
to ninja...@googlegroups.com
2012/4/14 Amine Khaldi <amine....@reactos.org>:
- snip -

> If you pass foo\bar to gcc in windows you'll get a concatenation.
> We faced this problem and had to convert the slashes to unix style (foo/bar)
> for such tools to work correctly on Windows, even though cl for example can
> deal with both foo\bar and foo/bar correctly.

GNU ld/gold requires backslash escaping for response files. From ld(1),

>> Options in file are separated by whitespace.
>> A whitespace character may be included in
>> an option by surrounding the entire option in
>> either single or double quotes. Any character
>> (including a backslash) may be included by
>> prefixing the character to be included with a backslash.

So what we need is something like rspfile_escape = true or
rspfile_style = backslash option(I hope it is the last option specific
to response file..).

Normalizing backslashes to slash would overcome this situation; we
won't need to add rspfile_escape option _for_filenames_ , but other
parameters like gcc -DVERSION=\\"some\ string\\" are cannot be
normalized.
So i think we still need the rspfile_escape option.

-- oku

Andrey Kamaev

unread,
Apr 23, 2012, 6:22:47 AM4/23/12
to ninja...@googlegroups.com
Hi all,

I'm trying to use ninja on Windows for my projects and I think that ninja must support unix-style forward slashes (/) on Windows. And support of both types of slashes will significantly simplify the migration to ninja from other tools traditionally using backslashes.

Let me explain my point on the real example:

I've the following build command executed by ninja (I've omitted non-sensitive options):

arm-linux-androideabi-g++.exe -MMD -MT 3rdparty\tbb.dir\tbb40_20111130oss\src\tbb\arena.cpp.o -MF 3rdparty\tbb.dir\tbb40_20111130oss\src\tbb\arena.cpp.o.d -o 3rdparty\tbb.dir\tbb40_20111130oss\src\tbb\arena.cpp.o -c 3rdparty\tbb\tbb40_20111130oss\src\tbb\arena.cpp

And inside the source code of tbb I have:

#include "../rml/include/rml_tbb.h"
 
As result the generated dependencies file contains the following line:

 3rdparty\tbb\tbb40_20111130oss\src\tbb\/../rml/include/rml_tbb.h \

I suppose that the line was composed from the source file path path (3rdparty\tbb\tbb40_20111130oss\src\tbb\) and include path (../rml/include/rml_tbb.h).
After the path canonicalization inside ninja the path become rml/include/rml_tbb.h and is always marked as dirty. This makes every incremental build to recompile several files and relink the whole project.

And this behavior is shared by the various gcc-on-windows (at least google NDK, CodeSourcery toolchain, mingw 32-bit). So:

1) forward slash (/) has to be recognized on Windows (nobody will modify headers or fix all forks of gcc);
2) backward slash (\) support is nice-to-have but very important option for those who want to migrate to ninja.

For my needs I've modified CanonicalizePath function to recognize both types of slashes as a path separator. And I suggest to include this feature to the mainline.

Thanks for your attention,
Andrey

Evan Martin

unread,
Apr 23, 2012, 3:16:10 PM4/23/12
to ninja...@googlegroups.com
On Sun, Apr 15, 2012 at 12:05 AM, okuoku <oku...@gmail.com> wrote:
> 2012/4/14 Amine Khaldi <amine....@reactos.org>:
> - snip -
>> If you pass foo\bar to gcc in windows you'll get a concatenation.
>> We faced this problem and had to convert the slashes to unix style (foo/bar)
>> for such tools to work correctly on Windows, even though cl for example can
>> deal with both foo\bar and foo/bar correctly.
>
> GNU ld/gold requires backslash escaping for response files. From ld(1),
>
>>> Options in file are separated by whitespace.
>>> A whitespace character may be included in
>>> an option by surrounding the entire option in
>>> either single or double quotes. Any character
>>>  (including a backslash) may be included by
>>> prefixing the character to be included with a backslash.
>
> So what we need is something like rspfile_escape = true or
> rspfile_style = backslash option(I hope it is the last option specific
> to response file..).

Do you use response files when using gold?
I'm surprised they're needed.

> Normalizing backslashes to slash would overcome this situation; we
> won't need to add rspfile_escape option _for_filenames_ , but other
> parameters like gcc -DVERSION=\\"some\ string\\" are cannot be
> normalized.
> So i think we still need the rspfile_escape option.

Ugh, what a mess. :~(

okuoku

unread,
Apr 23, 2012, 4:31:20 PM4/23/12
to ninja...@googlegroups.com
2012/4/24 Evan Martin <mar...@danga.com>:

> On Sun, Apr 15, 2012 at 12:05 AM, okuoku <oku...@gmail.com> wrote:
- snip -

>> GNU ld/gold requires backslash escaping for response files. From ld(1),
>>
>>>> Options in file are separated by whitespace.
>>>> A whitespace character may be included in
>>>> an option by surrounding the entire option in
>>>> either single or double quotes. Any character
>>>>  (including a backslash) may be included by
>>>> prefixing the character to be included with a backslash.
>>
>> So what we need is something like rspfile_escape = true or
>> rspfile_style = backslash option(I hope it is the last option specific
>> to response file..).
>
> Do you use response files when using gold?
> I'm surprised they're needed.

Ah, of course we only need "escaped" response files when we are using
MinGW or any gcc cross compiler on Windows.
I'm also trying to use Ninja with Android NDK.

-- oku

Scott Graham

unread,
Apr 23, 2012, 5:05:48 PM4/23/12
to ninja...@googlegroups.com

This was my suggestion before too, but I've (mostly) come around to
viewing the case you mention above as a gcc bug, at least as far as
ninja is concerned.

In my current branch I've wrapped dependency output to more
aggressively normalize lists of headers to \, and made sure that all
paths in the generator match as well. Realistically, it's necessary to
normalize case anyway too, so there has to be a post-process in any
case.

I'm not sure this is the best solution, I just thought I'd mention it.

Andrey Kamaev

unread,
Apr 24, 2012, 4:06:57 PM4/24/12
to ninja...@googlegroups.com
Eh, now I see your point. And I agree that normalization at the dependency parser level should be more generic solution than hacking path canonicalization. But I still think that ninja should be aware of such gcc behavior.

By the way I've tried your branch and it works well for me. The only issue was that bootstraper script doesn't work in your branch (version.h is not generated for bootstrap.py and linker fails with duplicated declarations of "main" added by the utilities).

/Andrey

On Tuesday, April 24, 2012 1:05:48 AM UTC+4, Scott Graham wrote:

Scott Graham

unread,
Apr 24, 2012, 4:50:58 PM4/24/12
to ninja...@googlegroups.com
On Tue, Apr 24, 2012 at 1:06 PM, Andrey Kamaev <andrey...@itseez.com> wrote:
> Eh, now I see your point. And I agree that normalization at the dependency
> parser level should be more generic solution than hacking path
> canonicalization. But I still think that ninja should be aware of such gcc
> behavior.

Evan's "deplist" branch has the deplist helper which parses cl.exe's
/showIncludes output, and gcc .d files. I'm not sure if it normalizes
in that branch, but it does seem like a good place to make everything
"clean and tidy" for ninja anyway. That's where I do the normalize in
the Windows dep_database version.

>
> By the way I've tried your branch and it works well for me. The only issue
> was that bootstraper script doesn't work in your branch (version.h is not
> generated for bootstrap.py and linker fails with duplicated declarations of
> "main" added by the utilities).

Glad to hear. Sorry about the broken bootstrap, should be fixed now.

Reply all
Reply to author
Forward
0 new messages