A few comments / questions about Ninja

84 views

Skip to first unread message

Ciprian Dorin Craciun

unread,

Apr 23, 2011, 3:17:09 PM4/23/11

to ninja...@googlegroups.com

Hello all!

First of all please have my congratulations (and gratitude) for
this wonderful tool!

I've created a few months ago a small meta-build system (in
Scheme) available below, which targeted the Plan9 `mk` tool:
git://github.com/cipriancraciun/volution-build-system.git
git://gitorious.org/volution-build-system/mainline.git

Today I've hacked -- and hacked is the correct term as I've made a
little mess -- the tool to generate `ninja` scripts.

And for the moment I have the following questions and observations:

* absolute paths -- it seems that even though Ninja accepts
absolute paths, when creating the target directories it creates them
relative to the current directory (thus ignoring the leading `/`);
(but building works); is this a feature / bug? should I always
generate relative paths?

* you can't have a target named `build`; feature / bug? (how
should these "phony" targets be named? I went with the name
`__build__`...)

* for now all my build rules are like as follows, as my hack is
actually transliterating the old `mk` to the `ninja` script:
~~~~
rule sh
command = exec </dev/null >/dev/null ; ${sh_command}
build .outputs/erlang/applications/rabbit/ebin/rabbit_exchange_type_headers.beam
: sh .outputs/erlang/applications/rabbit/src/rabbit_exchange_type_headers.erl
sh_command = "${vbs_tools_erl}" '-smp' '-mode' 'minimal' ... ;
~~~~
thus the question is related with the `command` execution: for
now in the documentation we find only generic references as "command
line", "command", but no concrete interpretation is given. From what I
see -- and from an earlier post -- we see that `command` is actually
passed to the `sh -c` interpreter; about this I have a great grief
with sh and escaping rules...
Therefore does Ninja step into the other build system's shoes
and just delegate command execution to `sh -c` or shall "fix" this
mess and just use "execve"?

* stdin and stdout: I must put `exec </dev/null >/dev/null` in the
command as it seems the Erlang compiler breaks if it finds a closed
stdin / stdout; bug / feature?

And now a rules design question: let's suppose you have an include
folder that you are actually building, and we have some files which
when building depend on some files in the include folder. Now the
solution here is to use `build target : inputs | header1 header2`,
right? But what if I don't know which files it depends on? (In my case
I don't have a tool that generates Erlang include dependencies...) How
should I handle this rule? I guess `builder target : input |
include_dir_name` is wrong and I've used `builder target : input |
include_dir_name/.ready` and the `.ready` file depends on all the
files in the folder. Any suggestions?

Thanks,
Ciprian.

Evan Martin

unread,

Apr 23, 2011, 4:18:04 PM4/23/11

to ninja...@googlegroups.com

On Sat, Apr 23, 2011 at 12:17 PM, Ciprian Dorin Craciun
<ciprian...@gmail.com> wrote:
> I've created a few months ago a small meta-build system (in
> Scheme) available below, which targeted the Plan9 `mk` tool:
> git://github.com/cipriancraciun/volution-build-system.git
> git://gitorious.org/volution-build-system/mainline.git
>
> Today I've hacked -- and hacked is the correct term as I've made a
> little mess -- the tool to generate `ninja` scripts.

Cool! Do you have any docs or example build files from your build system?

I'm also curious how it worked out for you.
Do you find ninja any different in practice than how plan9 mk works?
(Simpler? More complicated? Faster? Slower?)

> And for the moment I have the following questions and observations:
>
> * absolute paths -- it seems that even though Ninja accepts
> absolute paths, when creating the target directories it creates them
> relative to the current directory (thus ignoring the leading `/`);
> (but building works); is this a feature / bug? should I always
> generate relative paths?

I haven't thought about this too hard. My vague feeling is that
absolute paths in a build system are usually a bug (your source should
still build even if you rename the directory it's contained in), but I
can understand conceptually depending on something like /usr/bin/gcc.
Could you describe your use case in more detail? It does sound like a
bug to to.

> * you can't have a target named `build`; feature / bug? (how
> should these "phony" targets be named? I went with the name
> `__build__`...)

Hah, a flaw in the tokenizer (the word "build" is tokenized as a
special word rather than a possible filename). I would like to fix
this. I opened https://github.com/martine/ninja/issues/27 for you.

> * for now all my build rules are like as follows, as my hack is
> actually transliterating the old `mk` to the `ninja` script:
> ~~~~
> rule sh
> command = exec </dev/null >/dev/null ; ${sh_command}
> build .outputs/erlang/applications/rabbit/ebin/rabbit_exchange_type_headers.beam
> : sh .outputs/erlang/applications/rabbit/src/rabbit_exchange_type_headers.erl
> sh_command = "${vbs_tools_erl}" '-smp' '-mode' 'minimal' ... ;
> ~~~~
> thus the question is related with the `command` execution: for
> now in the documentation we find only generic references as "command
> line", "command", but no concrete interpretation is given. From what I
> see -- and from an earlier post -- we see that `command` is actually
> passed to the `sh -c` interpreter; about this I have a great grief
> with sh and escaping rules...
> Therefore does Ninja step into the other build system's shoes
> and just delegate command execution to `sh -c` or shall "fix" this
> mess and just use "execve"?

I opened https://github.com/martine/ninja/issues/28 about the missing docs.

execve would require us to parse the command into an argv array, which
means we'd need to get into the ugly business of handling quoting.
However, maybe we can get away with defining a heavily reduced quoting
set (like say, escape spaces with backslashes if you don't want them
to be argument separators) and saying it's the ninja-file-generator's
problem to do quoting correctly.

Note that on Windows the only way to execute a command is with the
equivalent of system().

It seems to me that, other than simplifying the above problem, the
only advantage of not using a shell is that it might save on execution
time. But I don't think subshell startup is enough time to warrant
it.

My intuition is that you should just deal with the escaping yourself,
or perhaps even adjust your meta-build system to generate a specific
rule for each command it wants to run (which would allow you to use
one fewer layer of escaping). What do you think?

> * stdin and stdout: I must put `exec </dev/null >/dev/null` in the
> command as it seems the Erlang compiler breaks if it finds a closed
> stdin / stdout; bug / feature?

Sounds like a bug. I thought I provided /dev/null as stdin already,
but I might be confusing projects... Can you reduce the problem into
a test case somehow? (Maybe make a shell or Python script that
crashes if it's missing stdin or stdout and verify that it crashes
under Ninja.)

> And now a rules design question: let's suppose you have an include
> folder that you are actually building, and we have some files which
> when building depend on some files in the include folder. Now the
> solution here is to use `build target : inputs | header1 header2`,
> right? But what if I don't know which files it depends on? (In my case
> I don't have a tool that generates Erlang include dependencies...) How
> should I handle this rule? I guess `builder target : input |
> include_dir_name` is wrong and I've used `builder target : input |
> include_dir_name/.ready` and the `.ready` file depends on all the
> files in the folder. Any suggestions?

Just to make sure I follow:
1) you have a command that generates headers into include_dir/
2) you have another command that relies on those generated headers
3) you have no way of getting a list of which headers will be generated
?

How do you generate the .ready file properly in that case?
I would think that whatever generates the .ready file could maybe
generate a depfile. Or you could make the command in #2 depend on the
command in #1 (which would mean you always rebuild if you generate any
headers, but that sounds like that is already what you get with a
.ready file).

I'm not sure I follow this example well enough to give you advice...

Ciprian Dorin Craciun

unread,

Apr 24, 2011, 4:41:32 AM4/24/11

to ninja...@googlegroups.com

On Sat, Apr 23, 2011 at 23:18, Evan Martin <mar...@danga.com> wrote:
> On Sat, Apr 23, 2011 at 12:17 PM, Ciprian Dorin Craciun
> <ciprian...@gmail.com> wrote:
>> I've created a few months ago a small meta-build system (in
>> Scheme) available below, which targeted the Plan9 `mk` tool:
>> git://github.com/cipriancraciun/volution-build-system.git
>> git://gitorious.org/volution-build-system/mainline.git
>>
>> Today I've hacked -- and hacked is the correct term as I've made a
>> little mess -- the tool to generate `ninja` scripts.
>
> Cool! Do you have any docs or example build files from your build system?

Thanks! For now no documentation, just some examples (in another
"unpublished" repository :) It's in a very "prototype"-ish stage.) I
currently use it to build Erlang based applications. (It's kind of an
equivalent to `rebar` Erlang build tool, but focusing just on the
build.)

~~~~
(vbs:require-erlang)

(vbs:define-erlang-application 'mosaic_httpg
dependencies: '(rabbit rabbit_common amqp_client misultin goodies vme)
erl: "\\./sources/.*\\.erl"
hrl: "\\./sources/.*\\.hrl"
additional-ebin: "\\./sources/.*\\.app"
additional-priv: "\\./sources/.*\\.config")
~~~~

I intend to drop the current version and create a new one by
applying what I've learned from this first prototype. (My data-model
mirrors too closely make based systems, and it's not suited for
ninja.)

> I'm also curious how it worked out for you.
> Do you find ninja any different in practice than how plan9 mk works?
> (Simpler? More complicated? Faster? Slower?)

I found Ninja even more simpler than the Plan9 mk -- which I've
chosen after bad (as in overly-complex) experiences with GNU Make...
The reason I've dropped mk and moved to Ninja is it's speed. I
think I've hit a bug / quirk in mk which made it very unresponsive --
it took about 16 minutes to build everything from scratch, and about
40 seconds just to identify that nothing has to be made. (I think it
was related with the number of targets / prerequisites which were
about 3000...)

As said I was astonished how quickly Ninja worked. But the killer
feature it has is the error reporting... :) No more intermingled
errors. :)

>> And for the moment I have the following questions and observations:
>>
>> * absolute paths -- it seems that even though Ninja accepts
>> absolute paths, when creating the target directories it creates them
>> relative to the current directory (thus ignoring the leading `/`);
>> (but building works); is this a feature / bug? should I always
>> generate relative paths?
>
> I haven't thought about this too hard. My vague feeling is that
> absolute paths in a build system are usually a bug (your source should
> still build even if you rename the directory it's contained in), but I
> can understand conceptually depending on something like /usr/bin/gcc.
> Could you describe your use case in more detail? It does sound like a
> bug to to.

For the moment I don't have a clear use-case for absolute paths --
I just used it because this was what I've used in mk and make.

But here is one potential example: let's assume that I want to use
Ninja to build (better said put together) a Linux distribution. Now I
could build the individual packages under the current working
directory, but the final CD could be placed somewhere else -- a
network mounted folder. (Or maybe some temporary files are too big and
I would like to create them in `/tmp`.) Sounds plausible?

>> * you can't have a target named `build`; feature / bug? (how
>> should these "phony" targets be named? I went with the name
>> `__build__`...)
>
> Hah, a flaw in the tokenizer (the word "build" is tokenized as a
> special word rather than a possible filename). I would like to fix
> this. I opened https://github.com/martine/ninja/issues/27 for you.

Thanks.

>> * for now all my build rules are like as follows, as my hack is
>> actually transliterating the old `mk` to the `ninja` script:
>> ~~~~
>> rule sh
>> command = exec </dev/null >/dev/null ; ${sh_command}
>> build .outputs/erlang/applications/rabbit/ebin/rabbit_exchange_type_headers.beam
>> : sh .outputs/erlang/applications/rabbit/src/rabbit_exchange_type_headers.erl
>> sh_command = "${vbs_tools_erl}" '-smp' '-mode' 'minimal' ... ;
>> ~~~~
>> thus the question is related with the `command` execution: for
>> now in the documentation we find only generic references as "command
>> line", "command", but no concrete interpretation is given. From what I
>> see -- and from an earlier post -- we see that `command` is actually
>> passed to the `sh -c` interpreter; about this I have a great grief
>> with sh and escaping rules...
>> Therefore does Ninja step into the other build system's shoes
>> and just delegate command execution to `sh -c` or shall "fix" this
>> mess and just use "execve"?
>
> I opened https://github.com/martine/ninja/issues/28 about the missing docs.

Thanks again!

> execve would require us to parse the command into an argv array, which
> means we'd need to get into the ugly business of handling quoting.
> However, maybe we can get away with defining a heavily reduced quoting
> set (like say, escape spaces with backslashes if you don't want them
> to be argument separators) and saying it's the ninja-file-generator's
> problem to do quoting correctly.
>
> Note that on Windows the only way to execute a command is with the
> equivalent of system().
>
> It seems to me that, other than simplifying the above problem, the
> only advantage of not using a shell is that it might save on execution
> time. But I don't think subshell startup is enough time to warrant
> it.
>
> My intuition is that you should just deal with the escaping yourself,
> or perhaps even adjust your meta-build system to generate a specific
> rule for each command it wants to run (which would allow you to use
> one fewer layer of escaping). What do you think?

I tend to think I'm pretty good with Sh (as in Bash), and I've
written some other Sh script generators, but each time I had serious
problems with quoting and the Sh quirks (they mainly include quoting
single quotes) -- too many corner cases...

From my experience with build system generation (I've built one in
Sh which generates GNU Make files, this one which generates mk and
Ninja, and another one in GNU Make which auto-generates rules for
itself), I've identified the following requirements for the "recipe":
* almost always a recipe is a sequence of "commands", chained --
if one fails, we stop;
* there are a couple of cases when you can't just run the command
from the current directory, you need to step in into the directory of
either the output or the source; (like zipping, or other converters
which don't accept an output path and just dump it in the current
directory or in the directory of the source);
* there are cases when you need to change the environment variables;
* there are cases when you need input / output redirections;

Therefore if I would take this decision of how to threat the
command I would design a **strict** but simple sh-like language that
would allow me the following:
* change the environment; change the current working directory;
redirect input or output;
* chaining would be obtained by using multiple `command` variables;

An example:
~~~~
command = env VAR1=val1 VAR2=val2 & in in_file_to_redirect & out
out_file_to_redirect & chdir folder & exec command arg1 arg2
~~~~

Any of the `env`, `in`, `out` or `chdir` could be missing.
Tokenization is space based as you've suggested, escaping spaces is
done with a single `\`.

My only concern is how do you threat multiple values for a value?
How would `cmd $a` be expanded if $a contain b c and d. I guess they
would guess `cmd` would take three arguments. But how about `cmd
__$a__`? Would it be `cmd` `__b` `c` `d__`, or `cmd` `__b__`, `__c__`,
`__d__`? (This is inspired from Plan9's `rc`.)

>> * stdin and stdout: I must put `exec </dev/null >/dev/null` in the
>> command as it seems the Erlang compiler breaks if it finds a closed
>> stdin / stdout; bug / feature?
>
> Sounds like a bug. I thought I provided /dev/null as stdin already,
> but I might be confusing projects... Can you reduce the problem into
> a test case somehow? (Maybe make a shell or Python script that
> crashes if it's missing stdin or stdout and verify that it crashes
> under Ninja.)

I've looked through the source code, and indeed you use
`/dev/null` when forking, but I've not investigated deeper. It might
be from the Erlang compiler. I'll have to see.

>> And now a rules design question: let's suppose you have an include
>> folder that you are actually building, and we have some files which
>> when building depend on some files in the include folder. Now the
>> solution here is to use `build target : inputs | header1 header2`,
>> right? But what if I don't know which files it depends on? (In my case
>> I don't have a tool that generates Erlang include dependencies...) How
>> should I handle this rule? I guess `builder target : input |
>> include_dir_name` is wrong and I've used `builder target : input |
>> include_dir_name/.ready` and the `.ready` file depends on all the
>> files in the folder. Any suggestions?
>
> Just to make sure I follow:
> 1) you have a command that generates headers into include_dir/

For the moment I just copy the include files from the sources into
this folder.

> 2) you have another command that relies on those generated headers

The Erlang compiler.

> 3) you have no way of getting a list of which headers will be generated
> ?

I could get such a list, but obtaining it is not easy... For
example I can imagine that instead of copying files there I could
unzip them from somewhere. (Erlang has `*.ez` files that are actually
Zips that contain an `include` folder.)

> How do you generate the .ready file properly in that case?

In the current case I just have `.ready` depend on any other
`.hrl` files that must be copied. But if I needed to unzip some files
there I would invent a file `.some_application.ez.unzipped` file on
which `.ready` depends.

> I would think that whatever generates the .ready file could maybe
> generate a depfile. Or you could make the command in #2 depend on the
> command in #1 (which would mean you always rebuild if you generate any
> headers, but that sounds like that is already what you get with a
> .ready file).
>
> I'm not sure I follow this example well enough to give you advice...

Not quite. As said the `.ready` file has a rule like: `build
some_folder/include/.ready : touch
<<a_list_of_files_inside_include>>`, thus the `.ready` is only
"retouched" when I change either any of the `.hrl` files, or in the
unzip case when the zip itself changes.

Thanks for your reply, looking forward to hear from you,
Ciprian.

Evan Martin

unread,

Apr 29, 2011, 1:50:00 PM4/29/11

to ninja...@googlegroups.com

On Sun, Apr 24, 2011 at 1:41 AM, Ciprian Dorin Craciun
<ciprian...@gmail.com> wrote:
> The reason I've dropped mk and moved to Ninja is it's speed. I
> think I've hit a bug / quirk in mk which made it very unresponsive --
> it took about 16 minutes to build everything from scratch, and about
> 40 seconds just to identify that nothing has to be made. (I think it
> was related with the number of targets / prerequisites which were
> about 3000...)

Wow, that seems pretty pathologically bad. I wonder if it's worth
reducing it to a bug report for the mk maintainers.

>
> As said I was astonished how quickly Ninja worked. But the killer
> feature it has is the error reporting... :) No more intermingled
> errors. :)

Ah, I'm glad you like it! I felt a bit guilty about this feature
given that it's orthogonal to the other goals of ninja, but I just
couldn't resist. I feel like the build system is the best layer to
manage this kind of output. (I've even considered adding an option
for generating HTML logs that allow better navigation through errors.)

> For the moment I don't have a clear use-case for absolute paths --
> I just used it because this was what I've used in mk and make.
>
> But here is one potential example: let's assume that I want to use
> Ninja to build (better said put together) a Linux distribution. Now I
> could build the individual packages under the current working
> directory, but the final CD could be placed somewhere else -- a
> network mounted folder. (Or maybe some temporary files are too big and
> I would like to create them in `/tmp`.) Sounds plausible?

Hm, yes. I guess more concretely an absolute path for the build
output directory can make a lot of sense. I would still claim that
most paths should be relative to that directory, but ninja doesn't
know much about paths so it's not like this claim makes any actual
implementation difference.

Can you open an issue at https://github.com/martine/ninja/issues with
a simple file that reproduces the original problem?

>>> * you can't have a target named `build`; feature / bug? (how
>>> should these "phony" targets be named? I went with the name
>>> `__build__`...)
>>
>> Hah, a flaw in the tokenizer (the word "build" is tokenized as a
>> special word rather than a possible filename). I would like to fix
>> this. I opened https://github.com/martine/ninja/issues/27 for you.
>
> Thanks.

I have a patch for this, but I want to verify I didn't regress
performance. I'll hopefully have it fixed today.

> I tend to think I'm pretty good with Sh (as in Bash), and I've
> written some other Sh script generators, but each time I had serious
> problems with quoting and the Sh quirks (they mainly include quoting
> single quotes) -- too many corner cases...

That's a good point, it can be really hard to get right...

I'm really wary of starting down this path, for exactly the sorts of
concerns like these.
On the other hand, especially on Windows it does seem useful to be
able to chain commands together.
I would like to think about this more.
For things like redirections and environment variables, I think I'd
prefer requiring you to generate a .sh or .bat file, but maybe there's
a small thing we could do that would cover 90% of the use cases.

I feel a bit bad saying it, but your solution seems pretty good to me.
In Chrome's build I have a rule called "stamp" that just touches a
file like your .ready, and it's used as a sort of state checkpoint in
various places. (E.g. in Chrome we have a Perl script that generates
a bunch of JavaScript bindings headers; rather than trying to line up
all the build output, I have bindings.stamp get updated whenever the
Perl script runs.)

Especially in cases where you're, for example, unzipping a zip file
where you don't know the contents, it's nice to be able to output a
known file name and use the state of that file in build rules as a
stand-in for "the contents of the zip file".

Evan Martin

unread,

May 17, 2011, 7:45:50 PM5/17/11

to ninja...@googlegroups.com

On Fri, Apr 29, 2011 at 10:50 AM, Evan Martin <mar...@danga.com> wrote:
>> For the moment I don't have a clear use-case for absolute paths --
>> I just used it because this was what I've used in mk and make.
>>
>> But here is one potential example: let's assume that I want to use
>> Ninja to build (better said put together) a Linux distribution. Now I
>> could build the individual packages under the current working
>> directory, but the final CD could be placed somewhere else -- a
>> network mounted folder. (Or maybe some temporary files are too big and
>> I would like to create them in `/tmp`.) Sounds plausible?
>
> Hm, yes. I guess more concretely an absolute path for the build
> output directory can make a lot of sense. I would still claim that
> most paths should be relative to that directory, but ninja doesn't
> know much about paths so it's not like this claim makes any actual
> implementation difference.
>
> Can you open an issue at https://github.com/martine/ninja/issues with
> a simple file that reproduces the original problem?

Sorry that this problem got lost, I don't know if you ever made a bug from it.
But I managed to run into this myself. Simple bug, checked in a
regression test for it too.

https://github.com/martine/ninja/commit/b4bebb7ead0461529c9ff26cbb3db6b77a2643be

(We do need to not mangle absolute paths when you have dependencies on
e.g. /usr/lib/stdio.h.)

Reply all

Reply to author

Forward

0 new messages