Re: [9fans] `mk` (from Plan9 ports) efficiency related issue

erik quanstrom

unread,

Jan 17, 2011, 10:00:50 AM1/17/11

to

> Any ideas what could cause this?

have you tried profiling mk?

- erik

Robert Raschke

unread,

Jan 17, 2011, 10:01:57 AM1/17/11

to

Terribly sorry, my email won't help you much, apart from going "Wow, a 4000 link mk file!" and "Hmm, I wouldn't start from here if you want to go there." Your email also doesn't explain why you cannot generate a "normal" mk file.

If you want to stick with your approach, it almost looks like you may be better off to generate a shell script that explicitly checks and runs everything you need. (And yes, essentially make your "generator" be a make in it's own right. Another one won't hurt.)

But it's cool to see someone else who uses Erlang and RabbitMQ hanging out on this list. :-)

Robby

On Mon, Jan 17, 2011 at 2:47 PM, Ciprian Dorin Craciun <ciprian...@gmail.com> wrote:

Hello all!

Sorry for interrupting again, but I've stumbled on an `mk` issue...

I've written a little Scheme application that generates `mk`
scripts for building Erlang applications. (See below an extract of one
of my previous emails describing just the generator part; the thread
had the subject: <<mk (from plan9ports) modification time resolution
issue?>>.)

Now the problem is that the generated script (attached to this
email) has about 2671 prerequisites (` target :: prerequisite `), and
about 684 actual targets with recipes. (As I've explained below I'm
not using any meta-rules, and I'm explicit about each resulting file
and it's dependencies.)

The problem is that the time needed to run the script has
extremely increased, and the processor is 100% eaten by `mk`.

For example:
* just running the `mk` script with the `-n` option takes about 14 seconds.
* using the commands from the `mk -n` takes about 1 minute and 36 seconds;
* then running `mk` takes another 14 seconds; (as it has nothing);
* but after cleaning and running `mk` (which I've left running for
about 5 minutes and still didn't finished) it seems that between each
target (or batch of targets?) it stays about 14 seconds;

But what is strange is that if instead to build the default target
that builds everything I start building little by little independent
parts, it works without that great delay...

Any ideas what could cause this?

Thanks,
Ciprian.

----------
[[ Extract from the previous email. ]]
----------

BTW... People might wonder how come I have 367 targets (with 1221
prerequisites) for such a small project? :) The answers is I don't
write the `mk` script by hand, but I've written a small Scheme
application that just generates the `mk` script based on descriptions
like the following. (Thus the resulting `mk` script is quite
exhaustive with quite tight dependencies and doesn't use
meta-rules...) :)

So just out of curiosity are there any `mk` script generators out there?

Ciprian.

~~~~
(vbs:require-erlang)

(vbs:define-erlang-application 'rabbit
erl: "\\./(rabbitmq-server--latest/src|generated)/.*\\.erl"
hrl: "\\./(rabbitmq-server--latest/include|generated)/.*\\.hrl"
additional-ebin: "\\./generated/rabbit\\.app")

(vbs:define-erlang-application 'rabbit_common
erl: "\\./(rabbitmq-server--latest/src|generated)/(rabbit_writer|rabbit_reader|rabbit_framing_amqp_0_8|rabbit_framing_amqp_0_9_1|rabbit_framing_channel|rabbit_basic|rabbit_binary_generator|rabbit_binary_parser|rabbit_channel|rabbit_exchange_type|rabbit_misc|rabbit_net|rabbit_heartbeat|rabbit_msg_store_index|gen_server2|priority_queue|supervisor2)\\.erl"
hrl: "\\./(rabbitmq-server--latest/include|generated)/.*\\.hrl"
additional-ebin: "\\./generated/rabbit_common\\.app")

(vbs:define-erlang-application 'amqp_client
dependencies: 'rabbit_common
erl: "\\./rabbitmq-erlang-client--latest/src/.*\\.erl"
hrl: "\\./rabbitmq-erlang-client--latest/include/.*\\.hrl"
additional-ebin: "\\./generated/amqp_client\\.app")
~~~~

Robert Raschke

unread,

Jan 17, 2011, 10:04:19 AM1/17/11

to

Err, "Wow, a 4000 line mk file!"

Ciprian Dorin Craciun

unread,

Jan 17, 2011, 10:06:58 AM1/17/11

to

In fact I tried to `strace -f -T` it and it seems that in the
first second or so it `stats` all the files that exist, and then it
just waits 14 seconds computing something (100% processor), and
concludes that all is already built. (This is after I've already
successfully built it once).

But any further profiling I didn't do as I've stumbled upon this
issue just about an hour ago...

Ciprian.

erik quanstrom

unread,

Jan 17, 2011, 10:21:16 AM1/17/11

to

> Err, "Wow, a 4000 line mk file!"

machine making machnes—oh, my
- c3po

- erik

Ciprian Dorin Craciun

unread,

Jan 17, 2011, 10:23:10 AM1/17/11

to

On Mon, Jan 17, 2011 at 16:47, Ciprian Dorin Craciun

> Any ideas what could cause this?
>

P.S.: A complete build from nothing takes about 16 minutes...

Ciprian Dorin Craciun

unread,

Jan 17, 2011, 10:35:43 AM1/17/11

to

On Mon, Jan 17, 2011 at 17:00, Robert Raschke <rtrl...@googlemail.com> wrote:
> Terribly sorry, my email won't help you much, apart from going "Wow, a 4000
> link mk file!" and "Hmm, I wouldn't start from here if you want to go
> there."

No problem. Any feedback is welcomed, as I try to understand what
I'm not doing properly.

> Your email also doesn't explain why you cannot generate a "normal"
> mk file.

I'm afraid I don't understand the question. What do you mean by
"generating a normal mk file"?
A) Do you mean why am I using a generator that writes the `mk`
script instead of writing the `mk` script myself by hand? The answer
to this is complexity: writing `mk` is Ok when you have a simple
application to build, but as the application grows larger so does the
make script. (And using meta rules is not always possible.)
B) Why isn't the output script a "normal" `mk` script? Actually is
a very simple script (no meta-rules, no shell expansion, etc.). It's
just big. :)

> If you want to stick with your approach, it almost looks like you may be
> better off to generate a shell script that explicitly checks and runs
> everything you need. (And yes, essentially make your "generator" be a make
> in it's own right. Another one won't hurt.)

I favored the idea of using another make tool because of
portability and simplicity. The target is that I only need to generate
the make script once and distribute it with the source code, thus it
could be built with an already existing make tool (currently only `mk`
from Plan9 is supported, but I plan to also add GNU/BSD make support.)

The idea of generating shell scripts doesn't seem so tempting, as
I know that scripting -- at least in Bash -- is not very reliable or
easy...

(BTW I've chosen `mk` over `make` because it's rules (syntax and
semantic) seems much simpler and saner than the ones with GNU make...
(A lot of automagic happens in that realm...) But if I'm not able to
fix this I'll have to resort back to make...) :(

> But it's cool to see someone else who uses Erlang and RabbitMQ hanging out
> on this list. :-)
>
> Robby

Glad to see another erlanger. :)

Ciprian.

Federico G. Benavento

unread,

Jan 17, 2011, 10:53:47 AM1/17/11

to

>> Your email also doesn't explain why you cannot generate a "normal"
>> mk file.
>
> I'm afraid I don't understand the question. What do you mean by
> "generating a normal mk file"?
> A) Do you mean why am I using a generator that writes the `mk`
> script instead of writing the `mk` script myself by hand? The answer
> to this is complexity: writing `mk` is Ok when you have a simple
> application to build, but as the application grows larger so does the
> make script. (And using meta rules is not always possible.)
> B) Why isn't the output script a "normal" `mk` script? Actually is
> a very simple script (no meta-rules, no shell expansion, etc.). It's
> just big. :)
>
>

a normal mkfile does have meta-rules and if you have so many targets
wouldn't it make sense to have more mkfiles?

--
Federico G. Benavento

Robert Raschke

unread,

Jan 17, 2011, 10:55:40 AM1/17/11

to

On Mon, Jan 17, 2011 at 3:33 PM, Ciprian Dorin Craciun <ciprian...@gmail.com> wrote:

On Mon, Jan 17, 2011 at 17:00, Robert Raschke <rtrl...@googlemail.com> wrote:
> Your email also doesn't explain why you cannot generate a "normal"
> mk file.

I'm afraid I don't understand the question. What do you mean by
"generating a normal mk file"?
A) Do you mean why am I using a generator that writes the `mk`
script instead of writing the `mk` script myself by hand? The answer
to this is complexity: writing `mk` is Ok when you have a simple
application to build, but as the application grows larger so does the
make script. (And using meta rules is not always possible.)
B) Why isn't the output script a "normal" `mk` script? Actually is
a very simple script (no meta-rules, no shell expansion, etc.). It's
just big. :)

Sorry, I meant an idiomatic mk file, in the sense as they are used within the Plan 9 distribution. Have a look at "Plan 9 Mkfiles" (http://www.cs.bell-labs.com/sys/doc/mkfiles.html) and "Maintaining Files on Plan 9 with Mk" (http://www.cs.bell-labs.com/sys/doc/mk.html), if you haven't already done so.

I think by listing all your dependencies one by one, step by step, you are bypassing a lot of the strengths of a make system. I would expect your generator to produce a mk include file with the meta rules plus the mk file itself which lists file dependencies in a concise manner.

Robby

andrey mirtchovski

unread,

Jan 17, 2011, 11:04:17 AM1/17/11

to

something else to try if you're on a multiprocessor system: set $NPROC
to a value > 1 and see how this affects the runtime in the 14-second
case. your dependency tree may be very deep, but parallelizable.

Ciprian Dorin Craciun

unread,

Jan 17, 2011, 11:47:16 AM1/17/11

to

I've tried that, either NPROC=1 or NPROC=8 doesn't affect the
behaviour. (I'm on a Core2 Duo processor.)

Bakul Shah

unread,

Jan 17, 2011, 11:51:58 AM1/17/11

to

On Mon, 17 Jan 2011 17:05:27 +0200 Ciprian Dorin Craciun <ciprian...@gmail.com> wrote:
> On Mon, Jan 17, 2011 at 16:59, erik quanstrom <quan...@quanstro.net> wrote=

> :
> >> Any ideas what could cause this?
> >
> > have you tried profiling mk?
> >
> > - erik
>
> In fact I tried to `strace -f -T` it and it seems that in the
> first second or so it `stats` all the files that exist, and then it
> just waits 14 seconds computing something (100% processor), and
> concludes that all is already built. (This is after I've already
> successfully built it once).

strace tells you what system calls were made and when. To
find out which functions use most time, compile with -pg and
look at the gprof output once done. That 14 seconds were
probably spent computing dependencies. You can convert your
test.mk to a Makefile with a trivial sed script. See what
bsdmake or gmake does with it time wise. {bsd,g}make have
been been abused with huge Makefiles for far longer and are
likely to be friendlier to them :-)

But the real issue is that mk has to check all the long
dependency chains your generator creates and it is probably
not tuned for such large mkfiles as typically one factors out
build logic in a set of mkfiles and uses meta rules where
appropriate.

erik quanstrom

unread,

Jan 17, 2011, 11:58:11 AM1/17/11

to

> strace tells you what system calls were made and when. To
> find out which functions use most time, compile with -pg and
> look at the gprof output once done. That 14 seconds were
> probably spent computing dependencies. You can convert your
> test.mk to a Makefile with a trivial sed script. See what
> bsdmake or gmake does with it time wise. {bsd,g}make have
> been been abused with huge Makefiles for far longer and are
> likely to be friendlier to them :-)

why not just use prof, which is exactly the tool for the job?

i don't see how comparing with *make would get one closer
to solving the mystery.

- erik

Ciprian Dorin Craciun

unread,

Jan 17, 2011, 12:33:22 PM1/17/11

to

On Mon, Jan 17, 2011 at 17:51, Federico G. Benavento
<bena...@gmail.com> wrote:
>
> a normal mkfile does have meta-rules and if you have so many targets
> wouldn't it make sense to have more mkfiles?
>
> --
> Federico G. Benavento

I'll respond to both Robert and Federico in the same email, as
their observations an suggestions are on the same topic.

So for starters I've read both mentioned papers "Plan9 Mkfiles"
and "Maintaining Files on Plan9 with Mk", and I have the following
observations:
* the second paper "Maintaining Files on ..." is more a general
guide that describes the semantic and syntax of `mk` files;
* the first paper "Plan9 Mkfiles" focuses entirely and exclusively
on writing `mk` files for building Plan9 native applications; that is
it describes the general rules on how to write short and simple `mk`
files that take advantage on the existing Plan9 build infrastructure;
* as a consequence both of them seem to suggest writing small `mk`
scripts that individually build each application or library;
* unfortunately what they don't deal with is inter-dependencies
between multiple projects or libraries each with it's own `mk` script;
* thus if looking into the `plan9port` source code, inside the
`src/mkfile` I see the following snippet (and I count about 50 other
similar instances):
~~~~
libs-%:V:
for i in $LIBDIRS
do
(cd $i; echo cd `pwd`';' mk $MKFLAGS $stem; mk $MKFLAGS $stem)
done
~~~~

But after reading the paper "Recursive Make Considered Harmful", I
see that this approach poses at least the following problems:
* when developing and updating a single library, there is no way
in which I can instruct `mk` to rebuild all dependent applications --
unless I know about them and enumerate them one by one; as a
consequence -- unless I know very well the real dependency graph --
`mk` doesn't help me at all and I have to `mk clean all`;
* second, there is a performance bottleneck as now libraries can't
be built in parallel; (this is fine if a library is quite big, but if
I have libraries with a number of files on par with the number of
processors I have wasted time;)
* third, it's almost impossible to make a dependency between one
library to another; the dependency is encoded in their order in the
variables like `LIBDIRS`;

I hope that these observations answer the first question of why
don't I just create a number of separate files one for each project or
library.

(The cited paper is at http://aegis.sourceforge.net/auug97.pdf .)

For the second question about why no "meta-rules", the answer is
somehow trickier, thus I'll give a few problems which arise when using
meta-rules:
* not all files of the same type are built by using the same rule;
(for example for some files I want to enable debugging, for others
not); thus I don't see how a meta-rule would solve this problem;
(unless I create separate `mk` files or I resort to filename tags --
for example I would call `*.debug.c` all C files that I want to be
debugged, and `*.release.c` for those I don't);
* second, each file has a unique dependency graph -- for example
one `*.c` file might include other headers than another `*.c` file in
the same project; thus when updating a single header I want to be able
to build only those `*.c` files that actually depend on it; (this
observation is less important for C applications, but for other type
of programs -- like Java -- this fine grained dependency tracking
means a lot of saved computing power);
* third, in my brief experience with make files, meta-rules are
quite hard to get right... furthermore it is impossible to have two
patterns like: `%.%.x`; just imagine I have two types of images:
background and foreground and I want to superimpose them, then I would
like to be able to write `%{foreground}.%{background}.jpg :
%{foreground}.jpg %{background}.jpg ..."; (I know that I can resort
here to "<| generating command" but it seems just plain wrong...)

Now I hope nobody was offended by my observations regarding the
way `mk` scripts are currently written. I am sure that for the purpose
of building Plan9 applications this way is the best trade-off. But
applying the same techniques to more complex programming languages or
other systems is quite hard... Thus I just wanted to use the good `mk`
tool as a backend for a build script generator that is more intimately
acquainted with what it tries to build. (I want to extend my generator
to Python -- as a replacement for `setup.py` -- and Java -- as a
replacement for all the awful tools that exist in that ecosystem,
especially Ant.)

Ciprian.

Federico G. Benavento

unread,

Jan 17, 2011, 12:50:50 PM1/17/11

to

when you have a clean mkfile, doing mk clean; mk install is faster than all the
dependency checking you'd want to do, specially is the project is a big bloat

take X11 for instance.... how long does it take to build it?

On Mon, Jan 17, 2011 at 2:31 PM, Ciprian Dorin Craciun

--
Federico G. Benavento

Federico G. Benavento

unread,

Jan 17, 2011, 1:30:53 PM1/17/11

to

about debug, release

CONF=debug
DEBUG=`{if(~ $CONF release) echo -DNDEBUG}

CFLAGS=$CFLAGS $DEBUG

if you want debug you run
mk 'CONF=debug'

wouldn't something like that help?

or have 2 files mkdebug and mkrelease
so you include them from the other mkfiles as you see fit

On Mon, Jan 17, 2011 at 2:46 PM, Federico G. Benavento

--
Federico G. Benavento

Bakul Shah

unread,

Jan 17, 2011, 1:37:20 PM1/17/11

to

On Mon, 17 Jan 2011 11:56:22 EST erik quanstrom <quan...@labs.coraid.com> wrote:
> > strace tells you what system calls were made and when. To
> > find out which functions use most time, compile with -pg and
> > look at the gprof output once done. That 14 seconds were
> > probably spent computing dependencies. You can convert your
> > test.mk to a Makefile with a trivial sed script. See what
> > bsdmake or gmake does with it time wise. {bsd,g}make have
> > been been abused with huge Makefiles for far longer and are
> > likely to be friendlier to them :-)
>
> why not just use prof, which is exactly the tool for the job?

Ciprian specified plan9ports. Recipe for building a profiling
mk on unix:

cd $PLAN9/src/cmd/mk
9 mk clean
CC9="gcc -pg" 9 mk all

Then:

mk=$PLAN9/src/cmd/mk/o.mk
cd <source dir>
9 mk clean
9 $mk
gproff $mk > mk.gprof

This will show where time is being spent.

> i don't see how comparing with *make would get one closer
> to solving the mystery.

The comparison would reveal if other makes do better. I
suspect they do and that would solve Ciprian's problem.

Ciprian Dorin Craciun

unread,

Jan 17, 2011, 2:35:41 PM1/17/11

to

On Mon, Jan 17, 2011 at 20:36, Bakul Shah <bakul...@bitblocks.com> wrote:
> On Mon, 17 Jan 2011 11:56:22 EST erik quanstrom <quan...@labs.coraid.com> wrote:
>> > strace tells you what system calls were made and when. To
>> > find out which functions use most time, compile with -pg and
>> > look at the gprof output once done. That 14 seconds were
>> > probably spent computing dependencies. You can convert your
>> > test.mk to a Makefile with a trivial sed script. See what
>> > bsdmake or gmake does with it time wise. {bsd,g}make have
>> > been been abused with huge Makefiles for far longer and are
>> > likely to be friendlier to them :-)
>

>> i don't see how comparing with *make would get one closer
>> to solving the mystery.
>
> The comparison would reveal if other makes do better. I
> suspect they do and that would solve Ciprian's problem.

Ok. So I've transformed (with a `sed` script as suggested) the
script from `mk` to GNU `make. The results is as follows:
* building the entire thing with no parallelism took as in my
experiment -- when I've just obtained the raw commands and runned them
with plain sh -- about 1 minute and 27 seconds seconds;
* after the build issuing `make` a second time take under one
second (0.2 seconds);

I'll try now to profile the `mk` tool as suggested.

Ciprian.

P.S.: I'm not suggesting that GNU `make` is a better tool, just
that for this particular task it behaves better. (Actually I find it
overly complex and almost incomprehensible in its full
implications...)

Bakul Shah

unread,

Jan 17, 2011, 3:31:57 PM1/17/11

to

On Mon, 17 Jan 2011 21:59:58 +0200 Ciprian Dorin Craciun <ciprian...@gmail.com> wrote:
> Actually I'm using the `mk` from
> `http://swtch.com/plan9port/unix/mk-with-libs.tgz` which I'm assuming
> is an extract of the `plan9port` sources. As such in order to build it
> I had to resort to updating the Makefile, but I didit.

Look at the top 4 items:

% cumulative self self total
time seconds seconds calls s/call s/call name
28.11 2.24 2.24 1 2.24 2.24 clrmade
22.96 4.07 1.83 1 1.83 1.83 attribute
22.33 5.85 1.78 1 1.78 1.78 ambiguous.clone.2
20.45 7.48 1.63 1 1.63 1.63 cyclechk

Next look at the call graphs to see that these functions are
called 59+ million times! Significantly more times than
anything else. These are all in src/cmd/mk/graph.c -- mk is
using simple algorithm with much worse time complexity than
is theoretically possible.

I haven't studied make algorithms much to know if one can
just compute transitive closure of the dependency matrix
upfront as that would definitely be much faster than walking
so many linked lists so many times. cycle check is then just
walking down the diagonal of TC(dependency-matrix) to find
self-dependencies. A dependency matrix is usually very sparse
so one can do better than Warshall's Algorithm with its
O(N^3) time complexity.

Not sure how any of this helps you though.... You are better
off using gnumake for maximum portability, however detestable
it might be. You have to learn just enough of it to get by.

Andy Spencer

unread,

Jan 18, 2011, 1:11:13 AM1/18/11

to

On 2011-01-17 19:31, Ciprian Dorin Craciun wrote:
> * not all files of the same type are built by using the same rule;
> (for example for some files I want to enable debugging, for others
> not); thus I don't see how a meta-rule would solve this problem;
> (unless I create separate `mk` files or I resort to filename tags --
> for example I would call `*.debug.c` all C files that I want to be

Use a variable:

a_cflags=-g -Da
b_cflags=-g -Db
test: test.o a.o b.o
gcc -o $target $prereq
%.o: %.c
gcc $($stem^_cflags) -c -o $target $stem.c

Or only filename-tag the object files:

test: test.o a-debug.o b-opt.o c-prof.o
%.o: %.c
gcc -c -o $target $prereq
%-opt.o: %.c
gcc -O -c -o $target $prereq
%-debug.o: %.c
gcc -g -c -o $target $prereq
%-prof.o: %.c
gcc -p -c -o $target $prereq

> * second, each file has a unique dependency graph -- for example
> one `*.c` file might include other headers than another `*.c` file in
> the same project; thus when updating a single header I want to be able
> to build only those `*.c` files that actually depend on it; (this
> observation is less important for C applications, but for other type
> of programs -- like Java -- this fine grained dependency tracking
> means a lot of saved computing power);

Have you looked at cpp -M? Many cpps will generate make style
dependencies for you, I think they'll work with mk as well. You can
include them and the use meta rules for all the real work.

It's still a large list of dependencies though, so it might not help you
in this case. I'm no sure if erlang has somethings similar or not.

> * third, in my brief experience with make files, meta-rules are
> quite hard to get right... furthermore it is impossible to have two
> patterns like: `%.%.x`; just imagine I have two types of images:
> background and foreground and I want to superimpose them, then I would
> like to be able to write `%{foreground}.%{background}.jpg :
> %{foreground}.jpg %{background}.jpg ..."; (I know that I can resort
> here to "<| generating command" but it seems just plain wrong...)

Mk's meta rules are much easier to get right because they don't get
messed up by all of GNU Make's automatically added meta rules. Also:

default:V: circle-square.png

(.*)-(.*).png:R: \1.png \2.png
composite $prereq $target

P.S. anyone know a better way to composite images using the
plan9/plan9port image tools?

erik quanstrom

unread,

Jan 18, 2011, 8:31:23 AM1/18/11

to

> P.S. anyone know a better way to composite images using the
> plan9/plan9port image tools?

what do you mean by better? there's a compose program in
contrib quanstro/radar which may compile on p9p just fine.

- erik