Add 'tup clean'?

384 views
Skip to first unread message

Anatol Pomozov

unread,
Nov 11, 2011, 5:03:23 PM11/11/11
to tup-...@googlegroups.com
Hi,

I think for the sake of completeness 'tup clean' operation should be added. As expected it should remove all objects added by the build process.

If I remember correctly, Mike told that 'tup clean' is not needed because the clean operation is used only to avoid 'incorrect dependencies' problem. But in fact there are other cases when I would like to see 'tup clean':

- I test 'clean build' speed, before running a new iteration I need to revert project to pristine state.
- I upgraded compiler (e.g. gcc from 4.4 to 4.5) or a system library and want to recompile my project and see if everything works fine.

Currently I do 'git clean -xdf' or something similar to remove all garbage from the project. But I would prefer that the build system was able to clean after itself.

Does it make sense?

Elliott Hird

unread,
Nov 11, 2011, 8:57:35 PM11/11/11
to tup-...@googlegroups.com
On 11 November 2011 22:03, Anatol Pomozov <anatol....@gmail.com> wrote:
> - I upgraded compiler (e.g. gcc from 4.4 to 4.5) or a system library and
> want to recompile my project and see if everything works fine.

Why doesn't tup simply remember dependencies from outside the build
tree? The only reason cleaning is necessary in this case is because
tup deliberately discards some of the dependency information.

Jed Brown

unread,
Nov 11, 2011, 9:23:18 PM11/11/11
to tup-...@googlegroups.com
On Fri, Nov 11, 2011 at 19:57, Elliott Hird <penguino...@googlemail.com> wrote:
Why doesn't tup simply remember dependencies from outside the build
tree? The only reason cleaning is necessary in this case is because
tup deliberately discards some of the dependency information.

There is no way to determine whether a program looked at a particular environment variable so you would end up rebuilding too frequently (assuming that the behavior of the compiler depends on the environment, which is the typical case).

Elliott Hird

unread,
Nov 11, 2011, 9:39:16 PM11/11/11
to tup-...@googlegroups.com
On 12 November 2011 02:23, Jed Brown <j...@59a2.org> wrote:
> There is no way to determine whether a program looked at a particular
> environment variable so you would end up rebuilding too frequently (assuming
> that the behavior of the compiler depends on the environment, which is the
> typical case).

No, I mean: tup already gets complete filesystem dependency
information, it just discards all paths outside the project tree.

Jed Brown

unread,
Nov 11, 2011, 9:47:21 PM11/11/11
to tup-...@googlegroups.com
On Fri, Nov 11, 2011 at 20:39, Elliott Hird <penguino...@googlemail.com> wrote:
No, I mean: tup already gets complete filesystem dependency
information, it just discards all paths outside the project tree.

I assume it's for performance/inotify reasons. It definitely seems like an issue if the compiler suite was on an NFS mount.

But the influence of environment variables means that there is always a use case (not involving timing) where you want to do a clean rebuild even though the build system is functioning perfectly.

Mike Shal

unread,
Nov 14, 2011, 12:17:28 PM11/14/11
to tup-...@googlegroups.com
On Fri, Nov 11, 2011 at 5:03 PM, Anatol Pomozov
<anatol....@gmail.com> wrote:
> Hi,
> I think for the sake of completeness 'tup clean' operation should be added.

Sorry, but I'm still going to say "no" purely on ideological reasons.
This violates my rule #3 for build systems, which is that there must
only be one command to update a system. In tup, this is 'tup upd'.
This means we always have this separation between the developer and
the build system:

developer:
1) Change/add/remove some files in the system.
2) Run 'tup upd'

build system:
1) Analyze file states (timestamps, contents, etc)
2) Determine the minimum amount of work to do
3) Do that work

As soon as you add a 'tup clean' to the mix, now you are moving some
of the logic that belongs to the build system into the developer's
space. You might have something like:

developer:
1) Change/add/remove some files in the system.
2) Which files did I change? Just stuff in the project that can be
handled with 'tup upd'? Or other things that may need a full build?
2a) Maybe run 'tup upd'
2b) Maybe run 'tup clean; tup upd'
3) If I picked 2a) when I should've picked 2b), then try again with
'tup clean; tup upd'.

Notice how this sounds an awful lot like the responsibilities of the
build system. It is not your job to make those decisions; it is the
build system's. Therefore, tup will not have a 'clean'.

> As expected it should remove all objects added by the build process.
> If I remember correctly, Mike told that 'tup clean' is not needed because
> the clean operation is used only to avoid 'incorrect dependencies' problem.
> But in fact there are other cases when I would like to see 'tup clean':
> - I test 'clean build' speed, before running a new iteration I need to
> revert project to pristine state.

What are you trying to accomplish here? If you want to know the time
it takes a new developer to get the software installed, I would think
you'd have to add the time it takes to clone/checkout from version
control, or unpack the tarball. Once the new developer is setup,
however, the full build time is not a useful metric, since the only
builds should be incremental builds at that point.

> - I upgraded compiler (e.g. gcc from 4.4 to 4.5) or a system library and
> want to recompile my project and see if everything works fine.
> Currently I do 'git clean -xdf' or something similar to remove all garbage
> from the project. But I would prefer that the build system was able to clean
> after itself.

This is definitely a case that tup should handle properly. As tup was
initially designed, however, all of the dependencies are rooted at the
top of the project. The database will need to be re-worked to handle
dependencies on things like /lib/libc.so or whatever. In other words,
the fix here is not to add 'clean' and make you try to figure out when
you need to run 'clean' and when you don't; the fix is to have tup
properly analyze and track the dependencies.

-Mike

Mike Shal

unread,
Nov 14, 2011, 12:34:56 PM11/14/11
to tup-...@googlegroups.com
On Fri, Nov 11, 2011 at 9:47 PM, Jed Brown <j...@59a2.org> wrote:
> On Fri, Nov 11, 2011 at 20:39, Elliott Hird
> <penguino...@googlemail.com> wrote:
>>
>> No, I mean: tup already gets complete filesystem dependency
>> information, it just discards all paths outside the project tree.
>
> I assume it's for performance/inotify reasons. It definitely seems like an
> issue if the compiler suite was on an NFS mount.

Part of it is the way the nodes are stored in the database. This will
need to be updated to provide a full view, rather than a local view of
just the project tree.

Another part as you mention is the inotify/scanning logic. This will
need to be updated to watch/scan files in directories that are outside
of the project tree.

Finally, there is the method of watching the file accesses of the
sub-process. Right now with the fuse setup, we can detect file
accesses anywhere in the file-system, but only if they use relative
paths. To detect full-path accesses (ie:
open("/usr/include/stdio.h")), the sub-process needs to execute in a
chroot environment (or maybe there's some other way to handle this?).
I have been playing around with getting chroot to work, but it is
tricky because chroot needs additional permissions, but other things
(like the sub-processes) need to run as the user.

So it is not a trivial thing to add, but I would like to do so at some
point to make tup much more robust when updating system libraries.

> But the influence of environment variables means that there is always a use
> case (not involving timing) where you want to do a clean rebuild even though
> the build system is functioning perfectly.

Tup controls the environment of the sub-processes, so we know whether
or not the environment changes between invocations. Although it
wouldn't have the granularity to know that a particular sub-process
looked at a particular environment variable, it can know that
something in the environment changed and therefore do a full re-build.
Or maybe you could explicitly list which environment variables are
passed to the sub-processes in Tuprules.tup, and then those are the
ones that can be compared against their previous values. That might be
too tedious, though.

-Mike

Anatol Pomozov

unread,
Nov 14, 2011, 12:48:18 PM11/14/11
to tup-...@googlegroups.com
Hi

On Mon, Nov 14, 2011 at 9:34 AM, Mike Shal <mar...@gmail.com> wrote:
On Fri, Nov 11, 2011 at 9:47 PM, Jed Brown <j...@59a2.org> wrote:
> On Fri, Nov 11, 2011 at 20:39, Elliott Hird
> <penguino...@googlemail.com> wrote:
>>
>> No, I mean: tup already gets complete filesystem dependency
>> information, it just discards all paths outside the project tree.
>
> I assume it's for performance/inotify reasons. It definitely seems like an
> issue if the compiler suite was on an NFS mount.

Part of it is the way the nodes are stored in the database. This will
need to be updated to provide a full view, rather than a local view of
just the project tree.

Another part as you mention is the inotify/scanning logic. This will
need to be updated to watch/scan files in directories that are outside
of the project tree.

Finally, there is the method of watching the file accesses of the
sub-process. Right now with the fuse setup, we can detect file
accesses anywhere in the file-system, but only if they use relative
paths. To detect full-path accesses (ie:
open("/usr/include/stdio.h")), the sub-process needs to execute in a
chroot environment (or maybe there's some other way to handle this?).
I have been playing around with getting chroot to work, but it is
tricky because chroot needs additional permissions, but other things
(like the sub-processes) need to run as the user.

I think you can utilize linux capabilities and set CAP_SYS_CHROOT on the tup binary. In this case you don't need to run the binary as root.

But this works on on Linux.

Elliott Hird

unread,
Nov 14, 2011, 5:33:59 PM11/14/11
to tup-...@googlegroups.com
On 14 November 2011 17:34, Mike Shal <mar...@gmail.com> wrote:
> To detect full-path accesses (ie:
> open("/usr/include/stdio.h")), the sub-process needs to execute in a
> chroot environment (or maybe there's some other way to handle this?).

There's fakeroot [1], which Debian packaging uses, but it's a return
to LD_PRELOAD.

A more promising avenue is UMLBox [2], which I've used, and which can
run Linux programs (under Linux only, unfortunately) with arbitrary
filesystem mount configurations without much of a speed reduction. It
takes about half a second to start up, though, which might be
considered too slow -- on the other hand, it only needs to be started
up once per build, and /only/ if there's any work to be done, so it's
not that bad. tup needing to be provided with a specially-configured
Linux kernel (not the one run on the host, just a compiled one) might
not be desirable, though.

[1] http://fakeroot.alioth.debian.org/
[2] https://bitbucket.org/GregorR/umlbox/wiki/Home

Elazar Leibovich

unread,
Nov 15, 2011, 12:24:46 AM11/15/11
to tup-...@googlegroups.com
What's your practical suggestion for someone with half-built tree, who just, say, changed the version of gcc. Or hacked a system h file to determine whether there's a bug there. Or scp'ed a half-build file system from a friend's freeBSD to your Ubuntu (he went to vacation, and you really need to finish the bugfix he just started, 'cause a client has called).

encodr

unread,
Nov 15, 2011, 8:18:17 AM11/15/11
to tup-...@googlegroups.com
On 15 Nov 2011, at 05:24, Elazar Leibovich wrote:

What's your practical suggestion for someone with half-built tree, who just, say, changed the version of gcc. Or hacked a system h file to determine whether there's a bug there. Or scp'ed a half-build file system from a friend's freeBSD to your Ubuntu (he went to vacation, and you really need to finish the bugfix he just started, 'cause a client has called).

You could force rebuilds by changing the value of a globally used but harmless variable, in a small dependent tup file
e.g.
arch.tup:
ARCH=x86
CC=gcc -DFORCETUP=0

Tupfile:
include arch.tup
: foo.c |> $(CC) -o foo |> foo

then changing the value from 0 to 1 (or vice-versa) and issue "tup upd" will rebuild everything that uses $CC

Which is kinda clunky, but simple, and it works.

Question for Mike:  would you be ideologically averse to adding a "force" option to tup? 
e.g. 
   tup upd -f
which means "pretend all dependencies have changed"
That way you dont need to change HOW you track dependencies, just quickly mark them all as changed and carry on as normal.

e

Elliott Hird

unread,
Nov 15, 2011, 4:42:05 PM11/15/11
to tup-...@googlegroups.com
On 15 November 2011 05:24, Elazar Leibovich <ela...@gmail.com> wrote:
> What's your practical suggestion for someone with half-built tree, who just,
> say, changed the version of gcc. Or hacked a system h file to determine
> whether there's a bug there. Or scp'ed a half-build file system from a
> friend's freeBSD to your Ubuntu (he went to vacation, and you really need to
> finish the bugfix he just started, 'cause a client has called).

Tracking dependencies outside of the project root would handle all of these.

The only likely kind of dependencies that can't be feasibly tracked
are dependencies on the system clock, on random number sources, and on
environment variables. All but the last of these should be harmless in
practice.

Mike Shal

unread,
Nov 15, 2011, 11:39:17 PM11/15/11
to tup-...@googlegroups.com

Thanks for the links! It sounds like fakeroot won't work for us
because of the static binary issue that keeps popping up. UMLBox is
interesting. I guess we'll have to compare it's performance vs. a
chrooted fuse and see which is best.

-Mike

Mike Shal

unread,
Nov 15, 2011, 11:43:58 PM11/15/11
to tup-...@googlegroups.com
On Tue, Nov 15, 2011 at 12:24 AM, Elazar Leibovich <ela...@gmail.com> wrote:
> What's your practical suggestion for someone with half-built tree, who just,
> say, changed the version of gcc. Or hacked a system h file to determine
> whether there's a bug there. Or scp'ed a half-build file system from a
> friend's freeBSD to your Ubuntu (he went to vacation, and you really need to
> finish the bugfix he just started, 'cause a client has called).

I don't have a practical suggestion at the moment. But adding a
'clean' command to tup is not going to help, since 'clean' is broken
by design. I'd rather fix the actual issues than introduce more
brokenness. There should be some branches to try out in a few days...

-Mike

Mike Shal

unread,
Nov 15, 2011, 11:48:25 PM11/15/11
to tup-...@googlegroups.com

Unfortunately I think that puts us in the same boat as 'tup clean' -
essentially it is up to the developer to try to figure out whether it
is ok to run just 'tup upd' or if you have to do 'tup upd -f'. That
kind of decision making belongs in the build system -- putting that
responsibility onto the user means that the build system is broken.
Therefore, tup is currently broken. I think it can be fixed.

-Mike

Elliott Hird

unread,
Nov 16, 2011, 12:44:46 AM11/16/11
to tup-...@googlegroups.com
On 16 November 2011 04:48, Mike Shal <mar...@gmail.com> wrote:
> Unfortunately I think that puts us in the same boat as 'tup clean' -
> essentially it is up to the developer to try to figure out whether it
> is ok to run just 'tup upd' or if you have to do 'tup upd -f'. That
> kind of decision making belongs in the build system -- putting that
> responsibility onto the user means that the build system is broken.
> Therefore, tup is currently broken. I think it can be fixed.

I note that these kinds of deep dependency tracking issues are also
experienced by the Nix package manager [1]. In Nix's case, it's to
eliminate undeclared ones; in tup's, it's to track them.
Unfortunately, I'm not sure a solution to one helps with the other.

[1] http://nixos.org/nix/

Mike Shal

unread,
Nov 22, 2011, 3:43:18 PM11/22/11
to tup-...@googlegroups.com

There are currently 3 environment test branches out. They are as follows:

'environ':
- Tup saves the whole environment in its config table, and compares
the previous config with the current environment. If the environment
is different, all sub-processes are re-executed. This should work with
all existing Tupfiles, but can be annoying if your environment changes
for random reasons (for example, ssh'ing into a machine might have a
different environment if you are local. So switching between the two
will cause tup to update lots of stuff). Some ever-changing variables,
like PWD and OLDPWD are ignored to maintain some sanity :)

'environ-clear':
- Similar to 'environ', except the only environment variable that is
passed down is PATH. Everything else is purged from the environment,
so if for example a compiler needs a special environment variable set,
you would have to add it to your compiler rule. If the PATH changes
from one update to the next, everything is re-built. Probably more
usable than 'environ', but I'm curious to know what it would break.

'environ-export':
- Similar to 'environ-clear', only PATH is exported by default.
Anything else can be exported by using the export keyword. Eg:
Tupfile:
export FOO
: |> gcc -c bar.c ... |>

This will read FOO from the environment, and turn the command into
'FOO=value gcc -c bar.c ...'. If FOO changes in a future update, all
Tupfiles that have 'export FOO' are re-parsed. Currently there's no
way to unexport in a Tupfile, but it could probably be added if this
branch makes people happiest. This may help in not updating too much
during environment changes, but it does add some complexity to tup.

If you have any time to try out any or all of the branches, or just
have any general thoughts, please let me know what you think. I
recommend trying them out in a separate .tup directory (so re-check
out your source code, and run 'tup init' with the new tup) because if
you switch back it may confuse the older version of tup.

Thanks!
-Mike

Elliott Hird

unread,
Nov 23, 2011, 3:03:29 PM11/23/11
to tup-...@googlegroups.com
On 22 November 2011 20:43, Mike Shal <mar...@gmail.com> wrote:
> 'environ-export':
>  - Similar to 'environ-clear', only PATH is exported by default.
> Anything else can be exported by using the export keyword. Eg:
> Tupfile:
> export FOO
> : |> gcc -c bar.c ... |>

++

This seems like the obvious best solution in retrospect. environ-clear
would probably be OK too, but this sugar seems handy.

I'm not sure what use unexport would be; surely you could just omit
the export line?

Oliver Kiddle

unread,
Nov 23, 2011, 2:21:33 PM11/23/11
to tup-...@googlegroups.com
Mike Shal wrote:

> There are currently 3 environment test branches out. They are as follows:

> If you have any time to try out any or all of the branches, or just


> have any general thoughts, please let me know what you think. I

On the subject of general thoughts, I think the environ-export is likely
the best approach. The first approach's invalidating of the build would be
really irritating every time you change something unrelated. builds should
not be affected by environment variables. I've seen setups where people
had to source a pile of C-shell common ".profile" files before the build
would work. But sometimes it can be useful to allow one or two through.

I have some GNU make include files that are used by internal projects
and it loops through all environment variables (using $(.VARIABLES)
and $(origin)) and unexports them. I have a few exceptions which I pass
through: PATH, TERM, DISPLAY, HOME, LOGNAME, MAKE%, LC_% and LANG. HOME
is needed by some commands to find their rc files. TERM and DISPLAY are
mainly there for make run/make test though some things like fop might
need a DISPLAY. For the locale stuff I explicitly export LC_COLLATE=C
and LC_NUMERIC=C. Setting LC_CTYPE can also fix some things but break
others. Thinking about it now, I also wonder whether TMPDIR should
perhaps also be allowed through.

Oliver


Elliott Hird

unread,
Nov 23, 2011, 4:43:53 PM11/23/11
to tup-...@googlegroups.com
On 23 November 2011 19:21, Oliver Kiddle <oki...@yahoo.co.uk> wrote:
> Thinking about it now, I also wonder whether TMPDIR should
> perhaps also be allowed through.

++

I wouldn't pass HOME; it's the kind of thing I could see affecting the
outcome of a build undesirably. Such things should be done as tup
configuration variables instead.

Mike Shal

unread,
Nov 24, 2011, 10:25:55 PM11/24/11
to tup-...@googlegroups.com

Right now once you do "export FOO" in a Tupfile, then FOO is exported
to the commands in all following :-rules. Since you have to write
commands in order, you might have something like this:

export FOO
: |> I need FOO here |> output.txt
: output.txt |> I don't want FOO here, but need output.txt |>

That's the only reason I think you would need an "unexport". Not sure
if there is a real-world practical example though.

-Mike

Mike Shal

unread,
Nov 24, 2011, 10:28:10 PM11/24/11
to tup-...@googlegroups.com

So it sounds like there are some cases here where you'd want things
other than PATH? I was hoping if it worked we could stick with
environ-clear since it is simpler, but if you need to pass things
through from the actual environment then we'd probably have to use
environ-export. (Keep in mind if you just need to explicitly set an
environment variable like LC_COLLATE=C, tup doesn't actually need a
dependency on the environment - that could just be written directly in
the :-rule for commands that need it).

-Mike

Mike Shal

unread,
Dec 5, 2011, 6:17:47 PM12/5/11
to tup-...@googlegroups.com

There is now also an 'environ-export2' branch, which is similar to
environ-export but tracks things a little differently. I had some
issues getting it to work in Windows with the environ-export approach.
Feel free to try it out (again, a separate workspace is probably best)
and let me know if there are any issues. I'll probably merge
environ-export2 soon if there are no major problems.

-Mike

Slawomir Czarko

unread,
Jan 5, 2012, 12:44:11 PM1/5/12
to tup-users
Hi,

About dependencies on things outside the "tup hierarchy" - how about
providing a hash of the "build system" as a environment variable and
exporting that environment variable at the beginning of the Tupfile?
That way all the commands would depend on it and if the hash changes
all the commands would be rerun.

The hash of the "build system" could be calculated outside tup by a
wrapper script calling tup. On RPM based system it could be something
like this:

wrapper script:

#!/bin/bash
RPMS="gcc glibc-devel" # add any other RPMS which are used in the
build
BUILD_SYSTEM_HASH=`rpm -q $RPMS | md5sum | awk '{print $1}'` tup upd

Tupfile:
export BUILD_SYSTEM_HASH

The exact method of calculating the hash value doesn't matter as long
as it detects changes in the system configuration.

-Slawomir

Jed Brown

unread,
Jan 5, 2012, 1:07:31 PM1/5/12
to tup-...@googlegroups.com
On Thu, Jan 5, 2012 at 11:44, Slawomir Czarko <slawomi...@gmail.com> wrote:
#!/bin/bash
RPMS="gcc glibc-devel" # add any other RPMS which are used in the
build

How do you determine which other packages might be used?
 
BUILD_SYSTEM_HASH=`rpm -q $RPMS | md5sum | awk '{print $1}'` tup upd

This will force a rebuild if only a subminor version of the package changes, which seems overzealous (I wouldn't want to rebuild the world after a documentation fix or a security patch for static libc). Perhaps a way to subtly notify that something is different, but not force the full rebuild.

Slawomir Czarko

unread,
Jan 5, 2012, 2:58:52 PM1/5/12
to tup-users
On Jan 5, 7:07 pm, Jed Brown <j...@59A2.org> wrote:
> On Thu, Jan 5, 2012 at 11:44, Slawomir Czarko <slawomir.cza...@gmail.com>wrote:
>
> > #!/bin/bash
> > RPMS="gcc glibc-devel" # add any other RPMS which are used in the
> > build
>
> How do you determine which other packages might be used?
It depends on your project. I don't know if there's an automated way
to figure it out - maybe running the whole build process with strace
would help. Or you could scan the code for system header includes and
then maybe use "rpm -qf ..." to find corresponding RPMs. Similar with
system libraries. Otherwise you need to figure out the dependencies
manually, similar like when writing RPM spec file. Determining which
other packages is not something to be done on each build if an
automated method is used since it most likely will take too long.

>
> > BUILD_SYSTEM_HASH=`rpm -q $RPMS | md5sum | awk '{print $1}'` tup upd
>
> This will force a rebuild if only a subminor version of the package
> changes, which seems overzealous (I wouldn't want to rebuild the world
> after a documentation fix or a security patch for static libc). Perhaps a
> way to subtly notify that something is different, but not force the full
> rebuild.

In such a case you can use this to be less sensitive to minor updates:

BUILD_SYSTEM_HASH=`rpm -q --qf "%{VERSION}\n" $RPMS | md5sum | awk
'{print $1}'` tup upd

-Slawomir

Jed Brown

unread,
Jan 5, 2012, 6:14:02 PM1/5/12
to tup-...@googlegroups.com
On Thu, Jan 5, 2012 at 13:58, Slawomir Czarko <slawomi...@gmail.com> wrote:
It depends on your project. I don't know if there's an automated way
to figure it out - maybe running the whole build process with strace
would help. Or you could scan the code for system header includes and
then maybe use "rpm -qf ..." to find corresponding RPMs. Similar with
system libraries. Otherwise you need to figure out the dependencies
manually, similar like when writing RPM spec file. Determining which
other packages is not something to be done on each build if an
automated method is used since it most likely will take too long.


I wonder how expensive it would be to just log mtime for external build dependencies during the build process, but not check them when computing what to build with "tup upd".

In such a case you can use this to be less sensitive to minor updates:

BUILD_SYSTEM_HASH=`rpm -q --qf "%{VERSION}\n" $RPMS | md5sum | awk
'{print $1}'` tup upd

But now you don't rebuild when the major version of a dependency changes, which could actually change behavior. I think there is no way to win using the package version alone.

Slawomir Czarko

unread,
Jan 6, 2012, 4:51:32 AM1/6/12
to tup-users
On Jan 6, 12:14 am, Jed Brown <j...@59a2.org> wrote:
> On Thu, Jan 5, 2012 at 13:58, Slawomir Czarko <slawomir.cza...@gmail.com>wrote:
> > In such a case you can use this to be less sensitive to minor updates:
> >
> > BUILD_SYSTEM_HASH=`rpm -q --qf "%{VERSION}\n" $RPMS | md5sum | awk
> > '{print $1}'` tup upd
>
> But now you don't rebuild when the major version of a dependency changes,
> which could actually change behavior. I think there is no way to win using
> the package version alone.

Hi Jed,

I'm not sure what do you mean by major version and subminor version.

"rpm -q glibc-devel" produces

glibc-devel-2.14-5.i686

"rpm -q --qf "%{NAME}-%{VERSION}\n" glibc-devel" produces

glibc-devel-2.14

-Slawomir

Slawomir Czarko

unread,
Jan 6, 2012, 4:59:52 AM1/6/12
to tup-users
On Nov 14 2011, 6:17 pm, Mike Shal <mar...@gmail.com> wrote:
> On Fri, Nov 11, 2011 at 5:03 PM, Anatol Pomozov
>
Hi,

Maybe you can have a look at how it's done in ccache since they deal
with similar issues, at least as far as checking the compiler.

It would be also cool if rebuilds didn't happen when only comments or
white space (outside of string literals) was changed. One way to deal
with this scenario is to separate preprocessing from compilation and
to have some way of handling intermediate files so further steps of
the process don't happen if the output of the intermediate step didn't
change. For example, if output of preprocessor didn't change then
compilation step shouldn't run at all. At the moment one handle this
specific scenario by using tup + ccache. There are other cases though
where the build starts with one file and then runs few tools in
succession, each tool taking as input the output of the previous tool.
At any point in this chain if the output is the same as it was last
time there's no need to run the following tools in the chain.

-Slawomir

Jed Brown

unread,
Jan 6, 2012, 8:07:25 AM1/6/12
to tup-...@googlegroups.com
On Fri, Jan 6, 2012 at 03:51, Slawomir Czarko <slawomi...@gmail.com> wrote:
I'm not sure what do you mean by major version and subminor version.

"rpm -q glibc-devel" produces

glibc-devel-2.14-5.i686

The 5 here can change due to repackaging (e.g. a dependent library changed added a member to a publicly visible struct) with no source changes at all. This is usually harmless.

The behavior of a dependent library can change without needing to repackage at all. Going one level higher, glibc might change the direction that memcpy() runs, but a caller that links using shared libraries does not need to relink because the ABI did not change. But changing the memcpy() implementation might expose a bug affecting build correctness (this happened with libflashplayer.so not long ago).

I don't see an automated way to win. If you go for build correctness at all cost, then you have to track every file/package, including all dependencies down to the kernel. But this will mean full rebuilds far more frequently than you probably want for a big project.

Mike Shal

unread,
Jan 7, 2012, 4:33:17 PM1/7/12
to tup-...@googlegroups.com

That sounds like a clever solution to me, at least until tup supports
tracking file accesses from outside the dev tree. I would think you'd
want to generate the hash as part of your OS update process (like a
user-generated post-install script if that is available). Or were you
suggesting tup do this hash automatically somehow? I don't think there
is an easy way to do this in a cross-platform manner.

-Mike

Mike Shal

unread,
Jan 7, 2012, 4:38:13 PM1/7/12
to tup-...@googlegroups.com
On Thu, Jan 5, 2012 at 6:14 PM, Jed Brown <j...@59a2.org> wrote:
> On Thu, Jan 5, 2012 at 13:58, Slawomir Czarko <slawomi...@gmail.com>
> wrote:
>>
>> It depends on your project. I don't know if there's an automated way
>> to figure it out - maybe running the whole build process with strace
>> would help. Or you could scan the code for system header includes and
>> then maybe use "rpm -qf ..." to find corresponding RPMs. Similar with
>> system libraries. Otherwise you need to figure out the dependencies
>> manually, similar like when writing RPM spec file. Determining which
>> other packages is not something to be done on each build if an
>> automated method is used since it most likely will take too long.
>>
>
> I wonder how expensive it would be to just log mtime for external build
> dependencies during the build process, but not check them when computing
> what to build with "tup upd".

One test I tried is to use the example fuse file-system to mirror the
real fs (fusexmp.c). I compared the build.sh script in tup using a
chroot environment in the mirror fs against the real fs. (The build.sh
script just compiles a bunch of C files - it doesn't actually run
tup). In the real fs, the script ran in 10.598s, whereas in the chroot
it ran in 14.183s. So there is some overhead just from having all file
accesses go through fuse. There would of course be additional overhead
in tracking these dependencies in tup, but I don't know how
significant that would be yet.

Though I don't understand - what is the purpose of tracking mtimes for
external build dependencies, but not checking them during 'tup upd'?

-Mike

Jed Brown

unread,
Jan 7, 2012, 4:45:38 PM1/7/12
to tup-...@googlegroups.com
On Sat, Jan 7, 2012 at 15:38, Mike Shal <mar...@gmail.com> wrote:
Though I don't understand - what is the purpose of tracking mtimes for
external build dependencies, but not checking them during 'tup upd'?

These files change frequently if you use a rolling release distribution (Debian Sid, Archlinux, etc), but usually in benign ways. The problem is that distinguishing meaningful from benign changes is really hard. If you have large projects, you don't want to spend hours rebuilding because of a benign packaging change, but it would be nice to be notified about changes if you get a strange build error (so you can trigger a more strictly accurate rebuild).

Slawomir Czarko

unread,
Jan 8, 2012, 10:46:26 AM1/8/12
to tup-users
I thought this would be done outside of tup since as you say there's
no cross-platform way to do it and the actual dependencies are project
specific.

-Slawomir
Reply all
Reply to author
Forward
0 new messages