calling a sub-paver?

3 views
Skip to first unread message

Matt Kangas

unread,
Sep 8, 2008, 5:28:39 PM9/8/08
to paver
I'd like to use paver to orchestrate my company's entire build
process. Most of our code is python packages using setuptools/
distutils already. But we're talking about 30+ project dirs, scattered
over a subversion repository.

I'd like to write a top-level pavement.py that defines:
- "build" directory: all sub-projects install here (not site-packages)
- "build" task: scan subdirectories for "pavement.py"s, recursively
call "build"
- "test" task: requires "build", scans subdirectories, recursively
calls "test"

Ideally, the "scan subdirectories for sub-pavement.py and recurse"
would happen in-process, propagating variables set in the uppermost
level.

For those who know Ant, I'm looking for Paver's equivalent of "subant"
task.

How can I do this w/o using sh() to call the sub-paver?

Kevin Dangoor

unread,
Sep 9, 2008, 10:10:39 AM9/9/08
to pa...@googlegroups.com
Hi Matt,

Right now, there's no way to do this. Someone asked me about this the
other day, and I didn't quite grasp the use case... your description
here makes sense to me.

I think such a thing is possible. There's a little bit of trickery
involved and some thought needs to be put in as far as what, exactly,
propagates to the sub-pavements.

There are some module globals that need to be considered. Primarily:

1. options
2. tasks

I'm guessing that we'd want these to propagate down to the sub-
pavement, but be overridden by any values that are in those sub-
pavements. (People with more experience with Make or Rake here can
chime in... I'm sure this stuff has been thought about) After the sub-
pavement has been run, the values should be reset.

That means making a deep copy of options, and a shallow copy of tasks,
temporarily moving aside the paver.defaults (which is where the
pavement runs), reinitializing paver.defaults, running the sub-
pavement, and then restoring everything. It is likely possible to
eliminate paver.defaults and just exec into a dictionary that we keep
around.

It's not hard, but the devil's in the details.

Kevin

--
Kevin Dangoor
Product Manager
SitePen, Inc.
Web development experts:
development, support, training

ke...@sitepen.com

Matt Kangas

unread,
Sep 9, 2008, 11:47:09 AM9/9/08
to paver
Hi Kevin,

If you'd like to see more concrete use-cases, this may be helpful:

http://www.pragmaticautomation.com/cgi-bin/pragauto.cgi/Build/Ant16Subant.rdoc
http://ant.apache.org/manual/CoreTasks/subant.html

(I just Googled "subant" and looked for the first reasonable howto)

I think what you describe is probably ideal in terms of ease-of-use.
However, neither Make nor Ant do anything this fancy:

recursive Make: you're just setting environment variables and calling
a sub-make.
- (all settings are env vars in Make anyway)
recursive Ant: can choose whether to inherit all properties, or
specific properties.
- Does NOT inherit task definitions: subproject "build.xml" files
still need to explicitly <include> files to get common tasks

The essential features IMO are:
- find "pavement.py" files in subdirectories
- invoke a sub-paver with the same Python interpreter and sys.path,
adjusting for relative paths (if present)... critical for loading
"common task" code
- call the same task in child as in parent
- pass build/install paths to the sub-paver, adjusting for relative
paths

Kevin Dangoor

unread,
Sep 10, 2008, 9:16:16 AM9/10/08
to pa...@googlegroups.com
On Sep 9, 2008, at 11:47 AM, Matt Kangas wrote:

> - find "pavement.py" files in subdirectories
> - invoke a sub-paver with the same Python interpreter and sys.path,
> adjusting for relative paths (if present)... critical for loading
> "common task" code
> - call the same task in child as in parent
> - pass build/install paths to the sub-paver, adjusting for relative
> paths

Hmm, I wonder if it's a hard requirement that the sub-paver has the
same Python interpreter. Things become a lot less complicated if
that's not the case, because of how imports work in Python and things
like that. Paver could grow a command line option that says "read in
an options pickle on stdin" to pass state from the parent paver to the
child paver.

Is a goal here that the sub-pavements are runnable on their own? Make
(or bmake) has something like this:

FOO?=bar

which only sets FOO if it's not already set.

If the intention is that the sub-pavements can be run standalone but
can also be invoked as part of a larger build, then you'd want the
inherited settings to actually take precedence over the local settings
in some cases, right?

By the way, it'll be good to nail down exactly what we want here, but
I have too many things on my plate to personally work on this right
now. I can certainly provide guidance to someone who does want to work
on it.

I am hoping to do a new release sometime soon with some deployment
goodies.

Kevin

Matt Kangas

unread,
Sep 10, 2008, 12:20:15 PM9/10/08
to paver
I've taken a stab at implementing a subtask runner in a non-recursive
manner. Since my current projects are all "setup.py" based, I simply
scan for all such files, then iterate over those projects, calling
"python setup.py <task>" in each.

Some bits of horribleness in my current code:
- I'm using sh("find") to search for the sub-setup.py files, so I can
use the "-maxdepth" argument. Obviously, not portable!

- I'm using sh("cd <subdir>; python setup.py <target>") to run the
subtask. This will use whatever "python" is on the PATH, not
necessarily the one I started with.

- My "dependency_rank()" function is a fairly crazy way to sort the
list of subprojects in dependency order. It works and is fast, but I
doubt anyone could maintain it.


### code below ###

# $Id: pavement.py 28793 2008-09-10 07:08:20Z kangas $

import re
#from paver.path import path

BUILD_DIR = path.getcwd() / "build"

# Build a regex that specifies project dependency-order
# group "A" - build first
# group "B" - build second
# no match - build last
#
# Used by dependency_rank() function
#
RE_DEPS = re.compile(r'(?P<B>platform-api)|(?P<A>singularity)')


@task
def build_dir():
BUILD_DIR.mkdir()

@task
def clean():
"""Clean all build artifacts"""
BUILD_DIR.rmtree()
recur_task("clean")

@task
@needs('build_dir')
def build():
"""Build all subprojects"""
recur_task("build", args="--build-base %s" % BUILD_DIR)

@task
def test():
"""Test all subprojects"""
raise NotImplementedError()
#recur_task("test")

# --------------------------------------

def dependency_rank(path_str, default='Z'):
"""
Return a sort-key for a given project path;
root dependencies should sort before dependents.
"default" must be cmp() less than groupnames in RE_DEPS
"""
matches = [y for y in RE_DEPS.finditer(path_str)]
if 0 == len(matches): return default

# flatten groupdict items, exclude None matches
# sample [y.groupdict() for y in matches] result:
# [{'A': 'platform-api', 'B': None}, {'A': None, 'B':
'singularity'}]

labels = [y[0][0] for y in
[filter(lambda y: y[1], x.groupdict().items())
for x in matches]]

# get greatest matching label
return sorted(labels)[0]

def find_subsetups(pattern='setup.py'):
"""
Return paths to subproject "setup.py" files IN DEPENDENCY ORDER.
First items in the list must be built first
"""
txt = sh('find . -maxdepth 3 -name setup.py', capture=True)
lines = txt.splitlines()
# sort result by dependency_rank
return sorted(lines, key=dependency_rank)


def call_setup(setup_path, target, args="",
python_bin='python', global_args=""):
"""We execute python_bin on the command line within dirname"""
dirname = path(setup_path).dirname()
basename = path(setup_path).basename()

print "## %s: %s" % (target, dirname)
exe = " ".join((python_bin, basename, global_args, target, args))
full_exe = "cd %s; %s" % (dirname, exe)
sh(full_exe)
### FIXME FIXME FIXME ### check error status?
print "-------------------"
print "## %s done: %s" % (target, dirname)

def recur_task(taskname, **kw):
for p in find_subsetups():
call_setup(p, taskname, **kw)
print


### end code ###

Kevin Dangoor

unread,
Sep 15, 2008, 9:50:06 AM9/15/08
to pa...@googlegroups.com
On Sep 10, 2008, at 12:20 PM, Matt Kangas wrote:

>
> I've taken a stab at implementing a subtask runner in a non-recursive
> manner. Since my current projects are all "setup.py" based, I simply
> scan for all such files, then iterate over those projects, calling
> "python setup.py <task>" in each.

Obviously, we'd want to scan for both pavement.py and setup.py :)

> Some bits of horribleness in my current code:
> - I'm using sh("find") to search for the sub-setup.py files, so I can
> use the "-maxdepth" argument. Obviously, not portable!

you want to use the goodies in the path module. They're not hard, they
are portable and it doesn't need to fire up a shell... you just get
the results directly in path objects.

> - I'm using sh("cd <subdir>; python setup.py <target>") to run the
> subtask. This will use whatever "python" is on the PATH, not
> necessarily the one I started with.

sys.executable is what you want. It will tell you which Python is in
use.

> - My "dependency_rank()" function is a fairly crazy way to sort the
> list of subprojects in dependency order. It works and is fast, but I
> doubt anyone could maintain it.

I'm pretty sure there's gotta be another way to do this :)

We likely want to end up with a function that can be called that will
track down sub-pavements and setups, run them with some standard set
of parameters (and pass in options automatically) and have some
customizable ordering.

Once we have a function that does that, we might be able to figure out
some nice, declarative options that can be used to create a task for it.

Kevin

Matt Kangas

unread,
Sep 15, 2008, 9:59:50 AM9/15/08
to paver
Quick question: do the goodies in the path module allow provide any
control for the depth of traversal?

I looked at that previously, and it seemed no, it was going to do a
strict depth-first traversal. I for one would like to avoid a full SVN
tree traversal. Hence why I punted and went with "find -maxdepth". (My
full tree checkout takes 2.4 GB)

Otherwise, I completely agree with your points re: sys.executable and
scanning for "pavement.py".

Kevin Dangoor

unread,
Sep 15, 2008, 10:24:57 AM9/15/08
to pa...@googlegroups.com
On Sep 15, 2008, at 9:59 AM, Matt Kangas wrote:

>
> Quick question: do the goodies in the path module allow provide any
> control for the depth of traversal?

If it doesn't, there's no reason we can't add it. Our path.py is
already a fork from the original as it is (adding dry-run support), so
there's no reason why it can't just do everything we want.

Kevin

Reply all
Reply to author
Forward
0 new messages