I do appreciate your feedback and your experience. I believe in all
the problems you're talking about, having lived them all firsthand
myself (maybe not as excruciatingly as you :)) However, I don't think
I believe that your proposed solution is the one that will work.
I was going to quote and reply inline, but it won't really do much
good given that I disagree at a pretty high level. So I'll just tell
you my current thoughts in overview form. First, things I agree with:
- Yes, every shell is different, and shell portability is rather nightmarish.
- Yes, the shell is only the beginning; the rest of the Unix tools are also key.
- Yes, forking on Windows sucks.
- Yes, people will use whatever we make the default, even if #! is
available as an escape.
However, here are some points that unfortunately, together, seem to
make your proposed solution non-viable:
- I'm not smart enough to design an objectively better DSL than POSIX.
From what I can tell by looking at the available competitors, neither
is anybody else. (The plan9 shell has a few tiny improvements...)
- Even if I was smart enough, implementing such a great DSL would be
about the same amount of work as implementing the POSIX tools in a
portable way.
- Even if we implemented it despite the above, we'd surely screw up
about a million times along the way, producing a million incompatible
versions of our DSL, so we're back to square one. (That's how there
got to be so many variants of sh, after all. You must have had this
same problem with MakeMaker input file syntax, no? Did you solve it
somehow?)
- The essential simplicity of redo - no new syntax - is totally
destroyed if we implement a new language. Lacking this essential
simplicity, redo is pointless, so nobody will use it.
- The python dependency is meant to be temporary; redo should be
written in C. Ruby and Perl people are likely to get religious about
it (I know you're a perl guy and probably much more enlightened, but
let's be honest about the masses) and not use it if it's python.
- Python changes (at least a bit) in every version anyway, so people
will still screw up portability.
- As it happens, the forking can be greatly reduced through some
cleverness that I haven't implemented yet. An earlier version of
minimal/do actually did this by coincidence, and it was awesome. (The
essential thing to do is note that redo-ifchange can be implemented in
sh as a subshell of an existing shell; you don't always have to run a
new shell. Long story, I'll get to it later.)
- Make already has all the problems you're talking about and it is
still popular, so fixing these problems is not necessary in order to
be popular. Meanwhile, no system which attempts to fix the problems
you're talking about is popular. (MakeMaker is sort of popular if you
use perl, but it made a lot of sacrifices in order to achieve that,
and if you don't use perl it seems to be unhelpful.)
So I just don't think it's going to turn out if we do it that way. I
think redo has the chance to at least get rid of make in many cases;
that's a huge step forward, since make syntax is *much much less
portable* than even sh syntax, because it has all the problems you
mentioned above, *plus* make is its own DSL with a million variations.
I also have what I think is good news:
- The world has made *massive* steps toward POSIX in the last 15
years. Depending on a POSIX shell today is not nearly as ambiguous as
it was 15 years ago.
- Open Source in particular has made life a lot better for everyone;
you can get bash or ash for any platform, and it's not a totally
insane requirement.
- In fact, given how long it has taken to stabilize the POSIX
environment, and that we've now made massive strides in that
direction, designing anything *new* is like throwing away 15+ years of
progress toward portability!
- I think we are quite possibly now - finally - in the home stretch
toward a golden age of compatible POSIX environments being available
everywhere. The freebsd guys have pointed me to an excellent shell
test suite that I'm meaning to look at; I think all the shell makers
have an incentive to pass a massive combined test suite if it were
available. (Think of things like the CSS ACID test, but for
sh/POSIX.)
- I recently found out about a port of busybox to win32:
https://github.com/pclouds/busybox-w32 . I tried it. It works,
mostly (there are quite a few bugs left), and it works with no
installation, no stupid registry settings, no stupid line ending
translation, no stupid fake directory mounts, no stupid .dll files, no
stupid "dll rebasing", and just one single monolithic 500k binary. It
includes not only ash, but a whole bunch of POSIXy command-line tools.
And on win32, those tools can be run without fork/exec in most cases!
(It's done by one of the git developers. He's really smart. He did
it because git actually has a lot of sh scripts, so it has a lot of
the same portability problems we're talking about. If his solution
works out, it'll be a truly beautiful one.)
So here's what I would propose:
Make it easy for people to use a POSIX-only subset of the standard
Unix tools. How exactly? Well, I'm not sure. The idea of using
busybox by default on *all* platforms would be a very interesting
possibility. Then if you wanted to "break out" of the busybox prison,
you'd have to use #!/bin/sh and take your chances. We could include a
copy of busybox with the redo distribution, so everyone would have it.
Of course, busybox isn't fully POSIX (yet?). It seems they're trying
reasonably hard at it though. Maybe there would be value in a fork of
busybox whose goal was more to be POSIX than to be tiny, which would
make the job easier. Or maybe that's a terrible idea, I don't know.
This solution sounds a bit messy, but well, portability is always a
bit messy. It has some advantage over any other proposal I've seen
however:
- It's already mostly written (busybox, gnu, bsd, whatever)
- It's based on a standard that has had 15+ years to stabilize
- Everybody already knows the language/syntax in question
- The language/syntax is well known to be excellent for running
programs, which is what build scripts tend to do
- If we do it, it benefits not just redo, but everybody, because
they'll now all have a *real* cross-platform POSIX environment that
they can use for anything else.
Am I wrong?
Have fun,
Avery
Make it easy for people to use a POSIX-only subset of the standard
Unix tools. How exactly? Well, I'm not sure. The idea of using
busybox by default on *all* platforms would be a very interesting
possibility. Then if you wanted to "break out" of the busybox prison,
you'd have to use #!/bin/sh and take your chances. We could include a
copy of busybox with the redo distribution, so everyone would have it.
For Windows, I think this sounds like the right way to go - I put mingw
and busybox on a test VM and "gcc" magically becomes available in the
busybox shell, so that seems like building stuff on Windows should be
a Simple Matter of Programming.
For Linux and OS X, to use
busybox for portability would be using a sledgehammer for a walnut
- both environments already come with reasonably POSIX-compliant shells
(bash, specifically) and POSIXish user-land tools. Given the compilation
differences in OS X (.dylib files, frameworks, etc.) I doubt the
differences in flags accepted by sed will be much trouble.
As I understand it, the real problems with portable shell scripting
crop up when you start porting to the ancient Unix-derived environments
- Solaris, AIX, HP-UX and so forth. Those are the places where finding
a good POSIX shell and user-land is likely to be difficult, and also the
places where busybox is not likely to be available.
The busybox README says:
# In theory it's possible to use Busybox under other operating systems
# (such as MacOS X, Solaris, Cygwin, or the BSD Fork Du Jour). This
# generally involves a different kernel and a different C library at the
# same time. While it should be possible to port the majority of the
# code to work in one of these environments, don't be suprised if it
# doesn't work out of the box.
On a tangent, I'd certainly support redo exporting POSIXLY_CORRECT=1
into the environment of .do files, to at least nudge peoplo toward
writing more portable code on environments with a GNU userland.
> For Linux and OS X, to use
> busybox for portability would be using a sledgehammer for a walnut
> - both environments already come with reasonably POSIX-compliant shells
> (bash, specifically) and POSIXish user-land tools.
OS X comes with zsh installed by default, and I believe zsh running as
sh was already mentioned as being the most POSIXy shell out there, at
least from redo's point of view.
--
Aaron Davies
aaron....@gmail.com
While most of your other points look sound to me, I'm not sure how you
reached this conclusion. Doesn't Python's extensive standard library
already provide the functionality of all the POSIX tools? The DSL
doesn't need to cover everything in POSIX - something like Perl's qw{}
and qx{} operators (whitespace-separated lists, and shellout with
variable interpolation) would cover most of what you need. For anything
more complex, you'd have a Real Programming Language, and one that's
friendlier and more memorable than shell.
> - The essential simplicity of redo - no new syntax - is totally
> destroyed if we implement a new language. Lacking this essential
> simplicity, redo is pointless, so nobody will use it.
Speaking as a Perl guy, I think I know Python better than I know
shell... Python's syntax, after all, is expressly designed for casual
programmers.
That said, the busybox approach sounds promising too. But require it
/everywhere/. Make portability as easy to achieve as possible.
Miles
--
BE ALOOF! (There has been a recent population explosion of lerts.)
-- seen on Slashdot
I'm not sure there is a 'proper thing' to do here; portability is not
a binary attribute, and writing more portable code is always going to
take more effort. The main question is, on the sliding scale from "any
use of any tool not ratified by POSIX is a fatal error" to "crazy go
nuts", where should redo aim?
It's not possible to guarantee perfect portability - the very first
redo script I wrote wasn't even portable from one Linux distro to
another, because I happened to invoke "tempfile" (which is Debian-only)
rather than "mktemp". No amount of POSIX lock-down would have prevented
that, short of actively blocking any tool not built-in to busybox (and
note that many of the tools people actually want to run from a build
script are not in busybox, like gcc and TeX).
At the other end of the spectrum, what is lost if people write
unportable build scripts? I'd wager most projects only get rebuilt on
successive versions of a single platform, and forcing people to write
portable build-scripts for their much less portable code would be
wasting their time. Some projects get big enough and popular enough that
people want to use them on multiple platforms, and then you have to port
the code anyway - would porting the build-scripts be that much more
work?
Another relevant question is how *far* build-scripts should be portable.
I've asserted in this thread (admittedly without hard evidence
- rebuttals welcome) that Linux and Mac OS X are already sufficiently
similar that porting build-scripts between "ls" syntaxes is not going to
be difficult compared to porting between the different library and
compilation processes for each platform. If you want to target other
*BSDs, I imagine it still wouldn't be too hard. If you want to target
Old School UNIX environments like Solaris and AIX, you should probably
skip busybox entirely and look into testing against the Heirloom
Tools[3].
I'm not trying to attack anyone, or claim that portability is a Bad
Thing or anything like that, I'm just trying to understand the the
problem space a little better.
Python's stdlib wraps a staggering amount of the C and POSIX standard
library calls. However, actually stringing them together is often much,
much more cumbersome than the equivalent shell-script.
For example, I recently stumbled across the 'pipes' standard library
module, which is easily the simplest approach I've seen to building
shell pipelines. Here's the example from the documentation:
import pipes
t=pipes.Template()
t.append('tr a-z A-Z', '--')
f=t.open('/tmp/1', 'w')
f.write('hello world')
f.close()
print open('/tmp/1').read()
The equivalent in shell-script looks like this:
echo "hello world" | tr a-z A-Z > /tmp/1
cat /tmp/1
...and besides which, the "pipes" module uses os.system() and hence
requires a POSIXy shell anyway.
Some of the solutions proposed to questions posed on this list have
involved .do files that invoke sed and tr and even fancier processing on
various commands; actually implementing those directly in Python without
some kind of DSL syntactic sugar does not sound fun at all.
(If somebody *does* want to implement a shell-style DSL, possibly for
a C rewrite of redo, I happened to be reading through the manpage[3] for
Plan 9's "rc" shell today, and it's very, very nice... I'm told that
redo's design is very reminiscent of Plan 9, so it might be a good fit.)
My point was more that you wouldn't *need* to shell out to sed and tr,
because you could do the necessary string-processing directly in Python.
Miles
--
Claire: At least George Bush /pretends/ to speak English.
Michael: At least Arnold Schwarzenegger /tries/.
It won’t be a whole lot extra if busybox is bundled. Most of the
tools are close to POSIX. You’ll have to do without some GNUisms
and BSDisms, but few people are truly in need of those. Most
people know and use only a few of those. Everyone does know the
basics of the Unix toolbox; busybox provides almost all of what
most people use.
> The main question is, on the sliding scale from "any use of any
> tool not ratified by POSIX is a fatal error" to "crazy go
> nuts", where should redo aim?
>
> It's not possible to guarantee perfect portability
It’s possible to write unportable code in Python and in Perl.
But most Perl and Python scripts are perfectly portable. How is
this possible? Affordances.
That is what redo should aim for.
Sure it cannot guarantee anything. That doesn’t justify simply
throwing in the towel. Just because the halting problem shows
that the behaviour of programs is fundamentally unknowable does
not mean we don’t try to fix bugs or design programming languages
that help the programmer make fewer mistakes.
Making things that take only a tiny effort take zero effort makes
a qualitative difference. It changes the way things get done.
> I'd wager most projects only get rebuilt on successive versions
> of a single platform, and forcing people to write portable
> build-scripts for their much less portable code would be
> wasting their time.
How so? Does it take *that* much effort to write a script that
will work under busybox compared to writing one that will run in
their regular shell?
This the main reason I cannot follow your argument. As far as
I can tell, “forcing” users to write toward busybox (by default;
they can always break out with a single line!) will cost zero
added effort 99% of the time. So why oppose the idea (if you
accept that shell is the right language for redo scripts)?
> Some projects get big enough and popular enough that people
> want to use them on multiple platforms, and then you have to
> port the code anyway - would porting the build-scripts be that
> much more work?
The projects that will profit the most are mid-sized ones that
aren’t highly platform-specific, have mostly casual users, and
only a handful of developers with a limited variety of systems to
test on.
Ideally porting to a platform the devs have never heard would
consist of “untar and redo, and it works”. This is not going to
happen all of the time of course, but with the right affordances
it isn’t vanishingly unlikely, and there is no reason not to try
to encourage it to materialise.
> I've asserted in this thread (admittedly without hard evidence
> - rebuttals welcome) that Linux and Mac OS X are already
> sufficiently similar that porting build-scripts between "ls"
> syntaxes is not going to be difficult
No, but it’ll come up over and over and over and over. None of
these tiny cases will matter in the big scheme, but why should
they have to happen when the problem can be near-prevented once?
Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>
I think we are wasting our time here on orthogonal concerns.
For review: http://groups.google.com/group/redo-list/browse_thread/thread/291c851908332306
Redo itself is not actually specific to build systems. It is a
file-based dependency tracking system. The question it answers is "how
can we bring a target up to date with the minimal amount of work?"
Not: "How was we compile this piece of c code without recompiling
everything?" or "How can we update these generated documentation files
with the minimal amount of work?"
The questions asked by a cross-platform build system: "how can we
compile this c file on every system?" is entirely orthogonal to the
design of redo. It is a problem common to scons, make, cmake, redo,
and every other build system out there. Therefore, this should be
handled in a different project. As the #! syntax demonstrates, there
is nothing tying redo to the shell except for convenience. A potential
redoconf project is free to use the shell or busybox or a scripting
language like python to handle the cross-platform stuff and can
integrate seamlessly with redo without much trouble. That is where
these cross-platform efforts should be focused.
It would be a mistake to saddle redo with an autoconf clone.
--
________________________
Warm Regards
Prakhar Goel
e-mail: newt...@gmail.com, newt...@caltech.edu
LinkedIn Profile: http://www.linkedin.com/in/newt0311
Cell: (972) 992-8078
"The real universe is always one step beyond logic." --- Frank Herbert, Dune
That was not in dispute.
The only question was: which one should be the default?
Defaults Matter.
> It would be a mistake to saddle redo with an autoconf clone.
I didn’t see anyone talking about that.
No, it was not in dispute but it was forgotten. Since cross-platform
compatibility is not the primary focus of redo, the primary criterion
for redo's scripting engine is simplicity and flexibility. The only
option which satisfies these criterion is the shell (or rather a
shell) so far. Both python and perl would require additions to easily
call other programs, which redo scripts usually end up doing
frequently, and therefore add complexity. A custom DSL would be even
worse.
But that was already the consensus.
It just struck me that this is actually a very djb-ish approach,
in keeping with the spirit of qmail installing under /opt to
avoid platform differences.
These are what I like to call "famous last words" :)
We want to make writing .do scripts a *minimum* amount of work, not
"not that much more effort."
As far as I can see, portability will *always* be harder than
non-portability. If we make redo enforce portability, we will be
making it harder for a beginner to write redo scripts. That's bad.
For example, 'cat' and 'grep' and 'find' don't behave the same in
busybox as they do in bsd/gnu. So if you run a command from the
command line to try it out, it'll do one thing; if you run it from a
.do that uses busybox, it might do something else.
This is one reason why I'm hesitant to do the whole "enforce busybox"
thing. Another reason is that it seems thoroughly gross to force
people who already have a POSIX environment to install another one
just so they can make their build scripts crash more.
That said, it seems like a good idea to *let* people *ask* their build
scripts to crash more. Someone suggested a "--busybox" option to redo
that would request it to use busybox; that would be neat.
Also, on win32 people have much lower standards for duplicated code
and oversized install packages than on Unix. So a win32 version of
redo could include a copy of busybox and nobody would care. Since
win32 has no well-accepted standard POSIX environment, that might be
the only sane way to do it.
On win32, enforcing busybox by default would probably *help* people
make their build systems work at all, so it's not a tradeoff like it
is on Unix.
Have fun,
Avery
Hmm. I remember using busybox in some environment a while ago,
I think on some router firmware I was routinely twiddling. Some
commands are complete enough that I never had to adjust my
habits, some were limited in ways that took extra effort.
You wrote that busybox is closing in on POSIX, so I took it on
faith that the friction has decreased since. Too willing to hope,
bah. If even `cat` is still problematic, I agree that busybox as
default will chafe. Sigh.
> Another reason is that it seems thoroughly gross to force
> people who already have a POSIX environment to install another
> one just so they can make their build scripts crash more.
LOL :-)
I’m not worried about duplication. Busybox is unobtrusive, it
will just take a bit of disk space. I used to care a lot about
that but it’s just too cheap to pay it much mind any more. If
busybox were at all desirable by any other metric it would make
up for that downside more or less instantly.
> That said, it seems like a good idea to *let* people *ask*
> their build scripts to crash more. Someone suggested
> a "--busybox" option to redo that would request it to use
> busybox; that would be neat.
That would be neat if it’s optional, yes.
Oh, it's not bad. It's just that, as others have pointed out,
POSIX-only isn't actually all that nice. There are a bunch of
situations where both BSD and GNU versions of tools (eg. find) support
a particular feature, but busybox and/or POSIX does not.
Anyway, for someone really concerned about portability, installing
busybox for testing should be fine. And if it works in busybox,
there's a really good chance it'll work with GNU or BSD tools (as long
as they're sufficiently recent, not released by idiotic proprietary
Unix companies, etc). So it's still a good strategy for portability.
It's just that it'll still be nonzero extra work for someone to write
a .do script using busybox instead of the system they're familiar
with, and I don't know if we really want to force people to go through
that.
> I’m not worried about duplication. Busybox is unobtrusive, it
> will just take a bit of disk space. I used to care a lot about
> that but it’s just too cheap to pay it much mind any more.
I still care. It's an elegance thing. Give the "waste disk space"
people an inch and they'll take a mile, and the next thing you know
your base OS install is 6 gigs, and I wish I was exaggerating.
>> That said, it seems like a good idea to *let* people *ask*
>> their build scripts to crash more. Someone suggested
>> a "--busybox" option to redo that would request it to use
>> busybox; that would be neat.
>
> That would be neat if it’s optional, yes.
I think that gets us most of what we want: people who want to be
portable have a really easy step (install busybox, test with
--busybox), and people who don't care don't pay for it.
Of course, on win32 the situation is reversed: we should include
something like busybox because it's *easier* for them that way.
Have fun,
Avery