> I'd also like to see a little more focus on the perl/python/ruby
> equivalence from a newbie perspective. That is, more clarity around
> startup, script execution (including having an equivalent to python's
> "if __name__ == '__main__':" construct), class path management, etc.
> I know that this is one area where being in the JVM ecosystem makes
> our life worse rather than better, but approaching Clojure is still a
> bit daunting compared to these other languages.
I have a proposal for a standard way to make a namespace "executable"
and to invoke it as a program/script.
The basic idea is to mark each namespace intended to be run as
"program" such that its entry point can be found and to provide
support for calling that entry point easily.
An outline:
- add a key to the "ns" form to specify, as a symbol, the name of
a function to be called when the namespace is "invoked" as a script.
- I propose ":run" (with no default so no namespace is ever
accidentally executable.)
- If provided, the :run value in the ns form is stored as
the :run value in the namespace's metadata.
- The run function should
- accept/expect Strings as arguments
- it may accept any number of Strings using normal
Clojure arity rules.
- return an Integer
- zero indicates success
- non-zero indicates (some kind of) failure
- error codes should be chosen and documented by
the namespace author
- add a function to clojure.core to run an executable namespace:
- (run ns-name arg*)
- requires the namespace
- retrieves the namespace's :run value and resolves it to
a var in the namespace.
- calls that function via:
(apply the-run-func args)
- returns the integer that the-run-func returns
- add support to clojure-main to handle a namespace name in
"script position" by:
- calling "run" on it, passing *command-line-args* in for the
args.
- returning the status value to the OS via Java's System/exit
facility
This would be a change away from the current handling of scripts which
simply loads them, expecting them to do their operation at load time.
It also requires the resource (file) that contains the script be in
classpath. (Though that's easy to accomplish by adding "<path-to-the-
script>" to classpath.)
With this method, we have a new standard way to invoke a script, and
the scripts we load this way are "clean" in the sense that they don't
run arbitrary code while loading. (They can, of course, run arbitrary
code during macro expansion but it's still a good idea not to have
bits of executable code laying around being loaded when the
namespace's definition is loaded.)
A clear separation for scripts between load time and run time is a win.
Also with this method, we can treat (specially marked) namespaces that
have already been loaded into Clojure as runnable entities (albeit
with a rather restricted interface for arguments and return values).
We can invoke them from other Clojure code as we would in a shell
script, but without involving the OS at all.
Via string arguments, these Clojure scripts could also indicate/react-
to a desire to use *in* *out* and *err* to communicate as
corresponding UNIX tools would when run by the shell.
People who want to "run code from a file" will still be able to do so
by using "load", but that would no longer be the supported mechanism
for writing/using/invoking Clojure scripts.
Thoughts?
--Steve
> I suggest using a "-main" function for this purpose. Then with
> (ns ... (:gen-class)) you can generate a static Java class with same
> behavior.
And if the :gen-class clause renamed main would the main function for
this namespace be that named function?
> And "public static void main(String[] args)" is already the standard
> Java way to make a class executable.
There's currently no way to return a "status" value from an argument
with that signature without using System/exit which has the
unfortunate side effect of also shutting down the JVM.
We could create a var for the status return analogous to *command-line-
args*. One could wrap the invocation of the main function with:
(binding [*command-line-status* 0]
(apply the-main *command-line-args*)
*command-line-status*)
Which would assume success, but allow the-main to indicate another
status code with
(set! *command-line-status* the-status)
--Steve
>
> On Apr 17, 11:23 pm, "Stephen C. Gilardi" <squee...@mac.com> wrote:
>>> I suggest using a "-main" function for this purpose. Then with
>>> (ns ... (:gen-class)) you can generate a static Java class with same
>>> behavior.
>>
>> And if the :gen-class clause renamed main would the main function for
>> this namespace be that named function?
>
> Sure. This even suggests a process: To "execute" a namespace, first
> compile it, then invoke the main() method of the generated class. We
> might try to simplify the compile path stuff first.
One of the main points of this is for it to work if you don't compile
it. We've had the "compile it" solution for a while and I've used it
and like it. But the current way of running scripts is very different
and I hope to change that. Launching through clojure.main or through
java directly ought to be very very similar.
>> There's currently no way to return a "status" value from an argument
>> with that signature without using System/exit which has the
>> unfortunate side effect of also shutting down the JVM.
>
> True, but the Clojure function can still return a value, if it's being
> called interactively. And if it's not being called interactively,
> it's ok to use System/exit. I frequently do this:
>
> (ns my.cool.namespace (:gen-class))
>
> (defn run [args]
> ... execution code goes here...
> ... return an integer ...)
>
> (defn -main [& args]
> (System/exit (run args)))
I think this demonstrates why your "run" function, and not your "-
main" function is the entry point at the appropriate level for what
I'm after in my proposal. I'm proposing that the namespace be
executable from the repl *or any other Clojure code* via run, *and*
from the java command line via the namespace name being the first
argument to clojure.main.
I think not requiring that a namespace be compiled for this to work is
pretty key.
> From the REPL, I call "run". From the command line, I call "java
> my.cool.namespace", which invokes "-main", which calls "run".
> Different routes, same result.
Right. With my proposal it would be from the repl:
user=> (run my-ns "arg1" "arg2")
0
user=>
or from the command line:
% java -cp clojure.jar:. clojure.main my-ns arg1 arg2
% echo $?
0
%
In fact, with an easily found, well-defined "run" function (which I
think should *always* be specified by a :run clause in the namespace's
ns form), we could even default the definition of -main to be the one
you use:
> (defn -main [& args]
> (System/exit (run args)))
(with the "run" function named as specified for this namespac)
--Steve
Am 18.04.2009 um 01:16 schrieb Stephen C. Gilardi:
> I have a proposal for a standard way to make a namespace "executable"
> and to invoke it as a program/script.
I miss this so badly! Up to now, I always used gen-class to compile
a class with a main function to get this functionality. Since I mostly
don't do scripting in Clojure this approach worked well until yesterday.
There I needed that really badly and didn't have it....
> Thoughts?
Ok. You asked for it, so I will play the devil's advocate!
Why this fixing on one somehow blessed function? When the
namespace is loaded I can call any public function directly.
Why do I need (run ...)? I don't see the value of this, since run
should probably return a numeric exit code, no? While any
other function will provide eg. a map or a seq, which is probably
much more useful.
Let me bring scsh into the discussion and show how they handle
this issue. They have a command line switch to tell the scsh driver
which function to invoke after loading the script. This could also
be done for clojure.main:
java -cp .. clojure.main -E my.ns/main my-script-defining-my-ns.clj
The script does not need to be in the Classpath. As soon as it is
loaded the namespace is available. (Which brings me to the
double book-keeping of require...)
The script could be made self-contained using hashdot:
#! /usr/bin/env hashdot
;;.hashdot.main = clojure.main
;;.hashdot.args.pre = -E my-ns/main
(ns my.ns)
(defn main
[& args]
...)
./my-script-defining-my-ns.clj
With this approach I can also choose different entry points for
different invokations of the same script. Although I'm not sure this
is an interesting feature.
Just some thoughts to a look a different approach.
Sincerely
Meikel
> Ok. You asked for it, so I will play the devil's advocate!
>
> Why this fixing on one somehow blessed function? When the
> namespace is loaded I can call any public function directly.
> Why do I need (run ...)? I don't see the value of this, since run
> should probably return a numeric exit code, no? While any
> other function will provide eg. a map or a seq, which is probably
> much more useful.
From within Clojure you can call any function you like. The purpose
of this function is to bridge the gap between the shell and Clojure
while still being reasonably callable from within Clojure. At the
command line all we have for arguments are strings and all we can
return is integer status.
The purpose of marking a function as "the one to call when invoking
the script" is to allow clojure.main to call the namespace as a
program by naming only the namespace. The purpose of making it a
standard interface (strings in, integer status out, use streams for
anything else) is that we can call it that way easily from either the
command line or from within Clojure. It's an interface that's been
tried and true for many years in Unix tools and in Java's "public
static void main (String[] args)" (modulo the return code issue being
different between the two.
> Let me bring scsh into the discussion and show how they handle
> this issue. They have a command line switch to tell the scsh driver
> which function to invoke after loading the script. This could also
> be done for clojure.main:
>
> java -cp .. clojure.main -E my.ns/main my-script-defining-my-ns.clj
I'm trying to make this kind of command line easier to use and
explain. We basically already have what you're proposing:
java -cp .. clojure.main -i my-script-defining-my-ns.clj -e "(my.ns/
main *command-line-args*)"
Contrast that to what I'm proposing:
java -cp .. clojure.main my.ns
I still like mine, but an alternative of specifying the function
directly and then calling it with the rest of the args as strings and
returning the int it returns would also be succinct and perhaps even
clearer:
java -cp .. clojure.main my.ns/main
> The script does not need to be in the Classpath. As soon as it is
> loaded the namespace is available. (Which brings me to the
> double book-keeping of require...)
I'd rather have it in classpath. What is the "double book-keeping of
require..."
> The script could be made self-contained using hashdot:
>
> #! /usr/bin/env hashdot
> ;;.hashdot.main = clojure.main
> ;;.hashdot.args.pre = -E my-ns/main
>
> (ns my.ns)
>
> (defn main
> [& args]
> ...)
>
> ./my-script-defining-my-ns.clj
>
> With this approach I can also choose different entry points for
> different invokations of the same script. Although I'm not sure this
> is an interesting feature.
I'm not sure it's interesting either, but the idea of invoking a
Clojure *function* as a "program" rather than a Clojure namespace or
file is a good one I think.
> Just some thoughts to a look a different approach.
Thanks very much.
--Steve
Am 18.04.2009 um 14:34 schrieb Stephen C. Gilardi:
>> java -cp .. clojure.main -E my.ns/main my-script-defining-my-ns.clj
>
> I'm trying to make this kind of command line easier to use and
> explain. We basically already have what you're proposing:
>
> java -cp .. clojure.main -i my-script-defining-my-ns.clj -e "(my.ns/
> main *command-line-args*)"
I tried to make this work with hashdot, but then the script is read
twice
because I have to specify it as -i script and it will be passed again as
additional parameter to clojure.main.
The problem is, that -e works before reading the script and but hashdot
doesn't provide a hashdot.args.post. Then I could construct a header
like:
#! /usr/bin/env hashdot
;;.hashdot.main = clojure.main
;;.hashdot.args.pre = -i
;;.hashdot.args.post = -e "(my.ns/main)"
Just the usual great-tools-but-still-not-perfect issue. This could be
either
"fixed" in hashdot or clojure.main. So it's not a problem of either.
It's mine
wanting to combine the two tools...
> I still like mine, but an alternative of specifying the function
> directly and then calling it with the rest of the args as strings
> and returning the int it returns would also be succinct and perhaps
> even clearer:
>
> java -cp .. clojure.main my.ns/main
Now, that would be cool. Although it would still rule out hashdot,
since this passes the script on as additional argument. Could I
convince you to have an optional -E to allow the use of tools like
hashdot?
> I'd rather have it in classpath. What is the "double book-keeping of
> require..."
(ns foo.bar)
(in-ns 'user)
(require 'foo.bar)
This loads foo.bar again, although it already exists and I did not
specify the :reload argument. I ran into this some while ago
with a gen-class'd Class. It was I somewhat convoluted setup
which I don't remember any more, but it took me some time to
figure out, that require doesn't care for all-ns and keeps its own
set of namespaces. This I would call double book-keeping.
> Thanks very much.
Thanks for your great contributions, Stephen. In particular
the require/use system and clojure.main. :)
Sincerely
Meikel