Executable Namespace Proposal

45 views
Skip to first unread message

Stephen C. Gilardi

unread,
Apr 17, 2009, 7:16:47 PM4/17/09
to Clojure

On Apr 17, 2009, at 5:21 PM, Tom Faulhaber wrote:

> I'd also like to see a little more focus on the perl/python/ruby
> equivalence from a newbie perspective. That is, more clarity around
> startup, script execution (including having an equivalent to python's
> "if __name__ == '__main__':" construct), class path management, etc.
> I know that this is one area where being in the JVM ecosystem makes
> our life worse rather than better, but approaching Clojure is still a
> bit daunting compared to these other languages.

I have a proposal for a standard way to make a namespace "executable"
and to invoke it as a program/script.

The basic idea is to mark each namespace intended to be run as
"program" such that its entry point can be found and to provide
support for calling that entry point easily.

An outline:

- add a key to the "ns" form to specify, as a symbol, the name of
a function to be called when the namespace is "invoked" as a script.

- I propose ":run" (with no default so no namespace is ever
accidentally executable.)

- If provided, the :run value in the ns form is stored as
the :run value in the namespace's metadata.

- The run function should

- accept/expect Strings as arguments

- it may accept any number of Strings using normal
Clojure arity rules.

- return an Integer

- zero indicates success

- non-zero indicates (some kind of) failure

- error codes should be chosen and documented by
the namespace author

- add a function to clojure.core to run an executable namespace:

- (run ns-name arg*)

- requires the namespace

- retrieves the namespace's :run value and resolves it to
a var in the namespace.

- calls that function via:
(apply the-run-func args)

- returns the integer that the-run-func returns

- add support to clojure-main to handle a namespace name in
"script position" by:

- calling "run" on it, passing *command-line-args* in for the
args.

- returning the status value to the OS via Java's System/exit
facility

This would be a change away from the current handling of scripts which
simply loads them, expecting them to do their operation at load time.
It also requires the resource (file) that contains the script be in
classpath. (Though that's easy to accomplish by adding "<path-to-the-
script>" to classpath.)

With this method, we have a new standard way to invoke a script, and
the scripts we load this way are "clean" in the sense that they don't
run arbitrary code while loading. (They can, of course, run arbitrary
code during macro expansion but it's still a good idea not to have
bits of executable code laying around being loaded when the
namespace's definition is loaded.)

A clear separation for scripts between load time and run time is a win.

Also with this method, we can treat (specially marked) namespaces that
have already been loaded into Clojure as runnable entities (albeit
with a rather restricted interface for arguments and return values).
We can invoke them from other Clojure code as we would in a shell
script, but without involving the OS at all.

Via string arguments, these Clojure scripts could also indicate/react-
to a desire to use *in* *out* and *err* to communicate as
corresponding UNIX tools would when run by the shell.

People who want to "run code from a file" will still be able to do so
by using "load", but that would no longer be the supported mechanism
for writing/using/invoking Clojure scripts.

Thoughts?

--Steve

Laurent PETIT

unread,
Apr 17, 2009, 10:24:53 PM4/17/09
to clo...@googlegroups.com
Hi,

This indeed seems like an improvement to me, since from then there will be no excuse to have namespaces doing time or resource consuming operations (I'm thinking about those scripts that automatically load GUIs, ...) at load (and even compile *ouch*) time.

* Concerning the name of the keyword, may I suggest to use :main instead of :run ? The word "main" is the convention in the java world for referencing a candidate class for being the Main class of the application (Main is used in MANIFESTs , in naming the public static void main(String args[]) ...) ?

* We could make this :main keyword play well with the :gen-class directive if it is present in the namespace declaration :

Examples of use:

(ns
  (:main)) ; there could be a default name for the function (if :main is present of course). This could be (-main) to offer a smooth migration if later on the ns is gen-classed

(ns
  (:main)
  (:gen-class (:prefix "foo")) ; here :main would stay "aligned" with :gen-class by default (DRY principle) and the main function of the ns would be the main function of the class generated : (foo-main)

(ns
  (:main ns-main-function)) ; the main function for the ns is ns-main-function.

(ns
  (:main ns-main-function)
  (:gen-class)) ; here since :main explicit defines a name, it is intentionally not aligned with the generated class' main function

* Concerning the execution : it could be interesting, then, to be able to launch such a script as one would launch a plain old java executable : java -cpclojure.jar:classes/:src/ my.ns
Considering this last proposition, this would imply to even more closely couple the use of the new :main keyword with the :gen-class directive. Indeed, this would imply that if :main is used alone (without :gen-class at all), the namespace would be compiled into a file. (And then, the value for the function name better be just always aligned with the :gen-class'). Though, I'm not sure whether this last step is a good idea, since it forces either to always compile namespaces that would be auto-executable, or either to "auto-compile" the ns if the :main keyword is found (and I don't like the idea of compiling ahead of time in the background, since the classes/ directory may well not have been placed in the classpath by the user).

What do you think about that ?

My 0,02 €,

--
Laurent

2009/4/18 Stephen C. Gilardi <sque...@mac.com>

Stuart Sierra

unread,
Apr 17, 2009, 10:28:50 PM4/17/09
to Clojure
On Apr 17, 7:16 pm, "Stephen C. Gilardi" <squee...@mac.com> wrote:
> I have a proposal for a standard way to make a namespace "executable"  
> and to invoke it as a program/script.
...
>      - add a key to the "ns" form to specify, as a symbol, the name of  
> a function to be called when the namespace is "invoked" as a script.
...
> Thoughts?

I suggest using a "-main" function for this purpose. Then with
(ns ... (:gen-class)) you can generate a static Java class with same
behavior.

And "public static void main(String[] args)" is already the standard
Java way to make a class executable.

-Stuart Sierra

Laurent PETIT

unread,
Apr 17, 2009, 10:45:24 PM4/17/09
to clo...@googlegroups.com


2009/4/18 Stuart Sierra <the.stua...@gmail.com>

There should be STMs for e-mails :-)

I agree with Stuart's answer, of course, just wanted to add that it could also be "watheverprefix-main" if (:gen-class) defines a prefix.
And I still see value in having the :main keyword if the (:gen-class) is not necessary.

Regards,

--
Laurent
 

Stephen C. Gilardi

unread,
Apr 17, 2009, 11:23:29 PM4/17/09
to clo...@googlegroups.com

On Apr 17, 2009, at 10:28 PM, Stuart Sierra wrote:

> I suggest using a "-main" function for this purpose. Then with
> (ns ... (:gen-class)) you can generate a static Java class with same
> behavior.

And if the :gen-class clause renamed main would the main function for
this namespace be that named function?

> And "public static void main(String[] args)" is already the standard
> Java way to make a class executable.

There's currently no way to return a "status" value from an argument
with that signature without using System/exit which has the
unfortunate side effect of also shutting down the JVM.

We could create a var for the status return analogous to *command-line-
args*. One could wrap the invocation of the main function with:

(binding [*command-line-status* 0]
(apply the-main *command-line-args*)
*command-line-status*)

Which would assume success, but allow the-main to indicate another
status code with

(set! *command-line-status* the-status)

--Steve

Stuart Sierra

unread,
Apr 17, 2009, 11:40:00 PM4/17/09
to Clojure
On Apr 17, 11:23 pm, "Stephen C. Gilardi" <squee...@mac.com> wrote:
> > I suggest using a "-main" function for this purpose.  Then with
> > (ns ... (:gen-class)) you can generate a static Java class with same
> > behavior.
>
> And if the :gen-class clause renamed main would the main function for  
> this namespace be that named function?

Sure. This even suggests a process: To "execute" a namespace, first
compile it, then invoke the main() method of the generated class. We
might try to simplify the compile path stuff first.

> There's currently no way to return a "status" value from an argument  
> with that signature without using System/exit which has the  
> unfortunate side effect of also shutting down the JVM.

True, but the Clojure function can still return a value, if it's being
called interactively. And if it's not being called interactively,
it's ok to use System/exit. I frequently do this:

(ns my.cool.namespace (:gen-class))

(defn run [args]
... execution code goes here...
... return an integer ...)

(defn -main [& args]
(System/exit (run args)))

From the REPL, I call "run". From the command line, I call "java
my.cool.namespace", which invokes "-main", which calls "run".
Different routes, same result.

-Stuart Sierra

Stephen C. Gilardi

unread,
Apr 17, 2009, 11:59:11 PM4/17/09
to clo...@googlegroups.com

On Apr 17, 2009, at 11:40 PM, Stuart Sierra wrote:

>
> On Apr 17, 11:23 pm, "Stephen C. Gilardi" <squee...@mac.com> wrote:
>>> I suggest using a "-main" function for this purpose. Then with
>>> (ns ... (:gen-class)) you can generate a static Java class with same
>>> behavior.
>>
>> And if the :gen-class clause renamed main would the main function for
>> this namespace be that named function?
>
> Sure. This even suggests a process: To "execute" a namespace, first
> compile it, then invoke the main() method of the generated class. We
> might try to simplify the compile path stuff first.

One of the main points of this is for it to work if you don't compile
it. We've had the "compile it" solution for a while and I've used it
and like it. But the current way of running scripts is very different
and I hope to change that. Launching through clojure.main or through
java directly ought to be very very similar.

>> There's currently no way to return a "status" value from an argument
>> with that signature without using System/exit which has the
>> unfortunate side effect of also shutting down the JVM.
>
> True, but the Clojure function can still return a value, if it's being
> called interactively. And if it's not being called interactively,
> it's ok to use System/exit. I frequently do this:
>
> (ns my.cool.namespace (:gen-class))
>
> (defn run [args]
> ... execution code goes here...
> ... return an integer ...)
>
> (defn -main [& args]
> (System/exit (run args)))

I think this demonstrates why your "run" function, and not your "-
main" function is the entry point at the appropriate level for what
I'm after in my proposal. I'm proposing that the namespace be
executable from the repl *or any other Clojure code* via run, *and*
from the java command line via the namespace name being the first
argument to clojure.main.

I think not requiring that a namespace be compiled for this to work is
pretty key.

> From the REPL, I call "run". From the command line, I call "java
> my.cool.namespace", which invokes "-main", which calls "run".
> Different routes, same result.

Right. With my proposal it would be from the repl:

user=> (run my-ns "arg1" "arg2")
0
user=>

or from the command line:

% java -cp clojure.jar:. clojure.main my-ns arg1 arg2
% echo $?
0
%

In fact, with an easily found, well-defined "run" function (which I
think should *always* be specified by a :run clause in the namespace's
ns form), we could even default the definition of -main to be the one
you use:

> (defn -main [& args]
> (System/exit (run args)))

(with the "run" function named as specified for this namespac)

--Steve


Meikel Brandmeyer

unread,
Apr 18, 2009, 5:24:43 AM4/18/09
to clo...@googlegroups.com
Hi,

Am 18.04.2009 um 01:16 schrieb Stephen C. Gilardi:

> I have a proposal for a standard way to make a namespace "executable"
> and to invoke it as a program/script.

I miss this so badly! Up to now, I always used gen-class to compile
a class with a main function to get this functionality. Since I mostly
don't do scripting in Clojure this approach worked well until yesterday.
There I needed that really badly and didn't have it....

> Thoughts?

Ok. You asked for it, so I will play the devil's advocate!

Why this fixing on one somehow blessed function? When the
namespace is loaded I can call any public function directly.
Why do I need (run ...)? I don't see the value of this, since run
should probably return a numeric exit code, no? While any
other function will provide eg. a map or a seq, which is probably
much more useful.

Let me bring scsh into the discussion and show how they handle
this issue. They have a command line switch to tell the scsh driver
which function to invoke after loading the script. This could also
be done for clojure.main:

java -cp .. clojure.main -E my.ns/main my-script-defining-my-ns.clj

The script does not need to be in the Classpath. As soon as it is
loaded the namespace is available. (Which brings me to the
double book-keeping of require...)

The script could be made self-contained using hashdot:

#! /usr/bin/env hashdot
;;.hashdot.main = clojure.main
;;.hashdot.args.pre = -E my-ns/main

(ns my.ns)

(defn main
[& args]
...)

./my-script-defining-my-ns.clj

With this approach I can also choose different entry points for
different invokations of the same script. Although I'm not sure this
is an interesting feature.

Just some thoughts to a look a different approach.

Sincerely
Meikel

Stephen C. Gilardi

unread,
Apr 18, 2009, 8:34:57 AM4/18/09
to clo...@googlegroups.com

On Apr 18, 2009, at 5:24 AM, Meikel Brandmeyer wrote:

> Ok. You asked for it, so I will play the devil's advocate!
>
> Why this fixing on one somehow blessed function? When the
> namespace is loaded I can call any public function directly.
> Why do I need (run ...)? I don't see the value of this, since run
> should probably return a numeric exit code, no? While any
> other function will provide eg. a map or a seq, which is probably
> much more useful.

From within Clojure you can call any function you like. The purpose
of this function is to bridge the gap between the shell and Clojure
while still being reasonably callable from within Clojure. At the
command line all we have for arguments are strings and all we can
return is integer status.

The purpose of marking a function as "the one to call when invoking
the script" is to allow clojure.main to call the namespace as a
program by naming only the namespace. The purpose of making it a
standard interface (strings in, integer status out, use streams for
anything else) is that we can call it that way easily from either the
command line or from within Clojure. It's an interface that's been
tried and true for many years in Unix tools and in Java's "public
static void main (String[] args)" (modulo the return code issue being
different between the two.

> Let me bring scsh into the discussion and show how they handle
> this issue. They have a command line switch to tell the scsh driver
> which function to invoke after loading the script. This could also
> be done for clojure.main:
>
> java -cp .. clojure.main -E my.ns/main my-script-defining-my-ns.clj

I'm trying to make this kind of command line easier to use and
explain. We basically already have what you're proposing:

java -cp .. clojure.main -i my-script-defining-my-ns.clj -e "(my.ns/
main *command-line-args*)"

Contrast that to what I'm proposing:

java -cp .. clojure.main my.ns

I still like mine, but an alternative of specifying the function
directly and then calling it with the rest of the args as strings and
returning the int it returns would also be succinct and perhaps even
clearer:

java -cp .. clojure.main my.ns/main

> The script does not need to be in the Classpath. As soon as it is
> loaded the namespace is available. (Which brings me to the
> double book-keeping of require...)

I'd rather have it in classpath. What is the "double book-keeping of
require..."

> The script could be made self-contained using hashdot:
>
> #! /usr/bin/env hashdot
> ;;.hashdot.main = clojure.main
> ;;.hashdot.args.pre = -E my-ns/main
>
> (ns my.ns)
>
> (defn main
> [& args]
> ...)
>
> ./my-script-defining-my-ns.clj
>
> With this approach I can also choose different entry points for
> different invokations of the same script. Although I'm not sure this
> is an interesting feature.

I'm not sure it's interesting either, but the idea of invoking a
Clojure *function* as a "program" rather than a Clojure namespace or
file is a good one I think.

> Just some thoughts to a look a different approach.

Thanks very much.

--Steve

Meikel Brandmeyer

unread,
Apr 18, 2009, 9:39:23 AM4/18/09
to clo...@googlegroups.com
Hi,

Am 18.04.2009 um 14:34 schrieb Stephen C. Gilardi:

>> java -cp .. clojure.main -E my.ns/main my-script-defining-my-ns.clj
>
> I'm trying to make this kind of command line easier to use and
> explain. We basically already have what you're proposing:
>
> java -cp .. clojure.main -i my-script-defining-my-ns.clj -e "(my.ns/
> main *command-line-args*)"

I tried to make this work with hashdot, but then the script is read
twice
because I have to specify it as -i script and it will be passed again as
additional parameter to clojure.main.

The problem is, that -e works before reading the script and but hashdot
doesn't provide a hashdot.args.post. Then I could construct a header
like:

#! /usr/bin/env hashdot
;;.hashdot.main = clojure.main

;;.hashdot.args.pre = -i
;;.hashdot.args.post = -e "(my.ns/main)"

Just the usual great-tools-but-still-not-perfect issue. This could be
either
"fixed" in hashdot or clojure.main. So it's not a problem of either.
It's mine
wanting to combine the two tools...

> I still like mine, but an alternative of specifying the function
> directly and then calling it with the rest of the args as strings
> and returning the int it returns would also be succinct and perhaps
> even clearer:
>
> java -cp .. clojure.main my.ns/main

Now, that would be cool. Although it would still rule out hashdot,
since this passes the script on as additional argument. Could I
convince you to have an optional -E to allow the use of tools like
hashdot?

> I'd rather have it in classpath. What is the "double book-keeping of
> require..."

(ns foo.bar)
(in-ns 'user)
(require 'foo.bar)

This loads foo.bar again, although it already exists and I did not
specify the :reload argument. I ran into this some while ago
with a gen-class'd Class. It was I somewhat convoluted setup
which I don't remember any more, but it took me some time to
figure out, that require doesn't care for all-ns and keeps its own
set of namespaces. This I would call double book-keeping.

> Thanks very much.

Thanks for your great contributions, Stephen. In particular
the require/use system and clojure.main. :)

Sincerely
Meikel

Reply all
Reply to author
Forward
0 new messages