[Web-SIG] A more useful command-line wsgiref.simple

Masklinn

unread,

Mar 29, 2012, 6:02:46 AM3/29/12

to Web SIG

Moving here as suggested by Terry Reedy as this list may be more
interested than -ideas (note: some feedback already used to revise
the original proposal, and a very basic patch — with no tests — is
provided for the current CPython default branch)

Currently, calling wsgiref.simple_server simply mounts the (bundled)
demo app.

I think that's a bit of a lost opportunity: the community seems to have
mostly standardized on a wsgi script providing an application callable
in its global namespace (though details may differ, mod_wsgi does not
care for the script's name and mandates an `application` callable while
e.g. gunicorn wants a Python module and the callable name must be
configured), and it would be nice if simple_server could take such a
script and mount the application provided:

* This would allow testing that the script has no error without having
to go through mounting it in e.g. mod_wsgi
* It would make trivial/test applications (e.g. dynamic responders to
local JS) simpler to bootstrap as there would be no need for the
half-dozen lines of wsgiref.simple_server bootstrapping and "hard"
dependency on wsgiref,

import wsgiref.simple_server

def application(environ, start_response):
'code'

if __name__ == '__main__':
httpd = make_server('', 8000, application)
httpd.serve_forever()

could become:

def application(environ, start_response):
'code'

Since wsgiref already supports `python -mwsgiref.simple_server`, the
changes would be pretty simple:

* an optional positional argument of the form `script[:app]`, the script
is exec'd, the application (called "application" by default) is
extracted and then mounted in simple_server. If no script is specified,
just mount `demo_app` as before
* Add -H/--host -p/--port options to, respectively, the hostname and the
port to bind the server to.
* The current -msimple_server uses `handle_request` and only replies once,
to increase the usability of the CLI tool use `serve_forever` *when and
only when the mounted application is not demo_app*. It also avoids
opening a hardcoded example URL on launch.

This way the current sanity test/"PHPInfo" demo app works as it did before,
but it becomes possible to very easily serve a WSGI script with almost no
overhead in the script itself.

Attachment: patch performing the above-specified alterations, using
argparse for arguments parsing and generation of help.

simple_server.patch

Graham Dumpleton

unread,

Mar 29, 2012, 6:38:08 AM3/29/12

to Masklinn, Web SIG

On 29 March 2012 21:02, Masklinn <mask...@masklinn.net> wrote:
> Moving here as suggested by Terry Reedy as this list may be more
> interested than -ideas (note: some feedback already used to revise
> the original proposal, and a very basic patch — with no tests — is
> provided for the current CPython default branch)
>
> Currently, calling wsgiref.simple_server simply mounts the (bundled)
> demo app.
>
> I think that's a bit of a lost opportunity: the community seems to have
> mostly standardized on a wsgi script providing an application callable
> in its global namespace (though details may differ, mod_wsgi does not
> care for the script's name and mandates an `application` callable while

Apache/mod_wsgi only defaults to 'application', it is configurable.

As for the rest of the proposal, I tried to push the same idea a few
years back with intent that all WSGI servers would provide a similar
mechanism to that you describe as a lowest common denominator, but I
didn't get anywhere at the time.

Although people now perhaps appreciate more that a single approach
would be better, it has ballooned now into a much bigger goal with the
discussions on a common deployment mechanism. For example, a WARP file
(Python WAR file equivalent) being the latest idea. This was touched
on again at Web Summit at PyCon this year.

Graham

> _______________________________________________
> Web-SIG mailing list
> Web...@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>
_______________________________________________
Web-SIG mailing list
Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/python-web-sig-garchive-9074%40googlegroups.com

Masklinn

unread,

Mar 29, 2012, 6:47:16 AM3/29/12

to Graham Dumpleton, Web SIG

On 2012-03-29, at 12:38 , Graham Dumpleton wrote:
> On 29 March 2012 21:02, Masklinn <mask...@masklinn.net> wrote:
>> Moving here as suggested by Terry Reedy as this list may be more
>> interested than -ideas (note: some feedback already used to revise
>> the original proposal, and a very basic patch — with no tests — is
>> provided for the current CPython default branch)
>>
>> Currently, calling wsgiref.simple_server simply mounts the (bundled)
>> demo app.
>>
>> I think that's a bit of a lost opportunity: the community seems to have
>> mostly standardized on a wsgi script providing an application callable
>> in its global namespace (though details may differ, mod_wsgi does not
>> care for the script's name and mandates an `application` callable while
>
> Apache/mod_wsgi only defaults to 'application', it is configurable.

OK, I didn't see any note on the subject so I didn't think that was the
case. Sorry. That's what my (revised) proposal supports anyway,
defaulting to 'application' but overridable.

PJ Eby

unread,

Mar 29, 2012, 1:46:12 PM3/29/12

to Masklinn, Web SIG

On Thu, Mar 29, 2012 at 6:02 AM, Masklinn <mask...@masklinn.net> wrote:

* This would allow testing that the script has no error without having
to go through mounting it in e.g. mod_wsgi
* It would make trivial/test applications (e.g. dynamic responders to
local JS) simpler to bootstrap as there would be no need for the
half-dozen lines of wsgiref.simple_server bootstrapping and "hard"
dependency on wsgiref,

Looks good to me. A few things I'd suggest adding:

* Add an option to serve a single request or forever

* Have it optionally launch any script in the webbrowser, not just the demo

* Allow use of a module instead of a script to obtain the application

* drop the ':' separator syntax, or else use os.pathsep so that it works properly on Windows (where ':' can denote a drive letter)

Masklinn

unread,

Mar 29, 2012, 2:14:09 PM3/29/12

to PJ Eby, Web SIG

On 2012-03-29, at 19:46 , PJ Eby wrote:
>
> * Add an option to serve a single request or forever
> * Have it optionally launch any script in the webbrowser, not just the demo

Should these options be activated only when not mounting the demo, or
all the time (meaning they'd default to single and open respectively, in
order not to change the current behavior?)

> * Allow use of a module instead of a script to obtain the application

How would the runner make the difference? Do it as Python does, a
positional script file and a -m option for a module?

> * drop the ':' separator syntax, or else use os.pathsep so that it works
> properly on Windows (where ':' can denote a drive letter)

Ah yes, I had not thought of that, this would be annoying and I don't
think the pathsep helps, since the script is an arbitrary path file it
can contain pathseps. A second — optional as well — positional parameter
would avoid that problem, would not be more typing and would remove the
need for some of the munging in the type callable.

But now that'd look weird with -m module, unless `-m` is not an
option-with-value but a boolean flag saying to treat the first
positional parameter as a module instead of a script.

That could work nicely, it'd depart somewhat from Python's own
semantics but not too much.

Sasha Hart

unread,

Mar 29, 2012, 4:16:46 PM3/29/12

to Masklinn, Web SIG

My 2c on the idea - I like this as a way to reduce friction a little
for beginners who just want to see their web apps run and may not be
comfortable enough to install something else. It would be nice to have
a one-liner with no non-stdlib dependencies, like 'python -m
SimpleHTTPServer' for running WSGI. But I probably won't be using it
much: if you can type (say) 'pip install gunicorn' then 'gunicorn app'
is ridiculously easy (not to specifically promote gunicorn, it's just
an example).

On Thu, Mar 29, 2012 at 1:14 PM, Masklinn <mask...@masklinn.net> wrote:
>
> On 2012-03-29, at 19:46 , PJ Eby wrote:
> > * drop the ':' separator syntax, or else use os.pathsep so that it works
> > properly on Windows (where ':' can denote a drive letter)
>
> Ah yes, I had not thought of that, this would be annoying and I don't
> think the pathsep helps, since the script is an arbitrary path file it
> can contain pathseps. A second — optional as well — positional parameter
> would avoid that problem, would not be more typing and would remove the
> need for some of the munging in the type callable.
>
> But now that'd look weird with -m module, unless `-m` is not an
> option-with-value but a boolean flag saying to treat the first
> positional parameter as a module instead of a script.

If the colon can be made safe for Windows it might be desirable, since
it looks fine and reflects existing practice in a number of places.
But aside from that, I really hope os.pathsep won't be used, because
then we will have a format for naming a WSGI app object which not only
differs from established practice, but (worse) depends on the platform
in a non-obvious way (i.e., not just like using os.sep when specifying
paths on platform-specific file system, which is normal for people who
type out file paths)

As I understand (please correct me if I'm wrong), the big obvious
problem for Windows is that simply splitting on ':' will do something
dumb if the path is something like 'C:\app.py' or
'D:\work\company.apps:foo'. That parsing issue can be worked around
for normal cases, but the ambiguous corner cases like 'c:app' are
confusing (that could be either a file 'app' on the root of the C:
drive, or a callable named 'app' in the module c - unfortunately the \
is not required to specify the root).

So if you want to both keep the colon and want to accept all strings
which DOS would accept before the colon, you must disambiguate which
one you want to parse. But if you can part with the colon OR with the
'c:app' corner case, it can be managed.

Here are some alternative suggestions, in case they are helpful.

(A) accept only one of the two types of input at all (e.g. insist only
on importable module name - it looks like this is what gunicorn does
and I like it a lot - short of having your code installed on system
PYTHONPATH, or using virtualenv PYTHONPATH, you can also do this if
you are in the directory of the script/package)

(B) use --file=c:\app.py or similar to disambiguate what kind of
parsing to do - then the typical case can be the importable and users
who really want to refer to files explicitly can do so

(C) make an executive decision to interpret 'c:app' as object 'app' in
module 'c', require roots to be specified like 'c:\',
and split from the rightmost : not trailed by \. I believe this only
makes a problem in the incredibly specific pathological case where a
DOS-savvy windows user is trying to serve a file without an extension
directly out of root, while naming root unusually without using the
conventional \. and that rare case would generally just generate a
'huh?' - with the exception that the diabolical user made
single-letter modules with the same letters as drives and dumped them
on PYTHONPATH... I think this would work, just be more complex than
the (A) and (B) solution which seems to jibe more with the rest of
Python. but if you want to allow files to be specified freely in the
same spot that modules might be specified...

(D) give up colon: first arg is a filepath-or-pythonmodulepath, and
use something like --name=app (defaults to 'application' if not
specified) to add the app name to look for in the module... I'm not
too keen on this, it is almost as much extra typing as 'pip install'
for a very typical case. The colon is nice because it's one keystroke
and you can expect to use it

(E) give up colon: first arg is a filepath-or-pythonmodulepath, and
optional second arg is what would have come after the colon. I think
this is your suggestion and it's not unreasonable at all. If we are
talking about 'python -m wsgiref.simple_server c:app app' then the
second positional arg is definitely not the weirdest looking part. I
just question whether it's really necessary - why not either drop the
implicit specification of a filename (A/B) or enforce backslash after
root on windows (C)?

PJ Eby

unread,

Mar 29, 2012, 10:25:04 PM3/29/12

to Sasha Hart, Web SIG

On Thu, Mar 29, 2012 at 4:16 PM, Sasha Hart <s...@sashahart.net> wrote:

If the colon can be made safe for Windows it might be desirable, since
it looks fine and reflects existing practice in a number of places.

For *modules*, yes. For scripts, no.

But for module names, there's no conflict.

(A) accept only one of the two types of input at all (e.g. insist only
on importable module name - it looks like this is what gunicorn does
and I like it a lot - short of having your code installed on system
PYTHONPATH, or using virtualenv PYTHONPATH, you can also do this if
you are in the directory of the script/package)

This would tend to be my preference, if we don't support both.

(B) use --file=c:\app.py or similar to disambiguate what kind of
parsing to do - then the typical case can be the importable and users
who really want to refer to files explicitly can do so

This is my preference if we do support both - to explicitly separate modules from scripts somehow.

(E) give up colon: first arg is a filepath-or-pythonmodulepath, and
optional second arg is what would have come after the colon. I think
this is your suggestion and it's not unreasonable at all. If we are
talking about 'python -m wsgiref.simple_server c:app app' then the
second positional arg is definitely not the weirdest looking part. I
just question whether it's really necessary - why not either drop the
implicit specification of a filename (A/B) or enforce backslash after
root on windows (C)?

I don't like accepting both scripts and modules without an explicit marking as to which you're providing. People confuse the two concepts enough already.

Masklinn

unread,

Mar 30, 2012, 4:31:52 AM3/30/12

to Web SIG

On 2012-03-30, at 04:25 , PJ Eby wrote:
>
>> (B) use --file=c:\app.py or similar to disambiguate what kind of
>> parsing to do - then the typical case can be the importable and users
>> who really want to refer to files explicitly can do so
>
> This is my preference if we do support both - to explicitly separate
> modules from scripts somehow.

Earlier I proposed a -m switch (inspired from Python), although the
repetition of -m may be slightly weird I think it could work nicely.

In either case, the colon separator would be removed and the
application's variable name would be a positional following the
script-or-module.

Sasha Hart

unread,

Mar 30, 2012, 2:22:39 PM3/30/12

to Masklinn, Web SIG

On Fri, Mar 30, 2012 at 3:31 AM, Masklinn <mask...@masklinn.net> wrote:

On 2012-03-30, at 04:25 , PJ Eby wrote:
>
>> (B) use --file=c:\app.py or similar to disambiguate what kind of
>> parsing to do - then the typical case can be the importable and users
>> who really want to refer to files explicitly can do so
>
> This is my preference if we do support both - to explicitly separate
> modules from scripts somehow.

Earlier I proposed a -m switch (inspired from Python), although the
repetition of -m may be slightly weird I think it could work nicely.

In either case, the colon separator would be removed and the
application's variable name would be a positional following the
script-or-module.

I am finding more reasons to dislike that -m:

    python -m wsgiref.simple_server -m blog app

Beyond looking a little stuttery, it's really unclear. Anyone could be forgiven for thinking that -m meant the same thing in both cases, took the same kinds of arguments, could be exchanged for any other -m clause. But wsgiref.simple_server is not at all doing what Python is doing so I see no gain of understanding by reusing the convention. python -m doesn't take a second positional argument, either. You can't write '-m blog app -m wsgiref.simple_server' or '-m blog -m wsgiref.simple_server blog', but you have to understand lots of specifics to see why. I think it's just too confusing.

On reflection, I feel strongly that a module name should be the default positional arg, not a filename. I agree with PJ Eby that pointing directly at a file encourages script/module confusion. I would add that it encourages hardcoding file paths rather than module names, which is brittle and not good for the WSGI world (for example, it bypasses virtualenvs and breaks any time a different deploy directory structure is used). Of course, this also means no '-m'. Then the typical use case is really just

    python -m wsgiref.simple_server blog

A second positional arg is both a new convention and not an explicit one, where I would prefer either an existing implicit convention or a new explicit one.

I think PJ Eby is right that the colon convention is only for modules, and I think following gunicorn's lead here would result in a nicer interface than forcing (say) --module

    python -m wsgiref.simple_server blog:app

If there is a need to point at a filename, I agree that it should be done explicitly.

    python -m wsgiref.simple_server --file=~/app.py

(or whatever the flag should be called). To me this seems like a small cost to allow the colon by default without possibility of confusion or overly fancy parsing.

I do really like the idea of having a quick WSGI runner in the stdlib, I hope it happens and would be happy to help. New users should be able to see results with a one-liner local server and a quick server deploy tutorial BEFORE they have to implement a WSGI recipe or understand WSGI. Although it's not rocket science, it's just easier for everyone when people who don't (yet) care about WSGI aren't forced to deal with it directly, and this would help a little with that.

Masklinn

unread,

Mar 30, 2012, 3:20:57 PM3/30/12

to Sasha Hart, Web SIG

On 2012-03-30, at 20:22 , Sasha Hart wrote:
>
> I am finding more reasons to dislike that -m:
>
> python -m wsgiref.simple_server -m blog app
>
> Beyond looking a little stuttery, it's really unclear. Anyone could be
> forgiven for thinking that -m meant the same thing in both cases

And it does. In both case it means "use this module for execution". Hence
`-m`, as a shorthand for `--module`.

> , took the
> same kinds of arguments

It does as well, both take a Python module on the path.

> , could be exchanged for any other -m clause. But
> wsgiref.simple_server is not at all doing what Python is doing

Of course not, if it did the exact same thing it would be redundant.

> so I see no
> gain of understanding by reusing the convention. python -m doesn't take a
> second positional argument, either.

I'm not sure why that would matter.

> You can't write '-m blog app -m
> wsgiref.simple_server' or '-m blog -m wsgiref.simple_server blog'

Naturally, because this makes no sense at all, the tool being invoked
to start the server is all of `python -mwsgiref.simple_server`. But
that's the very basics of the -m option.

> , but you
> have to understand lots of specifics to see why.

What specific do you have to understand beyond the normal behavior of `-m`?

I fail to see why that would be any more troubling than
`python -mcalendar -m 6`. Or require any more specifics.

> On reflection, I feel strongly that a module name should be the default
> positional arg, not a filename. I agree with PJ Eby that pointing directly
> at a file encourages script/module confusion. I would add that it
> encourages hardcoding file paths rather than module names, which is brittle
> and not good for the WSGI world (for example, it bypasses virtualenvs and
> breaks any time a different deploy directory structure is used).

Not sure how that makes sense, it uses the Python instance and site-package
in the virtualenv, there is nothing to bypass.

> Of course,
> this also means no '-m'. Then the typical use case is really just
>
> python -m wsgiref.simple_server blog
>
> A second positional arg is both a new convention and not an explicit one,
> where I would prefer either an existing implicit convention or a new
> explicit one.

What is not explicit in having an explicit argument, that it's a positional
one instead of an option? How is a colon "more explicit"?

> I think PJ Eby is right that the colon convention is only for modules, and
> I think following gunicorn's lead here would result in a nicer interface
> than forcing (say) --module
>
> python -m wsgiref.simple_server blog:app

The colon is no more explicit than a second positional argument. In fact,
it is significantly less so since it can not be separately and clearly
documented and the one positional parameter needs to document its parsing
rules instead.

> If there is a need to point at a filename, I agree that it should be done
> explicitly.
>
> python -m wsgiref.simple_server --file=~/app.py
>
> (or whatever the flag should be called). To me this seems like a small cost
> to allow the colon by default without possibility of confusion or overly
> fancy parsing.

1. It also does not work considering you can't specify the application's
name in that scheme, so piling on yet more complexity would be
required and there would be two completely different schemes for
specifying an application name. I don't find this appealing.

2. You seem to have asserted from the start that the default should be
mounting modules, but I have seen no evidence or argument in favor of
that so far.

Defaulting to scripts not only works with both local modules and
arbitrary files and follow cpython's (and most tools's) own behavior,
but would also allows using -mwsgiref.simple_server as a shebang
line. I find this to have quite a lot of value.

Geoffrey Spear

unread,

Mar 30, 2012, 3:58:52 PM3/30/12

to Masklinn, Web SIG

On Fri, Mar 30, 2012 at 3:20 PM, Masklinn <mask...@masklinn.net> wrote:
> 2. You seem to have asserted from the start that the default should be
> mounting modules, but I have seen no evidence or argument in favor of
> that so far.
>
> Defaulting to scripts not only works with both local modules and
> arbitrary files and follow cpython's (and most tools's) own behavior,
> but would also allows using -mwsgiref.simple_server as a shebang
> line. I find this to have quite a lot of value.

I may be dense, but is there actually a use case for using a WSGI
application from a script? Presumably a script that defines a WSGI
application would also run it.

Masklinn

unread,

Mar 30, 2012, 4:09:12 PM3/30/12

to Web SIG

On 2012-03-30, at 21:58 , Geoffrey Spear wrote:
> On Fri, Mar 30, 2012 at 3:20 PM, Masklinn <mask...@masklinn.net> wrote:
>> 2. You seem to have asserted from the start that the default should be
>> mounting modules, but I have seen no evidence or argument in favor of
>> that so far.
>>
>> Defaulting to scripts not only works with both local modules and
>> arbitrary files and follow cpython's (and most tools's) own behavior,
>> but would also allows using -mwsgiref.simple_server as a shebang
>> line. I find this to have quite a lot of value.
>
> I may be dense, but is there actually a use case for using a WSGI
> application from a script? Presumably a script that defines a WSGI
> application would also run it.

I would recommend going back to the original email of the thread
where I described precisely the issue I have with this. And of course
you could have the exact same objection about a module.

The point is very much not to need the overhead of defining a
(conditional) import and usage of simple_server, and keep the
script/module itself focused on its job of defining the application.

It also frees up the __main__ runner for other behaviors such as
testing or CLI tools if needs be.

Graham Dumpleton

unread,

Mar 30, 2012, 5:12:23 PM3/30/12

to Geoffrey Spear, Web SIG

On 31 March 2012 06:58, Geoffrey Spear <geoff...@gmail.com> wrote:
> On Fri, Mar 30, 2012 at 3:20 PM, Masklinn <mask...@masklinn.net> wrote:
>> 2. You seem to have asserted from the start that the default should be
>> mounting modules, but I have seen no evidence or argument in favor of
>> that so far.
>>
>> Defaulting to scripts not only works with both local modules and
>> arbitrary files and follow cpython's (and most tools's) own behavior,
>> but would also allows using -mwsgiref.simple_server as a shebang
>> line. I find this to have quite a lot of value.
>
> I may be dense, but is there actually a use case for using a WSGI
> application from a script? Presumably a script that defines a WSGI
> application would also run it.

Some history for you.

Seeing the file containing a WSGI application entry point as a file
rather than a module derives from how Apache works.

Take for example CGI under Apache, one can say for a directory context:

AddHandler cgi-script .py

What this means is that any files with a .py extension are executed as
a CGI script. Thus, would have to be an executable file and have an
appropriate #! line which can resolve the Python interpreter to use.
In this case the extension used is actually irrelevant.

When mod_python came along it allowed one instead to say:

AddHandler python-script .py

The way mod_python then originally worked was that when it resolved a
URL to a directory containing the target .py file, it would add that
directory to sys.path, import the module based on the basename of the
target file. It would then execute the entry point callable within the
loaded module.

No #! line was needed, nor did file need to be executable. The first
wasn't needed because mod_python dictated what Python version was
used.

The problem with what mod_python did was the AddHandler can span
multiple directories. As target files in each directory were accessed,
each directory would get added into sys.path to be able to import it.

Because these are normal file system directories and treated as
separate module directories and not part of an overall package
structure, there was nothing to stop you having the same name file in
each directory. It was common for example to have:

DirectoryIndex index.py

This means that if the directory itself was the target, it would use
the index.py in the directory as means of generating the directory
index.

If more than one directory was added to sys.path containing an
index.py file, you can only have one loaded as a module, not both.

Thus you ended up with an in memory instance of 'index' module being
used rather than the second one encountered, or depending on sys.path
ordering, you could import the 'index' module from the wrong
directory. Basically, things were a bit unpredictable if you ever used
the same file name more than once.

There was various other things that could go wrong as well.

In latter version of mod_python the whole module importing system was
rewritten to avoid adding directories into sys.path. Instead a custom
module importer was used with special lookup rules to find modules in
directories itself.

Further, when modules were loaded, the __name__ of the module was not
just the basename of the file, but a magic string taking into account
the full file path name. By doing this, even though index.py may occur
in separate places, they would be distinct modules in memory.

The complexity of still allowing relative module imports from the same
directory to simulate things as if directory was in sys.path was
frightening though. Add to that that mod_python had a reloading
mechanism which could look not just at the immediate file, but all sub
modules imported from the directories managed by the mod_python custom
module importer and also trigger a reload when one of the used modules
was changed and not just the top level one.

Now when doing mod_wsgi, a similar method of loading each file
separately with a __name__ based on file system path was used to
ensure each was distinct when same file name used in different
directories.

What mod_wsgi didn't do though was replicate the custom module
importer that mod_python had as that really was a nightmare.

This mean that relative module imports from same directory would not
work. If someone really wanted that, they would need to add the
directory to sys.path themselves.

Once they did that though, because the target file as loaded by
mod_wsgi had a __name__ which didn't match the basename for file, then
if someone tried to import that module file back into something else,
you would end up with two copies in memory. The first being the magic
one mod_wsgi loaded as file and the other loaded as module.

To make it more obvious that they were treated a bit differently, and
to avoid people making this mistake, it was promoted to use a .wsgi
extension for the WSGI script file rather than .py. That way people
would not go inadvertently importing it a second time.

Further, because of the way that the .wsgi script file was loaded,
ie., not as regular module import, and with __name__ being special
there were certain things you couldn't do in it.

One example was that you could not put class definitions for objects
in it which you then pickled up instances of. This is because when
unpickling Python would not know how to automatically import the
module containing the class definitions because of __name__ not having
any meaning to the Python where being unpickled.

So that is the history of why there is a distinction between WSGI
script file and module containing a WSGI application in Apache at
least. Although it is called 'script file', it is really only 'file'
as it isn't executable itself in the sense of what people generally
talk about when they say 'script'.

FWIW, in the past when pushing the idea of a WSGI script file being
the lowest common denominator, part of the reason I found I couldn't
get it accepted is that some people simply didn't understand how in
Python to load an arbitrary file by path name and construct a module
for it in memory, with magic __name__. They seemed to think that the
only way to import a code file was for it to have a .py extension and
for the directory to be in sys.path. So, due to ignorance of the
solution as to how to do it meant I got a push back from some people.
:-(

Graham

PJ Eby

unread,

Mar 30, 2012, 11:27:30 PM3/30/12

to Sasha Hart, Web SIG

On Fri, Mar 30, 2012 at 2:22 PM, Sasha Hart <s...@sashahart.net> wrote:

I do really like the idea of having a quick WSGI runner in the stdlib,

What's kind of funny is that this was actually one of the original use cases that resulted in the invention of WSGI; back in the early 2000's, PEAK had its own internal protocol called "runCGI", and part of the idea was that we had a command line tool that could run things implementing that interface from the command line, with servers for fastcgi, cgi, SimpleHTTPServer, and so on.

Ah well, that's a bit off-topic. But mainly, it's the reason I never got thought of actually making wsgi_ref.simple_server actually do that stuff; I already had a runner in PEAK that would do that kind of thing and launch a browser too. Despite inspiring simple_server, it completely slipped my mind that it'd be a good idea to put that stuff in wsgiref, too!

Regarding modules vs. files, I don't really care that much which way the capability is spelled, as long as the file vs. module distinction is explicit. "-m " isn't a lot to add to a command line, and neither is "-f ". If there's no consensus, just require that one or the other be specified, and inconvenience both groups of people equally. ;-)

PJ Eby

unread,

Mar 30, 2012, 11:36:12 PM3/30/12

to Graham Dumpleton, Web SIG

On Fri, Mar 30, 2012 at 5:12 PM, Graham Dumpleton <graham.d...@gmail.com> wrote:

Now when doing mod_wsgi, a similar method of loading each file

separately with a __name__ based on file system path was used to
ensure each was distinct when same file name used in different
directories.

Why give them a __name__ at all? Aren't they scripts, rather than modules? ISTM that not having a __name__ would also let things like pickles fail faster. That is, code that expected a module rather than a script would break right away.

FWIW, in the past when pushing the idea of a WSGI script file being
the lowest common denominator, part of the reason I found I couldn't
get it accepted is that some people simply didn't understand how in
Python to load an arbitrary file by path name and construct a module
for it in memory, with magic __name__. They seemed to think that the
only way to import a code file was for it to have a .py extension and
for the directory to be in sys.path. So, due to ignorance of the
solution as to how to do it meant I got a push back from some people.

Who were you trying to get acceptance from? Web-SIG or Python-Dev? Framework devs or end-users? Is there a PEP?

If implementation is a problem for people, could we just include a wsgiref utility for it?

Graham Dumpleton

unread,

Mar 30, 2012, 11:57:04 PM3/30/12

to PJ Eby, Web SIG

On 31 March 2012 14:36, PJ Eby <p...@telecommunity.com> wrote:
> On Fri, Mar 30, 2012 at 5:12 PM, Graham Dumpleton
> <graham.d...@gmail.com> wrote:
>>
>> Now when doing mod_wsgi, a similar method of loading each file
>>
>> separately with a __name__ based on file system path was used to
>> ensure each was distinct when same file name used in different
>> directories.
>
>
> Why give them a __name__ at all? Aren't they scripts, rather than modules?
> ISTM that not having a __name__ would also let things like pickles fail
> faster. That is, code that expected a module rather than a script would
> break right away.

Because not having a __name__ attribute at all would make:

if __name__ == '__main__':
...

fail straight away and people quite often had that in scripts so they
could run it directly as well with a pure WSGI server.

>> FWIW, in the past when pushing the idea of a WSGI script file being
>> the lowest common denominator, part of the reason I found I couldn't
>> get it accepted is that some people simply didn't understand how in
>> Python to load an arbitrary file by path name and construct a module
>> for it in memory, with magic __name__. They seemed to think that the
>> only way to import a code file was for it to have a .py extension and
>> for the directory to be in sys.path. So, due to ignorance of the
>> solution as to how to do it meant I got a push back from some people.
>
> Who were you trying to get acceptance from? Web-SIG or Python-Dev?
> Framework devs or end-users? Is there a PEP?

I brought it up on the WEB-SIG. It may have been bad timing amongst
all the other discussions that went around in circles at the time on
the WEB-SIG. Also mentioned it in passing to some WSGI server
developers and other people when discussing web stuff at meet ups or
otherwise.

PJ Eby

unread,

Apr 1, 2012, 12:03:17 AM4/1/12

to Graham Dumpleton, Web SIG

On Fri, Mar 30, 2012 at 11:57 PM, Graham Dumpleton <graham.d...@gmail.com> wrote:

On 31 March 2012 14:36, PJ Eby <p...@telecommunity.com> wrote:
> Why give them a __name__ at all? Aren't they scripts, rather than modules?
> ISTM that not having a __name__ would also let things like pickles fail
> faster. That is, code that expected a module rather than a script would
> break right away.

Because not having a __name__ attribute at all would make:

if __name__ == '__main__':
...

fail straight away and people quite often had that in scripts so they
could run it directly as well with a pure WSGI server.

Well, they could always replace it with "if globals().get('__name__', '__main__')== '__main__':", I suppose. ;-)

Masklinn

unread,

Apr 1, 2012, 10:32:00 AM4/1/12

to Web SIG

On 2012-03-31, at 05:27 , PJ Eby wrote:

> On Fri, Mar 30, 2012 at 2:22 PM, Sasha Hart <s...@sashahart.net> wrote:
>
>> I do really like the idea of having a quick WSGI runner in the stdlib,
>>

> Regarding modules vs. files, I don't really care that much which way the
> capability is spelled, as long as the file vs. module distinction is
> explicit. "-m " isn't a lot to add to a command line, and neither is "-f
> ". If there's no consensus, just require that one or the other be
> specified, and inconvenience both groups of people equally. ;-)

Let's go with that. I'm not sure if I should be using sub-commands or
options so for now I've used options. UI changes from initial patch:

* Exclusive options for getting the application callable from a script
(via `exec`) or a module (using `importlib.import_module`)
* Application name specification has been moved to an option (default:
`application`)
* Option to serve the request only once (default: forever) when mounting
a custom application
* Option to open a browser window/tab to the (host, port) specified
(which still default to '' and 8000 respectively)

> python -mwsgiref.simple_server -h
usage: simple_server.py [-h] [-H HOST] [-p PORT] [-s SCRIPT | -m MODULE]
[-a APP] [-1] [-b]

Mount and serve a WSGI application. If no script or module is specified,
mounts wsgiref.simple_server.demo_app

optional arguments:
-h, --help show this help message and exit
-H HOST, --host HOST Host to listen on
-p PORT, --port PORT Port to listen on (defaults to 8000)
-s SCRIPT, --script SCRIPT
WSGI script file to execute and get the application
from
-m MODULE, --module MODULE
Python module to import and get the application from

Script or module options:
These options do not apply when mounting the demo_app

-a APP, --app APP, --app-name APP
Name of the application variable in the script or
module, application if none is provided
-1, --once Only handles a single request, then exits
-b, --browse Launch browser to the served URL

simple_server.patch

Sasha Hart

unread,

Apr 2, 2012, 12:54:36 AM4/2/12

to Masklinn, Web SIG

On Fri, Mar 30, 2012 at 2:20 PM, Masklinn <mask...@masklinn.net> wrote:

On 2012-03-30, at 20:22 , Sasha Hart wrote:
>
> I am finding more reasons to dislike that -m:
>
> python -m wsgiref.simple_server -m blog app
>
> Beyond looking a little stuttery, it's really unclear. Anyone could be
> forgiven for thinking that -m meant the same thing in both cases

And it does. In both case it means "use this module for execution". Hence
`-m`, as a shorthand for `--module`.

I don't agree, there are real differences. python -m runs e.g. what is in __main__.py and this is effectively a script: the interface is defined by things like sys.argv, environment variables, exit code, stdin/stdout. yourtool -m asks a module to put a callable into a namespace, the name of the callable must be known to decide which callable to use. a simple http server must run and repeatedly call the callable according to WSGI convention, not a script interface. Typically there is some kind of event loop, etc. The things 'executed' are different and they are 'executed' in quite different ways. So while you can use -m (or -xyz or -rm-rf) I have the opinion that it's a little more confusing and a little more long-winded than makes sense to me as a clean one-liner for newbies, partially because I think it suggests a misleading parallel.

But this is just feedback; if you don't like it, then just get more feedback. My vote is for whatever gets consensus as easiest for users who have a use for the thing. If you get a patch in, great. If I don't like it then I will just recommend that (still pretty trivial, conceptually clear) wsgiref recipe or gunicorn or whatever has nailed the extreme simplicity required for this case where you are avoiding a 'real server' deploy.

> so I see no gain of understanding by reusing the convention. python -m doesn't take a
> second positional argument, either.

I'm not sure why that would matter.

What it could matter (for what it's worth) is that two different calling conventions is more for a newbie to keep straight, for a '-m' which is supposed to mean one thing.

> You can't write '-m blog app -m
> wsgiref.simple_server' or '-m blog -m wsgiref.simple_server blog'

Naturally, because this makes no sense at all, the tool being invoked
to start the server is all of `python -mwsgiref.simple_server`. But
that's the very basics of the -m option.

It is only natural for people who have strong domain knowledge of Python, for whom the proposed feature is redundant. For people who don't care about Python or are just starting, it is not safe to assume detailed knowledge of the -m option to the python executable. Many utilities allow the same flag to be repeated to supply multiple arguments, for example; the interpretation of -m and the rest of the command line here is not unprecedented, but also not just obvious to every bash user. If this is easy, how hard is it to paste in the wsgiref recipe or run gunicorn? Take it or leave it, my input is that this has no reason to exist in stdlib unless it is at least as simple. But as I said before, if you get a patch in then that's great.

I fail to see why that would be any more troubling than
`python -mcalendar -m 6`. Or require any more specifics.

I think 'draw six calendars across' is not at all confusable with 'run the calendar module as a script' because that utility is so simple. The one under discussion is at least somewhat more complex, since you are asking one module to run as a script which imports another module and picks out a callable and starts an HTTP/WSGI server. We can keep track of that without any trouble, but we are also subscribers to Web-SIG.

I will likely never use that calendar one-liner in my life, nor recommend it to any newbie I am trying to help. Whether it is good or bad, it matters very little. If you get a patch into stdlib with really simple syntax then I will evangelize it right after 'hello world'. If not, there are other options which are also okay.

> On reflection, I feel strongly that a module name should be the default
> positional arg, not a filename. I agree with PJ Eby that pointing directly
> at a file encourages script/module confusion. I would add that it
> encourages hardcoding file paths rather than module names, which is brittle
> and not good for the WSGI world (for example, it bypasses virtualenvs and
> breaks any time a different deploy directory structure is used).

Not sure how that makes sense, it uses the Python instance and site-package
in the virtualenv, there is nothing to bypass.

The problem is that you get scripts coded directly against Z:\WORK\FROB\FROB2\FARPLE.PY instead of correctly searching for farple on PYTHONPATH (thus separating the concerns of how/where to install farple from ones of how to get it). If you write a script which hardcodes module paths, it is asking for exactly the same file regardless of what virtualenv you are in.

I fully understand why the python interpreter takes file arguments. In web apps we are not talking about just running a file through the interpreter any more. WSGI doesn't involve 'running' scripts, but rather importing modules which contain WSGI application callables. Here we are talking about how to tell python to import a file by its filename, when Python itself already provides a clearly better way of finding things. Now I reckon you are seasoned and you know about those subtleties, but if I am just switching from PHP and you teach me to specify this with file paths then that is setting me up for frustration down the line, when I either have to deal with correct imports or I hit the wall of the 'import from py files' approach.

> Of course, this also means no '-m'. Then the typical use case is really just
>
> python -m wsgiref.simple_server blog
>
> A second positional arg is both a new convention and not an explicit one,
> where I would prefer either an existing implicit convention or a new
> explicit one.

What is not explicit in having an explicit argument, that it's a positional
one instead of an option? How is a colon "more explicit"?

Well, that isn't what I said... I said that I would prefer EITHER an existing implicit convention (e.g. the widely used colon, which also made sense to me from the moment I saw it) OR an explicit convention (e.g. not allowing a second positional argument, only a labeled argument if a second argument were needed). The explicit one is wordy but discoverable. The implicit one is clean and allows me to transfer knowledge from myriad other tools using the same convention. The second positional arg has the disadvantages of both - neither discoverable, as it lacks a label nor does it benefit from sharing a convention with other tools (the latter has a little relevance to me when I wish to teach others, as well).

This is why I was -1 on the second positional arg, and +1 on EITHER of the other two options, which have different pluses and minuses.

> I think PJ Eby is right that the colon convention is only for modules, and
> I think following gunicorn's lead here would result in a nicer interface
> than forcing (say) --module
>
> python -m wsgiref.simple_server blog:app

The colon is no more explicit than a second positional argument. In fact,
it is significantly less so since it can not be separately and clearly
documented and the one positional parameter needs to document its parsing
rules instead.

I am sure that is slightly easier to write, anyway. This attribution of 'explicit' to the colon convention is probably a misunderstanding; I never said that (although I find it strange to say that the second positional arg is in any way more explicit, particularly when it is inventing a convention for referring to WSGI callables). The virtue of the colon syntax for me is that it fully covers the case which needs to be supported without any --args (simple, clean looking) and while allowing transfer of knowledge between this and just about every other tool out there which names WSGI callables. It's close to the simple Python notation for referring to the object - perhaps it could be better though?

> If there is a need to point at a filename, I agree that it should be done
> explicitly.
>
> python -m wsgiref.simple_server --file=~/app.py
>
> (or whatever the flag should be called). To me this seems like a small cost
> to allow the colon by default without possibility of confusion or overly
> fancy parsing.

1. It also does not work considering you can't specify the application's
name in that scheme, so piling on yet more complexity would be
required and there would be two completely different schemes for
specifying an application name. I don't find this appealing.

I guess the problem is that you are indirectly specifying the namespace, via a filename to be imported, and then directly specifying the name. All this (including cross-platform issues) is so much easier if you just use Python's conventions for finding and referencing modules and names in namespaces. But for whatever reason, you must import Python modules by filename (something I'll never do again, for reasons already laid out). Yet WSGI requires a name to be specified, so you end up with some kind of awkward hybrid specification. I certainly do not suggest trying to invent a new convention, but if you want to then you just need to pick a portable delimiter for the command line. Maybe a second positional arg or --app= for the cases where you are importing from a file and do not want 'application' and must have a one-liner rather than setting up the project. I have no reason to care because I will never use or teach this case.

I think this is the interesting question: what convention already exists in Python for naming a particular object in an imported module?

2. You seem to have asserted from the start that the default should be
mounting modules, but I have seen no evidence or argument in favor of
that so far.

In a way you do not have any choice but to 'mount modules': WSGI does not provide a mechanism to 'run a program' without picking a callable Python object out of a namespace, implying an import.

My intention was to offer feedback, on the assumption that the idea was being thrashed out publicly. I believe I actually have offered some supporting reasoning for why I like the module path better as a default, even though you reject it. But ultimately, much of this is or verges on bikeshedding. Now our exchange has taken on a negative tone that I would not have chosen. I have written to clarify some misunderstandings, but I will not bother offering further unwanted feedback. I hope you will look for more input and develop a better consensus.

Defaulting to scripts not only works with both local modules and
arbitrary files and follow cpython's (and most tools's) own behavior,
but would also allows using -mwsgiref.simple_server as a shebang
line. I find this to have quite a lot of value.

The python interpreter runs scripts, i.e. processes which interface with sys.argv and stuff like that. So of course it needs to take file paths. But imports are not typically done with file paths, and WSGI app-finding is either imperatively about importing or declaratively about Python namespaces - not file paths or scripts, as in CGI. Python imports are conventionally done by a path like frob.bar rather than a filename, which has problems (I think this is why we don't "import c:\work\frob\farple.py" at the top of our files, or run "python -m /usr/lib/python3.2/wsgiref.py simple_server". You must also specify a name for the app object, and it happens that the conventions around specifying names in namespaces dovetail pretty closely with the conventions around imports. (e.g.: from frob import app, import frob.app). Given that the operation required is an import and then a lookup, it seems more natural to me to use Python notation or something trivially related to it, rather than OS-specific filesystem notation. But I am sure this is just another disagreement and that's fine with me.

I would personally not +x a module file just to serve an app with wsgiref from the hashbang line; it's clever but I can't come up with any real benefit. A case where I'm serving with wsgiref already has to be pretty trivial and I'm not going to couple to it *from inside the module itself* when it is so darned easy to just run the module from several nice python test servers (also portable and I can use autoreload, etc.) But if this is desired by many others, I'd agree it's a good factor to consider.

Cheers
Sasha

Graham Dumpleton

unread,

Apr 2, 2012, 1:08:06 AM4/2/12

to Sasha Hart, Web SIG

On 2 April 2012 14:54, Sasha Hart <s...@sashahart.net> wrote:
> I would personally not +x a module file just to serve an app with wsgiref
> from the hashbang line; it's clever but I can't come up with any real
> benefit. A case where I'm serving with wsgiref already has to be pretty
> trivial and I'm not going to couple to it *from inside the module itself*
> when it is so darned easy to just run the module from several nice python
> test servers (also portable and I can use autoreload, etc.) But if this is
> desired by many others, I'd agree it's a good factor to consider.

When using CGI or FASTCGI, with a hosting system where an executable
script needs to be supplied, it is beneficial to be able to say
something like:

#!/usr/bin/env python -m cgi2wsgi

#!/usr/bin/env python -m fcgi2wsgi

where the rest of the script is the just the WSGI application.

I have implemented this for CGI as an example at:

https://github.com/GrahamDumpleton/cgi2wsgi

I have done it for FASTCGI using flup as well before but that isn't
available anywhere.

Graham

Graham Dumpleton

unread,

Apr 2, 2012, 1:21:14 AM4/2/12

to Sasha Hart, Web SIG

On 2 April 2012 15:08, Graham Dumpleton <graham.d...@gmail.com> wrote:
> On 2 April 2012 14:54, Sasha Hart <s...@sashahart.net> wrote:
>> I would personally not +x a module file just to serve an app with wsgiref
>> from the hashbang line; it's clever but I can't come up with any real
>> benefit. A case where I'm serving with wsgiref already has to be pretty
>> trivial and I'm not going to couple to it *from inside the module itself*
>> when it is so darned easy to just run the module from several nice python
>> test servers (also portable and I can use autoreload, etc.) But if this is
>> desired by many others, I'd agree it's a good factor to consider.
>
> When using CGI or FASTCGI, with a hosting system where an executable
> script needs to be supplied, it is beneficial to be able to say
> something like:
>
> #!/usr/bin/env python -m cgi2wsgi
>
> #!/usr/bin/env python -m fcgi2wsgi
>
> where the rest of the script is the just the WSGI application.
>
> I have implemented this for CGI as an example at:
>
> https://github.com/GrahamDumpleton/cgi2wsgi
>
> I have done it for FASTCGI using flup as well before but that isn't
> available anywhere.

I should probably add though that this is not the best way it could be
done for FASTCGI. For FASTCGI you are better off making use of FASTCGI
implementations wrapper mechanism as intermediary with it handling the
loading. This is the approach that PHP under FASTCGI uses and why it
is so easy for users, namely because system admins set it up with
wrapper support. You don't see such niceties for Python where a system
admin sets up that a .wsgi script file would be understood to be a
Python WSGI application with no extra magic needing to be added to it
by the user, even though not that difficult in principle. Thus why
users need to resort to #! line and low level FASTCGI script in the
first place.

Reply all

Reply to author

Forward

[Web-SIG] A more useful command-line wsgiref.simple_server?

Masklinn

Graham Dumpleton

Masklinn

PJ Eby

Masklinn

Sasha Hart

PJ Eby

Masklinn

Sasha Hart

Masklinn

Geoffrey Spear

Masklinn

Graham Dumpleton

PJ Eby

PJ Eby

Graham Dumpleton

PJ Eby

Masklinn

Sasha Hart

Graham Dumpleton

Graham Dumpleton