Currently, calling wsgiref.simple_server simply mounts the (bundled)
demo app.
I think that's a bit of a lost opportunity: the community seems to have
mostly standardized on a wsgi script providing an application callable
in its global namespace (though details may differ, mod_wsgi does not
care for the script's name and mandates an `application` callable while
e.g. gunicorn wants a Python module and the callable name must be
configured), and it would be nice if simple_server could take such a
script and mount the application provided:
* This would allow testing that the script has no error without having
to go through mounting it in e.g. mod_wsgi
* It would make trivial/test applications (e.g. dynamic responders to
local JS) simpler to bootstrap as there would be no need for the
half-dozen lines of wsgiref.simple_server bootstrapping and "hard"
dependency on wsgiref,
import wsgiref.simple_server
def application(environ, start_response):
'code'
if __name__ == '__main__':
httpd = make_server('', 8000, application)
httpd.serve_forever()
could become:
def application(environ, start_response):
'code'
Since wsgiref already supports `python -mwsgiref.simple_server`, the
changes would be pretty simple:
* an optional positional argument of the form `script[:app]`, the script
is exec'd, the application (called "application" by default) is
extracted and then mounted in simple_server. If no script is specified,
just mount `demo_app` as before
* Add -H/--host -p/--port options to, respectively, the hostname and the
port to bind the server to.
* The current -msimple_server uses `handle_request` and only replies once,
to increase the usability of the CLI tool use `serve_forever` *when and
only when the mounted application is not demo_app*. It also avoids
opening a hardcoded example URL on launch.
This way the current sanity test/"PHPInfo" demo app works as it did before,
but it becomes possible to very easily serve a WSGI script with almost no
overhead in the script itself.
Attachment: patch performing the above-specified alterations, using
argparse for arguments parsing and generation of help.
Apache/mod_wsgi only defaults to 'application', it is configurable.
As for the rest of the proposal, I tried to push the same idea a few
years back with intent that all WSGI servers would provide a similar
mechanism to that you describe as a lowest common denominator, but I
didn't get anywhere at the time.
Although people now perhaps appreciate more that a single approach
would be better, it has ballooned now into a much bigger goal with the
discussions on a common deployment mechanism. For example, a WARP file
(Python WAR file equivalent) being the latest idea. This was touched
on again at Web Summit at PyCon this year.
Graham
> _______________________________________________
> Web-SIG mailing list
> Web...@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>
_______________________________________________
Web-SIG mailing list
Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/python-web-sig-garchive-9074%40googlegroups.com
OK, I didn't see any note on the subject so I didn't think that was the
case. Sorry. That's what my (revised) proposal supports anyway,
defaulting to 'application' but overridable.
* This would allow testing that the script has no error without having
to go through mounting it in e.g. mod_wsgi
* It would make trivial/test applications (e.g. dynamic responders to
local JS) simpler to bootstrap as there would be no need for the
half-dozen lines of wsgiref.simple_server bootstrapping and "hard"
dependency on wsgiref,
Should these options be activated only when not mounting the demo, or
all the time (meaning they'd default to single and open respectively, in
order not to change the current behavior?)
> * Allow use of a module instead of a script to obtain the application
How would the runner make the difference? Do it as Python does, a
positional script file and a -m option for a module?
> * drop the ':' separator syntax, or else use os.pathsep so that it works
> properly on Windows (where ':' can denote a drive letter)
Ah yes, I had not thought of that, this would be annoying and I don't
think the pathsep helps, since the script is an arbitrary path file it
can contain pathseps. A second — optional as well — positional parameter
would avoid that problem, would not be more typing and would remove the
need for some of the munging in the type callable.
But now that'd look weird with -m module, unless `-m` is not an
option-with-value but a boolean flag saying to treat the first
positional parameter as a module instead of a script.
That could work nicely, it'd depart somewhat from Python's own
semantics but not too much.
On Thu, Mar 29, 2012 at 1:14 PM, Masklinn <mask...@masklinn.net> wrote:
>
> On 2012-03-29, at 19:46 , PJ Eby wrote:
> > * drop the ':' separator syntax, or else use os.pathsep so that it works
> > properly on Windows (where ':' can denote a drive letter)
>
> Ah yes, I had not thought of that, this would be annoying and I don't
> think the pathsep helps, since the script is an arbitrary path file it
> can contain pathseps. A second — optional as well — positional parameter
> would avoid that problem, would not be more typing and would remove the
> need for some of the munging in the type callable.
>
> But now that'd look weird with -m module, unless `-m` is not an
> option-with-value but a boolean flag saying to treat the first
> positional parameter as a module instead of a script.
If the colon can be made safe for Windows it might be desirable, since
it looks fine and reflects existing practice in a number of places.
But aside from that, I really hope os.pathsep won't be used, because
then we will have a format for naming a WSGI app object which not only
differs from established practice, but (worse) depends on the platform
in a non-obvious way (i.e., not just like using os.sep when specifying
paths on platform-specific file system, which is normal for people who
type out file paths)
As I understand (please correct me if I'm wrong), the big obvious
problem for Windows is that simply splitting on ':' will do something
dumb if the path is something like 'C:\app.py' or
'D:\work\company.apps:foo'. That parsing issue can be worked around
for normal cases, but the ambiguous corner cases like 'c:app' are
confusing (that could be either a file 'app' on the root of the C:
drive, or a callable named 'app' in the module c - unfortunately the \
is not required to specify the root).
So if you want to both keep the colon and want to accept all strings
which DOS would accept before the colon, you must disambiguate which
one you want to parse. But if you can part with the colon OR with the
'c:app' corner case, it can be managed.
Here are some alternative suggestions, in case they are helpful.
(A) accept only one of the two types of input at all (e.g. insist only
on importable module name - it looks like this is what gunicorn does
and I like it a lot - short of having your code installed on system
PYTHONPATH, or using virtualenv PYTHONPATH, you can also do this if
you are in the directory of the script/package)
(B) use --file=c:\app.py or similar to disambiguate what kind of
parsing to do - then the typical case can be the importable and users
who really want to refer to files explicitly can do so
(C) make an executive decision to interpret 'c:app' as object 'app' in
module 'c', require roots to be specified like 'c:\',
and split from the rightmost : not trailed by \. I believe this only
makes a problem in the incredibly specific pathological case where a
DOS-savvy windows user is trying to serve a file without an extension
directly out of root, while naming root unusually without using the
conventional \. and that rare case would generally just generate a
'huh?' - with the exception that the diabolical user made
single-letter modules with the same letters as drives and dumped them
on PYTHONPATH... I think this would work, just be more complex than
the (A) and (B) solution which seems to jibe more with the rest of
Python. but if you want to allow files to be specified freely in the
same spot that modules might be specified...
(D) give up colon: first arg is a filepath-or-pythonmodulepath, and
use something like --name=app (defaults to 'application' if not
specified) to add the app name to look for in the module... I'm not
too keen on this, it is almost as much extra typing as 'pip install'
for a very typical case. The colon is nice because it's one keystroke
and you can expect to use it
(E) give up colon: first arg is a filepath-or-pythonmodulepath, and
optional second arg is what would have come after the colon. I think
this is your suggestion and it's not unreasonable at all. If we are
talking about 'python -m wsgiref.simple_server c:app app' then the
second positional arg is definitely not the weirdest looking part. I
just question whether it's really necessary - why not either drop the
implicit specification of a filename (A/B) or enforce backslash after
root on windows (C)?
If the colon can be made safe for Windows it might be desirable, sinceit looks fine and reflects existing practice in a number of places.
(A) accept only one of the two types of input at all (e.g. insist only
on importable module name - it looks like this is what gunicorn does
and I like it a lot - short of having your code installed on system
PYTHONPATH, or using virtualenv PYTHONPATH, you can also do this if
you are in the directory of the script/package)
(B) use --file=c:\app.py or similar to disambiguate what kind of
parsing to do - then the typical case can be the importable and users
who really want to refer to files explicitly can do so
(E) give up colon: first arg is a filepath-or-pythonmodulepath, and
optional second arg is what would have come after the colon. I think
this is your suggestion and it's not unreasonable at all. If we are
talking about 'python -m wsgiref.simple_server c:app app' then the
second positional arg is definitely not the weirdest looking part. I
just question whether it's really necessary - why not either drop the
implicit specification of a filename (A/B) or enforce backslash after
root on windows (C)?
Earlier I proposed a -m switch (inspired from Python), although the
repetition of -m may be slightly weird I think it could work nicely.
In either case, the colon separator would be removed and the
application's variable name would be a positional following the
script-or-module.
On 2012-03-30, at 04:25 , PJ Eby wrote:Earlier I proposed a -m switch (inspired from Python), although the
>
>> (B) use --file=c:\app.py or similar to disambiguate what kind of
>> parsing to do - then the typical case can be the importable and users
>> who really want to refer to files explicitly can do so
>
> This is my preference if we do support both - to explicitly separate
> modules from scripts somehow.
repetition of -m may be slightly weird I think it could work nicely.
In either case, the colon separator would be removed and the
application's variable name would be a positional following the
script-or-module.
And it does. In both case it means "use this module for execution". Hence
`-m`, as a shorthand for `--module`.
> , took the
> same kinds of arguments
It does as well, both take a Python module on the path.
> , could be exchanged for any other -m clause. But
> wsgiref.simple_server is not at all doing what Python is doing
Of course not, if it did the exact same thing it would be redundant.
> so I see no
> gain of understanding by reusing the convention. python -m doesn't take a
> second positional argument, either.
I'm not sure why that would matter.
> You can't write '-m blog app -m
> wsgiref.simple_server' or '-m blog -m wsgiref.simple_server blog'
Naturally, because this makes no sense at all, the tool being invoked
to start the server is all of `python -mwsgiref.simple_server`. But
that's the very basics of the -m option.
> , but you
> have to understand lots of specifics to see why.
What specific do you have to understand beyond the normal behavior of `-m`?
I fail to see why that would be any more troubling than
`python -mcalendar -m 6`. Or require any more specifics.
> On reflection, I feel strongly that a module name should be the default
> positional arg, not a filename. I agree with PJ Eby that pointing directly
> at a file encourages script/module confusion. I would add that it
> encourages hardcoding file paths rather than module names, which is brittle
> and not good for the WSGI world (for example, it bypasses virtualenvs and
> breaks any time a different deploy directory structure is used).
Not sure how that makes sense, it uses the Python instance and site-package
in the virtualenv, there is nothing to bypass.
> Of course,
> this also means no '-m'. Then the typical use case is really just
>
> python -m wsgiref.simple_server blog
>
> A second positional arg is both a new convention and not an explicit one,
> where I would prefer either an existing implicit convention or a new
> explicit one.
What is not explicit in having an explicit argument, that it's a positional
one instead of an option? How is a colon "more explicit"?
> I think PJ Eby is right that the colon convention is only for modules, and
> I think following gunicorn's lead here would result in a nicer interface
> than forcing (say) --module
>
> python -m wsgiref.simple_server blog:app
The colon is no more explicit than a second positional argument. In fact,
it is significantly less so since it can not be separately and clearly
documented and the one positional parameter needs to document its parsing
rules instead.
> If there is a need to point at a filename, I agree that it should be done
> explicitly.
>
> python -m wsgiref.simple_server --file=~/app.py
>
> (or whatever the flag should be called). To me this seems like a small cost
> to allow the colon by default without possibility of confusion or overly
> fancy parsing.
1. It also does not work considering you can't specify the application's
name in that scheme, so piling on yet more complexity would be
required and there would be two completely different schemes for
specifying an application name. I don't find this appealing.
2. You seem to have asserted from the start that the default should be
mounting modules, but I have seen no evidence or argument in favor of
that so far.
Defaulting to scripts not only works with both local modules and
arbitrary files and follow cpython's (and most tools's) own behavior,
but would also allows using -mwsgiref.simple_server as a shebang
line. I find this to have quite a lot of value.
I may be dense, but is there actually a use case for using a WSGI
application from a script? Presumably a script that defines a WSGI
application would also run it.
I would recommend going back to the original email of the thread
where I described precisely the issue I have with this. And of course
you could have the exact same objection about a module.
The point is very much not to need the overhead of defining a
(conditional) import and usage of simple_server, and keep the
script/module itself focused on its job of defining the application.
It also frees up the __main__ runner for other behaviors such as
testing or CLI tools if needs be.
Some history for you.
Seeing the file containing a WSGI application entry point as a file
rather than a module derives from how Apache works.
Take for example CGI under Apache, one can say for a directory context:
AddHandler cgi-script .py
What this means is that any files with a .py extension are executed as
a CGI script. Thus, would have to be an executable file and have an
appropriate #! line which can resolve the Python interpreter to use.
In this case the extension used is actually irrelevant.
When mod_python came along it allowed one instead to say:
AddHandler python-script .py
The way mod_python then originally worked was that when it resolved a
URL to a directory containing the target .py file, it would add that
directory to sys.path, import the module based on the basename of the
target file. It would then execute the entry point callable within the
loaded module.
No #! line was needed, nor did file need to be executable. The first
wasn't needed because mod_python dictated what Python version was
used.
The problem with what mod_python did was the AddHandler can span
multiple directories. As target files in each directory were accessed,
each directory would get added into sys.path to be able to import it.
Because these are normal file system directories and treated as
separate module directories and not part of an overall package
structure, there was nothing to stop you having the same name file in
each directory. It was common for example to have:
DirectoryIndex index.py
This means that if the directory itself was the target, it would use
the index.py in the directory as means of generating the directory
index.
If more than one directory was added to sys.path containing an
index.py file, you can only have one loaded as a module, not both.
Thus you ended up with an in memory instance of 'index' module being
used rather than the second one encountered, or depending on sys.path
ordering, you could import the 'index' module from the wrong
directory. Basically, things were a bit unpredictable if you ever used
the same file name more than once.
There was various other things that could go wrong as well.
In latter version of mod_python the whole module importing system was
rewritten to avoid adding directories into sys.path. Instead a custom
module importer was used with special lookup rules to find modules in
directories itself.
Further, when modules were loaded, the __name__ of the module was not
just the basename of the file, but a magic string taking into account
the full file path name. By doing this, even though index.py may occur
in separate places, they would be distinct modules in memory.
The complexity of still allowing relative module imports from the same
directory to simulate things as if directory was in sys.path was
frightening though. Add to that that mod_python had a reloading
mechanism which could look not just at the immediate file, but all sub
modules imported from the directories managed by the mod_python custom
module importer and also trigger a reload when one of the used modules
was changed and not just the top level one.
Now when doing mod_wsgi, a similar method of loading each file
separately with a __name__ based on file system path was used to
ensure each was distinct when same file name used in different
directories.
What mod_wsgi didn't do though was replicate the custom module
importer that mod_python had as that really was a nightmare.
This mean that relative module imports from same directory would not
work. If someone really wanted that, they would need to add the
directory to sys.path themselves.
Once they did that though, because the target file as loaded by
mod_wsgi had a __name__ which didn't match the basename for file, then
if someone tried to import that module file back into something else,
you would end up with two copies in memory. The first being the magic
one mod_wsgi loaded as file and the other loaded as module.
To make it more obvious that they were treated a bit differently, and
to avoid people making this mistake, it was promoted to use a .wsgi
extension for the WSGI script file rather than .py. That way people
would not go inadvertently importing it a second time.
Further, because of the way that the .wsgi script file was loaded,
ie., not as regular module import, and with __name__ being special
there were certain things you couldn't do in it.
One example was that you could not put class definitions for objects
in it which you then pickled up instances of. This is because when
unpickling Python would not know how to automatically import the
module containing the class definitions because of __name__ not having
any meaning to the Python where being unpickled.
So that is the history of why there is a distinction between WSGI
script file and module containing a WSGI application in Apache at
least. Although it is called 'script file', it is really only 'file'
as it isn't executable itself in the sense of what people generally
talk about when they say 'script'.
FWIW, in the past when pushing the idea of a WSGI script file being
the lowest common denominator, part of the reason I found I couldn't
get it accepted is that some people simply didn't understand how in
Python to load an arbitrary file by path name and construct a module
for it in memory, with magic __name__. They seemed to think that the
only way to import a code file was for it to have a .py extension and
for the directory to be in sys.path. So, due to ignorance of the
solution as to how to do it meant I got a push back from some people.
:-(
Graham
I do really like the idea of having a quick WSGI runner in the stdlib,
Now when doing mod_wsgi, a similar method of loading each file
separately with a __name__ based on file system path was used to
ensure each was distinct when same file name used in different
directories.
FWIW, in the past when pushing the idea of a WSGI script file being
the lowest common denominator, part of the reason I found I couldn't
get it accepted is that some people simply didn't understand how in
Python to load an arbitrary file by path name and construct a module
for it in memory, with magic __name__. They seemed to think that the
only way to import a code file was for it to have a .py extension and
for the directory to be in sys.path. So, due to ignorance of the
solution as to how to do it meant I got a push back from some people.
Because not having a __name__ attribute at all would make:
if __name__ == '__main__':
...
fail straight away and people quite often had that in scripts so they
could run it directly as well with a pure WSGI server.
>> FWIW, in the past when pushing the idea of a WSGI script file being
>> the lowest common denominator, part of the reason I found I couldn't
>> get it accepted is that some people simply didn't understand how in
>> Python to load an arbitrary file by path name and construct a module
>> for it in memory, with magic __name__. They seemed to think that the
>> only way to import a code file was for it to have a .py extension and
>> for the directory to be in sys.path. So, due to ignorance of the
>> solution as to how to do it meant I got a push back from some people.
>
> Who were you trying to get acceptance from? Web-SIG or Python-Dev?
> Framework devs or end-users? Is there a PEP?
I brought it up on the WEB-SIG. It may have been bad timing amongst
all the other discussions that went around in circles at the time on
the WEB-SIG. Also mentioned it in passing to some WSGI server
developers and other people when discussing web stuff at meet ups or
otherwise.
On 31 March 2012 14:36, PJ Eby <p...@telecommunity.com> wrote:Because not having a __name__ attribute at all would make:
> Why give them a __name__ at all? Aren't they scripts, rather than modules?
> ISTM that not having a __name__ would also let things like pickles fail
> faster. That is, code that expected a module rather than a script would
> break right away.
if __name__ == '__main__':
...
fail straight away and people quite often had that in scripts so they
could run it directly as well with a pure WSGI server.
> On Fri, Mar 30, 2012 at 2:22 PM, Sasha Hart <s...@sashahart.net> wrote:
>
>> I do really like the idea of having a quick WSGI runner in the stdlib,
>>
> Regarding modules vs. files, I don't really care that much which way the
> capability is spelled, as long as the file vs. module distinction is
> explicit. "-m " isn't a lot to add to a command line, and neither is "-f
> ". If there's no consensus, just require that one or the other be
> specified, and inconvenience both groups of people equally. ;-)
Let's go with that. I'm not sure if I should be using sub-commands or
options so for now I've used options. UI changes from initial patch:
* Exclusive options for getting the application callable from a script
(via `exec`) or a module (using `importlib.import_module`)
* Application name specification has been moved to an option (default:
`application`)
* Option to serve the request only once (default: forever) when mounting
a custom application
* Option to open a browser window/tab to the (host, port) specified
(which still default to '' and 8000 respectively)
> python -mwsgiref.simple_server -h
usage: simple_server.py [-h] [-H HOST] [-p PORT] [-s SCRIPT | -m MODULE]
[-a APP] [-1] [-b]
Mount and serve a WSGI application. If no script or module is specified,
mounts wsgiref.simple_server.demo_app
optional arguments:
-h, --help show this help message and exit
-H HOST, --host HOST Host to listen on
-p PORT, --port PORT Port to listen on (defaults to 8000)
-s SCRIPT, --script SCRIPT
WSGI script file to execute and get the application
from
-m MODULE, --module MODULE
Python module to import and get the application from
Script or module options:
These options do not apply when mounting the demo_app
-a APP, --app APP, --app-name APP
Name of the application variable in the script or
module, application if none is provided
-1, --once Only handles a single request, then exits
-b, --browse Launch browser to the served URL
On 2012-03-30, at 20:22 , Sasha Hart wrote:And it does. In both case it means "use this module for execution". Hence
>
> I am finding more reasons to dislike that -m:
>
> python -m wsgiref.simple_server -m blog app
>
> Beyond looking a little stuttery, it's really unclear. Anyone could be
> forgiven for thinking that -m meant the same thing in both cases
`-m`, as a shorthand for `--module`.
> so I see no gain of understanding by reusing the convention. python -m doesn't take aI'm not sure why that would matter.
> second positional argument, either.
Naturally, because this makes no sense at all, the tool being invoked
> You can't write '-m blog app -m
> wsgiref.simple_server' or '-m blog -m wsgiref.simple_server blog'
to start the server is all of `python -mwsgiref.simple_server`. But
that's the very basics of the -m option.
I fail to see why that would be any more troubling than
`python -mcalendar -m 6`. Or require any more specifics.
Not sure how that makes sense, it uses the Python instance and site-package
> On reflection, I feel strongly that a module name should be the default
> positional arg, not a filename. I agree with PJ Eby that pointing directly
> at a file encourages script/module confusion. I would add that it
> encourages hardcoding file paths rather than module names, which is brittle
> and not good for the WSGI world (for example, it bypasses virtualenvs and
> breaks any time a different deploy directory structure is used).
in the virtualenv, there is nothing to bypass.
What is not explicit in having an explicit argument, that it's a positional
> Of course, this also means no '-m'. Then the typical use case is really just
>
> python -m wsgiref.simple_server blog
>
> A second positional arg is both a new convention and not an explicit one,
> where I would prefer either an existing implicit convention or a new
> explicit one.
one instead of an option? How is a colon "more explicit"?
The colon is no more explicit than a second positional argument. In fact,
> I think PJ Eby is right that the colon convention is only for modules, and
> I think following gunicorn's lead here would result in a nicer interface
> than forcing (say) --module
>
> python -m wsgiref.simple_server blog:app
it is significantly less so since it can not be separately and clearly
documented and the one positional parameter needs to document its parsing
rules instead.
1. It also does not work considering you can't specify the application's
> If there is a need to point at a filename, I agree that it should be done
> explicitly.
>
> python -m wsgiref.simple_server --file=~/app.py
>
> (or whatever the flag should be called). To me this seems like a small cost
> to allow the colon by default without possibility of confusion or overly
> fancy parsing.
name in that scheme, so piling on yet more complexity would be
required and there would be two completely different schemes for
specifying an application name. I don't find this appealing.
2. You seem to have asserted from the start that the default should be
mounting modules, but I have seen no evidence or argument in favor of
that so far.
Defaulting to scripts not only works with both local modules and
arbitrary files and follow cpython's (and most tools's) own behavior,
but would also allows using -mwsgiref.simple_server as a shebang
line. I find this to have quite a lot of value.
When using CGI or FASTCGI, with a hosting system where an executable
script needs to be supplied, it is beneficial to be able to say
something like:
#!/usr/bin/env python -m cgi2wsgi
#!/usr/bin/env python -m fcgi2wsgi
where the rest of the script is the just the WSGI application.
I have implemented this for CGI as an example at:
https://github.com/GrahamDumpleton/cgi2wsgi
I have done it for FASTCGI using flup as well before but that isn't
available anywhere.
Graham
I should probably add though that this is not the best way it could be
done for FASTCGI. For FASTCGI you are better off making use of FASTCGI
implementations wrapper mechanism as intermediary with it handling the
loading. This is the approach that PHP under FASTCGI uses and why it
is so easy for users, namely because system admins set it up with
wrapper support. You don't see such niceties for Python where a system
admin sets up that a .wsgi script file would be understood to be a
Python WSGI application with no extra magic needing to be added to it
by the user, even though not that difficult in principle. Thus why
users need to resort to #! line and low level FASTCGI script in the
first place.