At PyCon there was an open space about deployment, and the idea of drop-in applications (Java-WAR-style).
I generally get pessimistic about 80% solutions, and dropping in a WAR file feels like an 80% solution to me. I’ve used the Hudson/Jenkins installer (which I think is specifically a project that got WARs on people’s minds), and in a lot of ways that installer is nice, but it’s also kind of wonky, it makes configuration unclear, it’s not always clear when it installs or configures itself through the web, and when you have to do this at the system level, nor is it clear where it puts files and data, etc. So a great initial experience doesn’t feel like a great ongoing experience to me — and it doesn’t have to be that way. If those were necessary compromises, sure, but they aren’t. And because we don’t have WAR files, if we’re proposing to make something new, then we have every opportunity to make things better.
So the question then is what we’re trying to make. To me: we want applications that are easy to install, that are self-describing, self-configuring (or at least guide you through configuration), reliable with respect to their environment (not dependent on system tweaking), upgradable, and respectful of persistence (the data that outlives the application install). A lot of this can be done by the "container" (to use Java parlance; or "environment") — if you just have the app packaged in a nice way, the container (server environment, hosting service, etc) can handle all the system-specific things to make the application actually work.
At which point I am of course reminded of my Silver Lining project, which defines something very much like this. Silver Lining isn’t just an application format, and things aren’t fully extracted along these lines, but it’s pretty close and it addresses a lot of important issues in the lifecycle of an application. To be clear: Silver Lining is an application packaging format, a server configuration library, a cloud server management tool, a persistence management tool, and a tool to manage the application with respect to all these services over time. It is a bunch of things, maybe too many things, so it is not unreasonable to pick out a smaller subset to focus on. Maybe an easy place to start (and good for Silver Lining itself) would be to separate at least the application format (and tools to manage applications in that state, e.g., installing new libraries) from the tools that make use of such applications (deploy, etc).
Some opinions I have on this format, exemplified in Silver Lining:
Things that could be improved:
There is a description of the configuration file for apps. The environmental variables are also notably part of the application’s expectations. The file layout is explained (together with a bunch of Silver Lining-specific concepts) in Development Patterns. Besides all that there is admittedly some other stuff that is only really specified in code; but in Silver Lining’s defense, specified in code is better than unspecified ;) App Engine provides another example of an application format, and would be worth using as a point of discussion or contrast (I did that myself when writing Silver Lining).
Discussing WSGI stuff with Ben Bangert at PyCon he noted that he didn’t really feel like the WSGI pieces needed that much more work, or at least that’s not where the interesting work was — the interesting work is in the tooling. An application format could provide a great basis for building this tooling. And I honestly think that the tooling has been held back more by divergent patterns of development than by the difficulty of writing the tools themselves; and a good, general application format could fix that.
cheers
James
--
-- James Mills
--
-- "Problems are solved by method"
_______________________________________________
Web-SIG mailing list
Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/python-web-sig-garchive-9074%40googlegroups.com
> +1 too. I would however like to see this idea developed in a generic
> and useable way. ie: No zope/twisted deps or making it fit around
> Django :)
> Ideally it should be useable by the most basic (plain old WSGI).
The following are the collected ideas of myself and a few other users
in the WebCore chat room:
https://gist.github.com/911991
Being generic (i.e. using WSGI under-the-hood) and allowing generic
port assignments for other (non-web) networked applications is a design
goal.
The aversion to packaged zips is not entirely understandable to us; in
this case, a packaged copy of the application is produced via a
setup.py command, though in theory one could develop with that model
and just zip everything up in the end by hand.
Silver Lining seems to require too much in the way of hacking
(modifying .pth files, etc) to be reasonable.
— Alice.
A couple of comments:
4. It would be nice to also support web applications that provide
their own web server (for whatever reason). chroot/jail them into a a
virtualenv of their own (maybe?)
6. It would be nice to also support other standard UNIX-ish logging. eg: syslog
> Being generic (i.e. using WSGI under-the-hood) and allowing generic port
> assignments for other (non-web) networked applications is a design goal.
Good :)
cheers
James
--
-- James Mills
--
-- "Problems are solved by method"
On 2011-04-10 16:25:21 -0700, James Mills said:The following are the collected ideas of myself and a few other users in the WebCore chat room:
+1 too. I would however like to see this idea developed in a generic
and useable way. ie: No zope/twisted deps or making it fit around
Django :)
Ideally it should be useable by the most basic (plain old WSGI).
https://gist.github.com/911991
Being generic (i.e. using WSGI under-the-hood) and allowing generic port assignments for other (non-web) networked applications is a design goal.
The aversion to packaged zips is not entirely understandable to us; in this case, a packaged copy of the application is produced via a setup.py command, though in theory one could develop with that model and just zip everything up in the end by hand.
Silver Lining seems to require too much in the way of hacking (modifying .pth files, etc) to be reasonable.
On 2011-04-10 19:06:52 -0700, Ian Bicking said:
> There's a significant danger that you'll be creating a configuration
> management tool at that point, not simply a web application description.
Unless you have the tooling to manage the applications, there's no
point having a "standard" for them. Part of that tooling will be some
form of configuration management allowing you to determine the
requirements and configuration of an application /prior/ to
installation. Better to have an application rejected up-front ("Hey,
this needs my social insurance number? Hells no!") then after it's
already been extracted and potentially littered the landscape with its
children.
> The escape valve in Silver Lining for these sort of things is services,
> which can kind of implement anything, and presumably ad hoc services
> could be allowed for.
Generic services are useful, but not useful enough.
> You create a build process as part of the deployment (and development
> and everything else), which I think is a bad idea.
Please elaborate. There is no requirement for you to use the
"application packaging format" and associated tools (such as an
application server) during development. In fact, like 2to3, that type
of process would only slow things down to the point of uselessness.
That's not what I'm suggesting at all.
> My model does not use setup.py as the basis for the process (you could
> build a tool that uses setup.py, but it would be more a development
> methodology than a part of the packaging).
I know. And the end result is you may have to massage .pth files
yourself. If a tool requires you to, at any point during "normal
operation", hand modify internal files… that tool has failed at its
job. One does not go mucking about in your Git repo's .git/ folder, as
an example.
How do you build a release and upload it to PyPi? Upload docs to
packages.python.org? setup.py commands. It's a convienent hook with
access to metadata in a convienent way that would make an excellent
"let's make a release!" type of command.
> Also lots of libraries don't work when zipped, and an application is
> typically an aggregate of many libraries, so zipping everything just
> adds a step that probably has to be undone later.
Of course it has to be un-done later. I had thought I had made that
quite clear in the gist. (Core Operation, point 1, possibly others.)
> If a deploy process uses zip file that's fine, but adding zipping to
> deployment processes that don't care for zip files is needless
> overhead. A directory of files is the most general case. It's also
> something a developer can manipulate, so you don't get a mismatch
> between developers of applications and people deploying applications --
> they can use the exact same system and format.
So, how do you push the updated application around? Using a full
directory tree leaves you with Rsync and SFTP, possibly various SCM
methods, but then you'd need a distinct repo (or rootless branch) just
for releasing and you've already mentioned your dislike for SCM-based
deployment models.
Zip files are universal -- to the point that most modern operating
systems treat zip files /as folders/. If you have to, consider it a
transport encoding.
> The pattern that it implements is fairly simple, and in several models
> you have to lay things out somewhat manually. I think some more
> convention and tool support (e.g., in pip) would be helpful.
+1
> Though there are quite a few details, the result is more reliable,
> stable, and easier to audit than anything based on a build process
> (which any use of "dependencies" would require -- there are *no*
> dependencies in a Silver Lining package, only the files that are *part*
> of the package).
It might be just me (and the other people who seem to enjoy WebCore and
Marrow) but it is fully possible to do install-time dependencies in
such a way as things won't break accidentally. Also, you missed
Application Spec #4.
> Some notes from your link:
>
> - There seems to be both the description of a format, and a program
> based on that format, but it's not entirely clear where the boundary
> is. I think it's useful to think in terms of a format and a reference
> implementation of particular tools that use that format (development
> management tools, like installing into the format; deployment tools;
> testing tools; local serving tools; etc).
Indeed; this gist was some really quickly hacked together ideas.
> - In Silver Lining I felt no need at all for shared libraries. Some
> disk space can be saved with clever management (hard links), but only
> when it's entirely clear that it's just an optimization. Adding a
> concept like "server-packages" adds a lot of operational complexity and
> room for bugs without any real advantages.
±0
> - I try to avoid error conditions in the deployment, which is a big
> part of not having any build process involved, as build processes are a
> source of constant errors -- you can do a stage deployment, then five
> minutes later do a production deployment, and if you have a build
> process there is a significant chance that the two won't match.
I have never, in my life, encountered that particular problem. I may
be more careful than most in defining dependencies with version number
boundaries, I may be more careful in utilizing my own package
repository (vs. the public PyPi), but I don't think I'm unique in
having few to no issues in development/sandbox/production deployment
processes.
Hell, I'm still able to successfully deploy a TurboGears 0.9
application without dependency issues.
However, the package format I describe in that gist does include the
source for the dependencies as "snapshotted" during bundling. If your
application is working in development, after snapshotting it /will/
work on sandbox or production deployments.
On Apr 10, 2011, at 10:29 PM, Alice Bevan–McGregor wrote:
> However, the package format I describe in that gist does include the source for the dependencies as "snapshotted" during bundling. If your application is working in development, after snapshotting it /will/ work on sandbox or production deployments.
I wanted to chime in on this one aspect b/c I think the concept is somewhat flawed. If your application is working in development and "snapshot" the dependencies that is no guarantee that things will work in production. The only way to say that snapshot or bundle is guaranteed to work is if you snapshot the entire system and make it available as a production system.
Using a real world example, say you develop your application on OS X and you deploy on Ubuntu 8.04 LTS. Right away you are dealing with two different operating systems with entirely different system calls. If you use something like lxml and simplejson, you have no choice but to repackage or install from source on the production server. While it is fair to say that generally you could avoid packages that don't use C, both lxml and simplejson are rather obvious choices for web development. You could use the json module and ElementTree, but if you want more speed (and who doesn't like to go fast!), lxml and simplejson are both better options.
It sounds like Ian doesn't want to have any build steps which I think is a bad mantra. A build step lets you prepare things for deployment. A deployment package is different than a development package and mixing the two by forcing builds on the server or seems like asking for trouble. I'm not saying this is what you (Alice) are suggesting, but rather pointing out that as a model, depending on virtualenv + pip's bundling capabilities seems slightly flawed.
Personally, and I don't expect folks to take my opinions very seriously b/c I haven't offered any code, what I'd like to see is a simple format that helps install and uninstall web applications. I think it should offer hooks for running tests, learning basic status and allow simple configuration for typical sysadmin needs (logging via syslog, process management, nagios checks, etc.). Instead of focusing on what format that should take in terms of packages, it seems more effective to spend time defining a standard means of managing WSGI apps and piggyback or plain old copy some format like RPMs or dpkg.
Just my .02. Again, I haven't offered code, so feel free to ignore me. But I do hope that if there are others that suspect this model of putting source on the server is a problem pipe up. If I were to add a requirement it would be that Python web applications help system administrators become more effective. That means finding consistent ways of deploying apps that plays well with other languages / platforms. After all, keeping a C compiler on a public server is rarely a good idea.
Eric
>
> — Alice.
>
>
> _______________________________________________
> Web-SIG mailing list
> Web...@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/eric%40ionrock.org
_______________________________________________
Web-SIG mailing list
Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig
installation. Better to have an application rejected up-front ("Hey,
this needs my social insurance number? Hells no!") then after it's
already been extracted and potentially littered the landscape with its
children.
> My model does not use setup.py as the basis for the process (you could
> build a tool that uses setup.py, but it would be more a development
> methodology than a part of the packaging).I know. And the end result is you may have to massage .pth files
yourself. If a tool requires you to, at any point during "normal
operation", hand modify internal files… that tool has failed at its
job. One does not go mucking about in your Git repo's .git/ folder, as
an example.
How do you build a release and upload it to PyPi? Upload docs to
packages.python.org? setup.py commands. It's a convienent hook with
access to metadata in a convienent way that would make an excellent
"let's make a release!" type of command.
It might be just me (and the other people who seem to enjoy WebCore and
Marrow) but it is fully possible to do install-time dependencies in
such a way as things won't break accidentally. Also, you missed
Application Spec #4.
> if you have a build process there is a significant chance that the two won't match.
I have never, in my life, encountered that particular problem.
> Hi,
> On Apr 10, 2011, at 10:29 PM, Alice Bevan–McGregor wrote:
>
>> However, the package format I describe in that gist does include the
>> source for the dependencies as "snapshotted" during bundling. If your
>> application is working in development, after snapshotting it /will/
>> work on sandbox or production deployments.
>
> I wanted to chime in on this one aspect b/c I think the concept is
> somewhat flawed. If your application is working in development and
> "snapshot" the dependencies that is no guarantee that things will work
> in production. The only way to say that snapshot or bundle is
> guaranteed to work is if you snapshot the entire system and make it
> available as a production system.
`pwaf bundle` bundles the source tarballs, effectively, of your
application and dependencies into a single file. Not unlike a certain
feature of pip.
And… wait, am I the only one who uses built-from-snapshot virtual
servers for sandbox and production deployment? I can't be the only one
who likes things to work as expected.
> Using a real world example, say you develop your application on OS X
> and you deploy on Ubuntu 8.04 LTS. Right away you are dealing with two
> different operating systems with entirely different system calls. If
> you use something like lxml and simplejson, you have no choice but to
> repackage or install from source on the production server.
Installing from source is what I was suggesting. Also, Ubuntu on a
server? All your `linux single` (root) are belong to me. ;^P
> While it is fair to say that generally you could avoid packages that
> don't use C, both lxml and simplejson are rather obvious choices for
> web development.
Except that json is built-in in 2.6 (admittedly with fewer features,
but I've never needed the extras) and there are alternate xml parsers,
too.
> It sounds like Ian doesn't want to have any build steps which I think
> is a bad mantra. A build step lets you prepare things for deployment. A
> deployment package is different than a development package and mixing
> the two by forcing builds on the server or seems like asking for
> trouble.
I'm having difficulty following this statement: build steps good,
building on server bad? So I take it you know the exact target
architecture and have cross-compilers installed in your development
environment? That's not practical (or simple) at all!
> I'm not saying this is what you (Alice) are suggesting, but rather
> pointing out that as a model, depending on virtualenv + pip's bundling
> capabilities seems slightly flawed.
Virtualenv (or something utilizing a similar Python path 'chrooting'
capability) and pip using the extracted "deps" as the source for
"offline" installation actually seems quite reasonable to me. The
benefit of a known set of working packages (i.e. specific version
numbers, tested in development) and the ability to compile C extensions
in-place. (Because sure as hell you can't reliably compile them
before-hand if they have any form of system library dependency!)
> I think it should offer hooks for running tests, learning basic status
> and allow simple configuration for typical sysadmin needs (logging via
> syslog, process management, nagios checks, etc.). Instead of focusing
> on what format that should take in terms of packages, it seems more
> effective to spend time defining a standard means of managing WSGI apps
> and piggyback or plain old copy some format like RPMs or dpkg.
RPMs are terrible, dpkg is terrible. Binary package distribution, in
general, is terrible. I got the distinct impression at PyCon that
binary distributable .eggs were thought of as terrible and should be
phased out.
Also, nobody so far seems to have noticed the centralized logging
management or deamon management lines from my notes.
> Just my .02. Again, I haven't offered code, so feel free to ignore me.
> But I do hope that if there are others that suspect this model of
> putting source on the server is a problem pipe up. If I were to add a
> requirement it would be that Python web applications help system
> administrators become more effective. That means finding consistent
> ways of deploying apps that plays well with other languages /
> platforms. After all, keeping a C compiler on a public server is rarely
> a good idea.
If you could demonstrate a fool-proof way to install packages with
system library dependencies using cross-compilation from a remote
machine, I'm all ears. ;)
— Alice.
_______________________________________________
Web-SIG mailing list
Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/python-web-sig-garchive-9074%40googlegroups.com
Howdy!Unless you have the tooling to manage the applications, there's no point having a "standard" for them. Part of that tooling will be some form of configuration management allowing you to determine the requirements and configuration of an application /prior/ to installation. Better to have an application rejected up-front ("Hey, this needs my social insurance number? Hells no!") then after it's already been extracted and potentially littered the landscape with its children.
On 2011-04-10 19:06:52 -0700, Ian Bicking said:
There's a significant danger that you'll be creating a configuration management tool at that point, not simply a web application description.
Generic services are useful, but not useful enough.
The escape valve in Silver Lining for these sort of things is services, which can kind of implement anything, and presumably ad hoc services could be allowed for.
Please elaborate. There is no requirement for you to use the "application packaging format" and associated tools (such as an application server) during development. In fact, like 2to3, that type of process would only slow things down to the point of uselessness. That's not what I'm suggesting at all.
You create a build process as part of the deployment (and development and everything else), which I think is a bad idea.
I know. And the end result is you may have to massage .pth files yourself. If a tool requires you to, at any point during "normal operation", hand modify internal files… that tool has failed at its job. One does not go mucking about in your Git repo's .git/ folder, as an example.
My model does not use setup.py as the basis for the process (you could build a tool that uses setup.py, but it would be more a development methodology than a part of the packaging).
How do you build a release and upload it to PyPi? Upload docs to packages.python.org? setup.py commands. It's a convienent hook with access to metadata in a convienent way that would make an excellent "let's make a release!" type of command.Of course it has to be un-done later. I had thought I had made that quite clear in the gist. (Core Operation, point 1, possibly others.)
Also lots of libraries don't work when zipped, and an application is typically an aggregate of many libraries, so zipping everything just adds a step that probably has to be undone later.
So, how do you push the updated application around? Using a full directory tree leaves you with Rsync and SFTP, possibly various SCM methods, but then you'd need a distinct repo (or rootless branch) just for releasing and you've already mentioned your dislike for SCM-based deployment models.
If a deploy process uses zip file that's fine, but adding zipping to deployment processes that don't care for zip files is needless overhead. A directory of files is the most general case. It's also something a developer can manipulate, so you don't get a mismatch between developers of applications and people deploying applications -- they can use the exact same system and format.
Zip files are universal -- to the point that most modern operating systems treat zip files /as folders/. If you have to, consider it a transport encoding.
+1
The pattern that it implements is fairly simple, and in several models you have to lay things out somewhat manually. I think some more convention and tool support (e.g., in pip) would be helpful.
It might be just me (and the other people who seem to enjoy WebCore and Marrow) but it is fully possible to do install-time dependencies in such a way as things won't break accidentally. Also, you missed Application Spec #4.
Though there are quite a few details, the result is more reliable, stable, and easier to audit than anything based on a build process (which any use of "dependencies" would require -- there are *no* dependencies in a Silver Lining package, only the files that are *part* of the package).
Indeed; this gist was some really quickly hacked together ideas.
Some notes from your link:
- There seems to be both the description of a format, and a program based on that format, but it's not entirely clear where the boundary is. I think it's useful to think in terms of a format and a reference implementation of particular tools that use that format (development management tools, like installing into the format; deployment tools; testing tools; local serving tools; etc).
±0
- In Silver Lining I felt no need at all for shared libraries. Some disk space can be saved with clever management (hard links), but only when it's entirely clear that it's just an optimization. Adding a concept like "server-packages" adds a lot of operational complexity and room for bugs without any real advantages.
I have never, in my life, encountered that particular problem. I may be more careful than most in defining dependencies with version number boundaries, I may be more careful in utilizing my own package repository (vs. the public PyPi), but I don't think I'm unique in having few to no issues in development/sandbox/production deployment processes.
- I try to avoid error conditions in the deployment, which is a big part of not having any build process involved, as build processes are a source of constant errors -- you can do a stage deployment, then five minutes later do a production deployment, and if you have a build process there is a significant chance that the two won't match.
> I use Ubuntu on all my servers, and "linux single" does not work with
> it, I can tell you ;P
The number of poorly configured Ubuntu servers I have seen (and
replaced) is staggering. Any time the barrier to entry is lowered,
quality suffers: having a compiler on the server is nothing compared to
having a complete X graphical environment running as root, with root
and a single user sharing the same password. ;^D
— Alice.
_______________________________________________
Web-SIG mailing list
Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/python-web-sig-garchive-9074%40googlegroups.com
Hello,
I have few comments:
- That file layout basically forces you to have your development environment as close to the production environment. This is especially visible if you're relying on python c extensions. Since you don't want to have the same environment constraints as appengine it should be more flexible in this regard and offer a way to generate the project dependencies somewhere else than the depeloper's machine.
- There's no builtin support for logging configuration.
- The update_fetch feels like a hack as it's not extensible to do lifecycle (hooks for shutdown, start, etc). Also, it's shouldn't be a application url because you'd want to run a hook before starting it or after stopping it. I guess you could accomplish that with a wsgi wrapper but there should be a clear separation between the app and hooks that manage the app.
- I'm not entirely clear on why you avoid a build process (war-like) prior to deployment. It works fine for appengine - but you don't have it's constraints.
On 2011-04-11 00:53:02 -0700, Eric Larson said:Hi,On Apr 10, 2011, at 10:29 PM, Alice Bevan–McGregor wrote:However, the package format I describe in that gist does include the source for the dependencies as "snapshotted" during bundling. If your application is working in development, after snapshotting it /will/ work on sandbox or production deployments.I wanted to chime in on this one aspect b/c I think the concept is somewhat flawed. If your application is working in development and "snapshot" the dependencies that is no guarantee that things will work in production. The only way to say that snapshot or bundle is guaranteed to work is if you snapshot the entire system and make it available as a production system.
`pwaf bundle` bundles the source tarballs, effectively, of your application and dependencies into a single file. Not unlike a certain feature of pip.
And… wait, am I the only one who uses built-from-snapshot virtual servers for sandbox and production deployment? I can't be the only one who likes things to work as expected.Using a real world example, say you develop your application on OS X and you deploy on Ubuntu 8.04 LTS. Right away you are dealing with two different operating systems with entirely different system calls. If you use something like lxml and simplejson, you have no choice but to repackage or install from source on the production server.
Installing from source is what I was suggesting. Also, Ubuntu on a server? All your `linux single` (root) are belong to me. ;^P
While it is fair to say that generally you could avoid packages that don't use C, both lxml and simplejson are rather obvious choices for web development.
Except that json is built-in in 2.6 (admittedly with fewer features, but I've never needed the extras) and there are alternate xml parsers, too.
It sounds like Ian doesn't want to have any build steps which I think is a bad mantra. A build step lets you prepare things for deployment. A deployment package is different than a development package and mixing the two by forcing builds on the server or seems like asking for trouble.
I'm having difficulty following this statement: build steps good, building on server bad? So I take it you know the exact target architecture and have cross-compilers installed in your development environment? That's not practical (or simple) at all!
I'm not saying this is what you (Alice) are suggesting, but rather pointing out that as a model, depending on virtualenv + pip's bundling capabilities seems slightly flawed.
Virtualenv (or something utilizing a similar Python path 'chrooting' capability) and pip using the extracted "deps" as the source for "offline" installation actually seems quite reasonable to me. The benefit of a known set of working packages (i.e. specific version numbers, tested in development) and the ability to compile C extensions in-place. (Because sure as hell you can't reliably compile them before-hand if they have any form of system library dependency!)
I think it should offer hooks for running tests, learning basic status and allow simple configuration for typical sysadmin needs (logging via syslog, process management, nagios checks, etc.). Instead of focusing on what format that should take in terms of packages, it seems more effective to spend time defining a standard means of managing WSGI apps and piggyback or plain old copy some format like RPMs or dpkg.
RPMs are terrible, dpkg is terrible. Binary package distribution, in general, is terrible. I got the distinct impression at PyCon that binary distributable .eggs were thought of as terrible and should be phased out.
Also, nobody so far seems to have noticed the centralized logging management or deamon management lines from my notes.
Just my .02. Again, I haven't offered code, so feel free to ignore me. But I do hope that if there are others that suspect this model of putting source on the server is a problem pipe up. If I were to add a requirement it would be that Python web applications help system administrators become more effective. That means finding consistent ways of deploying apps that plays well with other languages / platforms. After all, keeping a C compiler on a public server is rarely a good idea.
If you could demonstrate a fool-proof way to install packages with system library dependencies using cross-compilation from a remote machine, I'm all ears. ;)
— Alice.
_______________________________________________
Web-SIG mailing list
Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig
We have more than 3 implementations of this idea, the Python Web Application Package and Format or WAPAF, including Java's WAR files, Google App Engine, silverlining. Let's review the WAR file, approximately:
(static files, .jsp)
WEB-INF/web.xml
WEB-INF/classes/org/example/myapplication.class
WEB-INF/lib/some-library.jar
WEB-INF/lib/145-other-libraries.jar
Build the .war file, copy to server, done (ideally). Your program should require a standard Java installation plus whatever's in the .war file. The .war file is a .zip that follows certain conventions.
In practice you might develop in and deploy exploded .war files which are exactly the same thing but unzipped.
Since it's Java there is no classes/SQLAlchemy/src/sqlalchemy/__init__.py; the path for the code always starts at classes/, not at some arbitrary set of subdirectories under classes/
installation. Better to have an application rejected up-front ("Hey,
this needs my social insurance number? Hells no!") then after it's
already been extracted and potentially littered the landscape with its
children.Part of the potential win here is that the application need not litter anything. Like GAE, the server might keep all the previous versions you've uploaded and let you pick which one you want today. You shouldn't have to think about the state the server.
> My model does not use setup.py as the basis for the process (you could
> build a tool that uses setup.py, but it would be more a development
> methodology than a part of the packaging).I know. And the end result is you may have to massage .pth files
yourself. If a tool requires you to, at any point during "normal
operation", hand modify internal files… that tool has failed at its
job. One does not go mucking about in your Git repo's .git/ folder, as
an example.If I read the silverlining documentation correctly the .pth is created manually in the example only because there was no 'setup.py' to 'pip install -e'. As an alternative the spec could only add particular directories to PYTHONPATH. This might be a distutils2 thing.
How do you build a release and upload it to PyPi? Upload docs to
packages.python.org? setup.py commands. It's a convienent hook with
access to metadata in a convienent way that would make an excellent
"let's make a release!" type of command.setup.py should go away. The distutils2 talk from pycon 2011 explains. http://blip.tv/file/4880990
On 2011-04-11 15:22:11 -0700, Ian Bicking said:
> I... think we are misunderstanding each other or something.
Something. ;)
> A nice tool that could use this format, for instance, would be a tool
> that takes an app and creates a puppet recipe to setup a sever to host
> the application. A different tool (maybe better, maybe not?) would be
> a puppet plugin (if that's the terminology) that uses this format to
> tell puppet about all the requirements an application has, perhaps
> translating some notions to puppet-native concepts, or adding
> high-level recipes that setup an appropriate container (which can be as
> simple as a properly configured Nginx or Apache server).
Minuteman (loved the hat from the PyCon lightning talk), buildout,
puppet, make, bash, custom XML-RPC APIs, … there are quite a number of
ways to push something into production. Standardizing on one would
marginalize the idea, and being agnostic means there is a whole /lot/
of work to be done to add support to every tool. :/
> What I mean when I say there's a danger of becoming a configuration
> management tool, is that if you include hooks for the application to
> configure its environment you are probably stepping on the toes of
> whatever other tool you might use. And once you start down that path
> things tend to cascade.
Have a gander at the Application Spec section; what, specifically, are
you at odds with as coming from the application? I work with
specifics, not vague "don't do that!" comments.
The configuration of environment extends to:
:: static resource declaration, because a tool that manages server
configuration can do a better job 'mounting' those resources.
:: services (in your parlance, 'resources' in mine) such as "give me an
sql database".
:: recurrent tasks (a la cron) because having that centralized across
multiple applications Isn't Just a Good Idea™ -- treat this as a
'service' if you must.
> If you include something in the packaging format that indicates the
> libraries to be installed, then you are encouraging and perhaps
> requiring that the server install libraries during a deployment.
Libraries that are __bundled with the application__. I fail to see the
'badness' of this, or, really, how this differs from Silver Lining.
I'd double-check this, but cloudsilverlining.org is inaccessible from
my current location for some reason. :/
> Realistically this can't be entirely avoided, but I think it is a
> pretty workable separation to declare only those dependencies that
> can't reasonably be included directly in the application itself (e.g.,
> lxml, MySQLdb, git, and so on). In Silver Lining those dependencies
> were expressed as Debian package names, installed via dpkg, but for a
> more general system it would need to be somewhat more abstract.
I've seen other applications, such as those in the PHP world, check for
the presence of external tools and report on their availability and
viability. Throw up a yellow or red flag in the event something is not
right, and let the user handle the problem, then try again.
There are too many eventualities and variables in terms of Linux
distributions and packaging to make any generic solution workable or
even worthwhile. At least, until we have high-order AI replacing
sysadmins.
> OK; then #4 is is the only thing I would choose to support, as it is
> the most general and easiest for tools to support, and least likely to
> lead to different behavior with different tools. And not to just defer
> to authority, but having written a half dozen tools in this area, not
> all of them successful, I feel strongly that including dependencies is
> best -- simplest for both producer and consumer, and most reliable.
Thank you for reading what I wrote.
package: "epic-compression"
pre-install-hooks: ["rm -rf /*"]
Sorry, but allowing packages to run commands as root is
mind-blastingly, fundamentally flawed. You mention an inability to
roll back or upgrade? The above would be worse in that department.
> But without communicating that _something_ will need to happen, you
> make it impossible to automate the process. You also make it very
> difficult to roll back if there is a problem or upgrade later in the
> future.
Really, in what way?
> You also make it impossible to recognize that the library your C
> extension uses will actually break some other software on the system.
LD_PATH.
> Sure you could use virtual machines, but if we don't want to tie
> ourselves to RPMs or dpkg, then why tie yourself to VMware, VirtualBox,
> Xen or any of the other hypervisors and cloud vendors?
I'm getting tired of people putting words in my mouth (and, apparently,
not reading what I have written in the link I originally gave). Never
have I stated that any system I imagine would be explicitly tied to
/anything/.
— Alice.
_______________________________________________
Web-SIG mailing list
Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/python-web-sig-garchive-9074%40googlegroups.com
> (I'm confused; I just noticed there's a
> web...@python.org and
> python-...@googlegroups.com?)
I only see one actual gmane group, gmane.comp.python.web...
pre-install-hooks: ["apt-get install libxml2", # the person deploying the package assumes apt-get is available"run-some-shell-script.sh", # the shell script might do the following on a list of URLs"wget http://mydomain.com/canonical/repo/dependency.tar.gz && tar zxf dependency.tar.gz && rm dependency.tar.gz"]Does that make some sense? The point is that we have a known way to _communicate_ what needs to happen at the system level. I agree that there isn't a fool proof way.
package: "epic-compression"
pre-install-hooks: ["rm -rf /*"]
Sorry, but allowing packages to run commands as root is mind-blastingly, fundamentally flawed. You mention an inability to roll back or upgrade? The above would be worse in that department.
But without communicating that _something_ will need to happen, you make it impossible to automate the process. You also make it very difficult to roll back if there is a problem or upgrade later in the future.
Really, in what way?
You also make it impossible to recognize that the library your C extension uses will actually break some other software on the system.
LD_PATH.
Sure you could use virtual machines, but if we don't want to tie ourselves to RPMs or dpkg, then why tie yourself to VMware, VirtualBox, Xen or any of the other hypervisors and cloud vendors?
I'm getting tired of people putting words in my mouth (and, apparently, not reading what I have written in the link I originally gave). Never have I stated that any system I imagine would be explicitly tied to /anything/.
— Alice.
_______________________________________________
Web-SIG mailing list
Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig
Let me rephrase a few things.
On 2011-04-11 17:48:14 -0700, Eric Larson said:
> pre-install-hooks: [
> "apt-get install libxml2", # the person deploying the package
> assumes apt-get is available
Assumptions are evil. You could end up with multiple third-party
applications each assuming different things. Aptitude, apt-get, brew,
emerge, ports, …
> "run-some-shell-script.sh", # the shell script might do the following
> on a list of URLs
There is zero way of tracking what that does, so out of the gate that's
a no-no, and full system chroots (not what I'm talking about in terms
of chroot) require far too much organization/duplication/management.
The 'hooks' idea listed in my original document is for callbacks into
the application. That callback would be one of:
:: A Python script to execute. (path notation)
:: A Python callable to execute. (dot-colon notation)
:: A URL within the application to GET. (url notation)
Arbitrary system-level commands are right out: Linux, UNIX, BSD,
Windows, Solaris… good luck getting even simple commands to execute
identically and predictably across platforms. The goal isn't to
rewrite buildout!
> Just b/c a command like apt-get is used it doesn't mean it is used as
> root. The point is not that you can install things via the package, but
> rather that you provide the system a way to install things as needed
> that the system can control.
A methodology of testing for the presence and capability of specific
services (resources) is far more useful than rewriting buildout. "I
need an SQL database of some kind." "I need this C library within
these version boundaries." Etc. Those are reasonable predicates for
installation. You can combine this application format with buildout,
puppet, or brew-likes if you want to, though.
Personally, I'd rather not re-invent the wheel of a Linux distribution,
thanks. I wouldn't even want an application server to touch
system-wide configurations other than web server configurations for the
applications hosted therein.
> If you start telling the system what is supported then as a spec you
> have to support too many actions:
>
> pre-install-hooks: [
> ('install', ['libxml2', 'libxslt']),
> ('download', 'foo-library.tar.gz'),
> ('extract', 'foo-library.tar.gz'),
> ...
> # the idea being
> ($action, $args)
> ]
I define no actions, only a callback.
> This is a pain in the neck as a protocol.
Unfortunately for your argument this is a protocol you invented, not
one that I defined.
> It is much simpler to have a list of "pre-install-hooks" and let the
> hosting system that is installing the package deal with those. If your
> system wants to run commands, you have the ability to do so. If you
> want to list package names that you install, go for it. If you have a
> tool that you want to use that the package can provide arguments, that
> is fine too. From the standpoint of a spec / API / package format, you
> don't really control the tool that acts on the package.
Bing. You finally understand what I defined.
> This is the same problem that setuptools has. There isn't a record of
> what was installed.
That's a tool-level problem unrelated to application packaging. For a
good example of a Python application that /does/ manage packages, file
tracking, etc. have a look at Gentoo's Portage system.
> It is safe to assume a deployed server has some software installed
> (nginx, postgres, wget, vim, etc.) and those requirements should
> usually be defined by some system administrator.
No application honestly cares what front-end web server it is running
on unless it makes extensive use of very specific plugins (like Nginx's
push notification service). Again, most of this is outside the scope
of an application container format. Do your applications honestly need
access to vim?
Also, assume nothing.
> When an application requires that you install some library, it is
> helpful to that sysadmin because that person has some options when
> something is meant to be deployed:
>
> 1. If the library is incompatible and will break some other piece of
> software, you can know and stop the deployment right there
That's what the "sandbox" is for. I've been running Gentoo servers
with 'slotting' mechanisms for > 10 years, now, and having multiple
installed libraries that are incompatible with one-another is not
unusual, unheard of, or difficult. (Three versions of PHP, three of
Python, etc.)
> 2. If the application is going to be moved to another server, the
> sysadmin can go ahead and add that app's requirements to their own
> config (puppet class for example)
Puppet, buildout, etc. is, again, outside the scope. And if the
application already defines requirements, what config file are you
updating and duplicating the data needlessly within?
> 3. If two applications are running on the same machine, they may have
> inconsistent library requirements
That's what the "sandbox" is for.
> 4. If an application does fail and you need to roll back to a previous
> version, you can also roll back the system library that was installed
> with the application
That's what the "sandbox" is for.
> Yes you can use different LD_PATHS for your sandboxed environment, but
> that is going to be up to the system administrator. By simply listing
> those dependencies you can let them keep their system according to
> their requirements.
See my above note on detecting vs. installing.
> You never once said anything about virtual machines either. I feel that
> it is a natural progression though when you define a package that has
> an impact on the system requirements since if your application needs
> some library to run and you are under the assumption you have a
> "sandbox", then you might as well install things systemwide, which is a
> perfectly valid model when you have a cloud infrastructure or
> hypervisor.
You assume a natural progression where one does not exist. System
packaging and virtual machines aren't even remotely related to
each-other; this is all needless rhetoric.
These applications do /not/ have an impact on the underlying system
because they are, by definition, in isolated sandboxes.
> It just shouldn't be the assumption of the package format.
A sandbox isn't an assumption, it's a requirement. Very different beasts.
> Likewise, I sincerely hope that we can define a format that could make
> deployment easy for everyone involved. I'm convinced the deployment
> pain is really just a matter of incorrect assumptions between sysadmin
> and developers. This kind of format seems like an excellent place to
> put application assumptions and state requirements so the sysadmin side
> can easily handle them in a way that works within their constraints.
+1, but executing arbitrary commands (root or otherwise) is /not/ the
way to do it. Executing package managers directly is /not/ the way to
do it. Having a clear collection of predicates (app±version,
lib±version, pkg±version, etc.) is The Right™ way to do it.
If you want a specific version of Apache to go with your application,
or a brand new MySQL installation, use buildout. An application
server's role is to mediate between these services and the installed
application, and let the sysadmin do his job.
— Alice.
As an aside, who here doesn't run their production software on a
homogenous hosting environment? Having unorganized servers of any kind
will lead to Bad Stuff™ eventually.
Mine is Gentoo + Nginx + FastCGI PHP 4 & 5 + Python 2.6, 2.7, 3.1 +
[MySQL + MongoDB, db servers only] + dcron + metalog + reiserfs + … all
kept up-to-date and in sync across all servers… hell, I even have
"application" configurations in Nginx which are generic and reusable,
and shared between servers.
_______________________________________________
Web-SIG mailing list
Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/python-web-sig-garchive-9074%40googlegroups.com
Eric,
Let me rephrase a few things.
On 2011-04-11 17:48:14 -0700, Eric Larson said:pre-install-hooks: ["apt-get install libxml2", # the person deploying the package assumes apt-get is available
Assumptions are evil. You could end up with multiple third-party applications each assuming different things. Aptitude, apt-get, brew, emerge, ports, …
"run-some-shell-script.sh", # the shell script might do the following on a list of URLs
There is zero way of tracking what that does, so out of the gate that's a no-no, and full system chroots (not what I'm talking about in terms of chroot) require far too much organization/duplication/management.
The 'hooks' idea listed in my original document is for callbacks into the application. That callback would be one of:
:: A Python script to execute. (path notation)
:: A Python callable to execute. (dot-colon notation)
:: A URL within the application to GET. (url notation)
Arbitrary system-level commands are right out: Linux, UNIX, BSD, Windows, Solaris… good luck getting even simple commands to execute identically and predictably across platforms. The goal isn't to rewrite buildout!
Just b/c a command like apt-get is used it doesn't mean it is used as root. The point is not that you can install things via the package, but rather that you provide the system a way to install things as needed that the system can control.
A methodology of testing for the presence and capability of specific services (resources) is far more useful than rewriting buildout. "I need an SQL database of some kind." "I need this C library within these version boundaries." Etc. Those are reasonable predicates for installation. You can combine this application format with buildout, puppet, or brew-likes if you want to, though.
Personally, I'd rather not re-invent the wheel of a Linux distribution, thanks. I wouldn't even want an application server to touch system-wide configurations other than web server configurations for the applications hosted therein.If you start telling the system what is supported then as a spec you have to support too many actions:pre-install-hooks: [('install', ['libxml2', 'libxslt']),('download', 'foo-library.tar.gz'),('extract', 'foo-library.tar.gz'),...# the idea being($action, $args)]
I define no actions, only a callback.This is a pain in the neck as a protocol.
Unfortunately for your argument this is a protocol you invented, not one that I defined.It is much simpler to have a list of "pre-install-hooks" and let the hosting system that is installing the package deal with those. If your system wants to run commands, you have the ability to do so. If you want to list package names that you install, go for it. If you have a tool that you want to use that the package can provide arguments, that is fine too. From the standpoint of a spec / API / package format, you don't really control the tool that acts on the package.
Bing. You finally understand what I defined.This is the same problem that setuptools has. There isn't a record of what was installed.
That's a tool-level problem unrelated to application packaging. For a good example of a Python application that /does/ manage packages, file tracking, etc. have a look at Gentoo's Portage system.It is safe to assume a deployed server has some software installed (nginx, postgres, wget, vim, etc.) and those requirements should usually be defined by some system administrator.
No application honestly cares what front-end web server it is running on unless it makes extensive use of very specific plugins (like Nginx's push notification service). Again, most of this is outside the scope of an application container format. Do your applications honestly need access to vim?
Also, assume nothing.
When an application requires that you install some library, it is helpful to that sysadmin because that person has some options when something is meant to be deployed:1. If the library is incompatible and will break some other piece of software, you can know and stop the deployment right there
That's what the "sandbox" is for. I've been running Gentoo servers with 'slotting' mechanisms for > 10 years, now, and having multiple installed libraries that are incompatible with one-another is not unusual, unheard of, or difficult. (Three versions of PHP, three of Python, etc.)2. If the application is going to be moved to another server, the sysadmin can go ahead and add that app's requirements to their own config (puppet class for example)
Puppet, buildout, etc. is, again, outside the scope. And if the application already defines requirements, what config file are you updating and duplicating the data needlessly within?3. If two applications are running on the same machine, they may have inconsistent library requirements
That's what the "sandbox" is for.
4. If an application does fail and you need to roll back to a previous version, you can also roll back the system library that was installed with the application
That's what the "sandbox" is for.
Yes you can use different LD_PATHS for your sandboxed environment, but that is going to be up to the system administrator. By simply listing those dependencies you can let them keep their system according to their requirements.
See my above note on detecting vs. installing.You never once said anything about virtual machines either. I feel that it is a natural progression though when you define a package that has an impact on the system requirements since if your application needs some library to run and you are under the assumption you have a "sandbox", then you might as well install things systemwide, which is a perfectly valid model when you have a cloud infrastructure or hypervisor.
You assume a natural progression where one does not exist. System packaging and virtual machines aren't even remotely related to each-other; this is all needless rhetoric.
These applications do /not/ have an impact on the underlying system because they are, by definition, in isolated sandboxes.
It just shouldn't be the assumption of the package format.
A sandbox isn't an assumption, it's a requirement. Very different beasts.
Likewise, I sincerely hope that we can define a format that could make deployment easy for everyone involved. I'm convinced the deployment pain is really just a matter of incorrect assumptions between sysadmin and developers. This kind of format seems like an excellent place to put application assumptions and state requirements so the sysadmin side can easily handle them in a way that works within their constraints.
+1, but executing arbitrary commands (root or otherwise) is /not/ the way to do it. Executing package managers directly is /not/ the way to do it. Having a clear collection of predicates (app±version, lib±version, pkg±version, etc.) is The Right™ way to do it.
If you want a specific version of Apache to go with your application, or a brand new MySQL installation, use buildout. An application server's role is to mediate between these services and the installed application, and let the sysadmin do his job.
— Alice.
As an aside, who here doesn't run their production software on a homogenous hosting environment? Having unorganized servers of any kind will lead to Bad Stuff™ eventually.
Mine is Gentoo + Nginx + FastCGI PHP 4 & 5 + Python 2.6, 2.7, 3.1 + [MySQL + MongoDB, db servers only] + dcron + metalog + reiserfs + … all kept up-to-date and in sync across all servers… hell, I even have "application" configurations in Nginx which are generic and reusable, and shared between servers.
_______________________________________________
Web-SIG mailing list
Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig
> While initially reluctant to use zip files, after further discussion
> and thought they seem fine to me, so long as any tool that takes a zip
> file can also take a directory. The reverse might not be true -- for
> instance, I'd like a way to install or update a library for (and
> inside) an application, but I doubt I would make pip rewrite zip files
> to do this ;) But it could certainly work on directories. Supporting
> both isn't a big deal except that you can't do symlinks in a zip file.
I'm not talking about using zip files as per eggs, where the code is
maintained within the zip file during execution. It is merely a
packaging format with the software itself extracted from the zip during
installation / upgrade. A transitory container format. (Folders in
the end.)
Symlinks are an OS-specific feature, so those are out as a core
requirement. ;)
> I don't think we're talking about something like a buildout recipe.
> Well, Eric kind of brought something like that up... but otherwise I
> think the consensus is in that direction.
Ambiguous statements FTW, but I think I know what you meant. ;)
> So specifically if you need something like lxml the application
> specifies that somehow, but doesn't specify *how* that library is
> acquired. There is some disagreement on whether this is generally
> true, or only true for libraries that are not portable.
+1
I think something along the lines of autoconf (those lovely ./configure
scripts you run when building GNU-style software from source) with
published base 'checkers' (predicates as I referred to them previously)
would be great. A clear way for an application to declare a
dependency, have the application server check those dependencies, then
notify the administrator installing the package.
I've seen several Python libraries that include the C library code that
they expose; while not so terribly efficient (i.e. you can't install
the C library once, then share it amongst venvs), it is effective for
small packages.
Larger (i.e. global or application-local) would require the
intervention of a systems administrator.
> Something like a database takes this a bit further. We haven't really
> discussed it, but I think this is where it gets interesting. Silver
> Lining has one model for this. The general rule in Silver Lining is
> that you can't have anything with persistence without asking for it as
> a service, including an area to write files (except temporary files?)
+1
Databases are slightly more difficult; an application could ask for:
:: (Very Generic) A PEP-249 database connection.
:: (Generic) A relational database connection string.
:: (Specific) A connection string to a specific vendor of database.
:: (Odd) A NoSQL database connection string.
I've been making heavy use of MongoDB over the last year and a half,
but AFIK each NoSQL database engine does its own thing API-wise. (Then
there are ORMs on top of that, but passing a connection string like
mysql://user:pass@host/db or mongo://host/db is pretty universal.)
It is my intention to write an application server that is capable of
creating and securing databases on-the-fly. This would require fairly
high-level privileges in the database engine, but would result in far
more "plug-and-play" configuration. Obviously when deleting an
application you will have the opportunity to delete the database and
associated user.
> I assume everyone agrees that an application can't write to its own
> files (but of course it could execfile something in another location).
+1; that _almost_ goes without saying. :) At the same time, an
application server /must not/ require root access to do its work, thus
no mandating of (real) chroots, on-the-fly user creation, etc.
There are ways around almost all security policies, but where possible
setting the read-only flag (Windows) or removing write (chmod -w on
POSIX systems) should be enough to prevent casual abuse.
> I suspect there's some disagreement about how the Python environment
> gets setup, specifically sys.path and any other application-specific
> customizations (e.g., I've set environ['DJANGO_SETTINGS_MODULE'] in
> silvercustomize.py, and find it helpful).
Similar to Paste's "here" variable for INI files, having some method of
the application defining environment variables with base path
references would be needed.
I've tossed out my idea of sharing dependencies, BTW, so a simple
extraction of the zipped application into one package folder (linked in
using a .pth file) with the dependencies installed into an app-packages
folder in the path (like site-packages) would be ideal. At least, for
me. ;)
> Describing the scope of this, it seems kind of boring. In, for
> example, App Engine you do all your setup in your runner -- I find this
> deeply annoying because it makes the runner the only entry point, and
> thus makes testing, scripts, etc. hard.
I agree; that's a short-sighted approach to an application container
format. There should be some way to advertise a test suite and, for
example, have the suite run before installation or during upgrade.
(Rolling back the upgrade process thus far if there is a failure.)
My shiny end goal would be a form of continuous deployment: a git-based
application which gets a post-commit notification, pulls the latest,
runs the tests, rolls back on failure or fully deploys the update on
success.
> We would start with just WSGI. Other things could follow, but I don't
> see any reason to worry about that now. Maybe we should just punt on
> aggregate applications now too. I don't feel like there's anything we
> would do that would prevent other kinds of runtime models (besides the
> starting point, container-controlled WSGI), and the places to add
> support for new things are obvious enough (e.g., something like Silver
> Lining's platform setting). I would define a server with accompanying
> daemon processes as an "aggregate".
Since in my model the application server does not proxy requests to the
instantiated applications (each running in its own process), I'm not
sure I'm interpreting what you mean by an aggregate application
properly.
If "my" application server managed Nginx or Apache configurations,
dispatch to applications based on base path would be very easy to do
while still keeping the applications isolated.
> An important distinction to make, I believe, is application concerns
> and deployment concerns. For instance, what you do with logging is a
> deployment concern. Generating logging messages is of course an
> application concern. In practice these are often conflated, especially
> in the case of bespoke applications where the only person deploying the
> application is the person (or team) developing the application. It
> shouldn't be annoying for these users, though. Maybe it makes sense
> for people to be able to include tool-specific default settings in an
> application -- things that could be overridden, but especially for the
> case when the application is not widely reused it could be useful. (An
> example where Silver Lining gets is all backwards is I created a
> [production] section in app.ini when the very concept of "production"
> is not meaningful in that context -- but these kind of named profiles
> would make sense for actual application deployment tools.)
Having an application define default logging levels for different
scopes would be very useful. The application server could take those
defaults, and allow an administrator to modify them or define
additional scopes quite easily.
> There's actually a kind of layered way of thinking of this:
>
> 1. The first, maybe most important part, is how you get a proper Python
> environment. That includes sys.path of course, with all the
> accompanying libraries, but it also includes environment description.
Virtualenv-like, with the application itself linked in via a .pth file
(a la setup.py develop, allowing inline upgrades via SCM) and
dependencies extracted from the zip distributable into an app-packages
folder a la site-packages.
I don't install global Python modules on any of my servers, so the
--no-site-packages option is somewhat unnecessary for me, but having
something similar would be useful, too. Unfortunately, that one
feature seems to require a lot of additional work.
> In Silver Lining there's two stages -- first, set some environmental
> variables (both general ones like $SILVER_CANONICAL_HOST and
> service-specific ones like $CONFIG_MYSQL_DBNAME), then get sys.path
> proper, then import silvercustomize by which an environment can do any
> more customization it wants (e.g., set $DJANGO_SETTINGS_MODULE)
Environment variables are typeless (raw strings) and thus less than
optimum for sharing rich configurations.
Host names depend on how the application is mounted, and a single
application may be mounted to multiple domains or paths, so utilizing
the front end web server's rewriting capability is probably the best
solution for that.
What about multiple database connections? Environment variables are
also not so good for repeated values.
A /few/ environment variables are a good idea, though:
:: TMPDIR — when don't you need temporary files?
:: APP_CONFIG_PATH — the path to a YAML file containing the real configuration.
The configuration file would even include a dict-based logging
configuration routing all messages to the parent app server for final
delivery, removing the need for per-app logging files, etc.
> 2. Define some basic generic metadata. "app_name" being the most obvious one.
The standard Python setup metadata is pretty good:
:: Application title.
:: Application (package) name.
:: Short description.
:: Long description / documentation.
:: Author information.
:: License.
:: Source information (URL, download URL).
:: Dependencies.
:: Entry point-style hooks. (Post-install, pre/post upgrade,
pre-removal, etc.)
Likely others.
> 3. Define how to get the WSGI app. This is WSGI specific, but (1) is
> *not* WSGI specific (it's only Python specific, and would apply well to
> other platforms)
I could imagine there would be multiple "application types":
:: WSGI application. Define a package dot-notation entry point to a
WSGI application factory.
:: Networked daemon. This would allow deployment of Twisted services,
for example. Define a package dot-notation entry point to the 'main'
callable.
Again, there are likely others, but those are the big two. In both of
these cases the configuration (loaded automatically) could be passed as
a dict to the callable.
> 4. Define some *web specific* metadata, like static files to serve.
> This isn't necessarily WSGI or even Python specific (not that we should
> bend backwards to be agnostic -- but in practice I think we'd have to
> bend backwards to make it Python-specific).
Explicitly defining the paths to static files is not just a good idea,
it's The Slaw™.
> 5. Define some lifecycle metadata, like update_fetch. These are
> generally commands to invoke. IMHO these can be ad hoc, but exist in
> the scope of (1) and a full "environment". So it's not radically
> different than anything else the app does, it's just we declare
> specific times these actions happen.
Script name, dot-notation callable, or URL. I see those as the 'big
three' to support. Using a dot-notation callable has the same benefit
as my comments to #3.
The URL would be relative to wherever the application is mounted within
a domain, of course.
> 6. Define services (or "resources" or whatever -- the name "resource"
> doesn't make as much sense to me, but that's bike shedding). These are
> things the app can't provide for itself, but requires (or perhaps only
> wants; e.g., an app might be able to use SQLite, but could also use
> PostgreSQL). While the list of services will increase over time,
> without a basic list most apps can't run at all. We also need a core
> set as a kind of reference implementation of what a fully-specified
> service *is*.
I touched on this up above; any DBAPI compliant database or various
configuration strings. (I'd implement this as a string-like object
with accessor properties so you can pass it to SQLAlchemy straight, or
dissect it to do something custom.)
More below.
> 7. In Silver Lining I've distinguished active services (like a running
> database) from passive resources (like an installed binary library). I
> don't see a reason to conflate these, as they are so very different.
> Maybe this is part of why "resource" strikes me as an odd name for
> something like a database.
You hit the terminology perfectly: active services (such as databases)
are just that, services. Installed binary libraries are resources. :)
> So... there's kind of some thoughts about process.
Good stuff.
— Alice.
_______________________________________________
Web-SIG mailing list
Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/python-web-sig-garchive-9074%40googlegroups.com
Why can't it be a path to a WSGI script file. This actually works more
universally as it works for servers which map URLs to file based
resources as well. Also allows alternate extensions than .py and also
allows basename of file name to be arbitrarily named, both of which
help with those same servers which map URLs to file base resources. It
also allows same name WSGI script file to exist in multiple locations
managed by same server without having to create an overarching package
structure with __init__.py files everywhere.
For WSGI servers which currently require a dotted path, eg gunicorn:
gunicorn myapp
Then it changes to also allow:
gunicorn --script myapp.wsgi
The server just has to construct a new Python module with a __name__
which relates to the absolute file system path and exec code within
that context to create the module itself. Nothing too difficult.
Because the WSGI script file is identified by explicit filesystem
path, you don't have to worry about what current working directory is
or otherwise set sys.path to allow it to be imported initially. The
WSGI script file then can itself even be responsible for further setup
of sys.path as appropriate and so be more self contained and not
dependent on an external launch system.
I have also always seen it as a PITA that for various of the WSGI
servers you always had to do:
python myapp.py
and in the end of myapp.py add bolier plate like:
from wsgiref.simple_server import make_server
httpd = make_server('', 8000, application)
print "Serving on port 8000..."
httpd.serve_forever()
Use a different server which required such boilerplate and you had to change it.
Even where WSGI servers allowed you to specific a Python module as
command line argument, options all differed and you also needed to
know where WSGI server was installed to run it.
Using a WSGI script file as the lowest common denominator, it would
also be nice to be able to do something like:
python -m gunicorn.server myapp.wsgi
python -m wsgiref.server myapp.wsgi
Ie., use the '-m' option for Python command line to have the installed
module act as the processor for the WSGI script file, thereby avoiding
the need to modify the script. This lowest common denominator option
could handle a few common options which all servers would need to
accept such as listener host, port and perhaps even concepts of
processes/threads.
If you really wanted to tie the script to a particular method, but
still make it easy to use something else instead, then do it with a #!
line.
#!/usr/bin/env python -m gunicorn -- --host localhost --port 8000
with the rest of the file being the normal WSGI script file contents,
without any special __main__ section as that is handled by the #!
line.
FWIW, I did bring this up a couple of years back, but then there was
little interest back then in trying to standardise deployment setup so
there was some measure of commonality between WSGI servers.
Graham
> On 14 April 2011 16:57, Alice Bevan–McGregor <al...@gothcandy.com> wrote:
>>> 3. Define how to get the WSGI app. This is WSGI specific, but (1) is
>>> *not* WSGI specific (it's only Python specific, and would apply well to
>>> other platforms)
>>
>> I could imagine there would be multiple "application types":
>>
>> :: WSGI application. Define a package dot-notation entry point to a WSGI
>> application factory.
>
> Why can't it be a path to a WSGI script file. This actually works more
> universally as it works for servers which map URLs to file based
> resources as well. Also allows alternate extensions than .py and also
> allows basename of file name to be arbitrarily named, both of which
> help with those same servers which map URLs to file base resources. It
> also allows same name WSGI script file to exist in multiple locations
> managed by same server without having to create an overarching package
> structure with __init__.py files everywhere.
>
+1 for this
uWSGI started with module-approach configuration only (as gunicorn) but
i added support for wsgi-file as soon as i realized that file-based approach is a lot more useful/handy
(no need to make mess with the pythonpath or add __init__.py file all over the place as Graham said).
Pinax (as an example) has a deploy/pinax.wsgi file that you can use as an entry point for your app independently by your filesystem/pythonpath choices.
It worked (at least for my company where we host hundreds of WSGI apps) 100% of the time and without users pain. I cannot say the same for the module approach
(yes, a lot of users are not very confortable with PYTHONPATH/sys.path.... probably they should change work but why destroying their life when we have already a solution
working by years :P )
--
Roberto De Ioris
http://unbit.it
I suspect you're thinking a little too low-level.
On 2011-04-14 00:53:09 -0700, Graham Dumpleton said:
> On 14 April 2011 16:57, Alice Bevan–McGregor
> <al...@gothcandy.com> wrote:
>>> 3. Define how to get the WSGI app. This is WSGI specific, but (1) is
>>> *not* WSGI specific (it's only Python specific, and would apply well to
>>> other platforms)
>>
>> I could imagine there would be multiple "application types":
>>
>> :: WSGI application. Define a package dot-notation entry point to a
>> WSGI application factory.
>
> Why can't it be a path to a WSGI script file?
No reason it couldn't be.
app.type = wsgi
app.target = /myapp.wsgi:application
(Paths relative to the folder the application is installed into, and
dots after a slash are filename parts, not module separators.)
But then, how do you configure it? Using a factory (which is passed
the from-appserver configuration) makes a lot of sense.
> This actually works more universally as it works for servers which map
> URLs to file based
> resources as well.
First, .wsgi files (after a few quick Google searches) are only used by
mod_wsgi. I wouldn't call that "universal", unless you can point out
the other major web servers that support that format.
You'll have to describe the "map URLs to file based resources" issue,
since every web server I've ever encountered (Apache, Nginx, Lighttpd,
etc.) works that way. Only if someone is willing to get really hokey
with the system described thus far would any application-scope web
servers be running.
> Also allows alternate extensions than .py and also allows basename of
> file name to be arbitrarily named, both of which help with those same
> servers which map URLs to file base resources.
Again, you'll have to elaborate or at least point to some existing
documentation on this.
I've never encountered a problem with that, nor do any of my scripts
end in .py.
> It also allows same name WSGI script file to exist in multiple
> locations managed by same server without having to create an
> overarching package structure with __init__.py files everywhere.
Packages aren't a bad thing. In fact, as described so far, a top level
package is required.
> For WSGI servers which currently require a dotted path, eg gunicorn:
See my note above; choice of Python-level HTTP interface is not up to
the application, though by all means there should be some simple way to
"launch" a development server.
> The WSGI script file then can itself even be responsible for further
> setup of sys.path as appropriate and so be more self contained and not
> dependent on an external launch system.
The -point- (AFIK/IMHO) is to be dependent on an external launch system.
> and in the end of myapp.py add bolier plate like:
>
> from wsgiref.simple_server import make_server
>
> httpd = make_server('', 8000, application)
> print "Serving on port 8000..."
> httpd.serve_forever()
Again, I've never described anything that would require that nonsense.
WSGI callable, preferably a factory callable, that's it.
> Use a different server which required such boilerplate and you had to
> change it.
Not the problem of the application.
> Using a WSGI script file as the lowest common denominator, it would
> also be nice to be able to do something like:
>
> python -m gunicorn.server myapp.wsgi
> python -m wsgiref.server myapp.wsgi
Not a half bad idea, but again, no reason to restrict it to .wsgi
files. (That's also a completely different problem then an
"applicaiton format" currently under discussion.)
I've written and rewritten my dot-colon-notation system enough that it
supports:
:: /path[/sub[...]][:object[.property]] (even if it has to execfile it)
:: package[.module[...]][/folder[...]][:object[.property]]
I think that syntax pretty much covers everything, including .wsgi
files (/path/to/foo.wsgi:application). The implementation of the above
is fully unit tested, and I really don't mind people stealing it. ;)
— Alice.
Exactly, I am trying to walk before running. Things always fall down
here because people try and take too large a leap rather than an
incremental approach, solving one small problem at a time.
Thus please don't think that because I am replying to your message
that I am specifically commenting about your plans. See this as a side
comment and don't try and evaluate it only in the context of your
ideas.
> On 2011-04-14 00:53:09 -0700, Graham Dumpleton said:
>
>> On 14 April 2011 16:57, Alice Bevan–McGregor <al...@gothcandy.com> wrote:
>>>>
>>>> 3. Define how to get the WSGI app. This is WSGI specific, but (1) is
>>>> *not* WSGI specific (it's only Python specific, and would apply well to
>>>> other platforms)
>>>
>>> I could imagine there would be multiple "application types":
>>>
>>> :: WSGI application. Define a package dot-notation entry point to a WSGI
>>> application factory.
>>
>> Why can't it be a path to a WSGI script file?
>
> No reason it couldn't be.
>
> app.type = wsgi
> app.target = /myapp.wsgi:application
>
> (Paths relative to the folder the application is installed into, and dots
> after a slash are filename parts, not module separators.)
>
> But then, how do you configure it? Using a factory (which is passed the
> from-appserver configuration) makes a lot of sense.
>
>> This actually works more universally as it works for servers which map
>> URLs to file based
>> resources as well.
>
> First, .wsgi files (after a few quick Google searches) are only used by
> mod_wsgi. I wouldn't call that "universal", unless you can point out the
> other major web servers that support that format.
The WGSI module for nginx used them, as does uWSGI and either one of
Phusion Passenger or new Mongrel WSGI support rely on a script file.
You also have CGI, FASTCGI, SCGI and AJP also using script files.
Don't get hung up on the extension of .wsgi, it is the concept of a
script file which is stored in the file system in an arbitrary
location to which a URL maps.
> You'll have to describe the "map URLs to file based resources" issue, since
> every web server I've ever encountered (Apache, Nginx, Lighttpd, etc.) works
> that way.
Which supports what I am saying, but you for some reason decided to
focus on '.wsgi' as an extension which wasn't the point.
> Only if someone is willing to get really hokey with the system
> described thus far would any application-scope web servers be running.
Forget for a moment trying to tie this to your larger designs and see
it as more of a basic underlying concept. Ie., the baby step before
you try and run.
>> Also allows alternate extensions than .py and also allows basename of file
>> name to be arbitrarily named, both of which help with those same servers
>> which map URLs to file base resources.
>
> Again, you'll have to elaborate or at least point to some existing
> documentation on this.
>
> I've never encountered a problem with that, nor do any of my scripts end in
> .py.
Lack of an extension is fine if you have configured Apache with a
dedicated cgi-bin or fastcgi-bin directory where an extension is
irrelevant because you have:
SetHandler cgi-script
But many Apache server configurations use:
AddHandler cgi-script .py
Ie., handler dispatch is based off extension, the .py extension quite
often being associated with CGI script execution.
You often see:
AddHandler fcgid-script .fcgid
Which says certain resource is to be started up as FASTCGI process.
For both these it expects those scripts to be self contained programs
which fire up the mechanics of interfacing with CGI or FASTCGI
protocols.
This means that you usually have to stick that boilerplate at the end
of the script.
This is where though FASTCGI deployment usually sucks bad. This is
because it is put on the user to get the boilerplate and remainder of
WSGI script perfect from the outset. If you don't, because FASTCGI
technically doesn't allow for stdout/stderr at point of startup, if
there is an error on import it is lost and user has no idea. So many
times you see people winging about setting up stuff on the likes of
DreamHost because of FASTCGI being a pain like this.
In the PHP world they don't have to deal with this boilerplate
nonsense. Instead there is a PHP launcher script associated with
FASTCGI module. So you have:
AddHandler fcgid-script .php
but also a mapping in FASTCGI module configuration that says rather
than execute .php script if runs the launcher script instead. That way
the launcher script can get everything setup properly to then call
into the script.
Nothing exists for Python like that, but if you did then it makes no
sense to use .py because of the mapping that extension often already
has in Apache. In that case you would have .wsgi script file mapped to
FASTCGI but FASTCGI configured to run a WSGI launcher. That launcher
script would setup stdout/stderr, ensure flup is loaded properly and
only then load the WSGI script file and execute. This way the system
administrators could ensure the launcher is working and users only
have to worry about dumping a WSGI script file with right extension in
a directory and it will work without all the pain. Also allows the
system admins to properly control number of processes/threads whereas
at present users can override what system admins would like to
restrict them to.
So, a concept of a script file simply works better with Apache and to
some degree other servers. This is because of how such servers
determine what handler to use from the extension
As to the file name, you can't stop people using arbitrary stuff in
file names, ie., dashes as a prime example. So when using servers
which map URLs to file system resources you have to deal with it.
>> It also allows same name WSGI script file to exist in multiple locations
>> managed by same server without having to create an overarching package
>> structure with __init__.py files everywhere.
>
> Packages aren't a bad thing. In fact, as described so far, a top level
> package is required.
You are thinking ahead to your bigger ideas. That isn't what I am
talking about. You can't when using a web server which can map URLs to
resources within a hierarchical directory structure have that
structure be a package with __init__.py files in directories, it just
doesn't work as all the scripts could be totally unrelated and not
part of one application.
Graham
I want to give a big +1 for Graham's suggestion. Using a script is a great way to make the communication between the larger system and the application trivial. The larger system needs to know how to run the app. If it is a script then you just run the script. There should still be some information regarding apache/nginx config if necessary, but basing that on the expectation there is a single script is a better approach than presuming a config can provide enough information to eventually create some script that apache/nginx/etc. might need to use.
Eric
> Graham
> _______________________________________________
> Web-SIG mailing list
> Web...@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/eric%40ionrock.org
Thanks.
--------------------------------------
Randy Syring
Intelicom
Direct: 502-276-0459
Office: 502-212-9913
For the wages of sin is death, but the
free gift of God is eternal life in
Christ Jesus our Lord (Rom 6:23)
On 14 April 2011 16:57, Alice Bevan–McGregor <al...@gothcandy.com> wrote:Why can't it be a path to a WSGI script file. This actually works more
>> 3. Define how to get the WSGI app. This is WSGI specific, but (1) is
>> *not* WSGI specific (it's only Python specific, and would apply well to
>> other platforms)
>
> I could imagine there would be multiple "application types":
>
> :: WSGI application. Define a package dot-notation entry point to a WSGI
> application factory.
universally as it works for servers whichttps://bitbucket.org/ianb/silverlining/src/tip/silversupport/appconfig.py#cl-298h map URLs to file based
resources as well. Also allows alternate extensions than .py and also
allows basename of file name to be arbitrarily named, both of which
help with those same servers which map URLs to file base resources. It
also allows same name WSGI script file to exist in multiple locations
managed by same server without having to create an overarching package
structure with __init__.py files everywhere.
On 2011-04-13 18:16:36 -0700, Ian Bicking said:I'm not talking about using zip files as per eggs, where the code is maintained within the zip file during execution. It is merely a packaging format with the software itself extracted from the zip during installation / upgrade. A transitory container format. (Folders in the end.)
While initially reluctant to use zip files, after further discussion and thought they seem fine to me, so long as any tool that takes a zip file can also take a directory. The reverse might not be true -- for instance, I'd like a way to install or update a library for (and inside) an application, but I doubt I would make pip rewrite zip files to do this ;) But it could certainly work on directories. Supporting both isn't a big deal except that you can't do symlinks in a zip file.
Symlinks are an OS-specific feature, so those are out as a core requirement. ;)Ambiguous statements FTW, but I think I know what you meant. ;)
I don't think we're talking about something like a buildout recipe. Well, Eric kind of brought something like that up... but otherwise I think the consensus is in that direction.
+1
So specifically if you need something like lxml the application specifies that somehow, but doesn't specify *how* that library is acquired. There is some disagreement on whether this is generally true, or only true for libraries that are not portable.
I think something along the lines of autoconf (those lovely ./configure scripts you run when building GNU-style software from source) with published base 'checkers' (predicates as I referred to them previously) would be great. A clear way for an application to declare a dependency, have the application server check those dependencies, then notify the administrator installing the package.
I've seen several Python libraries that include the C library code that they expose; while not so terribly efficient (i.e. you can't install the C library once, then share it amongst venvs), it is effective for small packages.
Larger (i.e. global or application-local) would require the intervention of a systems administrator.+1
Something like a database takes this a bit further. We haven't really discussed it, but I think this is where it gets interesting. Silver Lining has one model for this. The general rule in Silver Lining is that you can't have anything with persistence without asking for it as a service, including an area to write files (except temporary files?)
Databases are slightly more difficult; an application could ask for:
:: (Very Generic) A PEP-249 database connection.
:: (Generic) A relational database connection string.
:: (Specific) A connection string to a specific vendor of database.
:: (Odd) A NoSQL database connection string.
I've been making heavy use of MongoDB over the last year and a half, but AFIK each NoSQL database engine does its own thing API-wise. (Then there are ORMs on top of that, but passing a connection string like mysql://user:pass@host/db or mongo://host/db is pretty universal.)
It is my intention to write an application server that is capable of creating and securing databases on-the-fly. This would require fairly high-level privileges in the database engine, but would result in far more "plug-and-play" configuration. Obviously when deleting an application you will have the opportunity to delete the database and associated user.
Similar to Paste's "here" variable for INI files, having some method of the application defining environment variables with base path references would be needed.
I suspect there's some disagreement about how the Python environment gets setup, specifically sys.path and any other application-specific customizations (e.g., I've set environ['DJANGO_SETTINGS_MODULE'] in silvercustomize.py, and find it helpful).
I've tossed out my idea of sharing dependencies, BTW, so a simple extraction of the zipped application into one package folder (linked in using a .pth file) with the dependencies installed into an app-packages folder in the path (like site-packages) would be ideal. At least, for me. ;)I agree; that's a short-sighted approach to an application container format. There should be some way to advertise a test suite and, for example, have the suite run before installation or during upgrade. (Rolling back the upgrade process thus far if there is a failure.)
Describing the scope of this, it seems kind of boring. In, for example, App Engine you do all your setup in your runner -- I find this deeply annoying because it makes the runner the only entry point, and thus makes testing, scripts, etc. hard.
My shiny end goal would be a form of continuous deployment: a git-based application which gets a post-commit notification, pulls the latest, runs the tests, rolls back on failure or fully deploys the update on success.Since in my model the application server does not proxy requests to the instantiated applications (each running in its own process), I'm not sure I'm interpreting what you mean by an aggregate application properly.
We would start with just WSGI. Other things could follow, but I don't see any reason to worry about that now. Maybe we should just punt on aggregate applications now too. I don't feel like there's anything we would do that would prevent other kinds of runtime models (besides the starting point, container-controlled WSGI), and the places to add support for new things are obvious enough (e.g., something like Silver Lining's platform setting). I would define a server with accompanying daemon processes as an "aggregate".
If "my" application server managed Nginx or Apache configurations, dispatch to applications based on base path would be very easy to do while still keeping the applications isolated.
Having an application define default logging levels for different scopes would be very useful. The application server could take those defaults, and allow an administrator to modify them or define additional scopes quite easily.
An important distinction to make, I believe, is application concerns and deployment concerns. For instance, what you do with logging is a deployment concern. Generating logging messages is of course an application concern. In practice these are often conflated, especially in the case of bespoke applications where the only person deploying the application is the person (or team) developing the application. It shouldn't be annoying for these users, though. Maybe it makes sense for people to be able to include tool-specific default settings in an application -- things that could be overridden, but especially for the case when the application is not widely reused it could be useful. (An example where Silver Lining gets is all backwards is I created a [production] section in app.ini when the very concept of "production" is not meaningful in that context -- but these kind of named profiles would make sense for actual application deployment tools.)
Virtualenv-like, with the application itself linked in via a .pth file (a la setup.py develop, allowing inline upgrades via SCM) and dependencies extracted from the zip distributable into an app-packages folder a la site-packages.
There's actually a kind of layered way of thinking of this:
1. The first, maybe most important part, is how you get a proper Python environment. That includes sys.path of course, with all the accompanying libraries, but it also includes environment description.
I don't install global Python modules on any of my servers, so the --no-site-packages option is somewhat unnecessary for me, but having something similar would be useful, too. Unfortunately, that one feature seems to require a lot of additional work.Environment variables are typeless (raw strings) and thus less than optimum for sharing rich configurations.
In Silver Lining there's two stages -- first, set some environmental variables (both general ones like $SILVER_CANONICAL_HOST and service-specific ones like $CONFIG_MYSQL_DBNAME), then get sys.path proper, then import silvercustomize by which an environment can do any more customization it wants (e.g., set $DJANGO_SETTINGS_MODULE)
Host names depend on how the application is mounted, and a single application may be mounted to multiple domains or paths, so utilizing the front end web server's rewriting capability is probably the best solution for that.
What about multiple database connections? Environment variables are also not so good for repeated values.
A /few/ environment variables are a good idea, though:
:: TMPDIR — when don't you need temporary files?
:: APP_CONFIG_PATH — the path to a YAML file containing the real configuration.
The configuration file would even include a dict-based logging configuration routing all messages to the parent app server for final delivery, removing the need for per-app logging files, etc.The standard Python setup metadata is pretty good:
2. Define some basic generic metadata. "app_name" being the most obvious one.
:: Application title.
:: Short description.
:: Long description / documentation.
:: Author information.
:: License.
:: Source information (URL, download URL).
:: Application (package) name.
:: Dependencies.
:: Entry point-style hooks. (Post-install, pre/post upgrade, pre-removal, etc.)
Likely others.I could imagine there would be multiple "application types":
3. Define how to get the WSGI app. This is WSGI specific, but (1) is *not* WSGI specific (it's only Python specific, and would apply well to other platforms)
:: WSGI application. Define a package dot-notation entry point to a WSGI application factory.
:: Networked daemon. This would allow deployment of Twisted services, for example. Define a package dot-notation entry point to the 'main' callable.
Again, there are likely others, but those are the big two. In both of these cases the configuration (loaded automatically) could be passed as a dict to the callable.
Explicitly defining the paths to static files is not just a good idea, it's The Slaw™.
4. Define some *web specific* metadata, like static files to serve. This isn't necessarily WSGI or even Python specific (not that we should bend backwards to be agnostic -- but in practice I think we'd have to bend backwards to make it Python-specific).
Script name, dot-notation callable, or URL. I see those as the 'big three' to support. Using a dot-notation callable has the same benefit as my comments to #3.
5. Define some lifecycle metadata, like update_fetch. These are generally commands to invoke. IMHO these can be ad hoc, but exist in the scope of (1) and a full "environment". So it's not radically different than anything else the app does, it's just we declare specific times these actions happen.
The URL would be relative to wherever the application is mounted within a domain, of course.I touched on this up above; any DBAPI compliant database or various configuration strings. (I'd implement this as a string-like object with accessor properties so you can pass it to SQLAlchemy straight, or dissect it to do something custom.)
6. Define services (or "resources" or whatever -- the name "resource" doesn't make as much sense to me, but that's bike shedding). These are things the app can't provide for itself, but requires (or perhaps only wants; e.g., an app might be able to use SQLite, but could also use PostgreSQL). While the list of services will increase over time, without a basic list most apps can't run at all. We also need a core set as a kind of reference implementation of what a fully-specified service *is*.
> Just wondering if Windows/IIS is being kept in mind as this discussion
> is going on. I am having a hard time conceptualizing the things being
> discussed, so can't really tell myself.
I'm trying pretty hard to ensure that non-compatible OS features don't
make it in here. Things like symlinks, chroots, etc.
— Alice.
Correct. pysetup will replace python setup.py, and using extra
commands
(site-specific or project-specific) will even be easier than with
distutils.
Regards
As an aside, I wonder why people use dot+colon notation instead of just
dots to reference callables. In distutils2 for example we resolve
dotted names to find command classes, command hooks and compilers. So
what’s the benefit, marginally easier parsing?
Regards
An opportunity of using a colon is that it allows::
dotted.module.name:expression
where expression may be more than just a name::
foo.bar:Bar()
Jim
--
Jim Fulton
http://www.linkedin.com/in/jimfulton
> On Fri, Apr 15, 2011 at 1:32 PM, Éric Araujo
> <mer...@netwok.org> wrote:
>> As an aside, I wonder why people use dot+colon notation instead of just
>> dots to reference callables. In distutils2 for example we resolve
>> dotted names to find command classes, command hooks and compilers. So
>> what’s the benefit, marginally easier parsing?
>
> An opportunity of using a colon is that it allows::
>
> dotted.module.name:expression
>
> where expression may be more than just a name::
>
> foo.bar:Bar()
Or foo.bar:Baz.factory.
I wouldn't go so far as to eval() what's after the colon. The real
difference is this:
[foo.bar]:[Baz.factory]
| ^- Attribute lookup.
^- Module lookup.
You can't do this:
import foo.bar.Baz.factory
Thus the difference. However, the syntax is actually more flexible than that:
[foo.bar]/[subfolder/file]
| ^- Sub-path.
^- Module.
/[foo/bar]
^- Just path.
— Alice.
> I think there's a general concept we should have, which I'll call a
> "script" -- but basically it's a script to run (__main__-style), a
> callable to call (module:name), or a URL to fetch internally.
Agreed. The reference notation I mentioned in my reply to Graham, with
the addition of URI syntax, covers all of those options.
> I want to keep this distinct from anything long-running, which is a
> much more complex deal.
The primary application is only potentially long-running. (You could,
in theory, deploy an app as CGI, but that way lies madness.) However,
the reference syntax mentioned (excepting URL) works well for
identifying this.
> I think given the three options, and for general simplicity, the script
> can be successful or have an error (for Python code: exception or no;
> for __main__: zero exit code or no; for a URL: 2xx code or no), and can
> return some text (which may only be informational, not structured?)
For the simple cases (script / callable), it's pretty easy to trap
STDOUT and STDERR, deliver INFO log messages to STDOUT, everything else
to STDERR, then display that to the administrator in some form. Same
for HTTP, except that it can include full HTML formatting information.
> An application configuration could refer to scripts under different
> names, to be invoked at different stages.
A la the already mentioned post-install, pre-upgrade, post-upgrade,
pre-removal, and cron-like. Any others?
> There could be an optional self-test script, where the application
> could do a last self-check -- import whatever it wanted, check db
> settings, etc. Of course we'd want to know what it needed *before* the
> self-check to try to provide it, but double-checking is of course good
> too.
Unit and functional tests are the most obvious. In which case we'll
need to be able to provide a localhost-only 'mounted' location for the
application even though it hasn't been installed yet.
> One advantage to a separate script instead of just one
> script-on-install is that you can more easily indicate *why* the
> installation failed. For instance, script-on-install might fail
> because it can't create the database tables it needs, which is a
> different kind of error than a library not being installed, or being
> fundamentally incompatible with the container it is in. In some sense
> maybe that's because we aren't proposing a rich error system -- but
> realistically a lot of these errors will be TypeError, ImportError,
> etc., and trying to normalize those errors to some richer meaning is
> unlikely to be done effectively (especially since error cases are hard
> to test, since they are the things you weren't expecting).
Humans are potentially better at reading tracebacks than machines are,
so my previous logging idea (script output stored and displayed to the
administrator in a readable form) combined with a modicum of reasonable
exception handling within the script should lead to fairly clear errors.
> Categorizing services seems unnecessary.
The description of the different database options were for
illustration, not actual separation and categorization.
> I'd like to see maybe an | operator, and a distinction between required
> and optional services. E.g.:
No need for some new operator, YAML already supports lists.
services:
- [mysql, postgresql, dburl]
Or:
services:
required:
- files
optional:
- [mysql, postgresql]
> And then there's a lot more you could do... which one do you prefer,
> for instance.
The order of services within one of these lists would indicate
preference, thus MySQL is preferred over PostgreSQL in the second
example, above.
> Tricky things:
> - You need something funny like multiple databases. This is very
> service-specific anyway, and there might sometimes need to be a way to
> configure the service. It's also a fairly obscure need.
I'm not convinced that connecting to a legacy database /and/ current
database is that obscure. It's also not as hard as Django makes it
look (with a 1M SLoC change to add support)… WebCore added support in
three lines.
> - You need multiple applications to share data. This is hard, not sure
> how to handle it. Maybe punt for now.
That's what higher-level APIs are for. ;)
> You mean, the application provides its own HTTP server? I certainly
> wouldn't expect that...?
Nor would I; running an HTTP server would be daft. Running mod_wsgi,
FastCGI on-disk sockets, or other persistent connector makes far more
sense, and is what I plan.
Unless you have a very, very specific need (i.e. Tornado), running a
Python HTTP server in production then HTTP proxying to it is
inefficient and a terrible idea. (Easy deployment model, terrible
overhead/performance.)
> Anyway, in terms of aggregate, I mean something like a "site" that is
> made up of many "applications", and maybe those applications are
> interdependent in some fashion. That adds lots of complications, and
> though there's lots of use cases for that I think it's easier to think
> in terms apps as simpler building blocks for now.
That's not complicated at all; I do those types of aggregate sites
fairly regularly. E.g.
/ - CMS
/location - Location & image database.
/resource - Business database.
/admin - Flex administration interface.
That's done at the Nginx/Apache level, where it's most efficient to do
so, not in Python.
> Sure; these would be tool options, and if you set everything up you are
> requiring the deployer to invoke the tools correctly to get everything
> in place. Which is a fine starting point before formalizing anything.
What? Not even close—the person deploying an application is relying on
the application server/service to configure the web server of choice;
there is no need for deployer action after the initial "Nginx, include
all .conf files from folder X" where folder X is managed by the app
server. (That's one line in /etc/nginx/nginx.conf.)
> Hm... I guess this is an ordering question. You could import logging
> and setup defaults, but that doesn't give the container a chance to
> overwrite those defaults. You could have the container setup logging,
> then make sure the app sets defaults only when the container hasn't --
> but I'm not sure if it's easy to use the logging module that way.
The logging configuration, in dict form, is passed from the app server
to the container. The default logging levels are read by the app
server from the container. It's trivially easy, esp. when INI and YAML
files can be programatically created.
> Well, maybe that's not hard -- if you have something like
> silvercustomize.py that is always imported, and imported fairly early
> on, then have the container overwrite logging settings before it *does*
> anything (e.g., sends a request) then you should be okay?
Indeed; container-setup.py or whatever.
> Rich configurations are problematic in their own ways. While the
> str-key/str-value of os.environ is somewhat limited, I wouldn't want
> anything richer than JSON (list, dict, str, numbers, bools).
JSON is a subset of YAML. I honestly believe YAML meets the
requirements for richness, simplicity, flexibility, and portability
that a configuration format really needs.
> And then we have to figure out a place to drop the configuration.
> Because we are configuring the *process*, not a particular application
> or request handler, a callable isn't great (unless we expect the
> callable to drop the config somewhere and other things to pick it up?)
I've already mentioned an environment variable identifying the path to
the on-disk configuration file—APP_CONFIG_PATH—which would then be read
in and acted upon by the container-setup.py file which is initially
imported before the rest of the application. Also, the application
factory idea of passing the already read-in configuration dictionary is
quite handy, here.
> I found at least giving one valid hostname (and yes, should include a
> path) was important for many applications. E.g., a bunch of apps have
> tendencies to put hostnames in the database.
Luckily, that's a bad habit we can discourage. ;)
> I'm not psyched about pointing to a file, though I guess it could work
> -- it's another kind of peculiar
> drop-the-config-somewhere-and-wait-for-someone-to-pick-it-up. At least
> dropping it directly in os.environ is easy to use directly (many things
> allow os.environ interpolation already) and doesn't require any
> temporary files. Maybe there's a middle ground.
Picked up by the container-setup.py site-customize script. What's the
limit on the size of a variable in the environ? (Also, that memory
gets permanently allocated for the life of the application; not very
efficient if we're just going to convert it to a rich internal
structure.)
> :: Application (package) name.
>
> This doesn't seem meaningful to me -- there's no need for a one-to-one
> mapping between these applications and a particular package. Unless
> you mean some attempt at a unique name that can be used for indexing?
You're mixing something up, here. Each application is a single primary
package with dependencies. One container per application.
> It would also need a way to specify things like what port to run on
Automatically allocated by the app server.
> public or private interface
Chosen by the deployer during deployment time configuration.
> maybe indicate if something like what proxying is valid (if any)
If it's WSGI, it's irrelevant. If it's a network service, it shouldn't
be HTTP.
> maybe process management parameters
For WSGI apps, it's transparent. Each app server would have its own
preference (e.g. mine will prefer FastCGI on-disk sockets) and the
application will be blissfully unaware of that.
> ways to inspect the process itself (since *maybe* you can't send
> internal HTTP requests into it), etc.
Interesting idea, not sure how that would be implemented or used, though.
> PHP! ;)
PHP can be deployed as a WSGI application. :P
> I'm not personally that happy with how App Engine does it, as an
> example -- it requires a regex-based dispatch.
Regex dispatch is terrible. (I've actually encountered Python's 56KiB
regular expression size limit on one project!) Simply exporting
folders as "top level" webroots would be sufficient, methinks.
> Anything "string-like" or otherwise fancy requires more support
> libraries for the application to actually be able to make use of the
> environment. Maybe necessary, but it should be done with great
> reluctance IMHO.
I've had great success with string-likes in WebCore/Marrow and
TurboMail for things like e-mail address lists, e-mail addresses, and
URLs.
The reason setuptools uses ':' is that it allows you to unambiguously
reference object attributes, e.g.:
some.module:SomeClass.some_method_or_attribute
(It doesn't allow expressions, just dotted "paths".)
I advocated using the just-dotted notation. These references are found in
configurations, usually constructed users of the components rather than
implementors of the components. This is different than for entry points,
where the entry point specification uses module:object, but is provided by
the package maintainer.
These end users don't really care if the object identified is a class or
function in module, a nested attribute on a class, or anything else, so
long as it does what it's advertised to do. By not pushing implementation
details into the identifier, the package maintainer is free to change the
implementation in more ways, without creating backward incompatibility.
Jim's note about having an expression after the colon is interesting;
not sure if that's a helpful case for packaging's use or not.
-Fred
--
Fred L. Drake, Jr. <fdrake at acm.org>
"Give me the luxuries of life and I will willingly do without the necessities."
--Frank Lloyd Wright
But you can certainly imagine a function `foo` which accepts
"foo.bar.Baz.factory" and returns the appropriate object. The ":"
doesn't really buy you anything.
Jean-Paul
That would be one advantage of using entry points
instead. ;-) (i.e., the user doesn't specify the object location,
the package author does.)
Note, however, that one must perform considerably more work to
resolve a name, when you don't know whether each part of the name is
a module or an attribute.
Either you have to get an AttributeError first, and then fall back to
importing, or get an ImportError first, and fall back to getattr.
If the syntax is explicit, OTOH, then you don't have to guess,
thereby saving lots of work and wasteful exceptions.
Definitely! I'm certainly all in favor of having something very akin to entry
points, but I'm not sure where that stands in the current plans. I'm not so
worried about the efficiency, but making it explicit the way entry points do
is a clear win.
And more extensible to additional resolution methods in the future.
The primary application is only potentially long-running. (You could, in theory, deploy an app as CGI, but that way lies madness.) However, the reference syntax mentioned (excepting URL) works well for identifying this.I want to keep this distinct from anything long-running, which is a much more complex deal.
For the simple cases (script / callable), it's pretty easy to trap STDOUT and STDERR, deliver INFO log messages to STDOUT, everything else to STDERR, then display that to the administrator in some form. Same for HTTP, except that it can include full HTML formatting information.
I think given the three options, and for general simplicity, the script can be successful or have an error (for Python code: exception or no; for __main__: zero exit code or no; for a URL: 2xx code or no), and can return some text (which may only be informational, not structured?)
A la the already mentioned post-install, pre-upgrade, post-upgrade, pre-removal, and cron-like. Any others?
An application configuration could refer to scripts under different names, to be invoked at different stages.
Unit and functional tests are the most obvious. In which case we'll need to be able to provide a localhost-only 'mounted' location for the application even though it hasn't been installed yet.
There could be an optional self-test script, where the application could do a last self-check -- import whatever it wanted, check db settings, etc. Of course we'd want to know what it needed *before* the self-check to try to provide it, but double-checking is of course good too.
Humans are potentially better at reading tracebacks than machines are, so my previous logging idea (script output stored and displayed to the administrator in a readable form) combined with a modicum of reasonable exception handling within the script should lead to fairly clear errors.
One advantage to a separate script instead of just one script-on-install is that you can more easily indicate *why* the installation failed. For instance, script-on-install might fail because it can't create the database tables it needs, which is a different kind of error than a library not being installed, or being fundamentally incompatible with the container it is in. In some sense maybe that's because we aren't proposing a rich error system -- but realistically a lot of these errors will be TypeError, ImportError, etc., and trying to normalize those errors to some richer meaning is unlikely to be done effectively (especially since error cases are hard to test, since they are the things you weren't expecting).
I'd like to see maybe an | operator, and a distinction between required and optional services. E.g.:
No need for some new operator, YAML already supports lists.
services:
- [mysql, postgresql, dburl]
Or:
services:
required:
- files
optional:
- [mysql, postgresql]The order of services within one of these lists would indicate preference, thus MySQL is preferred over PostgreSQL in the second example, above.
And then there's a lot more you could do... which one do you prefer, for instance.
I'm not convinced that connecting to a legacy database /and/ current database is that obscure. It's also not as hard as Django makes it look (with a 1M SLoC change to add support)… WebCore added support in three lines.
Tricky things:
- You need something funny like multiple databases. This is very service-specific anyway, and there might sometimes need to be a way to configure the service. It's also a fairly obscure need.
That's what higher-level APIs are for. ;)
- You need multiple applications to share data. This is hard, not sure how to handle it. Maybe punt for now.
Nor would I; running an HTTP server would be daft. Running mod_wsgi, FastCGI on-disk sockets, or other persistent connector makes far more sense, and is what I plan.
You mean, the application provides its own HTTP server? I certainly wouldn't expect that...?
Unless you have a very, very specific need (i.e. Tornado), running a Python HTTP server in production then HTTP proxying to it is inefficient and a terrible idea. (Easy deployment model, terrible overhead/performance.)That's not complicated at all; I do those types of aggregate sites fairly regularly. E.g.
Anyway, in terms of aggregate, I mean something like a "site" that is made up of many "applications", and maybe those applications are interdependent in some fashion. That adds lots of complications, and though there's lots of use cases for that I think it's easier to think in terms apps as simpler building blocks for now.
/ - CMS
/location - Location & image database.
/resource - Business database.
/admin - Flex administration interface.
That's done at the Nginx/Apache level, where it's most efficient to do so, not in Python.
What? Not even close—the person deploying an application is relying on the application server/service to configure the web server of choice; there is no need for deployer action after the initial "Nginx, include all .conf files from folder X" where folder X is managed by the app server. (That's one line in /etc/nginx/nginx.conf.)
Sure; these would be tool options, and if you set everything up you are requiring the deployer to invoke the tools correctly to get everything in place. Which is a fine starting point before formalizing anything.
The logging configuration, in dict form, is passed from the app server to the container. The default logging levels are read by the app server from the container. It's trivially easy, esp. when INI and YAML files can be programatically created.
Hm... I guess this is an ordering question. You could import logging and setup defaults, but that doesn't give the container a chance to overwrite those defaults. You could have the container setup logging, then make sure the app sets defaults only when the container hasn't -- but I'm not sure if it's easy to use the logging module that way.
Indeed; container-setup.py or whatever.
Well, maybe that's not hard -- if you have something like silvercustomize.py that is always imported, and imported fairly early on, then have the container overwrite logging settings before it *does* anything (e.g., sends a request) then you should be okay?
JSON is a subset of YAML. I honestly believe YAML meets the requirements for richness, simplicity, flexibility, and portability that a configuration format really needs.
Rich configurations are problematic in their own ways. While the str-key/str-value of os.environ is somewhat limited, I wouldn't want anything richer than JSON (list, dict, str, numbers, bools).
I've already mentioned an environment variable identifying the path to the on-disk configuration file—APP_CONFIG_PATH—which would then be read in and acted upon by the container-setup.py file which is initially imported before the rest of the application. Also, the application factory idea of passing the already read-in configuration dictionary is quite handy, here.
And then we have to figure out a place to drop the configuration. Because we are configuring the *process*, not a particular application or request handler, a callable isn't great (unless we expect the callable to drop the config somewhere and other things to pick it up?)
Luckily, that's a bad habit we can discourage. ;)
I found at least giving one valid hostname (and yes, should include a path) was important for many applications. E.g., a bunch of apps have tendencies to put hostnames in the database.
Picked up by the container-setup.py site-customize script. What's the limit on the size of a variable in the environ? (Also, that memory gets permanently allocated for the life of the application; not very efficient if we're just going to convert it to a rich internal structure.)
I'm not psyched about pointing to a file, though I guess it could work -- it's another kind of peculiar drop-the-config-somewhere-and-wait-for-someone-to-pick-it-up. At least dropping it directly in os.environ is easy to use directly (many things allow os.environ interpolation already) and doesn't require any temporary files. Maybe there's a middle ground.
You're mixing something up, here. Each application is a single primary package with dependencies. One container per application.
:: Application (package) name.
This doesn't seem meaningful to me -- there's no need for a one-to-one mapping between these applications and a particular package. Unless you mean some attempt at a unique name that can be used for indexing?
Automatically allocated by the app server.
It would also need a way to specify things like what port to run on
public or private interface
Chosen by the deployer during deployment time configuration.
If it's WSGI, it's irrelevant. If it's a network service, it shouldn't be HTTP.
maybe indicate if something like what proxying is valid (if any)
maybe process management parameters
For WSGI apps, it's transparent. Each app server would have its own preference (e.g. mine will prefer FastCGI on-disk sockets) and the application will be blissfully unaware of that.
Interesting idea, not sure how that would be implemented or used, though.
ways to inspect the process itself (since *maybe* you can't send internal HTTP requests into it), etc.
PHP! ;)
PHP can be deployed as a WSGI application. :PRegex dispatch is terrible. (I've actually encountered Python's 56KiB regular expression size limit on one project!) Simply exporting folders as "top level" webroots would be sufficient, methinks.
I'm not personally that happy with how App Engine does it, as an example -- it requires a regex-based dispatch.
Anything "string-like" or otherwise fancy requires more support libraries for the application to actually be able to make use of the environment. Maybe necessary, but it should be done with great reluctance IMHO.
I've had great success with string-likes in WebCore/Marrow and TurboMail for things like e-mail address lists, e-mail addresses, and URLs.
> The file format discussion seems utterly pointless.
That's a pity.
> If you want the format to specify cron jobs and services and non-wsgi
> servers, why not go the whole way and use the Linux filesystem
> hierarchy standard. The entry point is an executable called `init`,
> configuration goes in /etc/, cron jobs go in /etc/cron.d etc. This
> should be flexible enough.
Because that would be… less than good. Let me illustrate:
a) The LFS is intended for complete operating system installations.
b) You sure as hell wouldn't want the init process to be Python.
c) Operating-system specific features are a no-go for portability.
d) We don't want developers to have to suddenly become sysadmins, too.
e) /etc is terrible for configuration organization.
There are other, lower-level reasons not to do that.
One big point is that the application server / container writes a
single configuration file which is then read in by the application.
One file, not a tree of them.
> I hope most applications won't need to look at the contents of app.yaml
> (the application container config) at all.
No-one has said that an application /would/ have to look at the
application metadata, or that after installation the file was anywhere
app accessible, even.
> Paste Deploy configures logging by passing the .ini to logging before
> invoking the app's entry point. This is the application container
> configuring the logging.
I've already defined that. RTFM or many ML messages about logging.
> For example a cool application container feature would be to have a
> little web application that manipulated logging configuration in a
> database, or reconfigured logging between requests without restarting
> the application.
The former is already defined. That's what the application server
does, database or no. The latter is broadly unnecessary, but easily
implementable within the application you are deploying.
> One way to pass 'services' information would be to specify a support
> package with abstract base classes and have a procedure for proposing
> new standard services to the web-sig. The container would have to
> populate a registry of named implementations of those services it is
> able to support:
That seems… excessive and ugly. You would also have code mixing
between the application server level and application level which will
encourage nothing but madness. Simple, named services with optional
configurations are more than enough.
> I would really like to see a basic specification with no support for
> services or 'spending an hour running apt-get to reconfigure the server
> before eventually getting around to running the application', and a
> procedure for extending the format.
apt-get has already been thrown out, and was, in fact, never part of
the quick summary I made, either.
> servers, why not go the whole way and use the Linux filesystem
> hierarchy standard. The entry point is an executable called `init`,
> configuration goes in /etc/, cron jobs go in /etc/cron.d etc. This
> should be flexible enough.Because that would be… less than good. Let me illustrate:
I've already defined that. RTFM or many ML messages about logging.
> On 2011-04-18 14:11:21 -0700, Daniel Holth said:
>
>> If you want the format to specify cron jobs and services and non-wsgi servers, why not go the whole way and use the Linux filesystem hierarchy standard. The entry point is an executable called `init`, configuration goes in /etc/, cron jobs go in /etc/cron.d etc. This should be flexible enough.
>
> Because that would be… less than good. Let me illustrate:
>
> a) The LFS is intended for complete operating system installations.
>
> b) You sure as hell wouldn't want the init process to be Python.
>
> c) Operating-system specific features are a no-go for portability.
>
> d) We don't want developers to have to suddenly become sysadmins, too.
>
> e) /etc is terrible for configuration organization.
>
> There are other, lower-level reasons not to do that.
>
> One big point is that the application server / container writes a single configuration file which is then read in by the application. One file, not a tree of them.
So, I'm going to throw this out there. Instead of assuming "/etc" always means "the root of the filesystem" we should consider it the "root of the sandbox" where the system providing the "sandbox" defines what that is. It is _a_ filesystem in that there is a place that an application will be run. For argument's sake, we'll say it is a directory on some server. Now, within that directory we choose to take some known bits from the LFS standard such as /etc, /bin, /var, etc for the placement of our application.
With that in mind, I think using things like LFS makes a ton of sense. We can piggy back or copy (since previous discussions for .debs or rpms seem not to sit well... even though they would fit this model very well...) systems like RPM rather directly and hopefully allow our Python web apps to play very nicely with applications in other languages.
Please do not get hung up on the fact that I've said RPMs here. The fact is distros have been doing package management for quite a long while. It is insanely convenient to say apt-get install couchdb and when it is done, having a couchdb server running. Copying the model seems like a good option in that we get to learn from the mistakes of others while inheriting a wild variety of tools and concepts.
Eric
> — Alice.
>
>
> _______________________________________________
> Web-SIG mailing list
> Web...@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/eric%40ionrock.org
That depends on how you define the F in RTFM. In this instance, I
meant "read the fine manual". ;)
You can understand my frustration, however, that > 10% of the posts in
this thread demonstrate a lack of understanding of (or lack of even a
cursory glance at) a) my initial post and associated document, and b)
the rest of the mailing list posts.
Asking for things already agreed upon or questions already resolved
wastes everyone's time.
On 2011-04-18 16:46:12 -0700, Eric Larson said:
> Instead of assuming "/etc" always means "the root of the filesystem" we
> should consider it the "root of the sandbox" where the system providing
> the "sandbox" defines what that is.
While /etc certainly wouldn't be the root of anything (insert sarcastic
smiley here ;), it was already agreed upon that / would refer to the
application container root, not system root. I share Ian's sentiment,
see: (search for 'root' on that page)
http://mail.python.org/pipermail/web-sig/2011-April/005041.html
> It is _a_ filesystem in that there is a place that an application will
> be run. For argument's sake, we'll say it is a directory on some
> server. Now, within that directory we choose to take some known bits
> from the LFS standard such as /etc, /bin, /var, etc for the placement
> of our application.
Again, not such a great idea.
> With that in mind, I think using things like LFS makes a ton of sense.
> We can piggy back or copy (since previous discussions for .debs or rpms
> seem not to sit well... even though they would fit this model very
> well...) systems like RPM rather directly and hopefully allow our
> Python web apps to play very nicely with applications in other
> languages.
I can't fully grok this paragraph. FHS (my bad calling it LFS
earlier!) = good because we won't confuse systems administrators and it
matches other binary packaging models?
I doubt an isolated web application will have a need for more than 6%
(3) of these:
http://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard#Directory_structure
While I personally have a FHS-like application deployment model using
Git, I would rather not see that level of complexity as a requirement
for deploying basic applications.
> Please do not get hung up on the fact that I've said RPMs here. The
> fact is distros have been doing package management for quite a long
> while. It is insanely convenient to say apt-get install couchdb and
> when it is done, having a couchdb server running.
It may be convienent, but it's also quite the risk. You're letting
someone else configure your server. Also, do binary installation
systems automatically start the service post-installation before you
can configure them? I have difficulty believing that, which means a
whole whack-ton of effort under a systems administrator hat has been
glossed over.
> Copying the model seems like a good option in that we get to learn from
> the mistakes of others while inheriting a wild variety of tools and
> concepts.
The on-disk structure which the application lives within (the
"application container") is up to the application server in use. The
underlying application should, and, IMHO, -must- be agnostic to it.
Passing paths to configuration files, TMPDIR, etc. in the environment
is a fairly trivial way to do that, at which point the FHS discussion
is nearly moot.
If you want a complete (complete enough for a simple web application)
FHS structure within the redistributable, I don't see the point of
having that many empty directories. ;)
As an aside, I -do- have an application in production using a FHS-like
file structure:
https://gist.github.com/926617
But again, I'm not suggesting something like that for the
redistributable application!
I stumbled across https://apphosted.com as more web application package and format 'prior art'. It appears to be an App Engine competitor. According to their API documentation, their deployment format is an archive containing a single directory with your WSGI program and a metro.config. They put the database configuration in a settings.py written into the application's root with defined DB_URI, etc.
On Wednesday, April 27, 2011 at 5:46 PM, Ian Bicking wrote:
On Wed, Apr 27, 2011 at 5:21 PM, Daniel Holth <dho...@gmail.com> wrote:I stumbled across https://apphosted.com as more web application package and format 'prior art'. It appears to be an App Engine competitor. According to their API documentation, their deployment format is an archive containing a single directory with your WSGI program and a metro.config. They put the database configuration in a settings.py written into the application's root with defined DB_URI, etc.
There's something that bothers me about using settings.py, though I guess it's not that different from a YAML file or whatever, though with a cleverness danger. Conveniently you could do sys.modules['settings'] = new.module('settings') and avoid ever making a real file.
Using the name "settings" *specifically* is likely to cause name clashes with existing Django applications.
Ian
> At 04:11 PM 4/15/2011 -0400, Fred Drake wrote:
>> These end users don't really care if the object identified is a class or
>> function in module, a nested attribute on a class, or anything else, so
>> long as it does what it's advertised to do. By not pushing implementation
>> details into the identifier, the package maintainer is free to change the
>> implementation in more ways, without creating backward incompatibility.
>
> That would be one advantage of using entry points
> instead. ;-) (i.e., the user doesn't specify the object location,
> the package author does.)
>
> Note, however, that one must perform considerably more work to
> resolve a name, when you don't know whether each part of the name is
> a module or an attribute.
Not if, as you mention, you use an explicit format. The format my
resolver code uses (and this code is utilized in marrow.mailer for
manager/transport lookup, marrow.server.http's command-line script to
resolve WSGI applications, and marrow.templating to resolve templates)
covers the following:
:: <object>
:: entrypoint_name
:: ../relative/path/to/something
:: ./relative/path/to/something
:: /absolute/path/to/something
:: package.relative/path/to/something
:: package.absolute.path
:: package.submodule:object
:: package.submodule:object.attribute
What is allowed on any given resolution depends on if the resolver
request is looking for an on-disk path or object.
Using the above as an example, you can define the use of the SMTP
transport within marrow.mailer in two ways:
from marrow.mailer.transport.smtp import SMTPTransport
config = dict(transport=SMTPTransport) # direct reference
config = dict(transport="smtp") # entry point
config = dict( # object lookup
transport = "marrow.mailer.transport.smtp:SMTPTransport"
)
When configuring m.s.http to load an app, you can:
# p-code
HTTPServer.serve("project.application:WSGIApp.factory")
When choosing templates, OTOH, you can do the following:
return "./templates/foo.html", dict()
return "/var/www/foo.html", dict()
return "myapp.templates.foo", dict()
return "myapp/templates/foo.html", dict()
return "myapp.stemplates:email.welcome", dict()
> Either you have to get an AttributeError first, and then fall back to
> importing, or get an ImportError first, and fall back to getattr.
If you examine the above closely, the differing formats are easily
identifiable using a few == and 'in' conditionals:
if not isinstance(ref, basestring):
return ref
if ref[0] == '.': pass # relative
if ref[0] == '/': pass # absolute
if '/' not in ref and '.' not in ref and ':' not in ref:
pass # entrypoint
if ':' in ref:
import_, _, attrs = ref.partition(':')
base = __import__(import_)
for attr in attrs.split('.'):
base = getattr(base, attr)
return attr
if '/' in ref:
import_, _, path = ref.partition('/')
pass # use pkg_resources + path to pull file from package
> If the syntax is explicit, OTOH, then you don't have to guess, thereby
> saving lots of work and wasteful exceptions.
:)
— Alice.