cgi module and awp

9 views
Skip to first unread message

Watt

unread,
Dec 12, 2008, 12:45:06 PM12/12/08
to antiweb
i am trying to understand the modules part. i setup the cgi and it
was working before as of your instruction. A few days back i went
back and try further to setup fast cgi. i have fast cgi in my perl
env and I conceptually thought that aw would call that in perl shell.
from the manual, i understand that perl process would terminate
regardless and i kind of see that in the module file. however, i just
could follow as far as a cgi con being created and added to the table
to be passed to c code. the question i have is, how does aw actually
invoke the perl command? i think that would have a great impact on
extending aw by the way of perl.

also, is there any timeline to update the awp manual at the moment?
or if you plan to write a book on the subject, which i think it is a
great idea? i am still not clear on how i can take care of basic
things like session management, routing, creating dynamic content,
etc. Thanks for your help.



Watt P.

do...@hcsw.org

unread,
Dec 14, 2008, 7:48:42 PM12/14/08
to ant...@googlegroups.com
Hi Watt,

On Fri, Dec 12, 2008 at 09:45:06AM -0800 or thereabouts, Watt wrote:
> i am trying to understand the modules part. i setup the cgi and it
> was working before as of your instruction. A few days back i went
> back and try further to setup fast cgi. i have fast cgi in my perl
> env and I conceptually thought that aw would call that in perl shell.

Thanks for your question. FastCGI is very different from regular CGI.
I don't like FastCGI because it is an over-engineered binary protocol.
I have no plans to support FastCGI at this time but eventually I plan
for Antiweb to be able to reverse proxy HTTP requests which should
give most of the same benefits (including allowing you to reverse
proxy requests to a FastCGI capable server such as nginx).

> from the manual, i understand that perl process would terminate
> regardless and i kind of see that in the module file. however, i just
> could follow as far as a cgi con being created and added to the table
> to be passed to c code. the question i have is, how does aw actually
> invoke the perl command? i think that would have a great impact on
> extending aw by the way of perl.

Sure, let's walk through this.

1) The connection is transfered to a worker from the hub.
2) The worker parses the message from the hub with code in the
worker-unix-connect function. This code calls aw_accept_conn
to perform the recvmsg(2), creates a closure with the
worker-handle-http function, and adds this closure to the
main connection hash-table with the add-to-conn-table function.
3) When the HTTP connection has data on it, the above closure is
invoked. This will happen immediately (before we re-enter the
event loop) because a full HTTP request is always transfered
from the hub along with the socket.

If we look at the worker-handle-http function, we will see
that is uses the fsm macro (short for "finite state machine") to
create the closure. The closure uses a shared buffer to avoid
consing memory for the HTTP request and responses (we can only
do this because we don't use threads) and then calls http-user-dispatch.
4) http-user-dispatch is defined as follows:

;; Must call compile-http-user-dispatch before dispatching.
;; http-buf can be a shared buffer, so will be mutilated between closure invocations.
(defun http-user-dispatch #1=(keepalive-closure c http-buf)
(declare (ignore . #1#))
(error "http-user-dispatch not compiled"))

But at run-time, its symbol-function value will be replaced
by your real HTTP dispatch function by compile-http-user-dispatch.
The HTTP dispatch function is mostly built by the
http-user-dispatch-macro-macrolet-wrapper macro which also
uses the http-user-dispatch-macro and http-user-dispatch-macro-request-handler
helper functions. We see that this macrolet-wrapper gives us all sorts
of different things that can be done by the HTTP dispatch code,
including err-and-linger, err-and-keepalive, add-header-to-response,
redir, etc. These are used by modules defined in modules.lisp
and by your own custom code that can live inside your
worker conf files.
5) The modules are processed by http-user-dispatch-macro-request-handler.
If you have a :cgi xconf in your worker conf file, for example:

:cgi (pl php)

then AW's mod-cgi in modules.lisp will find it with this code:
(if (xconf-get handler :cgi) ...)

It will process your list of file extensions (pl, php, etc) with
this code:

(when-match (,(list-of-file-extensions-to-regex-with-path-info (xconf-get handler :cgi)) http-path)
...)

The list-of-file-extensions-to-regex-with-path-info function builds
up a lambda-wrapped regular expression used to parse CGI files that
match your extensions. It also handles PATH_INFO efficiently.
6) If mod-cgi determines that this is a request to a CGI file, it tests
that the file exists, is readable, and is executable. If it isn't,
an error is sent to the client. But if it is OK, then AW builds up
a bunch of strings and other parameters to be passed to the
aw_build_cgi_conn() C function defined in libantiweb.c . Note that
none of these strings cons memory because they are stack allocated.
The code to do this is pretty messy and I plan on cleaning this up at
some point.
7) aw_build_cgi_conn() runs. This function fork()s Antiweb into 2
processes. It is careful to close all sockets and epoll descriptors
in the child process (kqueue descriptors never transfer on fork()),
either by manually closing them or by relying on CLOSEEXEC. The only
socket that isn't closed is the actual socket the HTTP request came
in on. That socket is installed as the standard output descriptor
of the child process. The child will then install a number of
environment variables for the CGI script to make use of.
Finally, the child process will execv(2) your CGI script,
overlaying over all of the Antiweb code and data in the child.

A) If the request was a GET request, the parent will now close the
socket, leaving it only open in the child process.
B) If the request was a POST request, the parent will install a
pipe in the child's standard input descriptor and continue to
read CONTENT_LENGTH bytes from the socket and forward them over
the pipe. The code to do this is will also be used when we
implement reverse proxying.
8) The CGI script will read all its input (if it was a POST) and then
will write out its output, and then it will exit.
9) AW will reap the zombie caused by this exiting process and clean
up the proxy data structures (if it was a POST). To clean up the
proxy, the conn data structure's resources are closed immediately
and the conn structures themselves are put into a zombie mode. This
is different from unix zombies and it just means that we will
re-claim these data structures on the next pass through the conn list.


> also, is there any timeline to update the awp manual at the moment?
> or if you plan to write a book on the subject, which i think it is a
> great idea? i am still not clear on how i can take care of basic
> things like session management, routing, creating dynamic content,
> etc.

Sorry, no definite timeline yet, but this will happen eventually.
Often you want to handle a request in the AW process instead of
sending it off to a CGI process. This allows you to maintain
keepalive connections, use an already open BerkeleyDB environment
(BDB is multi-process AND multi-thread), and, in general, take
advantage of all the great things lisp has to offer with respect
to web development.

The currently undocumented get-handler and post-handler xconfs
along with some :rewrite rules are how I am generating dynamic
content at the moment. I will try to add some basic docs on
these xconfs in BETA11 (I'm going to release BETA10 tonight probably).

Hope this helps,

Doug

Reply all
Reply to author
Forward
0 new messages