The design thought behind crary was that dispatch handlers could/
should be chained together to create full sites. For instance, a
dispatch handler could check if a user starts with "/images/" and if
it does, pass the request to crary_dir_listing to handle serving up
the static image directory. However, at the moment, there's really no
good way to tell modules like crary_dir_listing to ignore the "/
images/" part of the url. Furthermore, if the first dispatch handler
passed the request to a second dispatching handler, there's really no
good way to avoid it also having to deal with the "/images/" part of
the url.
Proposed solution:
I recently read the django tutorial, and they solve this problem by
explicitly passing the path part of the url around, removing the parts
that have already been dispatched upon. I propose to do the same
thing, that crary will now call each handler with two prefixed
arguments: the request (as it currently does), a string which
represents the path part of the url which hasn't yet been processed.
(by path that means no server, query string arguments, etc)
Obviously, the path should be used for dispatching and not be used for
url creation by modules as it can and will be partial.
Example:
handler(Req, "/images" ++ PathRest) ->
crary_dir_listing:handler(Req, PathRest, "/var/www/www.example.com/static/img
");
handler(Req, _) ->
throw(not_found).
or you can find them in rst format with the source.
Though they are incomplete and surely need to be revised a lot.
cheers,
filippo
+ are there standardized semantics regarding the behavior if the
application side throws an exception?
+ could the HTTP headers in the Environ (eg "HTTP_*") be either
upper-cased or lower-cased (standardize on one). rational: headers are
case insensitive, so an application wishing to conform to the case
insensitivity will have to manually loop over the proplist, even if the
proplist in the future supported dict style backing.
+ at the moment proplist is hard coded to using a linked list and O(n) avg
lookup. there's no reason why a map "interface" couldn't be build that
proplist could use, however such a proposal was largely ignored by the
wider erlang community. would it be possible to allow the passed in type
to be something such as a "map" interface rather then "proplist" so that
more efficient data structures could be used if/when desired?
+ why require the redundant status (eg {200, "Ok"})? crary has code that
will accept just the integer 200 or the atom 'ok' and automatically do the
right thing. it seems to me this reduces the potential for typos and
creates cleaner code (who really needs to see the integer 200?)
+ the current spec requires that the server send correct headers,
including for HTTP extensions. this seems to imply that all ewgi will have
to understand enough about the semantics of WebDAV and any other HTTP
extensions to automatically supplement their headers. i disagree that this
is what a server should be responsible for. automatically adding the
"date" and "server" is a really good thing, but i'd argue that it should
be limited to these, or limited to a finite, well documented list of such
headers.
+ The spec seems to imply--but not directly state--that StartResponse can
be called multiple times (eg it is buffered in case an "ok" turns into an
"internal server error"). What are the semantics of duplicate calls? Is
the first call effectively ignored? Are the calls merged? What is the
semantic of no StartResponse call being made?
+ Is it acceptable for StartResponse to be called from another process
then Application that it was originally passed to? Likewise, from a
different erlang vm?
+ From what i understand, erlang makes no guarantees that lambdas are
compatible across releases. As i understand it, by using lambdas for the
Application and StartResponse (as well as the one in the header if set),
this will make it unsafe for users to upgrade erlang releases in a hot
environment if the lambdas are passed across VMs. this may not be a
problem, but should be considered. the spec however did suggest that there
may be applications that pass the request to a separate VM; if lambda are
still used by ewgi, it might be prudent to add a warning not to pass the
lambdas across VMs, as it would be very tempting to do this.
+ Will there be any ewgi standardized support for the server timing out
and killing the Application. This could be highly useful to prevent, or
limit the impact of certain forms of DOSs. If there is, would the server
kill the Application in such a way that (either explicitly or implicitly)
any worker processes (links) that the Application starts will be killed?
Thanks
sRp
Scott R Parish wrote:
> Nice idea! I'm very interested in seeing what the streaming part looks
> like (both ways: server streaming response back to http client and http
> client streaming a PUT or POST). Here's some notes and questions i had on
> my first pass through the document:
For the streaming part Hunter Morris has done all the job and you'll
find something in his slides here:
http://blog.smarkets.com/2008/05/22/talking-smak-london-erlang-user-group-presentation/
>
> + are there standardized semantics regarding the behavior if the
> application side throws an exception?
the mochiweb implementation catches exceptions thrown by applications
and return a 500 Internal Server Error.
IMHO this should be the standardized behaviour. What do you think?
>
> + could the HTTP headers in the Environ (eg "HTTP_*") be either
> upper-cased or lower-cased (standardize on one). rational: headers are
> case insensitive, so an application wishing to conform to the case
> insensitivity will have to manually loop over the proplist, even if the
> proplist in the future supported dict style backing.
All http headers are normalized to upper case. While ewgi specific field
are something like "ewgi.fieldname"
>
> + at the moment proplist is hard coded to using a linked list and O(n) avg
> lookup. there's no reason why a map "interface" couldn't be build that
> proplist could use, however such a proposal was largely ignored by the
> wider erlang community. would it be possible to allow the passed in type
> to be something such as a "map" interface rather then "proplist" so that
> more efficient data structures could be used if/when desired?
Here I'm open to suggestions. I chose proplists because I come from
python and they are the thing that most resembles python dicts (in the
api, not the implementation).
IMHO the right thing to do is to define a data type in the specs and
strongly suggest access to the data structure only through the api. So
we can change in future versions of the specs without much problems.
>
> + why require the redundant status (eg {200, "Ok"})? crary has code that
> will accept just the integer 200 or the atom 'ok' and automatically do the
> right thing. it seems to me this reduces the potential for typos and
> creates cleaner code (who really needs to see the integer 200?)
In mochiweb e.g. you start the response with the complete status line
"200 OK", while in yaws you have to set it with e.g. {status, 404} and
non need to set the status in case it is 200.
I choose a representation and added macros to simplify coding :-) so
e.g. ?OK corresponds to {200, "OK"}
They are defined in include/ewgi.hrl
>
> + the current spec requires that the server send correct headers,
> including for HTTP extensions. this seems to imply that all ewgi will have
> to understand enough about the semantics of WebDAV and any other HTTP
> extensions to automatically supplement their headers. i disagree that this
> is what a server should be responsible for. automatically adding the
> "date" and "server" is a really good thing, but i'd argue that it should
> be limited to these, or limited to a finite, well documented list of such
> headers.
That's correct. ewgi IMHO should know HTTP.
ewgi must guarantee that the HTTP headers are correctly set, or
otherwise you would send the client an incorrect response.
However this in practice means most of the times the ewgi server hasn't
to do any checks since the underlying server mochiweb, yaws, or crary
already do this.
I hadn't thought of WebDav and other extensions over ewgi. Maybe we
could think of future extensions, but for now I'd be happy to have a
well defined spec for http and some working and tested implementations,
so if you want e.g. to take care of the crary one you are welcome ;-)
>
> + The spec seems to imply--but not directly state--that StartResponse can
> be called multiple times (eg it is buffered in case an "ok" turns into an
> "internal server error"). What are the semantics of duplicate calls? Is
> the first call effectively ignored? Are the calls merged? What is the
> semantic of no StartResponse call being made?
This part that has to be clarified better in the specs (as a lot of
other things :-) ).
The StartResponse must not send the send the headers until the first
chunk of the body is returned. It must buffer headers until the last
possible moment. The case of no StartResponse I think should result in
an internal server error. It's like answering without sending the
headers. The alternative could be to threat no headers as a 200 response
like yaws does, but I don't like it very much.
>
> + Is it acceptable for StartResponse to be called from another process
> then Application that it was originally passed to? Likewise, from a
> different erlang vm?
My idea was exactly this. Applications and middleware may reside on
different nodes from the server. The current mochiweb implementation
does not support this, but it should not be difficult to fix.
IMHO this is a must have feature working in erlang :-)
>
> + From what i understand, erlang makes no guarantees that lambdas are
> compatible across releases. As i understand it, by using lambdas for the
> Application and StartResponse (as well as the one in the header if set),
> this will make it unsafe for users to upgrade erlang releases in a hot
> environment if the lambdas are passed across VMs. this may not be a
> problem, but should be considered. the spec however did suggest that there
> may be applications that pass the request to a separate VM; if lambda are
> still used by ewgi, it might be prudent to add a warning not to pass the
> lambdas across VMs, as it would be very tempting to do this.
You are right. This should be stated clearly in the specs.
>
> + Will there be any ewgi standardized support for the server timing out
> and killing the Application. This could be highly useful to prevent, or
> limit the impact of certain forms of DOSs. If there is, would the server
> kill the Application in such a way that (either explicitly or implicitly)
> any worker processes (links) that the Application starts will be killed?
IMHO the ewgi specs should remain as simple as possible and this feature
could be implemented as a ewgi middleware module if not already present
in the underlying server.
Thanks again. Any comment or suggestion is welcome.
cheers,
filippo