JSGI | Problem with response headers and middleware

27 views
Skip to first unread message

Daniel Friesen

unread,
Jul 15, 2009, 2:55:07 AM7/15/09
to serv...@googlegroups.com
While working on an implementation of JSGI I noticed there is a little
problem inside of the spec when it comes to headers and middleware.

The spec doesn't say anything about the case insensitivity of headers,
and the HTTP spec does say that headers are case insensitive.

Thus, all these forms are valid for saying the same thing:
[ 200, { "Content-Type": "text/plain" }, ["..."]}
[ 200, { "CONTENT-TYPE": "text/plain" }, ["..."]}
[ 200, { "content-type": "text/plain" }, ["..."]}
[ 200, { "Content-type": "text/plain" }, ["..."]}
...

That's alright when it comes to creating the output, but the real issue
is with middleware.

exports.app = function app(env) {
...
}

// Middleware that makes sure all text types have a default
charset=UTF-8 so stuff doesn't default back to Latin1
function UnicodeText(app) {
return function(env) {
var res = app(env);
var ContentType = res[1]["Content-Type"];
if( ContentType &&
ContentType.match(/^(text/[^;]+|application/xhtml+xml)$/) ) {
res[1]["Content-Type"] = ContentType + '; charset=UTF-8';
}
return res;
};
}

exports.development = function(app) {
return Unicode(app);
};

But the problem here, is those headers. How is this piece of middleware
supposed to know what headers to use. Because it's perfectly valid for
the app to use { 'CONTENT-TYPE': 'text/html' } instead.
And we don't have any sort of intermediate processing of the response so
there is no location for implementations to inject a little bit of
header name cleanup in between middleware and other middleware and the app.

--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

Ash Berlin

unread,
Jul 15, 2009, 1:57:16 PM7/15/09
to serv...@googlegroups.com

So the HTTP Headers module I ported from the perl version I *think*
gets round this problem, or at least half of it

http://redmine.flusspferd.org/repositories/entry/flusspferd/src/js/http/headers.js#L91
and #L151

But it still doesn't get round it fully, And I'm not sure we (i.e.
using my version) can fully get round it without round-tripping it via
to string and back, or with a __noSuchProperty__ handler (which isn't
possible yet)


Wes Garland

unread,
Jul 15, 2009, 2:46:11 PM7/15/09
to serv...@googlegroups.com
You could always traverse the list in "debug mode" and throw an exception if property names don't conform to a particular regular expression.

Ugly, but functional.

Then on JS engines capable of it doing so, have a ~ no-such-property check when inserting. This would be easy from jsapi.

Suggested property validating regex, something like ([A-Z][a-z0-9_]*)(-[A-Z][a-z0-9_]*)*

Wes
--
Wesley W. Garland
Director, Product Development
PageMail, Inc.
+1 613 542 2787 x 102

Daniel Friesen

unread,
Jul 15, 2009, 3:18:41 PM7/15/09
to serv...@googlegroups.com
I thought someething like that would be alright myself (actually my idea
was to /require/ header object to have keys in that format, and
middleware that thinks it might output bad header keys should clean them
up before returning), then I realized that the common format people use
for headers isn't a simple regex like that.
There are 3 headers that don't follow that case pattern:
WWW-Authenticate, ETag, and Content-MD5

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

Wes Garland wrote:
> You could always traverse the list in "debug mode" and throw an
> exception if property names don't conform to a particular regular
> expression.
>
> Ugly, but functional.
>
> Then on JS engines capable of it doing so, have a ~ no-such-property
> check when inserting. This would be easy from jsapi.
>
> Suggested property validating regex, something like
> ([A-Z][a-z0-9_]*)(-[A-Z][a-z0-9_]*)*
>
> Wes
>
> On Wed, Jul 15, 2009 at 1:57 PM, Ash Berlin
> <ash_flu...@firemirror.com <mailto:ash_flu...@firemirror.com>>
> > daniel.friesen.name <http://daniel.friesen.name>]

Wes Garland

unread,
Jul 15, 2009, 3:23:47 PM7/15/09
to serv...@googlegroups.com
> There are 3 headers that don't follow that case pattern:
> WWW-Authenticate, ETag, and Content-MD5

Good point. That completely breaks my suggestion.

This is an ugly problem which shows that the selected datatype should not be a plain object.

Wes

Daniel Friesen

unread,
Jul 15, 2009, 3:35:58 PM7/15/09
to serv...@googlegroups.com
^_^ At least be happy we're not in Ruby writing protocol-only...

Ruby makes it tougher with keys.
{ 0 => 'a', "0" => 'b' }
{ "foo" => 'a', :foo => 'b' }

I always loved symbols from Common Lisp... but hated how ruby's hashes
make them highly unpleasant to use since 90% of tools like YAML/JSON all
use strings instead of symbols making it pointless to use symbols.

It's a little easier in JS since everything gets cast to a string.


Of course... The whole reason that Rack doesn't bother making this part
of the protocol clean is probably something along the lines of just
making everyone use the response object instead since Rack is both the
protocol and the implementation.
> <mailto:ash_flu...@firemirror.com

Alexandre.Morgaut

unread,
Jul 16, 2009, 6:13:08 AM7/16/09
to serverjs
For myself this piece of code might have been written differently

// Middleware that makes sure all text types have a default
charset=UTF-8 so stuff doesn't default back to Latin1

function UnicodeText(app) {
return function(env) {
var result, headers, field;
result = app(env);
headers = result[1];
for (field in headers) {
if ('content-type' == headers[field].toLowerCase() &
headers[field].match(/^(text/[^;]+|application/xhtml+xml)$/)) {
headers[field] += '; charset=UTF-8';
return result;
}
}
return result;
};
}

Alexandre.Morgaut

unread,
Jul 16, 2009, 8:39:28 AM7/16/09
to serverjs
I would have written it differently to make it works :

// Middleware that makes sure all text types have a default
charset=UTF-8 so stuff doesn't default back to Latin1

function UnicodeText(app) {
return function(env) {
var result, header, field;
result = app(env);
// we might also had this if the server send 203 responses in
some circumstances
// if (203 == result[0]) { return result; }
header = result[1];
for (field in header) {
if ('content-type' == field.toLowerCase()) {
if (header[field].match(/^(text/[^;]+|application/xhtml
+xml)$/) ) {
header[field] += '; charset=UTF-8';
}
return result;
}
}
return result;
};
}

Alexandre.Morgaut

unread,
Jul 16, 2009, 9:11:13 AM7/16/09
to serverjs
Sorry to reply twice

The first response looked lost,
but the second one is a little bit better (returning the result faster
when the content type isn't modified) ;-)

I didn't look attentively to the regex I didn't tested it neither but
it's clear it can't catch any textual media types, missing for
example :
- 'application/javascript',
- 'application/ecmascript',
- 'application/json',
- 'application/xml',
- 'application/xml-dtd',
- 'application/xml-external-parsed-entity',
- 'application/sgml'
and other not registered like :
- 'application/x-httpd-php-source',
- 'application/json-rpc'

Wes Garland

unread,
Jul 16, 2009, 10:05:30 AM7/16/09
to serv...@googlegroups.com
The problem with this approach is that it does not handle the case when two different components set the same header, incompatibly, with conflicting values.

If we don't re-think the header-setting API, I can guarantee you it will be a source of bugs in the long term.

Wes

mikewse

unread,
Sep 7, 2009, 10:04:11 AM9/7/09
to CommonJS
[didn't notice this thread before]

Yes, I agree with Wes. In lack of "ES 4/6+ catch-all property
accessors" the easiest thing is probably to provide some form of set/
get API to gain some control over header assignments. Then any desired
algorithm can be hidden behind this API, f ex doing case insensitive
match or using an extra internal field with fixed case for the actual
matching.
FWIW, this is roughly how Apache Tomcat does things; multiple header
assignments with different casing overwrite each other, no case
correction takes place and first "casing style" wins when sending to
the client.

Best regards
Mike

On Jul 16, 4:05 pm, Wes Garland <w...@page.ca> wrote:
> The problem with this approach is that it does not handle the case when two
> different components set the same header, incompatibly, with conflicting
> values.
>
> If we don't re-think the header-setting API, I can guarantee you it will be
> a source of bugs in the long term.
>
> Wes
>
Reply all
Reply to author
Forward
0 new messages