URL case sensitivity

204 views
Skip to first unread message

Victor Song

unread,
Apr 28, 2011, 10:21:05 PM4/28/11
to expre...@googlegroups.com
Are URLs case sensitive in Express? HTML4 spec says so (except machine names), I did not find anything in HTML5 spec.

Laurie Harper

unread,
Apr 29, 2011, 1:24:12 AM4/29/11
to expre...@googlegroups.com
The HTML recommendation doesn't specify URLs, that is covered by RFC 1738 (for the general format), and RFC 2616 (for HTTP URLs). But regardless of where what is specified, the answer is that express treats the path part of HTTP URLs case-insensitively -- that is, in the URL "http://hostname/path/to/something", the "path/to/something" part is not case-sensitive.

L.

On 2011-04-28, at 10:21 PM, Victor Song wrote:

> Are URLs case sensitive in Express? HTML4 spec says so (except machine names), I did not find anything in HTML5 spec.
>

> --
> You received this message because you are subscribed to the Google Groups "Express" group.
> To post to this group, send email to expre...@googlegroups.com.
> To unsubscribe from this group, send email to express-js+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/express-js?hl=en.

--
Laurie Harper
http://laurie.holoweb.net/

Victor Song

unread,
Apr 29, 2011, 8:36:35 AM4/29/11
to expre...@googlegroups.com
Thanks for the clarification. I first noticed this in a W3C doc at http://www.w3.org/TR/WD-html40-970708/htmlweb.html

It seems section 3.2.3 of RFC 2616 says the "path/to/something" part should be case sensitive: http://www.ietf.org/rfc/rfc2616.txt

Joshua Cohen

unread,
Apr 29, 2011, 8:49:27 AM4/29/11
to expre...@googlegroups.com
I think Express does the right thing. In cases like this I tend to fall back to the robustness principle: "Be conservative in what you send; be liberal in what you accept." Since that part of the RFC is a SHOULD and not a MUST, unless there's a compelling reason to differentiate it seems more troublesome than it's worth to do so.

On Fri, Apr 29, 2011 at 7:36 AM, Victor Song <yvs...@gmail.com> wrote:
Thanks for the clarification. I first noticed this in a W3C doc at http://www.w3.org/TR/WD-html40-970708/htmlweb.html

It seems section 3.2.3 of RFC 2616 says the "path/to/something" part should be case sensitive: http://www.ietf.org/rfc/rfc2616.txt

--

Victor Song

unread,
Apr 29, 2011, 10:46:52 AM4/29/11
to expre...@googlegroups.com
I improved my English by reading RFC2119.

Is the URL case insensitivity determined by Express, Connect, or Node? It would be helpful if designs like this are documented formally. Developers can then follow the same design when working on subsytems. For example, caches should also treat URLs case insensitive.

Another issue on doc: I wonder if API docs of Node/Connect/Express can show inherited methods, as JavaDoc does. A JS example.

TJ Holowaychuk

unread,
Apr 29, 2011, 11:51:50 AM4/29/11
to expre...@googlegroups.com
the regexps generated by Express set the "i" flag so they are case-insensitive. mainly a personal preference, but it could be something that we could allow toggling with an option, though I imagine most people will maintain lowercased pathnames and prefer the way it is now

-- 
TJ Holowaychuk

On Thursday, April 28, 2011 at 7:21 PM, Victor Song wrote:

Are URLs case sensitive in Express? HTML4 spec says so (except machine names), I did not find anything in HTML5 spec.

--

Andrew W. Donoho

unread,
Apr 29, 2011, 12:05:45 PM4/29/11
to expre...@googlegroups.com

On Apr 29, 2011, at 10:51 , TJ Holowaychuk wrote:

> the regexps generated by Express set the "i" flag so they are case-insensitive. mainly a personal preference, but it could be something that we could allow toggling with an option, though I imagine most people will maintain lowercased pathnames and prefer the way it is now


Gentlefolk,

While an IETF "SHOULD" recommendation can be ignored, what is the reason to ignore it? Why is it a virtue that Express overrules a "SHOULD" recommendation? Why should developers even have to be aware of this issue?

Being liberal in what you accept does not mean you should change the interpretation of URLs. This way does lead to subtle bugs.

Anon,
Andrew
____________________________________
Andrew W. Donoho
Donoho Design Group, L.L.C.
a...@DDG.com, +1 (512) 750-7596, twitter.com/adonoho

"We did not come to fear the future.
We came here to shape it."

-- President Barack Obama, Sept. 2009


Laurie Harper

unread,
Apr 29, 2011, 3:38:22 PM4/29/11
to expre...@googlegroups.com
On 2011-04-29, at 8:36 AM, Victor Song wrote:
Thanks for the clarification. I first noticed this in a W3C doc at http://www.w3.org/TR/WD-html40-970708/htmlweb.html

It seems section 3.2.3 of RFC 2616 says the "path/to/something" part should be case sensitive: http://www.ietf.org/rfc/rfc2616.txt

For purposes of comparing two http URLs for equivalence, yes. That doesn't mean that URLs that are not equivalent can't map to the same resource. 

On 2011-04-29, at 12:05 PM, Andrew W. Donoho wrote:
While an IETF "SHOULD" recommendation can be ignored, what is the reason to ignore it? Why is it a virtue that Express overrules a "SHOULD" recommendation? Why should developers even have to be aware of this issue? 

Being liberal in what you accept does not mean you should change the interpretation of URLs. This way does lead to subtle bugs.

Again, what was cited describes the semantics for comparing URLs, *not* for interpreting them. Traditionally, Windows-based web servers have always treated URLs in a case-insensitive manner, sine the underlying filesystem is case insensitive; *nix-based URLs generally are case sensitive, since case makes a difference at the filesystem level.

The point being, it is the job of the web server / web application to determine what resource corresponds to a particular URL. Mapping URLs which vary only by case to the *same* URL is perfectly within the language and intent of the relevant specs.

Andrew W. Donoho

unread,
Apr 29, 2011, 6:37:31 PM4/29/11
to expre...@googlegroups.com

On Apr 29, 2011, at 14:38 , Laurie Harper wrote:

The point being, it is the job of the web server / web application to determine what resource corresponds to a particular URL. Mapping URLs which vary only by case to the *same* URL is perfectly within the language and intent of the relevant specs.



Laurie,

You did not actually answer my question. You stated what you could do and still remain in conformance with the spec.

I asked: "Why is it a virtue that Express overrules a "SHOULD" recommendation?"

In my view, unless there is a good implementation reason to override a "SHOULD" recommendation, then implementers should not do so. In this case, we are told that this recommendation is being overridden by applying a single option to the regex engine. If the implementation did nothing it would be in conformance with the standard's "SHOULD" recommendation.

IOW, the implementation is affirmatively trying to ignore the "SHOULD" recommendation. Why is this a virtue? 


Anon,
Andrew
____________________________________
Andrew W. Donoho
Donoho Design Group, L.L.C.
"When you can't imagine how things are going to change, 
    that doesn't mean that nothing will change.
        It means that things will change in ways that are unimaginable."
            Bruce Sterling, January 02, 2009







Joshua Cohen

unread,
Apr 29, 2011, 6:53:44 PM4/29/11
to expre...@googlegroups.com
I can see the case for making this configurable at the app level. My personal preference for not wanting to force case sensitivity on users is just that. If I (or Express) was on the other side I'd certainly like that flexibility.

Of course at this point changing the default to match the RFC would have implications with regards to backwards compatibility with existing deployed applications (not that maintainers would have to do much to restore the previous behavior).

TJ Holowaychuk

unread,
Apr 29, 2011, 6:55:33 PM4/29/11
to expre...@googlegroups.com
Yeah same here.. I'm 100% fine with an option, but it's not something I would use personally.

-- 
TJ Holowaychuk

Laurie Harper

unread,
Apr 29, 2011, 7:30:14 PM4/29/11
to expre...@googlegroups.com
On 2011-04-29, at 6:37 PM, Andrew W. Donoho wrote:
On Apr 29, 2011, at 14:38 , Laurie Harper wrote:
The point being, it is the job of the web server / web application to determine what resource corresponds to a particular URL. Mapping URLs which vary only by case to the *same* URL is perfectly within the language and intent of the relevant specs.

Laurie,

You did not actually answer my question. You stated what you could do and still remain in conformance with the spec.

I asked: "Why is it a virtue that Express overrules a "SHOULD" recommendation?"

Actually, I did answer that, but you trimmed that part of my response. To reiterate and clarify: the spec says that, when *comparing* two HTTP URLs, they are considered identical only if they have matching case. It does not say that URLs which do differ in case may not resolve to the same resource. Express isn't comparing URLs for equivalence, it's mapping URLs to resources. No specifications are being violated, in spirit or intent, in any way as far as I can see.

BTW, if you don't like the way Express matches URLs by default (which is perfectly valid) you are free to change the behaviour. Should you want to do do:

 * currently: use regular expressions instead of strings to specify your paths; Express will use them as-is, and you have full control (or you could fork and modify Connect's 'route' middleware)

 * in the future: TJ has already said he'd be happy to support a config option to disable case sensitivity in route matching; if that gets implemented, you can just flip the switch.

TJ Holowaychuk

unread,
Apr 29, 2011, 7:31:52 PM4/29/11
to expre...@googlegroups.com
I'll add it right now so I dont forget

-- 
TJ Holowaychuk

TJ Holowaychuk

unread,
Apr 29, 2011, 7:42:05 PM4/29/11
to expre...@googlegroups.com

Victor Song

unread,
Apr 29, 2011, 10:19:18 PM4/29/11
to expre...@googlegroups.com
RFC3986 states that non-equivalent URIs can identify the same resource. W3C specs do not seem intuitive reading materials. Express does seem standard compliant on this issue. However, the design choices should be documented. Thanks for another example of doc needed:
use regular expressions instead of strings to specify your paths; Express will use them as-is, and you have full control

Reply all
Reply to author
Forward
0 new messages