I've opened a ticket for tracking:
http://sinatra.lighthouseapp.com/projects/9779-sinatra/tickets/147
When I get a sec, I'll try removing the URL encoding logic and see
where the specs fail.
Thanks,
Ryan
I believe servers are to process that as "/this/path/has/two/components".
Ryan
Your sure? What would be the point of encoding the '/' if you can't
make it different from the semantic use of '/' as the separator
between path segments?
I think its a path with two components "this/path" and "has/two/components".
http://labs.apache.org/webarch/uri/rfc/rfc3986.html#path
Sam
I'm definitely not sure. I didn't realize that slashes after the first
had any semantic value to be honest. I thought the entire path part
was an opaque value. But browsers obviously have some smarts here
since they're capable of traversing relative URLs like "../foo/bar".
> I think its a path with two components "this/path" and "has/two/components".
>
> http://labs.apache.org/webarch/uri/rfc/rfc3986.html#path
Interesting. And it looks like apache at least considers the URLs to
be different:
http://labs.apache.org/webarch%2Furi%2Frfc%2Frfc3986.html#path
http://labs.apache.org/webarch/uri/rfc/rfc3986.html#path
Nginx treats them as the same (i.e., "foo/bar" and "foo%2Fbar" are equivalent):
http://tomayko.com/misc%2Fbob/
http://tomayko.com/misc/bob/
I haven't tried other servers.
Still, I can't think of any practical advantage in treating them
differently. Can you? IMO, embedding slashes in the path part is
begging for trouble. If that's the only issue with comparing routes
after decoding and it fixes all of the other issues, I'd be very much
for accepting the limitation of not being able to embed slashes in
path segments.
Thanks,
Ryan
Its a corner case, but I can think of reasons to put slashes in paths
for stuff I'm working, one reason might be embedding entire URIs as a
path component (rather than as a query parameter).
I assume I can do that with sinatra, though, its just that I can match
it with routes? That'd be OK, and if somebody was dead set, they could
do their own routing as long as they can get the undecoded path
component of the URI.
I tried to confirm this by looking for the docs on how route matching
is done, and couldn't find them. It looks like routes can be a RegEx,
or a vaguely glob-like pattern? Except that unlike real globs, the *
can match a / character? So ** isn't used for matching multiple path
levels?
I've only found examples, and I believe that at least one example in
the intro must have a typo:
get '/hello/:name' do
# matches "GET /foo" and "GET /bar"
# params[:name] is 'foo' or 'bar'
- http://www.sinatrarb.com/intro.html
The GET commands must be missing a leading /hello?
Cheers,
Sam
Well, there's actually nothing more "RESTful" about stuffing things
like this into the path part than the query string. URLs that are
simple and meaningful definitely have value but there's nothing in
REST that favors the path part of a URL over the query string. I'd
stick to the query string for values like this or replace the slash
with a dash or other character. You're begging for compatibility
issues with a scheme like that.
As much as I dislike the idea of making it hard to take advantage of
features of the underlying specs (HTTP, URIs, etc.), this still feels
like a really good trade off that fixes a bunch of practical issues
for the cost of one corner-case feature that's also a bad practice.
Thanks,
Ryan