How to collect the tail of a variable length URL with encoded slashes?

50 views
Skip to first unread message

XDF

unread,
Apr 24, 2011, 11:13:59 PM4/24/11
to Mojolicious
I have built an application that uses wildcard placeholders to match
variable length URLs. I have these routing rules:

my $r = $self->routes;
my $l = $r->route ('/:lang', lang => qr/en|fi/ );
my $v = $l->route ( '/:action', action => qr/browse|inspect|show/ );
$v->route( '(*path)' )->to ( controller => 'view' );

The tail of the URL gets successfully collected to $stash->{path} so
the rules work OK. The issue I'm having is this: if the tail contains
encoded slashes there is no way to distinguish them from the real URL
slashes. For example with URL "/en/browse/fnum/f%2F1.4/fnum/f%2F1.8/"
I get $stash->{path} value "fnum/f/1.4/fnum/f/1.8/" so all the slashes
look the same and there is no way to correctly split the value to key-
value pairs. It should be "fnum -> f/1.4, fnum -> 1.8", not "fnum ->
f, 1.4 -> fnum, f -> 1.8".

Is there a way to collect the tail of a variable length URL so that
encoded slashes are not mixed with real URL slashes? Of course I could
get the raw URL via $self->req->url->path->to_string and process it
from there but that seems to be an ugly way out.

Sebastian Riedel

unread,
Apr 24, 2011, 11:21:49 PM4/24/11
to mojol...@googlegroups.com
Even with $self->req->url->path you should not be able to get this information.
And considering our recent security vulnerability (which was in fact related to this), as well as the portability problems, i very much doubt we will support this in the future.

--
Sebastian Riedel
http://mojolicio.us
http://twitter.com/kraih
http://blog.kraih.com

> --
> You received this message because you are subscribed to the Google Groups "Mojolicious" group.
> To post to this group, send email to mojol...@googlegroups.com.
> To unsubscribe from this group, send email to mojolicious...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mojolicious?hl=en.
>

XDF

unread,
Apr 27, 2011, 10:28:33 PM4/27/11
to Mojolicious

Thank you for reply.

I think that the raw unprocessed url should be available somewhere,
for example at $self->req->url->raw. I don't mind that it is not
available for standard routing but it should be readable somewhere in
the system.

I have found a workaround for my system. The urls are generated by the
system and the key-value -pairs in urls have predefined keys. So I was
able to write a hack to merge the url parts so that the correct key-
value -pairs can be found. Strangely double-encoding of slashes didn't
work: they still got automatically decoded back (Mojolicious version
1.16).

Generally encoded slashes seems to be an issue in many applications,
frameworks and web servers.

Ben van Staveren

unread,
Apr 28, 2011, 12:52:11 AM4/28/11
to mojol...@googlegroups.com

> I have found a workaround for my system. The urls are generated by the
> system and the key-value -pairs in urls have predefined keys. So I was
> able to write a hack to merge the url parts so that the correct key-
> value -pairs can be found. Strangely double-encoding of slashes didn't
> work: they still got automatically decoded back (Mojolicious version
> 1.16).
There's a reason they still get decoded :) Security wise, you want to decode
anything that comes in - and double-encoding is a known trick to get past
simplistic decoding. What you're asking is basically also asking for security
trouble ;)


XDF

unread,
Apr 28, 2011, 5:23:33 AM4/28/11
to Mojolicious

Actually what I'm asking is a read-only string to contain the raw
url :-D What I've understood is that most web systems have eventually
added a configuration option to disable slash decoding (= allow
encoded slashes and pass them thru unmodified) or a way to at least
access the raw url. Catalyst CGI engine has "use_request_uri_for_path"
to bypass decoding. Apache has "AllowEncodedSlashes" to bypass the
basic security check. Tomcat has
org.apache.tomcat.util.buf.UDecoder.ALLOW_ENCODED_SLASH. PSGI has
REQUEST_URI that is "the undecoded, raw request URL line" adding "this
value SHOULD NOT be decoded by servers".

Actually if I run Mojolicious with plackup what I get is the raw uri
at $self->req->env->{REQUEST_URI}. So I actually have two workarounds
now: do the hack-postprocessing of uri as explained in my previous
post or decide to run Mojolicious via PSGI at least in the production.
By the way, running Mojolicious on PSGI is both easy and cool and the
documentation gets it right: "Mojolicious applications are
ridiculously simple to deploy with Plack".
Reply all
Reply to author
Forward
0 new messages