An EndsWithSlash Controller

0 views
Skip to first unread message

beppu

unread,
Jul 5, 2009, 3:08:55 AM7/5/09
to squatting-framework
I was watching my access logs today, and I saw someone come in to one
of my sites from google and then try to guess a URL based on
hierarchy.

He did something like this:

GET /foo/bar/baz => 200
GET /foo/bar/ => 404!

The problem with the second request was that he put a trailing slash
at the end, but the controller that he wanted matches a very similar
path that DOES NOT have the trailing slash. So then he received an
uninformative 404 response which prompted him to leave my site.

To prevent this from happening in the future, I came up with this
simple controller:

C(
EndsWithSlash => [ '/(.*)/' ],
get => sub {
my ($self, $path) = @_;
$self->redirect("/$path", 301);
}
),

This is a controller you can put towards the end of your controller
list, and it'll catch URLs that end with '/' and perform a 301
redirect to a version of the URL that doesn't end with '/'.

I thought this was generally useful and thought I'd share it w/ the
rest of you.

--beppu

Terrence Brannon

unread,
Jul 5, 2009, 3:52:29 AM7/5/09
to squatting...@googlegroups.com

beppu wrote:
> I was watching my access logs today, and I saw someone come in to one
> of my sites from google and then try to guess a URL based on
> hierarchy.
>
> He did something like this:
>
> GET /foo/bar/baz => 200
> GET /foo/bar/ => 404!
>
> The problem with the second request was that he put a trailing slash
> at the end, but the controller that he wanted matches a very similar
> path that DOES NOT have the trailing slash. So then he received an
> uninformative 404 response which prompted him to leave my site
>

> To prevent this from happening in the future, I came up with this
> simple controller:
>
> C(
> EndsWithSlash => [ '/(.*)/' ],
>

this allows paths like this "//" .. maybe it should be /(.+)/
I looked in REgexp::Common and did not find a path regexp
<http://search.cpan.org/~abigail/Regexp-Common-2.122/lib/Regexp/Common/URI.pm>


> get => sub {
> my ($self, $path) = @_;
> $self->redirect("/$path", 301);
> }
> ),
>
> This is a controller you can put towards the end of your controller
> list, and it'll catch URLs that end with '/' and perform a 301
> redirect to a version of the URL that doesn't end with '/'.
>

Well the DOCUMENT_ROOT feature of apache would catch a URL with / and
try to find index.html in that directory.

Perhaps that should be option 1?

I have the same issue with the unix comamnd cp --archive

if you put a trailing slash on the end you get one behavior, otherwise
you get another.

beppu

unread,
Jul 5, 2009, 4:24:23 AM7/5/09
to squatting-framework


On Jul 5, 12:52 am, Terrence Brannon <metap...@gmail.com> wrote:

> >   C(
> >     EndsWithSlash => [ '/(.*)/' ],
>
> this allows paths like this "//"  .. maybe it should be /(.+)/

Good point. You're right about the regex.

> >     get => sub {
> >       my ($self, $path) = @_;
> >       $self->redirect("/$path", 301);
> >     }
> >   ),
>
> > This is a controller you can put towards the end of your controller
> > list, and it'll catch URLs that end with '/' and perform a 301
> > redirect to a version of the URL that doesn't end with '/'.
>
> Well the DOCUMENT_ROOT feature of apache would catch a URL with / and
> try to find index.html  in that directory.
>
> Perhaps that should be option 1?

You're thinking of mod_dir which does something similar in spirit:

http://httpd.apache.org/docs/2.2/mod/mod_dir.html

However, mod_dir's DirectorySlash directive *adds* a '/' instead of
removing it from the path.

--beppu

beppu

unread,
Jul 5, 2009, 4:35:06 AM7/5/09
to squatting-framework
One sec... I didn't read your response carefully enough before I
responded.

We're both talking about mod_dir-related features, but you were
thinking of DirectoryIndex while I was thinking of DirectorySlash.
Neither can really solve my problem, because I'm trying to *remove*
something from the URL, but both DirectoryIndex and DirectorySlash
*add* things to the URL.

I think mod_rewrite is capable of solving my problem, but honestly,
I'd rather not go there if I don't have to.

--beppu

Terrence Brannon

unread,
Jul 5, 2009, 4:38:20 AM7/5/09
to squatting...@googlegroups.com

beppu wrote:
> One sec... I didn't read your response carefully enough before I
> responded.
>
> We're both talking about mod_dir-related features, but you were
> thinking of DirectoryIndex while I was thinking of DirectorySlash.
>

ah yes!


> Neither can really solve my problem, because I'm trying to *remove*
> something from the URL,
>

i think you are over idiot-proofing your site and in the process, you
are making it counter-intuitive.
the mapping between path syntax and behavior is becoming less
straightforward now.

it's kind of like the mess were are in with HTML because the web browser
people came up with
permissive parsers.

beppu

unread,
Jul 5, 2009, 6:03:59 AM7/5/09
to squatting-framework


On Jul 5, 1:38 am, Terrence Brannon <metap...@gmail.com> wrote:
> > Neither can really solve my problem, because I'm trying to *remove*
> > something from the URL,
>
> i think you are over idiot-proofing your site and in the process, you
> are making it counter-intuitive.
> the mapping between path syntax and behavior is becoming less
> straightforward now.

I'm going to respectfully disagree. The guy came in from google and
landed on /foo/bar/baz and got some content. He then tried to get
more general information by typing in /foo/bar/. That was a *GOOD*
guess on his part, because if he were accessing a static, apache-
powered site, that would've been the correct path. However, he
happened to be on a Squatting-powered site where all the URLs are
virtual and '/foo/bar/' meant nothing to my app, but '/foo/bar'
would've given him exactly what he was looking for. He was just one
character off, and I knew exactly what he meant. Why not give him the
content he was looking for by kindly redirecting him?

He wasn't being an idiot. It was moderately clever to guess that '/
foo/bar/' might give him more general info. It happened to be an
incorrect guess at the time, but as the author of the web app, I can
change that. By adding that one EndsWithSlash controller, '/foo/bar/'
is now a correct guess as far as my web app is concerned. (There's no
standards body for URL patterns -- it's totally arbitrary and
rightfully so.)


> it's kind of like the mess were are in with HTML because the web browser
> people came up with
> permissive parsers.

URL correctness and HTML correctness are completely different issues
due to who gets to define correctness and who gets to interpret
correctness.

URL correctness is defined by the coder of the web app and is
interpreted by the web app. It's pretty arbitrary, and there is no
possibility for conflict, because there's only 1 party involved -- in
this case, me.

However, HTML correctness is defined by the W3C and is interpreted by
various (permissive) web browsers. Now, we have plenty of room for
conflict, because the number of participants == 1 + scalar
(@browser_vendors). Every browser vendor has deviated from the
standard in one way or another, and we know how that turned out.

So getting back to URLs, if I choose to strip off trailing slashes in
URLs to my web app, it's literally correct, because I say so. It's my
app, my house, my rules, etc. ;-) Furthermore, no one else is
affected by my decision. No one else has to strip trailing slashes
from their URLs if it doesn't make sense for their site. My decision
and its repercussions are isolated to my site alone.

It's not like HTML where one major browser vendor made closing tags
optional and now every browser vendor has to follow suit. That
situation has a totally different dynamic.

--beppu

Terrence Brannon

unread,
Jul 5, 2009, 8:19:50 PM7/5/09
to squatting...@googlegroups.com
Of course the biggest question/issue: why did he have to guess at all?
Where is the breadcrumb/navbar?

Sent from my iPhone

beppu

unread,
Jul 6, 2009, 8:17:47 PM7/6/09
to squatting-framework
It was there in plain sight and a big font with not many other links
around it. I think he just felt like typing.

Scott Walters

unread,
Jul 6, 2009, 8:49:45 PM7/6/09
to squatting...@googlegroups.com
Pardon me butting in, but it's a few lines of code, and it's
completely optional. It's stuff you can paste in to your Squatting
app, not stuff you get thrust upon you as part of a detail. Win, win.
Put it in the cookbook and let people cook with it if they please =)

-scott
Reply all
Reply to author
Forward
0 new messages