* Currently, many request handlers match very liberally. For example,
any request that matches "/admin.*" will be handled by the admin module,
including "/adminlskjdfkljsd". Is this intentional? If not, shouldn't
this be fixed?
* Many handlers allow a trailing /, for example "/newticket" and
"/newticket/". What is the reason for this? Noah has mentioned this
being a convenience for people entering URLs that don't know what they
are doing. Are there any other reasons?
* The "matching precision" is very variable between handlers. Some
handlers match only precise URLs (e.g. "/tickets/([0-9]+)$"), others are
more graceful (e.g. "/report(?:/([0-9]+))?", which will show the list of
available reports for "/report/bogus").
Is there a rule for how precise a match_request() should be? I see two
possibilities:
* Be strict in what a handler accepts. This has the advantage of
giving a more precise answer for malformed URLs (usually a "No handler
matched request to ..."). But it is less forgiving for user-entered
URLs, and might therefore have usability issues. If this option should
be followed, then a trailing / should probably not be accepted.
* Be forgiving, and return a reasonable result for any URL matching at
least the handler's root component. For example, in the case of the
timeline, show the timeline for any request matching "/timeline/.*". The
advantage is that users are less likely to be shown an error message.
But this is technically less correct.
Opinions? Which rule should I follow?
-- Remy
Not intentional AFAIK, should be fixed.
> * Many handlers allow a trailing /, for example "/newticket" and "/
> newticket/". What is the reason for this? Noah has mentioned this
> being a convenience for people entering URLs that don't know what
> they are doing. Are there any other reasons?
The problem here is that if you don't allow a trailing slash, you
should at least automatically redirect from e.g. /foo/ to /foo (most
of the web does this the other way around, but hey).
So if we get more strict here, we need to add some kinda of slash-
stripping redirector thing.
> * The "matching precision" is very variable between handlers. Some
> handlers match only precise URLs (e.g. "/tickets/([0-9]+)$"), others
> are more graceful (e.g. "/report(?:/([0-9]+))?", which will show the
> list of available reports for "/report/bogus").
>
> Is there a rule for how precise a match_request() should be? I see
> two possibilities:
>
> * Be strict in what a handler accepts. This has the advantage of
> giving a more precise answer for malformed URLs (usually a "No
> handler matched request to ..."). But it is less forgiving for user-
> entered URLs, and might therefore have usability issues. If this
> option should be followed, then a trailing / should probably not be
> accepted.
>
> * Be forgiving, and return a reasonable result for any URL matching
> at least the handler's root component. For example, in the case of
> the timeline, show the timeline for any request matching "/
> timeline/.*". The advantage is that users are less likely to be
> shown an error message. But this is technically less correct.
>
> Opinions? Which rule should I follow?
I very much prefer strict. The rule should be not to expose the same
resource/representation under multiple different URIs. So even if
there are valid convenience features such as allowing both /foo/ and /
foo, one needs to redirect (as in 301) to the other.
One issue here is that changes in this space may break URIs out there
on the web (bookmarks, search indices, links etc). That's something we
need to be very careful about: we're not just "breaking" the Trac
site, but the sites of all the Trac users out there. :P
Cheers,
Chris
--
Christopher Lenz
cmlenz at gmx.de
http://www.cmlenz.net/
Ok, I'll do that.
>> * Many handlers allow a trailing /, for example "/newticket" and "/
>> newticket/". What is the reason for this? Noah has mentioned this
>> being a convenience for people entering URLs that don't know what
>> they are doing. Are there any other reasons?
>
> The problem here is that if you don't allow a trailing slash, you
> should at least automatically redirect from e.g. /foo/ to /foo (most
> of the web does this the other way around, but hey).
I think some web servers give a slightly different meaning to each.
/foo/ means "foo" is a directory, and /foo that "foo" is a "file". IIRC
Apache does it that way. But in the era of dynamic web services, this
doesn't make sense anymore, and /foo seems more natural.
> So if we get more strict here, we need to add some kinda of slash-
> stripping redirector thing.
Ok.
>> Opinions? Which rule should I follow?
>
> I very much prefer strict. The rule should be not to expose the same
> resource/representation under multiple different URIs. So even if
> there are valid convenience features such as allowing both /foo/ and /
> foo, one needs to redirect (as in 301) to the other.
This makes perfect sense.
> One issue here is that changes in this space may break URIs out there
> on the web (bookmarks, search indices, links etc). That's something we
> need to be very careful about: we're not just "breaking" the Trac
> site, but the sites of all the Trac users out there. :P
Breaking invalid URLs like "/amdinlsdkjflkjsd" should not be a problem.
Breaking e.g. "/timeline/bogus/postfix" might or might not be a problem,
I don't know. The only breakage we really care about is bookmarks and
links, as search indices tend to be rebuilt quite often.
I'll try to do a search for links to Trac sites, and see if any invalid
URLs pop up.
Thanks for your comments.
-- Remy
I thought about adding this as a core system, but there are places where a
trailing / is probably legal. The biggest one is attachment URLs, since / is
a legal filename character on some OSes. Granted it isn't any major ones, so
probably it would be fine to add this at the level of the URL dispatcher.
--Noah
The attachment module treats e.g. "/attachment/wiki/Projects/Trac/" and
"/attachment/wiki/Projects/Trac" differently. The former shows the list
of attachments for the Projects/Trac page, and the second shows the
attachment with the name "Trac" attached to the page Projects.
Maybe the file name and the path should be reversed:
/attachment/wiki/MyFile/Projects/Trac
would open the attachment named "MyFile" for page "Projects/Trac".
-- Remy
> -----Original Message-----
> From: trac...@googlegroups.com [mailto:trac...@googlegroups.com] On
> Behalf Of Remy Blank
> Sent: Tuesday, September 02, 2008 2:52 PM
> To: trac...@googlegroups.com
> Subject: [Trac-dev] Re: Request match "precision" in match_request()
>
This breaks hierarchical semantics and grouping though.
--Noah
>This breaks hierarchical semantics and grouping though.> The attachment module treats e.g. "/attachment/wiki/Projects/Trac/" and
> "/attachment/wiki/Projects/Trac" differently. The former shows the list
> of attachments for the Projects/Trac page, and the second shows the
> attachment with the name "Trac" attached to the page Projects.
>
> Maybe the file name and the path should be reversed:
>
> /attachment/wiki/MyFile/Projects/Trac
>
> would open the attachment named "MyFile" for page "Projects/Trac".
Yes, and it doesn't work anyway, as there's no way to specify that you
want the list of attachments. Note to self: don't post stupid ideas.
What would work is to have two "roots": /attachment for showing
individual attachments, and /attachments for showing lists of attachments.
Oh, I see Ted has already made this suggestion. Take this as a +1, then.
-- Remy
Could you elaborate why the second redirection could be problematic?
IIRC, wiki pages with a / at the end have it stripped, so there should
not be any in the DB.
-- Remy
Given /attachment/wiki/Page/Foo/Bar which do you have
wiki page Page with attachment Foo/Bar
wiki page Page/Foo with attachment Bar
list of attachments on wiki page Page/Foo/Bar
The first option is probably safe to make illegal since / isn't
generally used in file names, but the ambiguity of the last two is the
problem. The best option is probably to use a different character as
the separator between parent and attachment or move the list. Both
have bad URL semantics so I am all ears if someone has a better option.
--Noah
Ted
I have attached a patch to http://trac.edgewall.org/ticket/4878 that
does exactly that: it redirects URLs *for which no handler has been
found* by stripping the / at the end. This ensures that handlers that do
manage trailing slashes differently (like attachments) continue working
as before.
Comments?
-- Remy