My question is whether it was a design decision to leave URLs
un-decoded, and, if so, what the rational is. I'm not necessarily
disagreeing with such a decision. :)
Jim
--
Jim Fulton
If you do this, do you think you could make this be unicode_path_info,
unicode_script_name, etc? We've already solidified a lot of code around the
fact that these are not decoded.
- C
That would be inconsistent with RFC3986, which specifies utf-8.
Jim
--
Jim Fulton
I guess it really depends on What The World Actually Does, and I'm not
sure in this case. For instance, QUERY_STRING is encoded with the
page encoding I'm pretty sure, so then presumably it could be
/UTF8-urlencoded-data?latin1-urlencoded-data -- which of course may
actually be the case (after all, the browser doesn't generate the
path). Also, what happens when you have <a href="/bête"> or something
in a page? The browser encodes unsafe characters in these cases.
So... I'm hoping someone who has experience with the more challenging
situations with encodings could say what happens.