url.Parse incorrectly unescaping %2F

1,953 views
Skip to first unread message

Jens Alfke

unread,
Oct 24, 2013, 5:37:47 PM10/24/13
to golang-nuts
I’m having trouble with URL.Parse changing escaped “/“ characters (“%2F”) back into naked slashes. This alters the meaning of the URL, causing my HTTP handlers to break.

Here’s a playground example: http://play.golang.org/p/3q4yN7QsGf
The tl;dr version: Calling
url.Parse("http://example.com/AC%2FDC/Back%20In%20Black”)
results in the URL
http://example.com/AC/DC/Back%20In%20Black
where the %2F has incorrectly been unescaped into a “/“.

This will cause trouble when an HTTP handler breaks the path apart — imagine a REST server that returns info about a record album given a GET to the path “/artist/albumname”. Works great until you look up AC/DC — you encode the artist as “AC%2FDC” then plug that into a URL, but the Go side unescapes the %2F, and the path components end up as [“AC”, “DC”, “Back In Black”] which of course doesn’t work.

I can’t think of a workaround for this, because the URL is broken way before it ever gets to my handler. Is there some other way to tweeze the original URL string out of the http.Request?

—Jens
Message has been deleted

Ian Davis

unread,
Oct 25, 2013, 5:15:08 AM10/25/13
to golan...@googlegroups.com
Go is doing the right thing here. %2F is not escaping the / character,
it is encoding it. a%2Fb and a/b are identical URIs.

The definition of the path component in a URI is:

A path consists of a sequence of path segments separated by a slash
("/") character




>
> —Jens
>
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
Message has been deleted

Ian Davis

unread,
Oct 25, 2013, 10:25:40 AM10/25/13
to golan...@googlegroups.com
 
On Fri, Oct 25, 2013, at 03:18 PM, Islan Dberry wrote:
 
Go is doing the right thing here. %2F is not escaping the / character,
it is encoding it. a%2Fb and a/b are identical URIs.
 
The paths are not identical. The RFC (http://tools.ietf.org/html/rfc3986#section-2.4) states that the components and sub-components of a URI are separated before decoding.  
 
The path "a%2fb" is separated to the sub-component ["a%2fb"] and decoded to ["a/b"].
 
The path "a/b" is separated to the sub-components ["a", "b"] and decoded to ["a", "b"].
 
The godoc for net/url URL type describes how to work around this issue with the URL parser.

OK, my mistake

 
 

Francesc Campoy Flores

unread,
Oct 25, 2013, 10:51:22 AM10/25/13
to Jens Alfke, golang-nuts
I'm not completely sure if the behavior you expect is the correct one, or if you should file a bug.

But you can implement what you expect very easily with the current url package.




—Jens

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
--
Francesc

Toni Cárdenas

unread,
Oct 25, 2013, 11:00:36 AM10/25/13
to golan...@googlegroups.com
Message has been deleted

Toni Cárdenas

unread,
Oct 25, 2013, 11:55:26 AM10/25/13
to golan...@googlegroups.com
That happens already. I suppose mux does that somewhere.

On Friday, October 25, 2013 5:35:37 PM UTC+2, Islan Dberry wrote:


On Friday, October 25, 2013 8:00:36 AM UTC-7, Toni Cárdenas wrote:

This is nice. A possible improvement is to have the mux decode the path variables. Using the OP's example URL and the pattern "/{artist}/{title}", the mux will set the path variables to artist="AC/DC" and title="Back In Black".  
Reply all
Reply to author
Forward
0 new messages