URL Param Encoding and Decoding and the case of the ever-present equals sign

876 views
Skip to first unread message

noahca...@gmail.com

unread,
Nov 17, 2013, 6:35:24 PM11/17/13
to golan...@googlegroups.com
Hello,

I've been recently looking at the goamz library to manipulate resources on S3.  The library has good support for your general CRUD on an S3 bucket.  However, S3 has since evolved to include things like ACLs, Versioning, Static Hosting, etc.  In order to manipulate these resources, one must access, what AWS calls, sub-resources.

These sub-resources take the form of /?resource. For example, /?website.  Creating a URL instance to access this is not impossible and quite feasible within the go standard sdk.  However, goamz needs to parse a URL to generate an Authorization code based on AWS specific canonicalization rules.  This signature has very specific requirements that must be emulated on the client as it is performed on the server.

In goamz today, you provide a URL in the form of /?website.  In the process of decoding the Values of the URL, a '=' character will be appended to the key as in '/?website='.  The extra = will case the Authentication code to not match what the server generates and thus using goamz today it is not possible without working around the net/url package.  It seems like a waste not to improve on the standard library to handle instances where query params are just keys with no value and correctly handle the inclusion of = or not based on the presence of a value.  Have some time on the airplane, I was able to control the inclusion of the '=' based on a value being present or not.  Something along the lines of:

Values{"foo": {}, "bar": {}} will render to "foo&bar"
Values{"foo": "", "bar": ""} will render to "foo=&bar="

Is there interest in updating the Encode() func to handle these types of URLs?  Parsing would also need to be updated so a parsed query string would correctly encode the query string as it found.  I'm happy to share what I have and I'm also happy if there is existing way to do it that I may have overlooked.

Thank you,
-Noah

Gijs

unread,
Nov 18, 2013, 11:48:11 AM11/18/13
to golan...@googlegroups.com, noahca...@gmail.com
I might be mistaken, but does it really matter? According to both versions of the signature process (http://docs.aws.amazon.com/general/latest/gr/signing_aws_api_requests.html) create a canonicalized version of the query string in which every empty value has an '=' appenden:

Separate parameter names from their values with the equal sign character (=) (ASCII character 61), even if the value is empty.

So maybe, if you're encountering errors, there's a bug in goamz's signing process?

- Gijs

Dave Cheney

unread,
Nov 18, 2013, 7:32:43 PM11/18/13
to Gijs, golang-nuts, noahca...@gmail.com
Can you please show a sample request which is failing
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

Noah Campbell

unread,
Nov 20, 2013, 12:06:23 AM11/20/13
to Dave Cheney, Gijs, golang-nuts
I’m working on getting the failing example. In doing so I’ve gone back through and looked at the process for signing a request. It seems that the query params are handled in the sing routine within goamz: http://bazaar.launchpad.net/~goamz/goamz/trunk/view/head:/s3/sign.go#L84

I overlooked that goamz has the capabilities to set the params for the request. I think my original position still hold water. To highlight the issue at its most basic level, I put together this: http://play.golang.org/p/jMmSc6O0Yg

While, semantically they are equivalent, the signing algorithm used by amazon (and it’s clones) must rely on byte for byte equivalence.

Anyway, I think the ability for a querystring like ?website to be parsed and encoded into the same byte for byte representation would be useful within the standard library.

Let me know if you need a more concrete example. I also have the change made in my local golang dev branch if you’re interested.

Thank you,
-Noah
Noah Campbell
415-513-3545
noahca...@gmail.com



signature.asc

Dave Cheney

unread,
Nov 20, 2013, 12:25:09 AM11/20/13
to Noah Campbell, Gijs, golang-nuts
I think there are two problems here.

The first is ParseQuery("website") IMO should fail. My reading of the
RFC says that http query strings are pairs of values concatenated with
a =. Put another way, `website=` is map[string]string { "website": ""
} , `website` is not valid.

The second is that `website` is being silently expanded to `website=`
inside goamz. I think this is of lesser importance if the first issue
is resolved.

Cheers

Dave

Noah Campbell

unread,
Nov 20, 2013, 4:57:02 PM11/20/13
to Dave Cheney, Gijs, golang-nuts
Hi Dave,

I here what you're saying about the format being odd.  However, even if I were able to influence AWS to make the change, it'd be years before the API is deprecated given they still support the original API when the service was launched back in the day.

As you suggested this format is a violation of the RFC, I wasn't able to find an RFC state that a query parameter had to be key value pairs exclusively.  RFC 3986 says that the query is everything in between the ? and the #.  But, you're probably more familiar with this than I am.  Regardless of the RFC, it doesn't necessarily address the reality of AWS not changing their interface any time soon.

I'm sure you also meant to point out Values is a map[string][]string, not map[string]string.  http://golang.org/pkg/net/url/#Values.  In this case there is room for a key in a query to exist without any values.  In my local version, it results in:  {"website": {}} for no value and {"webiste: {""}} for an empty string value.  This provides enough information to correctly Encode the query string the way it was found.

I would like to see Golang be robust to handle these funny edge cases so we don't see folks working around it.  The change I have doesn't break existing functionality according to the unit tests in net/url.

Thoughts?

-Noah

Dave Cheney

unread,
Nov 20, 2013, 5:01:59 PM11/20/13
to Noah Campbell, Gijs, golang-nuts
Hi Noah,

Can you please tell em the full URL you are trying to sign with goamz. 

Cheers

Dave

Noah Campbell

unread,
Nov 20, 2013, 5:34:09 PM11/20/13
to Dave Cheney, Gijs, golang-nuts
Here’s the path that needs to be signed (whitespace included)

"PUT


Wed, 20 Nov 2013 04:46:50 UTC
/my-s3-bucket/?website”

In the goamz I created a PutConfiguration func that takes “/?website” as the input.

The resulting url includes /my-s3-bucket/?website= which causes Amazon to reject the request because the signatures don’t map.

-Noah
signature.asc

Dave Cheney

unread,
Nov 20, 2013, 5:35:52 PM11/20/13
to Noah Campbell, Gijs, golang-nuts
Hi Noah,

Who is generating that URL? If it is required by amazon, can you show me the documentation for that. 

Cheers

Dave

Noah Campbell

unread,
Nov 20, 2013, 5:41:59 PM11/20/13
to Dave Cheney, Gijs, golang-nuts

Dave Cheney

unread,
Nov 20, 2013, 6:38:54 PM11/20/13
to Noah Campbell, Gijs, golang-nuts
Thanks Noah,

I see where that query param is coming from,

"f the request addresses a subresource, such as ?versioning,
?location, ?acl, ?torrent, ?lifecycle, or ?versionid, append the
subresource, its value if it has one, and the question mark. Note that
in case of multiple subresources, subresources must be
lexicographically sorted by subresource name and separated by '&',
e.g., ?acl&versionId=value."

I there are a few things going on here.

1. according to RFC 3986 the data between the ? and # is the query
section of the request, however 3986 does not define the syntax of the
query section.

2. RFC 1738 does define the form of the HTTP url (it's been too long
and I can't remember the exact differnces between URL's and URIs)
query parameter, but calls it search (i guess that is another way of
say query). But it doesn't call it a query parameter.

3. RFC 2616 doesn't give much detail here either.

4. This area is under tested in the standard library, I'm mulling over
raising an issue about this.

Let me raise an issue on goamz (you can do this also, but it means
having a launchpad account) and consule with the goamz author. Thanks
for your patience.

Cheers

Dave

Noah Campbell

unread,
Dec 2, 2013, 4:46:51 PM12/2/13
to Dave Cheney, Gijs, golang-nuts
Hi Dave,

Thanksgiving is now out of the way, I can give some attention to this.  I didn't see an issue raised on goamz, at least in a quick pass of the buglist.  Did you speak with the maintainers on a back channel?

goamz can be made to work, but it seems like an issue that would not be specific to only goamz.  As you pointed out what goes into a query is pretty opaque.  I agree the key/value combo is definitely the norm, but nothing in spec directly calls out what amazon is doing as incorrect.

The intent that I'm after with this is to make sure the query parameters are stable so that if you include an amazon "subresource" in a url, that it is honored when Encoded.  It is not difficult to include this behavior and I think it is more in line with what an uninitiated user would expect.  It would be less surprising, in my opinion.

It is also right after 1.2 release so it can be put in place early in the dev cycle (assuming it makes sense to add this).

Thanks,
-Noah


Kevin Gillette

unread,
Dec 2, 2013, 5:27:58 PM12/2/13
to golan...@googlegroups.com, Noah Campbell, Gijs
Whether or not it's defined in the RFC, valueless query parameters (including omission of the equal-sign, such as "?x&y&z") are both common and typically well supported in the realm of server-side software.

Noah Campbell

unread,
Dec 2, 2013, 5:30:13 PM12/2/13
to Kevin Gillette, golang-nuts, Gijs
I would imagine they would be, but I didn't want to project my opinion.  Can you give an example server side system that uses this convention?  Also, do you know if they would choke if "?x=&y=&z=" were presented?

-Noah

Gareth

unread,
Dec 2, 2013, 5:52:58 PM12/2/13
to golan...@googlegroups.com, noahca...@gmail.com
fwiw I recently branched the goamz library to add support for the s3 multi-delete operation, which expects request URLs to end in /?delete

I had no issues sending and signing URLs ending in /?delete= instead; I just had to make sure that goamz knew to include the delete parameter when computing the signature.

Here's my current merge proposal for the change I wrote/used:

Gareth.

Kevin Gillette

unread,
Dec 2, 2013, 5:57:57 PM12/2/13
to golan...@googlegroups.com, Kevin Gillette, Gijs, noahca...@gmail.com
On Monday, December 2, 2013 3:30:13 PM UTC-7, Noah Campbell wrote:
I would imagine they would be, but I didn't want to project my opinion.  Can you give an example server side system that uses this convention?  Also, do you know if they would choke if "?x=&y=&z=" were presented?

I haven't encountered any server side systems that would choke on either "?x&y&z" or "?x=&y=&z=". Simply put, they'd be rather non-robust if they didn't handle both forms gracefully. Some systems are unable to distinguish between either form, while others, for example, would consider all of the former to be nil valued and all of the latter to be empty-string valued. Go treats both forms as being identical.

See http://play.golang.org/p/RGHKn4sBfP for an demo of this (though it won't run in the playground)

Kevin Gillette

unread,
Dec 2, 2013, 6:03:34 PM12/2/13
to golan...@googlegroups.com, Kevin Gillette, Gijs, noahca...@gmail.com
Here's an equivalent that does run in the playground: http://play.golang.org/p/A8Q2N_OAuC
Reply all
Reply to author
Forward
0 new messages