Space in parameter name or value must be encoded as %20. The spec doesn't say how it must be encoded in the URL itself which I guess leaves + as a valid option. But for the Signature Base String, only %20 can be used. If an actual + is in the name or value, it must be encoded as %2B.
> -----Original Message----- > From: oauth@googlegroups.com [mailto:oauth@googlegroups.com] On Behalf > Of Rod Begbie > Sent: Sunday, November 11, 2007 1:42 AM > To: oauth@googlegroups.com > Subject: [oauth] Space encoding
> I'm seeing a problem in both Twitter's and termie's endpoints, and > want to bring it up.
> My JavaScript client is now 95% done, and can now get tokens and > mostly authenticate against servers.
> However, I'm getting signature errors when I make calls containing > spaces.
BTW, because most URL encoding functions are not as restrictive as OAuth, I use my own functions. It is such a simple process that you might want to consider just writing your own URL encode. Here are two in C++ and C# that I use. The nice thing about the C++ one is that it has the double encoded feature which allows me to write parameters directly into the Signature Base String instead of using an intermediate string. It saves two copies, one of the parameters concatenation, and second of the URL encoding of the normalized string. I just append everything into a stringstream.
public static string urlEncode(string value_) { string result = "";
foreach (char symbol in value_) { if (_unreservedChars.IndexOf(symbol) != -1) { result += symbol; } else { result += '%' + String.Format("{0:X2}", (int)symbol); } }
Note also that OAuth (section 5.4.1) requires the restrictive encoding for parameters in an Authorization header. Although it seems like a good idea for servers to accept any ordinary percent encoding.
As long as the Signature Base String uses only the spec's encoding. There is an important difference between using the parameters values and building the Signature Base String. Allowing a looser encoding parsing on the server is ok, as long as when building the SBS the values are re-encoded using the OAuth encoding flavor.
> -----Original Message----- > From: oauth@googlegroups.com [mailto:oauth@googlegroups.com] On Behalf > Of John Kristian > Sent: Monday, November 12, 2007 10:46 AM > To: OAuth > Subject: [oauth] Re: Space encoding
> Note also that OAuth (section 5.4.1) requires the restrictive encoding > for parameters in an Authorization header. Although it seems like a > good idea for servers to accept any ordinary percent encoding.
> -----Original Message----- > From: oauth@googlegroups.com [mailto:oauth@googlegroups.com] On Behalf Of > Eran Hammer-Lahav > Sent: Sunday, November 11, 2007 5:40 AM > To: oauth@googlegroups.com > Subject: [oauth] Re: Space encoding
> Space in parameter name or value must be encoded as %20. The spec doesn't > say how it must be encoded in the URL itself which I guess leaves + as a > valid option. But for the Signature Base String, only %20 can be used. If > an > actual + is in the name or value, it must be encoded as %2B.
> EHL
> > -----Original Message----- > > From: oauth@googlegroups.com [mailto:oauth@googlegroups.com] On Behalf > > Of Rod Begbie > > Sent: Sunday, November 11, 2007 1:42 AM > > To: oauth@googlegroups.com > > Subject: [oauth] Space encoding
> > I'm seeing a problem in both Twitter's and termie's endpoints, and > > want to bring it up.
> > My JavaScript client is now 95% done, and can now get tokens and > > mostly authenticate against servers.
> > However, I'm getting signature errors when I make calls containing > > spaces.
in Ruby. The question is, do we want to change the encoding to be whatever the "majority of programming languages do" - the OAuth spec is not the place to decide that URL encoding has been done wrong in all popular programming languages. They're simply not going to change for us.
b.
On Nov 11, 2007 5:50 AM, Eran Hammer-Lahav <hammerla...@gmail.com> wrote:
> BTW, because most URL encoding functions are not as restrictive as OAuth, > I > use my own functions. It is such a simple process that you might want to > consider just writing your own URL encode. Here are two in C++ and C# that > I > use. The nice thing about the C++ one is that it has the double encoded > feature which allows me to write parameters directly into the Signature > Base > String instead of using an intermediate string. It saves two copies, one > of > the parameters concatenation, and second of the URL encoding of the > normalized string. I just append everything into a stringstream.
On Nov 12, 2:48 pm, "Blaine Cook" <rom...@gmail.com> wrote:
> ... is there a common behaviour that most URL > encoding functions exhibit?
In Java, java.net.URLEncoder.encode encodes the characters from space through tilde as +%21%22%23%24%25%26%27%28%29*%2B%2C-.%2F0123456789%3A%3B%3C%3D%3E%3F %40ABCDEFGHIJKLMNOPQRSTUVWXYZ%5B%5C%5D%5E_%60abcdefghijklmnopqrstuvwxyz %7B%7C%7D%7E
On Nov 12, 1:32 pm, "Gabe Wachob" <gabe.wac...@amsoft.net> wrote:
This "+" vs. "%20" discussion is a permathread in stds discussions w/r/t URIs that get placed in various contexts and need to compared with other URIs or string fragments.
Think may NOT be a bug in OAuth. It is the result of a bug in the state of the world of specs - because the *unencoding* of ' ' in URIs is handled differently by different processors - OAuth is simply trying to nail this ambiguity down. Whether this ambiguity is a problem for OAuth or not is something I discuss below (I'm leaning towards thinking its *not* a problem).
The issue arises if you are handed a URI and asked to compare it (or some part of it) with a string which is NOT encoded (from a URI perspective). For example, if you have a name that is encoded in a query string - comparing that name to a name not embedded in a URI may be ambiguous:
If you say "Big Daddy" (common pratice/CGI forms), then
* I have to use http://example.com?Big%2BDaddy to encode "Big+Daddy" * Your out-of-the-box libraries will extract the query string as "Big Daddy" with no special processing * By implication 'http://example.com?Big%20Daddy' and 'http://example.com?Big+Daddy' are logically equivalent because the query string un-encodes to the same thing and the two URIs are the same otherwise. This also means there are two perfectly equivalent ways of encoding 'Big Daddy'. We have to confirm/reject that this is an issue for OAuth - this is the central issue. It really comes down to constructing the signature base string that gets signed (section 9.1.3) - because there can be no ambiguity about creating the string that gets signed. That being said, I'm not actually seeing an issue there right now because the component parts can be derived from the incoming HTTP request, outside the consumer & token secrets. *Someone else needs to review this as well though.*
I hope this directs the analysis.
-Gabe
_____
From: oauth@googlegroups.com [mailto:oauth@googlegroups.com] On Behalf Of Blaine Cook Sent: Monday, November 12, 2007 2:49 PM To: oauth@googlegroups.com Subject: [oauth] Re: Space encoding
I consider this a bug (in OAuth) - is there a common behaviour that most URL encoding functions exhibit?
Implementing a URL encode function should *NOT* be part of implementing OAuth. My implementation currently does:
CGI.escape(value.to_s).gsub("%7E", "~")
but if we stick with the currently more-restrictive encoding, I could do:
in Ruby. The question is, do we want to change the encoding to be whatever the "majority of programming languages do" - the OAuth spec is not the place to decide that URL encoding has been done wrong in all popular programming languages. They're simply not going to change for us.
b.
On Nov 11, 2007 5:50 AM, Eran Hammer-Lahav <hammerla...@gmail.com
BTW, because most URL encoding functions are not as restrictive as OAuth, I use my own functions. It is such a simple process that you might want to consider just writing your own URL encode. Here are two in C++ and C# that I
use. The nice thing about the C++ one is that it has the double encoded feature which allows me to write parameters directly into the Signature Base String instead of using an intermediate string. It saves two copies, one of the parameters concatenation, and second of the URL encoding of the normalized string. I just append everything into a stringstream.
public static string urlEncode(string value_) { string result = "";
foreach (char symbol in value_) { if (_unreservedChars.IndexOf(symbol) != -1) { result += symbol; } else { result += '%' + String.Format("{0:X2}", (int)symbol); } }
URL encoding and normalization are a problem in general. There is no standard for doing either in a consistent way (i.e. that will always produce identical strings).
I agreed with this argument when we had a problem with base64 implementations. In that case, there is a valid point in not requiring everyone to do base64 on their own and use the tools available. But in the base64 case, it had no impact on the SBS, all RFCs require that the server be able to read any line length (basically ignore all non base64 characters), and overall the RFC we used was the general practice.
With URL encoding, as my code sample shows, it is super easy, and most importantly, consistent and clear. Your own example shows just how simple it is for you to "fix" Ruby's implementation.
EHL
From: oauth@googlegroups.com [mailto:oauth@googlegroups.com] On Behalf Of Blaine Cook Sent: Monday, November 12, 2007 5:49 PM To: oauth@googlegroups.com Subject: [oauth] Re: Space encoding
I consider this a bug (in OAuth) - is there a common behaviour that most URL encoding functions exhibit?
Implementing a URL encode function should *NOT* be part of implementing OAuth. My implementation currently does:
CGI.escape(value.to_s).gsub("%7E", "~")
but if we stick with the currently more-restrictive encoding, I could do:
in Ruby. The question is, do we want to change the encoding to be whatever the "majority of programming languages do" - the OAuth spec is not the place to decide that URL encoding has been done wrong in all popular programming languages. They're simply not going to change for us.
b. On Nov 11, 2007 5:50 AM, Eran Hammer-Lahav <hammerla...@gmail.com <mailto:hammerla...@gmail.com> > wrote:
BTW, because most URL encoding functions are not as restrictive as OAuth, I use my own functions. It is such a simple process that you might want to consider just writing your own URL encode. Here are two in C++ and C# that I use. The nice thing about the C++ one is that it has the double encoded feature which allows me to write parameters directly into the Signature Base String instead of using an intermediate string. It saves two copies, one of the parameters concatenation, and second of the URL encoding of the normalized string. I just append everything into a stringstream.
public static string urlEncode(string value_) { string result = "";
foreach (char symbol in value_) { if (_unreservedChars.IndexOf(symbol) != -1) { result += symbol; } else { result += '%' + String.Format("{0:X2}", (int)symbol); } }
> URL encoding and normalization are a problem in general. There is no > standard for doing either in a consistent way (i.e. that will always > produce identical strings).
> I agreed with this argument when we had a problem with base64 > implementations. In that case, there is a valid point in not > requiring everyone to do base64 on their own and use the tools > available. But in the base64 case, it had no impact on the SBS, all > RFCs require that the server be able to read any line length > (basically ignore all non base64 characters), and overall the RFC we > used was the general practice.
> With URL encoding, as my code sample shows, it is super easy, and > most importantly, consistent and clear. Your own example shows just > how simple it is for you to “fix” Ruby’s implementation.
> EHL
> From: oauth@googlegroups.com [mailto:oauth@googlegroups.com] On > Behalf Of Blaine Cook > Sent: Monday, November 12, 2007 5:49 PM > To: oauth@googlegroups.com > Subject: [oauth] Re: Space encoding
> I consider this a bug (in OAuth) - is there a common behaviour that > most URL encoding functions exhibit?
> Implementing a URL encode function should *NOT* be part of > implementing OAuth. My implementation currently does:
> CGI.escape(value.to_s).gsub("%7E", "~")
> but if we stick with the currently more-restrictive encoding, I > could do:
> in Ruby. The question is, do we want to change the encoding to be > whatever the "majority of programming languages do" - the OAuth spec > is not the place to decide that URL encoding has been done wrong in > all popular programming languages. They're simply not going to > change for us.
> b.
> On Nov 11, 2007 5:50 AM, Eran Hammer-Lahav <hammerla...@gmail.com > > wrote:
> BTW, because most URL encoding functions are not as restrictive as > OAuth, I > use my own functions. It is such a simple process that you might > want to > consider just writing your own URL encode. Here are two in C++ and > C# that I > use. The nice thing about the C++ one is that it has the double > encoded > feature which allows me to write parameters directly into the > Signature Base > String instead of using an intermediate string. It saves two copies, > one of > the parameters concatenation, and second of the URL encoding of the > normalized string. I just append everything into a stringstream.
> With URL encoding, as my code sample shows, it is super easy, > and most importantly, consistent and clear. Your own example > shows just how simple it is for you to "fix" Ruby's implementation.