Nested Parameters Spec

30 views
Skip to first unread message

Josh Peek

unread,
Jan 26, 2009, 4:06:09 PM1/26/09
to Rack Development
Now that we are set on adding nested params to core, lets discuss the
API details before we start the implementation battle.

English API:

* Unless the parameters make use of "[]" in the key, it should
function identical the current parser
* For now, all keys will be strings. Save indifferent string/symbol
access for another day.
* Empty brackets signals the start of an array of keys. "foo[]=bar&foo
[]=baz" => {"foo" => ["bar", "baz"]}
* A key within brackets signals the start of a hash. "foo[bar]=1&foo
[baz]=2" => {"foo" => {"bar" => "1", "baz" => "2"}}
* Deep levels of nesting should follow the same rules
* Rack::Multipart.parse should build on top of Utils.parse_query so it
can take advantage of nested params as well

Some actual code:
(Most of these were ported from Rails unit tests)

Rack::Utils.parse_query("x[y][z]=10").
should.equal "x" => {"y" => {"z" => "10"}}
Rack::Utils.parse_query("x[y][z][]=10").
should.equal "x" => {"y" => {"z" => ["10"]}}
Rack::Utils.parse_query("x[y][z][]=10&x[y][z][]=5").
should.equal "x" => {"y" => {"z" => ["10", "5"]}}
Rack::Utils.parse_query("x[y][][z]=10").
should.equal "x" => {"y" => [{"z" => "10"}]}
Rack::Utils.parse_query("x[y][][z]=10&x[y][][w]=10").
should.equal "x" => {"y" => [{"z" => "10", "w" => "10"}]}
Rack::Utils.parse_query("x[y][][v][w]=10").
should.equal "x" => {"y" => [{"v" => {"w" => "10"}}]}
Rack::Utils.parse_query("x[y][][z]=10&x[y][][v][w]=10").
should.equal "x" => {"y" => [{"z" => "10", "v" => {"w" => "10"}}]}
Rack::Utils.parse_query("x[y][][z]=10&x[y][][z]=20").
should.equal "x" => {"y" => [{"z" => "10"}, {"z" => "20"}]}
Rack::Utils.parse_query("x[y][][z]=10&x[y][][w]=a&x[y][][z]=20&x[y][]
[w]=b").
should.equal "x" => {"y" => [{"z" => "10", "w" => "a"}, {"z" =>
"20", "w" => "b"}]}
Rack::Utils.parse_query("foo=bar&baz=").
should.equal "foo" => "bar", "baz" => ""
Rack::Utils.parse_query("foo=bar&baz[]=1&baz[]=2&baz[]=3").
should.equal "foo" => "bar", "baz" => ["1", "2", "3"]
Rack::Utils.parse_query("foo[]=bar&baz[]=1&baz[]=2&baz[]=3").
should.equal "foo" => ["bar"], "baz" => ["1", "2", "3"]

Questions, comments, objections, additional specs you would like to
have supported?

Christian Neukirchen

unread,
Jan 26, 2009, 4:33:57 PM1/26/09
to rack-...@googlegroups.com
Josh Peek <jo...@joshpeek.com> writes:

> * Unless the parameters make use of "[]" in the key, it should
> function identical the current parser
> * For now, all keys will be strings. Save indifferent string/symbol
> access for another day.
> * Empty brackets signals the start of an array of keys. "foo[]=bar&foo
> []=baz" => {"foo" => ["bar", "baz"]}

#parse_query already does this (in conformance with CGI semantics):
>> Rack::Utils.parse_query("foo=1&foo=2")
=> {"foo"=>["1", "2"]}

(CGI.parse always returns Arrays, btw.)

> * A key within brackets signals the start of a hash. "foo[bar]=1&foo
> [baz]=2" => {"foo" => {"bar" => "1", "baz" => "2"}}
> * Deep levels of nesting should follow the same rules

What is foo&foo=1&foo[]=2&foo[bar]=3?

> * Rack::Multipart.parse should build on top of Utils.parse_query so it
> can take advantage of nested params as well

--
Christian Neukirchen <chneuk...@gmail.com> http://chneukirchen.org

Joshua Peek

unread,
Jan 26, 2009, 5:01:09 PM1/26/09
to rack-...@googlegroups.com
On Mon, Jan 26, 2009 at 3:33 PM, Christian Neukirchen
<chneuk...@gmail.com> wrote:
> #parse_query already does this (in conformance with CGI semantics):
>>> Rack::Utils.parse_query("foo=1&foo=2")
> => {"foo"=>["1", "2"]}

Interesting. So we could probably strip out the empty brackets, so
"foo=1&foo=2" and "foo[]=1&foo[]=2" would have the same output.

> What is foo&foo=1&foo[]=2&foo[bar]=3?

For foo&foo=1&foo[]=2&foo, I'd would guess, { "foo" => ["", "1", "2"] }.

Not sure what to do about cases that have an array and hash: foo[]=1&foo[bar]=2

--
Joshua Peek

Joshua Peek

unread,
Jan 26, 2009, 5:04:34 PM1/26/09
to rack-...@googlegroups.com
On Mon, Jan 26, 2009 at 3:33 PM, Christian Neukirchen
<chneuk...@gmail.com> wrote:
> What is foo&foo=1&foo[]=2&foo[bar]=3?

Tested on Rails:

TypeError: Conflicting types for parameter containers. Expected an
instance of Array but found an instance of Hash. This can be caused by
colliding Array and Hash parameters like qs[]=value&qs[key]=value.
(The parameters received were {"bar"=>"3"}.)

--
Joshua Peek

Matt Todd

unread,
Jan 26, 2009, 5:30:26 PM1/26/09
to rack-...@googlegroups.com
Technically it's a 400 Bad Request, right?

I'm not in favor of the CGI.parse results style... Also, I prefer to only treat keys with [] in them as arrays and [*] as hashes. Otherwise, they should overwrite each other (the latter taking precedence).

Is there a way we can see what a Browser prefers to deal with? Usually it just goes with whatever the name is set for by a developer in the form field name...

Matt


--
Matt Todd
Highgroove Studios
www.highgroove.com
cell: 404-314-2612
blog: maraby.org

Scout - Web Monitoring and Reporting Software
www.scoutapp.com

Scytrin dai Kinthra

unread,
Jan 26, 2009, 10:30:23 PM1/26/09
to rack-...@googlegroups.com
I'd say a strict raising of a parse or argument error. There is no way
to predict the order of query keys for preference on a data type at a
given key. The decision to parse the query pairs in this fashion is
purely implementation (Rack, now, in this case) specific. In the
simple parsing method this does not cause any fault.
If a application or middleware should choose to recover and throw a
400 response it can by way of a rescue. Possibly a fall back to
#simple_parse could be made, but this feels as if it violates the
pols.

I'm inclined to treat any key with /(\[\w*\])+$/ as a hash, as arrays
can already be generated by simply repeating the key in the query
string. The only catch for me would whether to store the value at ''
or nil.
All the /\[\]$/ suffix does, as specified at this point, is generate
an explicit array. Which, reminding me of the php manner of arrays and
hashes being associative lists, does not seem to follow the pols as
well. Well, seems pols now with the general behavior being common
place.

I'm actually in favor of Rack::Utils.parse_query returning values as
hashes, where '&foo=bar' would result in {'foo'=>['bar']} akin to
CGI.parse.

--
stadik.net

Michael Fellinger

unread,
Jan 26, 2009, 10:39:15 PM1/26/09
to rack-...@googlegroups.com
On Tue, Jan 27, 2009 at 12:30 PM, Scytrin dai Kinthra <scy...@gmail.com> wrote:
>
> I'd say a strict raising of a parse or argument error. There is no way
> to predict the order of query keys for preference on a data type at a
> given key. The decision to parse the query pairs in this fashion is
> purely implementation (Rack, now, in this case) specific. In the
> simple parsing method this does not cause any fault.
> If a application or middleware should choose to recover and throw a
> 400 response it can by way of a rescue. Possibly a fall back to
> #simple_parse could be made, but this feels as if it violates the
> pols.
>
> I'm inclined to treat any key with /(\[\w*\])+$/ as a hash, as arrays
> can already be generated by simply repeating the key in the query
> string. The only catch for me would whether to store the value at ''
> or nil.

It might be very beneficial if we could expand this to include at
least unicode keys/values, if not any other encoding since 1.8 has so
inferior support.

Scytrin dai Kinthra

unread,
Jan 26, 2009, 10:56:44 PM1/26/09
to rack-...@googlegroups.com
I'm all for unicode support, as long as uri decoding and encoding
treat it right.

--
stadik.net

James Tucker

unread,
Jan 27, 2009, 4:39:24 AM1/27/09
to rack-...@googlegroups.com

Could we spec specific failure conditions for bad data?

Joshua Peek

unread,
Jan 27, 2009, 3:55:20 PM1/27/09
to rack-...@googlegroups.com
On Mon, Jan 26, 2009 at 3:33 PM, Christian Neukirchen
<chneuk...@gmail.com> wrote:
> What is foo&foo=1&foo[]=2&foo[bar]=3?

Learned something new. Using [] explicitly forces an array.

"foo=1" => {"foo" => "1"}
"foo=1&foo=2" => {"foo" => ["1", "2"]}

"foo[]=1" => {"foo" => ["1"]}
"foo[]=1&foo[]=2" => {"foo" => ["1", "2"]}

This is really useful for checkboxes when you always want an array of
the selected options even when the user only selects 0, 1, or more.


--
Joshua Peek

Matt Todd

unread,
Jan 27, 2009, 4:11:08 PM1/27/09
to rack-...@googlegroups.com
This is as I would expect.

I have a little distaste for the "foo=1&foo=2" resulting in "foo" => ["1", "2"]. It seems a little counterintuitive to me.

Matt


Joshua Peek

unread,
Jan 27, 2009, 4:15:44 PM1/27/09
to rack-...@googlegroups.com
Updated spec.

Its a gist now, so people should be able to fork it.

http://gist.github.com/53541

Scytrin dai Kinthra

unread,
Jan 27, 2009, 7:46:20 PM1/27/09
to rack-...@googlegroups.com
you'd prefer clobbering to occur?

--
stadik.net

Matt Todd

unread,
Jan 27, 2009, 10:43:25 PM1/27/09
to rack-...@googlegroups.com
you'd prefer clobbering to occur?

Indeed, it would seem so.

Unfortunately, I can't easily justify *either* behaviors...

I think I like the clobbering idea because, since we're rendering the query parameters as a hash, clobbering occurs naturally if you try to assign two different values sequentially to the same key.

Unfortunately, the sequential nature is the difficult part since the query string is sent as a whole, not in part or sequentially. So it's at this point where we make our interpretation.

Sorry if I'm overcomplicating things... I'm trying to be able to properly justify my notions.

Matt

Scytrin dai Kinthra

unread,
Jan 28, 2009, 12:33:26 AM1/28/09
to rack-...@googlegroups.com
Which is one of the reasons I'm more in favor of using the [] suffix
as a nil or empty string key, and [*] as defined in the spec.
And creation by default of 0..* values under a given key as an array,
rather than a single value.
Simplification and the reduction of clobbering through mixed keys. It
seems a total application of pols in context of unordered keys.

--
stadik.net

Matt Todd

unread,
Jan 28, 2009, 10:11:24 AM1/28/09
to rack-...@googlegroups.com
Is this a pattern you've seen being used?

Matt

Scytrin dai Kinthra

unread,
Jan 28, 2009, 12:23:27 PM1/28/09
to rack-...@googlegroups.com
In most of the previous, non-framework query string parsing
implementations I've used, all values were returned as arrays.
It wasn't until Rack that I had to deal with single value or array
type situations. And in that context, if we were setting up an
additional suffix to allow the generation of a hash within the hash of
parameters, we'd only be adding another conditional case of Array vs
Hash on top of String vs Array. And since there's no real nice way to
transform an array into a hash, while you can fold a scalar into a
list, we run into the type of icky situations that I abhor.

I honestly would love to treat parsed cgi parameters as a simple data
store. If there's nothing at a key, nil. If there's one or more values
at a key (including an empty string) return an Array of the values. I
can totally deal with the additional type check of a hash on top of
that, but I _hate_ edge cases surrounding a possible single
accidentally omitted character.

--
stadik.net

Christian Neukirchen

unread,
Jan 28, 2009, 2:40:42 PM1/28/09
to rack-...@googlegroups.com
Scytrin dai Kinthra <scy...@gmail.com> writes:

> If there's one or more values
> at a key (including an empty string) return an Array of the values.

+1, the current behavior actually is a misunderstanding of me. ;-)

Matt Todd

unread,
Jan 28, 2009, 3:16:56 PM1/28/09
to rack-...@googlegroups.com
So your preferred API for accessing parameter values is something like this (specifically when it's only been specified as foo=bar):

params["foo"].first # => "bar"

Instead of:

params["foo"] #=> "bar"

Is this correct?

I think, internally, in web applications, this is counterintuitive.

Perhaps we should settle on a policy similar to: if there are any questions about what should be done with the data, put it in an array of values. That way we still get a hash-like experience but for questionable data we pack it into an array.

Matt

Christian Neukirchen

unread,
Jan 28, 2009, 3:28:07 PM1/28/09
to rack-...@googlegroups.com
Matt Todd <chio...@gmail.com> writes:

> Perhaps we should settle on a policy similar to: if there are any questions
> about what should be done with the data, put it in an array of values. That way
> we still get a hash-like experience but for questionable data we pack it into
> an array.

Or maybe req.params["foo"].first, but req["foo"]? That's what cgi.rb does...

Josh Peek

unread,
Jan 28, 2009, 3:30:08 PM1/28/09
to Rack Development
Updated gist without clobbering:

Rack::Utils.parse_query("x[y]=1&x[y]=2").
should.equal "x" => {"y" => ["1", "2"]}}
Rack::Utils.parse_query("x[y][]=1&x[y][]=2").
should.equal "x" => {"y" => ["1", "2"]}}

http://gist.github.com/53541

On Jan 28, 11:23 am, Scytrin dai Kinthra <scyt...@gmail.com> wrote:
> I honestly would love to treat parsed cgi parameters as a simple data
> store. If there's nothing at a key, nil. If there's one or more values
> at a key (including an empty string) return an Array of the values.

I think I'm misunderstanding, but are you saying "foo=bar": {"foo" =>
["bar"]}? I would find that frustrating to work with.

Magnus Holm

unread,
Jan 28, 2009, 3:33:52 PM1/28/09
to rack-...@googlegroups.com
What we can do, is providing a simple_params, which will simply split it into a Hash of {String => Array} if you need the raw params without caring about encoding or splitting.

//Magnus Holm

Matt Todd

unread,
Jan 28, 2009, 3:39:31 PM1/28/09
to rack-...@googlegroups.com
I agree with Josh here.

I simply don't understand why we would want to prefer a more complicated method in terms of the usage API.

Matt

Joshua Peek

unread,
Jan 29, 2009, 2:00:24 PM1/29/09
to rack-...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages