ParamsParser and request body streams

160 views
Skip to first unread message

brianmario

unread,
May 15, 2009, 2:31:49 PM5/15/09
to Ruby on Rails: Core
Hey everyone,
I was wondering what the possability was of changing these lines:

body = request.raw_post
Hash.from_xml(body)
or
data = ActiveSupport::JSON.decode(body)

to instead be handed the env['rack.input'] IO itself?

Being as though both calls (from_xml and decode) are handled by rails
elsewhere, we could move the responsibility of parsing of the request
body IO to the swappable XML/JSON backends.

I'm asking because I'm looking to integrate yajl-ruby as one of the
JSON backends for ActiveSupport in Rails 3. It's capable of parsing
JSON right off the IO as a stream, and this change would allow it to
perform it's best (and keep memory usage very low for large request
bodies).

As for doing it for the XML backends as well, it opens up the
possibility of doing the type of stream parsing for XML.

Comments?

Michael Koziarski

unread,
May 15, 2009, 7:34:34 PM5/15/09
to rubyonra...@googlegroups.com
> Comments?

This sounds like a great idea to me, without much in the way of downsides.

So give it a go and see what breaks.


--
Cheers

Koz

Joshua Peek

unread,
May 15, 2009, 7:53:38 PM5/15/09
to rubyonra...@googlegroups.com
Check out how the current ParamsParser middleware works.

You can use whatever parser to extract the request params in your
middleware and Rails will be fooled into using that instead.

--
Joshua Peek

brianmario

unread,
May 15, 2009, 7:59:30 PM5/15/09
to Ruby on Rails: Core
The problem is that the ParamsParser reads the entire body into a
string (from what I can tell by the line "body = request.raw_post")
before handing it to the parsing backend. My suggestion is that it
hand the IO object (env['rack.input'] I assume?) to the parser
instead. This way the parser can either read the entire string, then
parse - OR if it supports parsing as a stream, start doing so directly
off the IO.

Joshua Peek

unread,
May 16, 2009, 1:37:02 AM5/16/09
to rubyonra...@googlegroups.com
On Fri, May 15, 2009 at 6:59 PM, brianmario <senio...@gmail.com> wrote:
>
> The problem is that the ParamsParser reads the entire body into a
> string (from what I can tell by the line "body = request.raw_post")
> before handing it to the parsing backend. My suggestion is that it
> hand the IO object (env['rack.input'] I assume?) to the parser
> instead. This way the parser can either read the entire string, then
> parse - OR if it supports parsing as a stream, start doing so directly
> off the IO.

Ah, sure. I was thinking about something different.

Yeah, sure we could just hand off the raw IO object to the parser instead.

I guess the real patch would allow "ActiveSupport::JSON.decode" to
accept an IO object as well. Then we could just pass that directly in.

--
Joshua Peek

Michael Koziarski

unread,
May 16, 2009, 1:40:31 AM5/16/09
to rubyonra...@googlegroups.com
> I guess the real patch would allow "ActiveSupport::JSON.decode" to
> accept an IO object as well. Then we could just pass that directly in.

yeah, ideally both the JSON and XML parsers would accept an IO, and
the 'read into a string' logic would live in the implementations which
don't support streaming.

--
Cheers

Koz

brianmario

unread,
May 16, 2009, 3:54:52 AM5/16/09
to Ruby on Rails: Core
Exactly.

What's the next step here?
I can fork Rails and work on a patch, though I'm unsure how many/which
tests will need to be refactored.

Mislav Marohnić

unread,
May 16, 2009, 12:44:53 PM5/16/09
to rubyonra...@googlegroups.com
On Sat, May 16, 2009 at 09:54, brianmario <senio...@gmail.com> wrote:

Exactly.

What's the next step here?
I can fork Rails and work on a patch, though I'm unsure how many/which
tests will need to be refactored.

I don't think existing tests need changes. New tests have to be added:
  1. that JSON parser works with an IO;
  2. that XML parser works with an IO;
  3. that the ParamsParser middleware doesn't read the entire stream into a string itself.

brianmario

unread,
May 16, 2009, 7:01:21 PM5/16/09
to Ruby on Rails: Core
Ok I forked and patched the ParamsParser to just pass request.body to
the parsers. I did it for JSON, XML and YAML (using YAML.load_stream
instead of just load).
I also patched the XmlMini and JSON decoders for parsing from an IO,
in addition to a string. And as a result (like you said), I didn't
have to refactor any tests. Just added the ones regarding parsing from
an IO.
The one test I haven't figured out how to write, is the "that the
ParamsParser middleware doesn't read the entire stream into a string
itself." test. I can imagine how I might do it using rspec/mocha but
no idea using Test::Unit.

Here's the commit:
http://github.com/brianmario/rails/commit/c63703489eb1f3f4dd96885e1e223126e5208638

Feedback?

Also, should I make a lighthouse ticket for this?

-Brian

On May 16, 9:44 am, Mislav Marohnić <mislav.maroh...@gmail.com> wrote:
> On Sat, May 16, 2009 at 09:54, brianmario <seniorlo...@gmail.com> wrote:
>
> > Exactly.
>
> > What's the next step here?
> > I can fork Rails and work on a patch, though I'm unsure how many/which
> > tests will need to be refactored.
>
> I don't think existing tests need changes. New tests have to be added:
>
>    1. that JSON parser works with an IO;
>    2. that XML parser works with an IO;
>    3. that the ParamsParser middleware doesn't read the entire stream into a
>    string itself.

Joshua Peek

unread,
May 16, 2009, 10:22:58 PM5/16/09
to rubyonra...@googlegroups.com
Yeah, create a LH patch and assign it to me plz
--
Joshua Peek

brianmario

unread,
May 16, 2009, 11:34:50 PM5/16/09
to Ruby on Rails: Core
Done: https://rails.lighthouseapp.com/projects/8994/tickets/2659-paramsparser-and-request-body-streams#ticket-2659-3

Let me know if you need anything else.

-Brian

On May 16, 7:22 pm, Joshua Peek <j...@joshpeek.com> wrote:
> Yeah, create a LH patch and assign it to me plz
>
>
>
>
>
> On Sat, May 16, 2009 at 6:01 PM, brianmario <seniorlo...@gmail.com> wrote:
>
> > Ok I forked and patched the ParamsParser to just pass request.body to
> > the parsers. I did it for JSON, XML and YAML (using YAML.load_stream
> > instead of just load).
> > I also patched the XmlMini and JSON decoders for parsing from an IO,
> > in addition to a string. And as a result (like you said), I didn't
> > have to refactor any tests. Just added the ones regarding parsing from
> > an IO.
> > The one test I haven't figured out how to write, is the "that the
> > ParamsParser middleware doesn't read the entire stream into a string
> > itself." test. I can imagine how I might do it using rspec/mocha but
> > no idea using Test::Unit.
>
> > Here's the commit:
> >http://github.com/brianmario/rails/commit/c63703489eb1f3f4dd96885e1e2...

brianmario

unread,
May 17, 2009, 12:54:28 PM5/17/09
to Ruby on Rails: Core
Thanks for committing that, but what was the reason you removed the
yajl JSON backend and updated JSON test?
I should have noted somewhere that the JSON test and yajl.rb backend
were originally written by Rick Olson, and I made modifications to
support the ability to be passed an IO.

-Brian

On May 16, 8:34 pm, brianmario <seniorlo...@gmail.com> wrote:
> Done:https://rails.lighthouseapp.com/projects/8994/tickets/2659-paramspars...

brianmario

unread,
May 30, 2009, 10:41:06 PM5/30/09
to Ruby on Rails: Core
I've updated the LH ticket with another patch - finishing off the
changes to actually parse from the IO (instead of converting to a
string first) for the rexml, libxml and nokogiri backends.
Should this be in another ticket?

-Brian
Reply all
Reply to author
Forward
0 new messages