bug report and unit test for infinite loop parsing Content-Disposion header

320 views
Skip to first unread message

Paul Rogers

unread,
May 4, 2012, 5:37:44 PM5/4/12
to Rack Development
Hi,

I created this

g...@github.com:paulrogers/rack.git

showing a test that seems to have an infinite loop issue when parsing
a multipart form.

you can run the test using

bacon -I./lib:./test -a -t 'Rack::Multipart'

What seems to happen is that when parsing a header like this

Content-Disposition: inline; name=xml_product_config;
filename=XML_PRODUCT_CONFIG.xml

the regexp in the get_filename method in parser.rb seems to get stuck
in an infinite loop on the line with

if head =~ RFC2183

This happens in the tests as well as in the unit test in the attached
git commit ( is that the correct term?)

Id be grateful if some one can take a look.

Thanks,,
Paul


Paul Rogers

unread,
May 4, 2012, 5:37:44 PM5/4/12
to Rack Development

Eric Wong

unread,
May 4, 2012, 7:34:15 PM5/4/12
to rack-...@googlegroups.com
Paul Rogers <pmr1...@gmail.com> wrote:
> the regexp in the get_filename method in parser.rb seems to get stuck
> in an infinite loop on the line with
>
> if head =~ RFC2183

This is an unfortunate issue of the type of regexp engine used by Ruby

> This happens in the tests as well as in the unit test in the attached
> git commit ( is that the correct term?)

I'm not a regexp/finite-automata expert, but having multiple '*' or
mixing '*'/'+' in a regexp can be problematic.

I think the following should fix your issue (but I'm not sure it's
correct):

diff --git a/lib/rack/multipart.rb b/lib/rack/multipart.rb
index 3777106..6849248 100644
--- a/lib/rack/multipart.rb
+++ b/lib/rack/multipart.rb
@@ -12,7 +12,7 @@ module Rack
MULTIPART = %r|\Amultipart/.*boundary=\"?([^\";,]+)\"?|n
TOKEN = /[^\s()<>,;:\\"\/\[\]?=]+/
CONDISP = /Content-Disposition:\s*#{TOKEN}\s*/i
- DISPPARM = /;\s*(#{TOKEN})=("(?:\\"|[^"])*"|#{TOKEN})*/
+ DISPPARM = /;\s*(#{TOKEN})=("(?:\\"|[^"])*"|#{TOKEN})/
RFC2183 = /^#{CONDISP}(#{DISPPARM})+$/i
BROKEN_QUOTED = /^#{CONDISP}.*;\sfilename="(.*?)"(?:\s*$|\s*;\s*#{TOKEN}=)/i
BROKEN_UNQUOTED = /^#{CONDISP}.*;\sfilename=(#{TOKEN})/i

Lawrence Pit

unread,
May 6, 2012, 8:39:11 PM5/6/12
to rack-...@googlegroups.com

Given the value of DISPPARM must always have at least 1 character (according to RFC2183 and RFC2045) that fix seems correct to me.

In addition I would make the TOKEN regexp non-greedy (for the BROKEN_UNQUOTED case):

TOKEN = /[^\s()<>,;:\\"\/\[\]?=]+?/

Also, why is the "@" character accepted as part of a TOKEN? It is part of the tspecials (in RFC2045), so I think it should not be accepted as a valid token character.


Cheers,
Lawrence

Paul Rogers

unread,
May 7, 2012, 11:04:28 AM5/7/12
to Rack Development
Thanks for the responses, and sorry for the double posting, not sure
what happened there.

I also found I can quote the filename which passes the tests. The app
Im using this in is a mock for another service, and I'll have to check
if the real service accepts a quoted string.

I'll also try these fixes in case that works better for me

Thanks

Paul
Reply all
Reply to author
Forward
0 new messages