Images stalling and timing out when downloaded through BMP

71 views
Skip to first unread message

Simonz

unread,
Apr 12, 2013, 3:57:25 AM4/12/13
to browserm...@googlegroups.com
Hello all,

I have an odd issue where the images on a particular web site do not download correctly (only) when the site is accessed via browsermob-proxy. What happens is that after a long time (minutes) the connection times out, with BMP apparently waiting for more data to arrive, which doesn't happen. Then BMP prints:

WARN 04/12 06:47:09 o.b.p.j.h.HttpConne~ - Invalid length: Content-Length=87813 written=81469 for http://images...

The actual size of the image is indeed 87813 bytes, and it downloads fine with wget, curl, etc. The request headers going upstream are:

GET /_lib/images/xxxxx
Host: images.xxxx
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:14.0) Gecko/20100101 Firefox/14.0.1
Accept: image/png,image/*;q=0.8,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Proxy-Connection: keep-alive
Referer: xxxxx
Pragma: no-cache
Connection: Keep-Alive

Then the response is 

HTTP/1.1 200 OK
Server: nginx/0.7.65
Date: Fri, 12 Apr 2013 07:40:13 GMT
Content-Type: image/jpeg
Connection: keep-alive
Cache-Control: max-age=86400
Last-Modified: Thu, 18 Oct 2012 21:49:18 GMT
Accept-Ranges: bytes
ETag: "xxxxxxxxx:0"
X-Powered-By: ASP.NET
Content-Length: 87813
X-Cache: MISS from xxxxxx
X-Cache-Lookup: HIT from xxxxxxx
Via: 1.0 xxxxx (squid)

Which is identical to what is sent when a normal browser accesses it. It seems like there's something wrong with how BMP tries to download the images.

Just wondering if anyone has insight into this?

Many thanks,

Simon

Simonz

unread,
Apr 13, 2013, 3:24:14 AM4/13/13
to browserm...@googlegroups.com
Hello all,

I just wanted to follow this up since after some fairly excruciating debugging I seem to have an answer.

The problem entirely goes away if I ensure the "Proxy-Connection" header is not passed through from the client request to the outgoing request from BMP to the hosting server. I was reviewing the role of this header and it seems like, apart from being deprecated to the point where there is a bug logged to remove it from FireFox, it is never meant to be relayed past a proxy.

When I reviewed the code some more I saw that there is actually a list of such headers to mask from being sent upstream (see SeleniumProxyHandler line 59). However the BrowserMobProxyHandler class, which overrides the same method that masks those headers in the SeleniumProxyHandler base class, does not mask them

So, on the face of it, this seems like a bug, that headers intended to stay between the browser and proxy are being transmitted upstream to the end point. Unless there was some reason for doing this, I'd like to propose the following patch:

--- a/src/main/java/org/browsermob/proxy/BrowserMobProxyHandler.java
+++ b/src/main/java/org/browsermob/proxy/BrowserMobProxyHandler.java
@@ -214,7 +214,8 @@ public class BrowserMobProxyHandler extends SeleniumProxyHandler {
                             hasContent = true;
                         }
 
-                        httpReq.addRequestHeader(hdr, val);
+                        if(!_DontProxyHeaders.containsKey(hdr))
+                          httpReq.addRequestHeader(hdr, val);
                     }
                 }
             }

Would love to have any feedback or comment on this. I have not tested this widely yet, but will be applying it to our own code base, and since it is directly in the original Selenium code I have a feeling it should be quite safe.

Many thanks,

Simon

Patrick Lightbody

unread,
Apr 13, 2013, 8:54:01 AM4/13/13
to browserm...@googlegroups.com

Nice work! Lets do it. Please submit a pu request. Thanks. 


Sent from Mailbox for iPhone


--
 
---
You received this message because you are subscribed to the Google Groups "BrowserMob Proxy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to browsermob-pro...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Roy de Kleijn

unread,
Apr 13, 2013, 12:08:22 PM4/13/13
to browserm...@googlegroups.com
Is it true that we can have the same issue with big flash files?

Best Regards,
Roy

Verstuurd vanaf mijn iPhone
--

Patrick Lightbody

unread,
Apr 13, 2013, 6:21:04 PM4/13/13
to browserm...@googlegroups.com
I expect it could happen to any number of file types.

Simon Sadedin

unread,
Apr 13, 2013, 9:25:58 PM4/13/13
to browserm...@googlegroups.com
> I expect it could happen to any number of file types.

Yes, it's not about file type very much except that because the stall is after quite a large amount of data (about 70kb in my case) you won't experience it on most assets a normal web site will deliver.

The particular case where I am seeing it is with images delivered from a CDN, and it appears (I'm sort of guessing here) that they are using some kind of reverse proxy on their back end to load balance. So my guess is that the "Proxy-Connection" header must be somehow interfering with that layer. To be honest, I don't fully understand why it's an issue, but I'd say you're more likely to encounter this when there is further chaining of proxies upstream that might try to interpret the headers.

Cheers,

Simon

Simon Sadedin

unread,
Apr 13, 2013, 9:35:26 PM4/13/13
to browserm...@googlegroups.com
On Sat, Apr 13, 2013 at 10:54 PM, Patrick Lightbody <pat...@lightbody.net> wrote:

Nice work! Lets do it. Please submit a pu request. Thanks. 

Awesome, thanks!

Unfortunately I'm in the awkward situation where the company lawyers have told me we're not allowed to place code directly onto Github (something about terms of use ...). I know it's ridiculous, and given it's a one line change, I can probably talk them down on this one, but is it possible you can humor me and apply the change directly?

Cheers,

Simon


Patrick Lightbody

unread,
Apr 21, 2013, 3:19:24 PM4/21/13
to browserm...@googlegroups.com
I am about to commit this. I feel like I made this change explicitly at some point for some reason, but I can't recall why :) So let's put it in and see if anyone reports any strange anomalies.


Reply all
Reply to author
Forward
0 new messages