Link Session to Web Page?

45 views
Skip to first unread message

Chad Sowald

unread,
Nov 23, 2009, 11:22:05 PM11/23/09
to Fiddler
Hi Eric,

I am looking for a way to relate a session (request/response) in
Fiddler to a web page that is being loaded while many other web pages
may also be loading at the same time. One idea I had is to add a
custom header, but I think this would only work for a request's base
page but not for resources referenced by that page since I couldn't
tell what resources were loaded for each page.

Is there a way with Fiddler (or more specifically FiddlerCore) to
relate a session to a web page? Would I have to modify the web
browser itself to enable this since it is 'page aware'? Perhaps this
could be accomplished with a browser plug-in?

Thanks.

Jan Martinec

unread,
Nov 24, 2009, 4:32:03 AM11/24/09
to Fiddler


> I am looking for a way to relate a session (request/response) in
> Fiddler to a web page that is being loaded while many other web pages
> may also be loading at the same time.
You could relate the requests by process ID, if the pages are loading
in separate processes.

(for IE8, you can start a new window+process with the -nomerge command
line switch)

Chad Sowald

unread,
Nov 24, 2009, 10:53:59 AM11/24/09
to Fiddler
Jan - do you know if this is possible to specify using the .NET
WebBrowser class (which is just a wrapper around IE)?

Thanks so much for the idea too!
~Chad

EricLaw

unread,
Nov 24, 2009, 5:14:17 PM11/24/09
to Fiddler
Chad-- The process ID is always going to be the process ID of your
application which is hosting the web browser control. If you're
hosting two WebBrowser instances within your process, you won't be
able to use the ProcessID to distinguish between them.

(It's not quite correct to say that the WebBrowser class is a wrapper
around IE-- it would be more correct to say that Internet Explorer is
just a wrapper around the WebBrowser class.)
> > line switch)- Hide quoted text -
>
> - Show quoted text -

Chad Sowald

unread,
Nov 24, 2009, 8:23:23 PM11/24/09
to Fiddler
So, is there any hope to relate Fiddler sessions to the specific web
pages that are loading? I sadly would have to believe that there
isn't a way to do this as Fiddler knows nothing about the browser and
vise versa - the only hope would be a browser plugin that could set
some header on each request, which Fiddler (or FiddlerCore) could see
and then my own application could handle.

Eric, thanks for clearing up the WebBrowser/IE hierarchy!

~Chad

EricLaw

unread,
Nov 24, 2009, 8:50:33 PM11/24/09
to Fiddler
Can you describe a bit more about the overall scenario you're trying
to achieve?

Tracking requests back to their origins is a non-trivial exercise,
even if you're running directly in the browser itself.
> > > - Show quoted text -- Hide quoted text -

Chad Sowald

unread,
Nov 24, 2009, 9:38:43 PM11/24/09
to Fiddler
If you're willing to read a bit more I'm certainly willing to write a
bit more :-)

The scenario is that I will be programmatically loading a dynamic
number of web pages at once using (currently, though it may change)
the WebBrowser .NET class. I do not show the WebBrowser at all. More
precisely, there is a queue of pages to load and at any time 'n' web
pages are loading. While all of this is going on, I'm using a proxy
(FiddlerCore, bless you!) to watch the HTTP traffic. I want to be
able to know what pages spawned what HTTP requests. Even more
precisely, I'm crawling web pages to a certain depth and (among other
reasons for relating web pages and resources) I want users of the
program to be able to click on a resource and go back to the page that
that resource came from.

-For performance reasons I want to load web pages in parallel.
-For accuracy and completeness reasons I don't want to just rely on
statically parsing a web page's HTML to look for referenced resources
as this wouldn't tell me about more dynamically requested resources
(via JavaScript, AJAX, or browser plugins).

Hopefully it's a bit clearer now, though the solution may not be :-)

If this is possible without a browser plugin that would be ideal. If
not, I'm still interested in a solution rather than none at all.

Thanks for your help and advice,
~Chad

EricLaw

unread,
Nov 25, 2009, 3:13:05 PM11/25/09
to Fiddler
Chad--

If you're writing your own crawler application, you could just run
each WebBrowser Control instance in its own process, then
differentiate based on the process ID. Also, note that the WebBrowser
class' Navigate() method accepts a parameter which allows you to
specify additional HTTP headers; you could use this to tag each
outbound navigation with a GUID or whatever, and that additional
header actually gets applied to the subdownloads used by that page
during the rendering process. You can detect that header in
FiddlerCore (and even remove it before it hits the wire, if you want)
and then use that to associate individual requests to the control &
navigation. I'm not sure how well the additional header mechanism
works for XHR requests or those initiated by Flash; for those cases,
you may need to rely on process isolation for differentiation.

Thanks,
Eric

Chad Sowald

unread,
Nov 26, 2009, 12:27:46 PM11/26/09
to Fiddler
Hi Eric. I'll have to work more on the process isolation since I
ultimately care about XHR and Flash requests/responses, but I gave the
'additionalHeaders' in Navigate(...) a try and I'm not seeing the
expected header on any page except the base page: Here are some
dumped headers to show you what I mean...

-->BASE PAGE
GET /watch?v=-hpiwPXkbVc HTTP/1.1
Accept: */*
Base-Page: http://www.youtube.com/watch?v=-hpiwPXkbVc (my special
header!)
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/
4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)
Host: www.youtube.com
Proxy-Connection: Keep-Alive

-->RESOURCE REQUEST FROM BASE PAGE
GET /yt/cssbin/www-core-vfl134429.css HTTP/1.1
Accept: */*
Referer: http://www.youtube.com/watch?v=-hpiwPXkbVc
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/
4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)
Host: s.ytimg.com
Proxy-Connection: Keep-Alive


Furthermore, I'm having trouble locating any helpful documentation on
the 'additionalHeaders' argument to verify its functionality. Since I
will probably isolate the browsers, this shouldn't matter in the end,
but it would be nice to know how that argument is supposed to work.

More interestingly, the "Referer" header is being set by the
WebBrowser control, which seems to partially resolve my issue, but
then on some requests such as:
GET /videoplayback?ip=0.0.0.0&sparams=id%2Cexpire%2Cip%2Cipbits%2Citag
%2Calgorithm%2Cburst%2Cfactor&algorithm=throttle-
factor&itag=34&ipbits=0&burst=40&sver=3&expire=1259276400&key=yt1&signature=6C438B374EC0CE5ADB45D8A9022A24B5668A93E2.68168074C42EB683B2E960F04C6659501D43CCAC&factor=1.25&id=fa1a62c0f5e46d57
HTTP/1.1
Accept: */*
Accept-Language: en-US
Referer: http://s.ytimg.com/yt/swf/watch-vfl134346.swf
x-flash-version: 10,0,32,18
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/
4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)
Host: v14.lscache1.c.youtube.com
Proxy-Connection: Keep-Alive


The "Referer" is set to the resource that made the request (the Flash
SWF file), so it seems I would then have to map that SWF file back to
the base page (through that referer) and therefore know that this
request is based on the YouTube page. I'm thinking this wouldn't
always work though as the same SWF file (for example) could be loaded
on another page and then load a different resource - I now wouldn't
know which base page to use. Plus, I don't know if the Referer header
is always sent.

So, it seems as though I have to go down the new road of process
isolation, which is completely unfamiliar to me :-)

EricLaw

unread,
Nov 27, 2009, 9:33:06 PM11/27/09
to Fiddler
Hmm... in IE8, the additional header is getting propagated to the
subdownloads as well. I wonder if that's a change from IE7?

As for the Referer thing-- yeah, not all requests will have the
referer set; in the case you show, that's actually the Flash plugin
that's setting the Referer header, which is why it's not set as one
might expect. You can tell because Flash sets the "x-flash-version"
custom header on its requests.
> factor&itag=34&ipbits=0&burst=40&sver=3&expire=1259276400&key=yt1&signature­=6C438B374EC0CE5ADB45D8A9022A24B5668A93E2.68168074C42EB683B2E960F04C6659501­D43CCAC&factor=1.25&id=fa1a62c0f5e46d57

Chad Sowald

unread,
Nov 27, 2009, 10:06:23 PM11/27/09
to Fiddler
Can I ask how you know the additionalHeaders are being propagated? I
have IE8 and .NET 3.5 installed, though I use a different default
browser.

I found this page:
http://blogs.msdn.com/ie/archive/2009/03/10/more-ie8-extensibility-improvements.aspx

which appears to say that I have to force IE8 usage through the
registry (yuck) for my application. Are you accomplishing this some
other way?

Well, I've started to work on separating the browser into its own
process - having fun with redirecting output back to my crawler :-)

Thanks again,
~Chad
> > factor&itag=34&ipbits=0&burst=40&sver=3&expire=1259276400&key=yt1&signature ­=6C438B374EC0CE5ADB45D8A9022A24B5668A93E2.68168074C42EB683B2E960F04C665950 1­D43CCAC&factor=1.25&id=fa1a62c0f5e46d57

EricLaw

unread,
Dec 1, 2009, 10:01:23 AM12/1/09
to Fiddler
Chad, if IE8 is installed, your application is using it. The registry
key controls only what document version is reported to the server.

As to how I know that the additional headers are getting propagated--
I see them in FiddlerCore in my sample application. (I also saw this
behavior when working on an IE8 issue earlier this year).

On Nov 27, 7:06 pm, Chad Sowald <chadsow...@gmail.com> wrote:
> Can I ask how you know the additionalHeaders are being propagated?  I
> have IE8 and .NET 3.5 installed, though I use a different default
> browser.
>
> I found this page:http://blogs.msdn.com/ie/archive/2009/03/10/more-ie8-extensibility-im...
Reply all
Reply to author
Forward
0 new messages