- You could look to see if the new request's cookie matches a cookie from the set of existing domains. This won't catch simultaneous page views that overlap domains, but it should help for the common case of ajax requests from multiple open tabs or applications.
- Checking the user agent header could also eliminate requests from other applications or browsers.
- Another technique that might work is unziping/unencoding each object already in a page view and apply a regex like
http://daringfireball.net/2009/11/liberal_regex_for_matching_urls to pull out URLs that should be included for this page view, if encountered. Again, this won't work for multiple simultaneous page loads for the same page.
In general, all of this will probably be harder when the root document is missing.
On a related note, is it possible to fill in a "dummy" document when the root object is missing? Does the HAR spec allow this?
--
Ryan Witt
http://onecreativeblog.com