Let me introduce myself: I'm an advanced researcher from Hughes. As
the world's leading satellite ISP we have a lot of experience with
dealing with latency and the web.
In terms of making the web pages faster over broadband connections
here's a problem (with thoughts on a solution) that I don't think this
group has considered.
In my view, modern web pages (especially advertising driven web pages)
suffer from too much serialization:
(a) With a web page that has an HTML web page, a couple of CSS, a few
javascript URLs, the browser stops its parsing of the HTML each time
it runs into an "include" of a CSS or javascript URL and retrieves it
serially. Thus every one of these files requires a serialized round-
trip to the Internet.
(b) Often javascript will dynamically calculate the advertisement
URLs. Again, this calculation of these URLs often occur one at a time
and the javascript may even block while it retrieves one of these
URLs. Thus a string of several ads may end up resulting in a serial
chain of serialized retrievals.
All of this serialization really adds up the latency. In the satellite
world we have ways of dealing with it (at least partially) but with
secure web pages this tricks don't apply.
This is a pretty big issue (especially when some of the servers, e.g.
the ad servers have long latency during busy hours as is often the
case). What can be done about it?
Well, here's one idea for a mechanism for free a browser to do more
parellel retrieval of a page's URLs.
How about adding a <ASYNCRETRIEVE href=aurl> tag to HTML? The tag
would instruct the browser to retrieve this URL because the browser
will eventually need the URL to complete the page.
The web server could load the beginning of the HTML with as many of
these tags as needed cover the URLs the server knows are part of the
page. The server can learn this by parsing the page, by looking at the
recent history of retrieving the page, etc.
For dynamic URLs (e.g. ads), javascript near the beginning of the HMTL
could be executed to do document.write operations of this tag. This
javascript would then run without blocking and the browser could
asynchronously retrieve the URLs thereby avoiding serializing browser
retrievals.
The result could be that the web browser could, right at the beginning
of the page after the one round-trip to the web server involved with
retrieving page HTML, have a much more complete view of what it will
need to paint the page and could then retrieve much more of the page
in parallel without getting stuck by these serialization situations.
Under best case conditions, the browser could retrieve a very, very
complicated web page in two round trips: one for the URL, one for each
of the dependent URLs with these all occurring in parallel.
What do you think?
Douglas M Dillon,
240-383-6846