Too Much Serialization In Modern Web Page Retrieval

Douglas M Dillon

unread,

Dec 3, 2009, 1:59:26 PM12/3/09

to Make the Web Faster, dil...@hns.com

Let me introduce myself: I'm an advanced researcher from Hughes. As
the world's leading satellite ISP we have a lot of experience with
dealing with latency and the web.

In terms of making the web pages faster over broadband connections
here's a problem (with thoughts on a solution) that I don't think this
group has considered.

In my view, modern web pages (especially advertising driven web pages)
suffer from too much serialization:
(a) With a web page that has an HTML web page, a couple of CSS, a few
javascript URLs, the browser stops its parsing of the HTML each time
it runs into an "include" of a CSS or javascript URL and retrieves it
serially. Thus every one of these files requires a serialized round-
trip to the Internet.
(b) Often javascript will dynamically calculate the advertisement
URLs. Again, this calculation of these URLs often occur one at a time
and the javascript may even block while it retrieves one of these
URLs. Thus a string of several ads may end up resulting in a serial
chain of serialized retrievals.

All of this serialization really adds up the latency. In the satellite
world we have ways of dealing with it (at least partially) but with
secure web pages this tricks don't apply.

This is a pretty big issue (especially when some of the servers, e.g.
the ad servers have long latency during busy hours as is often the
case). What can be done about it?

Well, here's one idea for a mechanism for free a browser to do more
parellel retrieval of a page's URLs.

How about adding a <ASYNCRETRIEVE href=aurl> tag to HTML? The tag
would instruct the browser to retrieve this URL because the browser
will eventually need the URL to complete the page.

The web server could load the beginning of the HTML with as many of
these tags as needed cover the URLs the server knows are part of the
page. The server can learn this by parsing the page, by looking at the
recent history of retrieving the page, etc.

For dynamic URLs (e.g. ads), javascript near the beginning of the HMTL
could be executed to do document.write operations of this tag. This
javascript would then run without blocking and the browser could
asynchronously retrieve the URLs thereby avoiding serializing browser
retrievals.

The result could be that the web browser could, right at the beginning
of the page after the one round-trip to the web server involved with
retrieving page HTML, have a much more complete view of what it will
need to paint the page and could then retrieve much more of the page
in parallel without getting stuck by these serialization situations.
Under best case conditions, the browser could retrieve a very, very
complicated web page in two round trips: one for the URL, one for each
of the dependent URLs with these all occurring in parallel.

What do you think?

Douglas M Dillon, 240-383-6846

Dan Johansson

unread,

Dec 3, 2009, 9:15:13 PM12/3/09

to make-the-...@googlegroups.com, dil...@hns.com

If you do pay attention to techniques used with high performance web sites, these things are already considered. Companies like google and yahoo do optimize both sites and ad loading so that it does not block, but many other companies are behind the curve.

--

You received this message because you are subscribed to the Google Groups "Make the Web Faster" group.
To post to this group, send email to make-the-...@googlegroups.com.
To unsubscribe from this group, send email to make-the-web-fa...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/make-the-web-faster?hl=en.

Viks

unread,

Dec 6, 2009, 11:19:36 AM12/6/09

to Make the Web Faster

One of the best pratices for improving performance is adding the CSS
includes at the top of the HTML page and adding the Javascript
includes at the bottom of the page. This way the browser will download
all the necessary CSS files upfront to paint the page and by the time
the javascript files are downloaded all the DOM elements are generated
in the memory for the javascript code to work on. This way we can
streamline the page rendering process to some extent.

->VS

On Dec 4, 7:15 am, Dan Johansson <datas...@gmail.com> wrote:
> If you do pay attention to techniques used with high performance web sites,
> these things are already considered. Companies like google and yahoo do
> optimize both sites and ad loading so that it does not block, but many other
> companies are behind the curve.
>
> On Thu, Dec 3, 2009 at 12:59 PM, Douglas M Dillon

> <dougdillon1...@gmail.com>wrote:

> > make-the-web-fa...@googlegroups.com<make-the-web-faster%2Bunsu...@googlegroups.com>
> > .

Peter L

unread,

Dec 14, 2009, 10:01:48 AM12/14/09

to Make the Web Faster

HTML5 addresses the serialization somewhat with async/defer --
http://code.google.com/speed/articles/html5-performance.html. This
allows a web developer to tell the browser whether or not they need to
wait for the script to be downloaded and executed before proceeding to
download the next object.

Peter

> > > make-the-web-fa...@googlegroups.com<make-the-web-faster%2Bunsu bsc...@googlegroups.com>

Reply all

Reply to author

Forward