Wednesday, November 26, 2008

Large sites without persistent connections

Sort of as a follow-up to the last article, over the last couple of days there were some particularly interesting sites that are just failing horribly:

Verizon's DNS error hijack - Verizon recently decided to start monetizing fat-fingered urls by taking over DNS failures and redirecting them to a search page. I'm using their default DNS servers (for now) for the online pagetest so some tests of invalid urls have returned the Verizon search page which ends up being a great example of a page that should be WAY faster than it is. Here are the full results of a test run I kicked off on purpose:

Here is the waterfall:

They fail pretty horribly for not using persistent connections but it's also a perfect example for image sprites and the html and css aren't gzipped either. The whole thing really should have been done in 2 or 3 requests and could be completely loading in under a second instead of a little over 3. None of those images have an expires header either so even repeat views take just as long.

Yahoo Japan - This one is mostly interesting because Yahoo is notoriously good at site optimization but somehow they seem to have missed the Japanese portal. They do pretty good on everything except for the persistent connections and gzip. Here are the full results:

and the waterfall:

That js is particularly bad as it's 85KB but can be reduced to 22 with gzip but the biggest crime is the persistent connections. They could cut the load time almost in half just by enabling persistent connections.

Tuesday, November 25, 2008

Easy ways to speed up your site

Pagetest has been online for 8 months now with close to 26,000 pages tested. I generally look through the test log daily and see what the results look like and it's been pretty frustrating because a significant number of the sites being tested could easily be twice as fast without changing the content or code but I have no way to reach out to the owner and tell them. The test results are probably a bit overwhelming for most people and they don't know where to start. so, with that.....

If you do nothing else for performance make sure you at least do these:

Persistent Connections - There is absolutely no reason that a site should not be using persistent connections yet I see them come through testing every day. Assuming you are not using a CDN and most of the content for your page is served from your server you can get close to a 50% improvement in performance just by enabling them. Here is a sample from a test that was run recently (site chopped out to protect the innocent):

I cropped it down but the whole site continued, opening new connections for 102 requests. The orange section of each request is the time used to open the connection and all but 2 would be eliminated just by enabling persistent connections. In this case the "start render" time (time when the user first sees something on the page) would go from 3.8 seconds down to roughly 2 seconds and the time to fully load the page would go down from 16 seconds to closer to 9 seconds. This sample was even from an Apache server so there's really no reason for not having them enabled.

GZIP (aka Free Money) - Just like with the persistent connections, enabling GZIP for text responses (html, css, js) is literally just a server configuration and there is very little reason for not doing it. Early versions of IE didn't react well to JS being compressed but it's easy enough to exclude them and not a good enough reason to penalize everyone else.

GZIP not only helps the end user get the pages faster but it also saves you significant bytes and if you pay for your network bandwidth, not having it enabled is throwing away free money.

It's not just the small sites that have this problem either. My favorite example is which serves 523KB of uncompressed text. They could save 358KB of that just by enabling gzip compression and since most of the text for a page is all in the base page or js and css referenced in the head it all has to get downloaded before the user sees anything:

The blue bars in the waterfall are where the browser isi downloading uncompressed text for CNN.

Combining CSS and JS files - In case it's not obvious from the two waterfalls shown so far, it's not uncommon for a site to reference several different js and css files. There are several solutions available that will let you keep the files separate but download them all in a single request. David Artz has a pretty good write-up on mod_concat and some of the other options here. This does require modifying the html to reference the combined URL instead of each file individually but that is a very worthwhile effort given the massive improvement in the loading of your site.

The JS and CSS files usually get downloaded before anything is shown to the user and JS in particular has a nasty habit of only being downloaded one at a time so anything you can do to reduce the number of each that you are loading will have a huge impact to the user.