Wednesday, December 29, 2010

Don't Panic!

I wanted to take a quick moment to dispel any fears that may be brewing about the future of WebPagetest with my move to Google (and in case this is the first time you're hearing it, surprise! I'm at Google now working on making the web faster).

This is actually a HUGELY positive move for WebPagetest.  Google is putting some engineering resources behind the development of WebPagetest (in addition to letting me work on it full-time) so expect to see lots of great things coming.  Additionally, itself is still independent and not owned by either Google or AOL so there is no risk of your favorite web performance tool going away (particularly once I migrate off of Dreamhost and into the Meenan Data Center).

We'll be sharing the roadmap for what we're planning on working on in the coming weeks but with more developers working on it now (and not just in our spare time) if you have ever wanted to ask for something to be implemented but were afraid it was too big of an effort or wouldn't get done please feel free to post suggestions in the forums (the bigger and crazier the better - well, as long as it is reasonably related to web performance).

Wednesday, December 1, 2010

2010 State of Optimization - The Short Version

Stoyan's (always excellent) Performance Calendar went live today with an article I provided looking at the state of performance optimization in 2010 compared to 2008 (based on the tests run on WebPagetest).  I highly recommend reading the article when you get a chance - there's lots of data and graphs.  One thing that struck me was how poorly even the most basic optimizations were being followed.  I thought it would be interesting to summarize it all into a single easy-to-understand table. So, without further ado....

Percent of sites that got a passing grade for the basic optimization checks:

OptimizationPercent of pages with a passing grade (2010)
Keep-alive Enabled85%
Compress Text43%
Compress Images39%
Cache Static Content15%
Combine JS/CSS47%
CDN Usage12%

These aren't the advanced optimizations - they're the most basic. Only 15% of pages are effectively leveraging the browser cache! (and "passing" is pretty generous here - a score of 80 or better).

Thursday, October 28, 2010

Performance Measurement consistency and Video Capture impact

One thing I've always been concerned about is people taking a single measurement point and using that as representative of the performance of their site.  Even for a well-performing platform (where the back-end is consistent) there are a lot of variations and sometimes significant outliers (I usually recommend taking at least 3 measurements and throwing out any that look obviously wrong).

Recently I've been looking at options for moving the Dulles testers for WebPagetest out of my basement and into an actual data center (picture of the "Meenan Data Center" will be posted at some point).  There have also been some projects recently where we needed to run a whole lot of tests and it would take several days on the current infrastructure (even in Dulles where I have multiple machines backing the IE7 and IE8 configurations).  I've been looking at co-lo pricing but to do anything of reasonable scale gets pretty expensive, even with virtualization when you factor in the Microsoft tax.  It turns out that Amazon's us-east EC2 region is right next to me in Northern Virginia so it is looking like a very attractive option.

I have also been asked quite a few times about the impact of capturing video on the actual test results (and I've generally steered people away from comparing video-capture results against non-video results).  I jumped through a lot of hoops in Pagetest to minimize the impact and I measured it as "effectively 0" on my dev systems but never did any large-scale testing against the public test systems to see how they perform.

Joshua Bixby also recently blogged about some testing that Tony Perkins had been doing with EC2 and how the Micro instances were not suitable for performance testing (but small were looking good).

So, all this means it's time for some testing:

First, some basics on the configurations at play:

IE7 - The Dulles IE7 tests are run on physical machines (3Ghz P4's running XP with 512MB of RAM).  There are 4 physical machines running the tests.
IE8 - The Dulles IE8 tests are run in a VM running under VMWare ESXi 4.0 (XP with 512MB of Ram).  There are 8 VM's available for testing.
EC2 - The EC2 instances (both IE7 and IE8) are running in the us-east region on "small" instances using instance storage (Windows Server 2003 with 1.7GB of Ram).  I had 2 instances of each available for testing so any consistency would not be because of running on the same machine

I ran 100 tests each of, and both with and without video capture.  These are all sites with sufficient infrastructure to provide consistent serving performance and all leverage CDN's.

I'm pretty visual so my favorite way to consume data for this kind of test is to look at a cumulative percentile graph with the load time across the X axis and the percentile along the Y axis.  The more vertical a line is, the more consistent the results and if lines are basically right on top of each other they are performing identically.

And the results (pretty picture time):

So, what does it mean?

  • It looks like the current Dulles IE7 machines are seeing an impact to the measurements when capturing video (at least in some cases).  
  • Both virtualized environments do NOT appear to be impacted by capturing video
  • EC2 results are generally slower than the current Dulles results (network peering is my best guess because they are using identical traffic shaping)
  • The EC2 results are quite consistent and look promising

Friday, September 10, 2010

Advertisers - Why you care about performance (speed)

One thing that has always amazed me while working on web performance is the need to push advertisers to care about the performance (speed) of their ads.  The hard part is that the way things are structured in advertising, the ones paying the money aren't even usually aware that it's something they should care about and it's not something they hold the agencies (that actually create the ads) or the publishers (that serve the ads) accountable for.

At some level, ads are largely accounted for at the impression level.  The publishers track the number of impressions and the advertiser pays based on the number of impressions that were served.  This is a bit simplistic because there are all sorts of variations as well as click or action-based arrangements but fundamentally they are going to be driven by the same pressures.  An interesting fact that isn't thrown around too much is that an impression is counted at the time a decision is made to serve your ad, not when the user actually sees it.

Let's think about that for a minute...What that means is you are paying for every time the ad publishing system starts to serve your ad but there is no guarantee that the ad itself ever made it to the user.  Let's say your agency and publisher ended up creating an ad creative that takes 12 seconds to load it's initial payload (yes, this is a real case and while most aren't this extreme they are usually pretty bad).  That means the user has 12 seconds to interact with the page (and even leave) before the ad you just paid for even shows up.  That is insane!

Faster ads = $$$

Getting the delivery time of the ads (at least the initial payload until something is visible) to be as fast as possible needs to be a requirement in all contracts, otherwise you are just throwing money away.  The publishers don't have any incentive to optimize the delivery since they get paid regardless.  Make sure the performance is also measured based on the users you are targeting the ads to (targeting mobile? try on an actual carrier on an actual mobile device).  Don't accept results from performance tests done from a data center on a high-bandwidth internet connection (or even worse, from someone at their desk) - you need to measure in the real world on real user connections because things can be exponentially slower without you realizing it.

Tuesday, September 7, 2010

Page Speed results now available in WebPagetest

This is what open source is all about! 

Today we are taking the first step in combining the optimization checks done by Page Speed and WebPagetest by making the Page Speed results available from within WebPagetest (and from an IE browser for the first time).  Huge thanks go out to the Page Speed team and Bryan McQuade in particular who did the bulk of the work getting it integrated into the Pagetest browser plugin (as well as Steve Souders for encouraging us to collaborate).

What You Will See

In your test results you will now be getting your Page Speed score along with the normal optimization checks that are done by WebPagetest:

Clicking on the link will take you to the details from Page Speed about the various checks and what needs to be fixed:

What's next?

As I mentioned, this is just the first step.  The long-term plans are to take the best of both tools, enhance the Page Speed checks and standardize on Page Speed for optimization checking.  You'll probably see the individual rules start to migrate slowly (with things like gzip and caching being no-brainers since the logic is essentially identical between the two tools) so it should be pretty seamless from the end-user perspective.  You will also see the Page Speed checks enhanced to include the DOM-based checks that you're used to seeing in the Firefox plugin.

Friday, August 20, 2010

Passive vs Active performance monitoring

One of the things that has always bothered me about actively monitoring a site's performance (hitting it on a regular interval from an automated browser) is that you are only getting results for the specific page(s) you are monitoring from the locations and browsers you are using for the monitoring.  To get better coverage you need to do more testing which increases the amount of artificial traffic hitting your site (and still ends up not being a very realistic coverage of what your end users are seeing).

Passive monitoring on the other hand involves putting a beacon of some kind on your page that reports the performance of every page that every visitor loads (without artificial traffic).  You get complete coverage of what real users are doing on your pages and what their real experiences are.

There are some real benefits to active testing, particularly the controlled environment which produces consistent results (while passive monitoring requires a lot of traffic otherwise individual user configurations will skew the results on a given day).  Active monitoring also gives you a wealth of information that you can't get from field data (information on every request and details on exactly what is causing a problem).

Active testing is easier - you just find a company that offers the service, subscribe and start receiving reports and alerts.  For passive monitoring you need to instrument your pages and build the infrastructure to collect and analyze the results (or find a company that will do it for you but then you are potentially adding another external Frontend SPOF to your page). Boomerang is a great place to start for passive monitoring but you still need the reporting infrastructure behind it.

Can we do better?  Would something like a mix of passive and active monitoring work better where active tests are initiated based on information collected from the passive monitoring (like the top pages for that day or pages that are showing slower or faster performance than "normal")?

Several people have asked for WebPagetest to be able to do recurring, automated testing and I'm debating adding the capability (particularly for private instances) but I'm not convinced it is the right way to go (for performance, not availability monitoring). Is the amount of artificial traffic generated (and testing infrastructure) worth it?  Are the results meaningful on a regular basis or will it just end up being another set of reports that people stop paying attention to after a week?

I'd love to hear from other users on how they monitor their sites and what they have found that works well so shoot some comments back and lets get a discussion on it going.

Wednesday, August 11, 2010

New WebPagetest UI

If you've been over to WebPagetest today you may have noticed that things have changed a bit (after you double-checked to make sure you were really at the correct site).  Thanks to Neustar Webmetrics (and Lenny Rachitsky in particular) for kicking in an actual designer to bring the UI out of the dark ages hopefully performance testing will be less intimidating to new users while still keeping all of the functionality that the more advanced users like.  All of the existing functionality is still there (with very similar navigation) but there are a few enhancements I managed to get in with the update as well...


Right at the bottom of the site (across all of the pages) is a blogroll (left column) of performance-focused blogs and a feed of recent discussions (right column) that pulls from the WebPagetest forums, the Yahoo Exceptional Performance group and the "Make the Web Faster" Google group.  If you have a blog that you would like included (that is focused on web performance) shoot it to me and I'll get it added to the feed.

Simplified Navigation

There used to be 3 separate "landing" pages.  One with some high-level information, one for testing individual pages and one for running visual comparisons.  All three have been collapsed into a single page.

New Performance Documentation Wiki

There are a lot of discussions in the forums that end up with really valuable information on how to fix something (keep-alives being broken for IE on Apache comes up frequently).  I decided to set up a new destination to serve as a place to document these findings as well as serve as a central repository for performance knowledge.  Web Performance Central is an open wiki for the community to contribute to the knowledge base of performance.  I will be hosting my documentation there and it is open for anyone else to do the same and hopefully we can start getting a reasonable knowledge base built (it's really bare right now - mostly just the site).

I'll commit to running the site without any branding and with no advertising so it can be a completely unbiased source for performance information.

More Prominent Grades

The grades for the key optimizations are now across the top of all of the results pages and clicking on any of them will take you to the list of requests/objects that caused the failure.  Eventually when the documentation is in place I hope to also link the labels to information on how to fix the problem.

Social Sharing

I also bit the bullet and added a 3rd party widgit to make it easier to share results.  It saves a couple of steps and makes it a lot easier to tweet things like "Wow, site X is painfully slow", etc.  I was a little torn because the addthis widget messes up the layout of the page a little bit in IE7 and below but let's face it, I don't expect that the target demographic for WebPagetest would be using outdated browsers so it was a tradeoff I was willing to make.

New Logo

I'm not a graphic designer by any stretch of the imagination and the UI designer provided the basis for the new logo but I wanted something that had a transparent background and that I could modify myself so I went and created a new one.  I HIGHLY recommend Inkscape for those that haven't tried it. It is a free (open source) vector drawing program that is used even by a lot of professional designers.  I managed to whip together the logo in a few minutes and create it in various different sizes (as well as a favicon) all from the same source (ahh, the beauty of vector graphics).

Finally, as a bonus for making it this far, there is an Easter egg in the new UI that lets you change the color scheme if you don't like the blue background.  Just pass a hex color code in as a query parameter and you can use whatever color you want (with the logo auto-switching from white to black as needed).  Here are some to get you started:

The original color scheme provided by the designer:

The color will stick until you clear your cookies or manually reset it.  To reset it to the default just pass an invalid color:

As always, feel free to send me any feedback, suggestions or questions.

Friday, June 11, 2010

Avoid the "inline javascript sandwich"

Hopefully by now it's clear that javascript in the head is bad - it blocks parallel requests from loading (for all but the newest browsers).  If you can't move it all to the bottom of your document or make it async then we usually recommend combining it into a single file and putting it after the CSS so that it will load in parallel with the CSS and reduce the pain.  This works because it's actually the EXECUTION of the javascript that causes the browser to block any other requests but it will go ahead and fetch the javascript file in parallel to whatever it is currently loading (css) to get it ready to execute.

I've bumped into an issue with this a few times recently on pages I've been looking at so I figured it was worth warning everyone to avoid this:

<link rel="stylesheet" type="text/css" href="my.css" />
<script type="text/javascript">
    var1='some value';
<script type="text/javascript" src="my.js"></script>
That little bit of inline javascript causes the browser to not load the external js at the same time as the css because it needs to block to execute the code.

Thanks to Steve's handy Cuzillion app I threw together a quick page with exactly that structure and this is how it loads:

Move the inline javascript up above the css

As long as the javascript isn't going to be modifying the css, you'll be a lot better off moving the inline code up above the css so it can execute without having to block anything else.  If you're just setting variables for later javascript to use then this is a no-brainer.

Here is what it looks like with the inline javascript moved up above the css:

The javascript is back where it should be, loading in parallel to the css.

Wednesday, May 26, 2010

Shopping on the web from Amsterdam

Europe is definitely a hotspot for interest in web performance (WebPagetest sees almost as much traffic from there as the US).  A huge "thank you" goes out to Aaron Peters who volunteered to expand our European testing footprint with a location in Amsterdam.

For an inaugural run, he ran some tests of the top online merchants in The Netherlands (according to Twinkle magazine) and from the looks of it there's quite a market need for Web Performance Optimization experts in the area.

(click on any of the urls to go to the test results for that page) - wow!  poster-child material.  Failures across the board with no persistent connections, caching, compression, nothing.  It's actually amazing that it managed to load in 12 seconds at all. - Non too bad on the standard things but a crazy number of javascript and css files in the head (and no caching) so a pretty poor user experience.  A couple of tweaks could cut the load time in half and significantly speed up the start render time. - Apparently caching is passé - yet another site that doesn't like to use expires headers but what really surprised be was the 222KB of css that is being delivered without any compression.  Both the sheer amount of CSS and the fact that it isn't compressed are pretty scary. - Pretty much just got the keep-alives right.  No compression, no caching, and a bunch of js/css files that need to be merged. - Yay, someone is actually compressing their javascript!  Just a shame they have so much of it (150KB compressed) and in so many different files and wow, a 209KB png that should easily be an 8-bit (and MUCH smaller image). - And now we're back to the really low bar of failures across the board (including persistent connections) and a couple of 404's for good measure. - Dell did a reasonable job (though to be fair, it's probably a global template) and it's not a very rich landing page but they could still get quite a bit of improvement with image sprites and delaying the javascript. - Do I sound like a broken record yet?  Other than persistent connections - epic fail! - In DESPERATE need of some SpriteMe love (in addition to the usual suspects).

The sad part is that with just a couple of minutes of work every one of these sites could load in half the time and present a MUCH better user experience.  We've already seen time and time again that conversions, sales, etc all increase substantially with improved page performance and as I see over and over again, the vast majority of sites aren't even taking the five-minutes to handle the absolute basics (most of which can be done just with configuration changes).

Thursday, May 13, 2010

Are pages getting faster?

Last year I did some bulk analysis on the test data from WebPagetest to get a snapshot of what the distribution of results looked like.  It's about time to update the data and compare how the distributions have changed over time.  It will take a while to crunch the data and generate pretty charts but before going there I thought it would be interesting to see how individual pages have changed  over the last year...

How sites have changed over the last year

I looked for any pages that were tested in the last 4 months that also had been tested prior to 4/30/09 and it turns out there were 1279 pages with tests from both time periods.  I'll see about making the raw (anonymized) data available but the aggregate results are pretty interesting. 

Median values were used to eliminate the influence of pages with huge swings:

Load Time: +0.533 s
Time to first byte: +0.117 s
Time to start render: +0.179 s

Hmm, that's unfortunate - in aggregate, sites got slower.

Given that these are sites that were tested on WebPagetest in the first place, you'd think someone was actually working on optimizing them (or they were large, popular sites that people were randomly testing - but I doubt there were 1200 of those).

Let's see if we can dig into some of the page stats and see what's going on...

Page Size: +48 KB
Requests: +4
Connections: +1
DNS Lookups: +1

Looks like in general the back-end got a little bit slower (the first byte times) and the pages got a little heavier with more requests. Nothing really surprising here but it does seem that optimization is either not keeping up with the increased richness of the pages or (more likely) optimizing the pages has not yet made it's way into the dev cycle.

On the plus side, there's lots of room for improvement.

Sunday, April 11, 2010

What happens when your site get mentioned is a google blog posting?


 In case you missed it, Google announced that site performance now affects your search rankings and in the article they included WebPagetest in the list of tools to use.  I appreciate the props but it's been a busy day adding more testing capacity to get ready for Monday :-)

Edit:  As expected, Monday was pretty busy:

Friday, March 5, 2010

Google to offer CDN services?

Sadly no, not that I'm aware of but it would make total sense and maybe if there is enough interest they would consider it.  This is one of those "wouldn't it be nice if..." posts but there is also a fair bit of thought that went into the how's and why's.

Google has been pushing to make the web faster on various fronts from Chrome to SPDY to Page Speed to Google DNS and have been saying that they would like the Internet to be as fast as turning the pages in a magazine.  Once you get past a lot of the optimizations, there really is no way around the problem - to get faster you need to use a CDN for static content because the speed of light isn't getting any faster and it doesn't matter how fast your Internet connection is, the real performance killer is latency.

I've thought a fair bit about how I think they could do it and it should fit really well into their model as well as their suite of offerings.  I'm thinking they could offer a zero-config version that looks a lot like how coral cache works.  Basically just prepend with your origin and the traffic would go through Google's CDN ( for example).  That way everything needed to fetch the origin content is already embedded in the request and the site owner doesn't need to do anything.  For custom urls they could make it part of Google apps and let you configure custom CNAME's and origin servers.

From an infrastructure perspective it really doesn't get any easier than that (assuming you're not trying to bill people for bandwidth utilization and storage).  They'd obviously need to put some protections in place to prevent it from turning into a massive download farm (limit mime types and file sizes?).  Most  CDN providers are trying to focus on the more lucrative "bits" anyway (streaming, etc) so taking away the static content portion of the market wouldn't completely obliterate the market and it would probably be the single most impactful thing they could do to speed up the web at large.

There would also be other benefits that may not be as obvious.  They could get significantly more bang by deploying SPDY if they also owned the other end of the connection for a lot of the requests (so anything going through the Google CDN would be significantly faster in chrome).  It also seems like a much more cost-effective strategy than laying fiber in communities  and would be a perfect fit for their current application model (basically just a software deployment and it would work). 

Just like with Google Analytics I expect there would be a huge number of sites that would switch over and start using it and by having a standard way to do it I would expect to start seeing things like Wordpress plugins that automatically servs all of your static content through the Google CDN making it an automatic speed-up.

Other than the obvious elephant in the room around the costs and infrastructure to run it, am I missing something?  It really should be that easy and voila - faster web for everyone.

Saturday, February 20, 2010

Exciting new CDN (MaxCDN)

A common complaint from users of WebPagetest is that I should make using a CDN (Content Distribution Network) an optional check (and it was a common complaint against YSlow as well before they did make it configurable in version 2).  It's usually because their site is too small to justify the expense or complexity of integrating with one of the CDN providers.  Recently I had a few different people ping me to add MaxCDN to the known list of CDN's that I check so while I was at it I thought I'd check them out.

I must admit, I came away totally impressed (so much so that I decided to use them for WebPagetest).  They are doing a lot of things "right" that most of the CDN companies don't bother because it's too difficult and to top it off the barrier to entry is pretty much non-existent (Trial pricing of $10 for 1TB right now with great normal prices as well).

Here are some of the highlights:

Anycast instead of DNS-based localization

Every CDN I have seen uses DNS to send each user to the closest server.  There are some serious problems with this approach:
  • The CDN provider actually only ever sees your user's DNS server's IP address (that of their ISP, company, whatever).  This is reasonable as long as they are using a DNS server that is close to them but the servers are usually regional at best (and if they are using something like OpenDNS it may not be anywhere near the actual user).  This can result in sending the user to a CDN server (POP) that is not anywhere near them.
  • The localization is only as good as their ability to figure out where the user's DNS server is.  They can usually locate the large ISP DNS servers well but a corporation or individual running their own resolver may be hit or miss (depending on how accurate their database is).
  • By relying on DNS they usually have a low TTL (Time To Live) on the DNS records - as low as 1 minute.  This means that all of the caching that goes on for DNS at the various levels (local PC, ISP, etc) gets flushed every minute and the pain of a new lookup can be pretty bad depending on the DNS architecture of the CDN provider.
MaxCDN uses Anycast routing to get the users to the  closest POP.  This means they can hand out the same address to all of the users and their traffic will automatically get routed to the closest peering point (and POP).  "Automatically" may be a bit simplistic since it is complex to maintain an anycast network but it is the right way to do it and guarantees that the traffic follows the best network path to the best  location for every user.

Simple configuration

It literally just takes a few minutes to set it up and have it working.  You don't have to upload the files, just set up an alias (the call them "zones") and tell it what to map it to and all of the resources will be fetched automatically when they are referenced.

For example, I set up a zone "webpagetest" that points to "" and gave it a DNS alias of (their UI will tell you what to set the CNAME record to for it to work).  Now everything that can be accessed from can also be referenced from but using their CDN.  The hardest part is changing the relative paths I used to reference the js, css and images on my site to use a full path through the CDN.

They have a few plugins that automate the configuration for some of the common CMS platforms (wordpress for example).

An eye towards performance

Rather than just taking on the market with a well-architected inexpensive CDN they are also pushing the envelope on helping their customers optimize their sites by making it  easy to do through the CDN.  They recently updated their control panel to make it easy to add multiple aliases for the same content so splitting requests across domains just got that much easier and it looks like they are looking to push more and more capabilities into the edge.

In a time when most CDN providers are interested in streaming and other large bandwidth uses (more money since you pay for the bandwidth you use) it's really exciting to see a new player come in and shake things up where it really matters for most sites - making pages faster.  Bonus points for it being so cheap that there's really no excuse to NOT use a CDN if your site is on the Internet.

Exciting Times!

Monday, January 11, 2010

We can do better (as an industry)!

For the most part, web site performance optimization has been something that an experienced developer had to be involved in and if they weren't then odds are your site doesn't perform well.  Why do we have to work so hard to make them faster (and more importantly, why are they not fast automatically)?  I think there are some key areas that could help significantly if they were addressed:

Hosting Providers
If a hosting provider does not have persistent connections and gzipping for html, css and js enabled by default they should be out of business.  Period.  Maybe it is time to keep a public record of which hosting providers are configured well for performance and which aren't but things are in pretty bad shape. 

I'm appalled by the number of sites that get tested with persistent connections disabled and the owners contact me asking how to fix it but they can't because the hosting provider has it disabled.  Enabling gzip for html, js and css mime types by default will also go a long way to helping (and it will likely help their bottom line as they will be serving fewer bytes).

CMS Platforms
This is for the Drupals, Joomla's and Wordpresses of the world.  You control the platform 100%, make it fast  by default for people installing rather than requiring acceleration plugins or significant tuning and tweaking.  Since they all have custom plugin and theming API's there is a HUGE opportunity here.  Fixing wordpress installs is another topic I see WAY too frequently, particularly when plugins are involved.  Some suggestions:
  1. Force use of an API call to include CSS in a page template and then do the necessary processing to combine the files together, do the versioning, have long expiration times, etc (and bonus points for inlining the background images for appropriate browsers and for fancier tricks like inlining the css for first view, etc).
  2. Provide API's and hooks for javascript on pages along with async loading and combining.  Make it HARD to load js in the head and strongly discourage it.
  3. Provide automatic image compression with reasonable quality levels (that could be overridden if needed but that defaults to re-compressing images and stripping any exif information off).
Code Libraries
This is for the jQuery, MooTools, YUI Library, etc.  Provide samples that are already optimized.  Developers are REALLY good at copy and paste.  If you give them a sample that is  going to perform poorly then that's what they use.  Every example I have ever seen for a js toolkit throws the individual components all in the head as separate files.  This is probably the worst thing you can do for page performance but everyone does it because that's how all of the samples tell you how to do it.