Tuesday, June 4, 2013

Progressive JPEGs FTW!

TL;DR: Progressive JPEGs are one of the easiest improvements you can make to the user experience and the penetration is a shockingly-low 7%.  WebPagetest now warns you for any JPEGs that are not progressive and provides some tools to get a lot more visibility into the image bytes you are serving.

I was a bit surprised when Ann Robson measured the penetration of progressive JPEGs at 7% in her 2012 Performance Calendar article.  Instead of a 1,000 image sample, I crawled all 7 million JPEG images that were served by the top 300k websites in the May 1st HTTP Archive crawl and came out with....wait for it.... still only 7% (I have a lot of other cool stats from that image crawl to share but that will be in a later post).

Is The User Experience Measurably Better?


Before setting out and recommending that everyone serve progressive JPEGs I wanted to get some hard numbers on how much of an impact it would have on the user experience.  I put together a pretty simple transparent proxy that could serve arbitrary pages, caching resources locally and transcoding images for various different optimizations.  Depending on the request headers it would:

  • Serve the unmodified original image (but from cache so the results can be compared).
  • Serve a baseline-optimized version of the original image (jpegtran -optimize -copy none).
  • Serve a progressive optimized version (jpegtran -progressive -optimize -copy none).
  • Serve a truncated version of the progressive image where only the first 1/2 of the scan lines are returned (more on this later).
I then ran a suite of the Alexa top 2,000 e-commerce pages through WebPagetest comparing all of the different modes on a 5Mbps Cable and 1.5Mbps DSL connection.  I first did a warm-up pass to populate the proxy caches and then each permutation was run 5 times to reduce variability.

The full test results are available as Google docs spreadsheets for the DSL and Cable tests.  I encourage you to look through the raw results and if you click on the different tabs you can get links for filmstrip comparisons for all of the URLs tested (like this one).

Since we are serving the same bytes, just changing HOW they are delivered, the full time to load the page won't change (assuming an optimized baseline image as a comparison point).  Looking at the Speed Index, we saw median improvements of 7% on Cable and 15% on DSL.  That's a pretty huge jump for a fairly simple serving optimization (and since the exact same pixels get served there should be no question about quality changes or anything else).

Here is what it actually looks like:



Some people may be concerned about the extremely fuzzy first-pass in the progressive case.  This test was just done with using the default jpegtran scans.  I have a TODO to experiment with different configurations to deliver more bits in the first scan and skip the extremely fuzzy passes.  By the time you get to 1/2 of the passes, most images are almost indistinguishable from the final image so there is a lot of room for improving the experience.

What this means in WebPagetest


Starting today, WebPagetest will be checking every JPEG that is loaded to see if it is progressive and it will be exposing an overall grade for progressive JPEGs:

The grade weights the images by their size so larger images will have more of an influence.  Clicking on the grade will bring you to a list of the images that were not served progressively as well as their sizes.

Another somewhat hidden feature that will now give you a lot more information about the images is the "View All Images" link right below the waterfall:


It has been beefed up and now displays optimization information for all of the JPEGs, including how much smaller it would be when optimized and compressed at quality level 85, if it was progressive and the number of scans if it was:

The "Analyze JPEG" link takes you to a view where it shows you optimized versions of the image as well as dumps all of the meta-data in the image so you can see what else is included.

What's next?


With more advanced scheduling capabilities coming in HTTP 2.0 (and already here with SPDY), sites can be even smarter about delivering the image bits and re-prioritize progressive images after enough data has been sent to render a "good" image and deliver the rest of the image after other images on the page have had a chance to display as well.  That's a pretty advanced optimization but it will only be possible if the images are progressive to start with (and the 7% number does not look good).

Most image optimization pipelines right now are not generating progressive JPEGs (and aren't stripping out the meta-data because of copyright concerns) so there is still quite a bit we can do there (and that's an area I'll be focusing on).

Progressive JPEGs can be built with almost arbitrary control over the separate scans.  The first scan in the default libjpeg/jpegtran setting is extremely blocky and I think we can find a much better balance.

At the end of the day, I'd love to see CDNs automatically apply lossless image optimizations and progressive encoding for their customers while maintaining copyright information.  A lot of optimization services already do this and more but since the resulting images are identical to what came from the origin site I'm hoping we can do better and make it more automatic (with an opt-out for the few cases where someone NEEDS to serve the exact bits).

20 comments:

  1. What about the effect on CPU overhead for things using batteries? Is the JPEG FAQ entry out of date regarding this?

    http://www.faqs.org/faqs/jpeg-faq/part1/section-11.html

    ReplyDelete
  2. The FAQ is technically correct but the decoders have had significant improvements and a lot of things with batteries have asics that can do the decode. The also use more memory while being decoded but both of those are very-much micro-optimizations that website owners should not have to worry about it.

    The overall picture is a lot more complicated because they tend to be smaller byte-wise than baseline so they use a tiny bit less radio.

    In the case of mobile, the Screen will typically dominate battery use and the power used to decode JPEGs as part of rendering pages isn't even a rounding error (most studies that look at battery usage exclude screen because of how hugely it is skewed).

    There was a good discussion on it ~5 months ago here: https://github.com/yeoman/yeoman/issues/810 and Joe Drew from Firefox had a great quote "Yes, progressive JPEGs will result in more uploads to the GPU, and more CPU writes to the same memory areas. My suggestion to you is to leave that to us browser makers."

    ReplyDelete
    Replies
    1. IIRC in libjpeg-turbo the SIMD optimizations only affect baseline, so the decoder is much slower in progressive. Then in hardware the same slowness can happen.

      Delete
    2. https://bugzilla.mozilla.org/show_bug.cgi?id=715919#c23

      Delete
  3. The webpagetest report should account for the source image size: I've found that small icons actually become larger when using progressive JPEG so our site conditionally enables progressive JPEG only when the image has more than 16K pixels, which is consistently a net win in my testing.

    ReplyDelete
  4. The overall score is weighted by the image size so small images should not change the overall score much (unless you have nothing but small JPEGs). I'm open to also setting a floor and not dinging for images < 10KB.

    ReplyDelete
  5. Great great post, Pat. Thanks. Since this only applies to JPG it made me wonder how popular JPG is compared to other image formats. According to the HTTP Archive, 46% of images are JPG. So this improvement would affect almost half of the images seen on the Web. A big win.

    ReplyDelete
  6. @steve, I'm cautiously optimistic that the impact would be even more than that.

    Since JPEGs tend to be used for the larger images, they constitute more of the page bytes than the other images.

    I don't have scientific data for it but I expect that navigating content and commerce sites will also tend to skew requests more towards JPEGs. Assuming gif/png are used mostly for layout and don't change frequently but JPEGs tend to be used for story photos, product images, etc and will be the content (along with the actual HTML/body) that changes the most often.

    ReplyDelete
  7. I would not trust the information reported by webpage test's JPEG tool. The lossless and lossy recommendation it offered were larger in size. http://www.webpagetest.org/jpeginfo/jpeginfo.php?url=http://www.ericperrets.info/images/meme.jpg

    ReplyDelete
    Replies
    1. @Eric - do you see something that is inaccurate? If you already optimized the images then there's a good chance that there isn't room to make them smaller. If the original is already compressed at a quality level lower than 85 (and looking at the artifacts it looks like it was probably well under 70) then re-compressing it at 85 is guaranteed to be bigger.

      The JPEG tool just dumps information about the file and checks to see if easy lossless or lossy gains are left on the table.

      If you run a site that has that image on it through webpagetest itself, you shouldn't be getting dinged for Image compression. You might get dinged for it not being progressive though since it's over 10KB though but progressive delivery isn't all about file size.

      Delete
    2. I would also use large progressive JPEGs with caution on memory limited devices (entry level smartphones).
      To reconstruct a progressive JPEG a browser has to either:
      - keep huge matrices in memory (6 bytes per pixel in case of a color JPEG, since each "data units" holds 14-bits values per cell).
      - keep all the scans compressed in memory and perform decompression in parallel on those (you loose the benefits of progressive display in this case).

      Sequential JPEGs based on interleaved MCU holding the 3 components (YCbCr) do not face this memory hog problem, since the memory used by an MCU (or a lign of MCUs) could be released right after it has been uncompressed back to RGB pixels.

      Delete
  8. "Some people may be concerned about the extremely fuzzy first-pass in the progressive case. This test was just done with using the default jpegtran scans. I have a TODO to experiment with different configurations to deliver more bits in the first scan and skip the extremely fuzzy passes."

    This is impossible since the JPEG specs states that in progressive mode the DC values (the top left value of a data unit) have to live in their own scan, apart from the 63 AC values.
    That's why you'll never be able to write something like "0: 0-5" in a scan file for jpegtran, only 0-0 works, luma and chroma can be interleaved: "0 1 2: 0-0", but if you are looking for the smallest file simply let JPEGrescan do its work.

    May I suggest you to read the chapters about JPEG in John Mianos book:
    http://www.2shared.com/document/7E31nRlZ/Miano_-_Compressed_Image_File_.html

    In the other hand you could produce sequential JPEGS with scans parameters like:
    0;
    1 2;
    (this one is usually better than the default "0 1 2;" when chroma sub-sampling is involved, since it will avoid the inclusion of dummy luma data units to fill the MCU).

    0 1;
    2;

    or

    0 2;
    1;
    can also give better results than default sequential.

    ReplyDelete
  9. Aggreed, progressive JPG accelerate the start render and gives a more useful idea of the content sooner. Not all people agree on this last point, but I'm not of those.
    However I do not get why nobody defends the poor users still using IE 8 and earlier (precisely the one who need speed boosts btw) : IE8 and earlier do not smoothly display progressive JPGs. Actually effect is worse than with regular JPG : as long as the image is downloading, you see a blank screen, so you have no visual effect indicating you an image is coming.

    Take a look at the screenshots : http://www.webpagetest.org/video/compare.php?tests=130717_N5_a43aa8f9b286a33bb769189a28fed9f5-r:1-c:0
    The main progressive JPG is not displayed until fully there, whereas when it was regular the user knew he was downloading (an heavy) something
    http://www.webpagetest.org/video/compare.php?tests=130717_K1_ce80cb6e524af4bcf05c985ce9d455c5-r%3A1-c%3A0&thumbSize=200&ival=1000&end=visual

    So I dont really agree about saying progressive should be standard : let the websites owner decide if their user base has modern enough browsers to benefit from this technique

    ReplyDelete
  10. Depending on what data set you look at, IE 8 is under 10% and declining. I'm not saying we shouldn't support them but I wouldn't necessarily optimize the experience for that small of the population, particularly since it is declining.

    Statcounter: http://gs.statcounter.com/#browser_version_partially_combined-ww-weekly-201327-201328-bar

    Akamai: http://www.akamai.com/html/io/io_dataset.html#stat=browser_ver&top=5&type=line&start=20130601&end=20130707&net=both

    The distribution for a given site may be different so obviously you should take that into account but it's not like were talking about optimizing for the few that are on cutting-edge browsers, we're talking about optimizing for the 80+% case.

    ReplyDelete
  11. Hi Patrick, thanks for this very informative post on Progressive jpegs. Just the think I was looking for since I have the same problem with my website at the moment.

    Webpagetest:
    http://www.webpagetest.org/result/130807_DY_PQ8/1/performance_optimization/#progressive_jpeg

    Is there any plugins for wordpress I can use to do this jpeg compression on my images?

    Thanks!

    ReplyDelete
    Replies
    1. With any luck progressive JPEGs will be the default in Wordpress in the not too distant future: http://core.trac.wordpress.org/ticket/21668

      Kraken.io do a good job with creating progressive images and are supposed to be working on a wordpress plugin.

      If you have shell access to the server you can also run Adept across your images which is also a great option.

      Delete
  12. Actually, my webpage wallpaper is already losslessy optimized and progressive (saved using PROGRESSIVE option on Photoshop CS5).
    However, webpagetest tells me that the image is NOT progressive and that it could be optimized (furthermore, the optimized version would be a few kB bigger than original version).
    I'd appreciate it if you could take a look:
    http://www.webpagetest.org/jpeginfo/jpeginfo.php?url=http%3A%2F%2Fwww.flapane.com%2Fimages_style%2Fwood.jpg

    Thanks

    ReplyDelete
  13. Hello
    Please inform how to change JPEG Image into progressive JPEGs, which software should i use or how should to change it, give me link.

    ReplyDelete
    Replies
    1. The easiest is probably the jpegtran command-line utility (available on most OS's through their package managers, here for windows: http://jpegclub.org/jpegtran/ ).

      jpegtran -progressive source.jpg out.jpg

      Stoyan Stefanov wrote a good article on it back in 2008: http://www.yuiblog.com/blog/2008/12/05/imageopt-4/

      Delete
    2. Photoshop also gives an option for that. Just open it and save for the web with progressive option checked. (CTRL + ALT + SHIFT + S)

      Delete

All comments are moderated and may take a while to appear.