deflate & gzip: I was wrong

UPDATE: I was wrong about being wrong! In the comments section below, Billy Hoffman has pointed out that gzip compression is still incredibly useful, and that I should do better research. ;)

 

Today I had to rethink how I serve data to my users.

According to an answer posted on StackOverflow,  gzip or deflate compression should be avoided when serving text-based files.

While compressing data to decrease bandwidth and download times generally seems like a great idea, there are a number of problems which have surfaced during the evolution of on-the-fly compression over the wire. After reading the article Lose the wait, I thought more of why I had implemented used mod_gzip in the past and started to realize that the benefits aren’t particularly worth the effort or expense.

  • Internet Explorer has a history of issues with receiving compressed payloads — so much history that all gzip/deflate HOWTOs have a line disabling compression for IE, based upon user agent evaluation.
  • Uses extra CPU cycles to compress data on each request.
  • Requires another module to be loaded into memory (mod_gzip, mod_deflate, etc)
  • SSL (HTTPS) connections cannot be compressed.

A better approach is to combine technologies like browser caching and local storage mechanisms, server-side expires tags, and code “minification” to mitigate the need for on-the-fly compression by addressing the underlying issue: management of client-to-server requests.

All that said, compressed http payloads still have a place on the web. For example, Reddit, Digg, Slashdot, or any high-traffic, predominantly text-based site with a large volume of traffic from modern, non-IE web browsers.

Comments

  1. Mike,

    I’m glad you enjoyed our Lose the Wait: HTTP Compression article, but the conclusion that you should not use HTTP compression at all is not a very good one.

    1- The problem is with DEFLATE. Gzip compression has had major problems in over 15 years. Use HTTP compression with Gzip and all is well.

    2- The problem with IE and compression today is with misconfigured web servers. IE has no compression problems. Other people might misconfiguring their servers to not use compression with IE, but that doesn’t effect you or *your* server at all. This does not excuse you from using HTTP compression properly.

    3- I am highly suspect about the increased CPU stats around compression, or CPU utilization of deflate vs. gzip. The CodingHorror post you quote is nearly 4 years old. It also comes down to how your web server was compiled and what hardward is it running on. Modern CPU have hardware instructions and optimizations for the type of operations that HTTP compression uses.

    Besides, if you are really concerned about the CPU load of HTTP compression, modern web servers like Apache or IIS allow you to compress static content like JavaScript or CSS or static HTML files ahead of time and serve the pre-compressed responses using Content Negotiation.

    4-Apache’s mod_gzip (for Apache 1.x) and mod_deflate (for Apache 2.x) are part of the standard set of modules, such as mod_expires. These are most likely already loaded into memory by default, and not using them doesn’t remove them from memory. Also, the size in RAM of these modules was negligible even in the late 90s, let alone today on servers with gigs of RAM.

    5-SSL connections can most definitely be compressed. SSL encryption happens after the HTTP response is generated. HTTP content can be compressed before SSL encrypts it. In fact you don’t need to do anything extra to make this happen. If HTTP compression is enabled, and SSL is enabled, it just occurs for applicable requests.

    I absolutely agree with you that there are other optimization techniques, such as reducing the number of HTTP requests to begin with, that will have a greater impact on performance than HTTP compression. That said, HTTP compression has such a low barrier to implement (especially if you use a correct, premade configuration file like HTML5 Boilerplate’s .htaccess file) and such a positive impact on bandwidth that you should implement it.

    Thanks,
    Billy

    • Mike Walker Author

      Billy, thanks for such a great article and I appreciate your reply to my post. I better understand your point now: DEFLATE is the devil, and gzip is awesome when configured properly. Also, thanks for helping me reevaluate my misconceptions.. again. ;)

Pingbacks

  1. Accept-Encoding http header variations deviations normalize gzip, deflate, identity

Leave a Reply