GZIP encoding = happier users?

Make your site faster and cheaper to operate in one easy step by Paul Buchheit recommends to turn on HTTP gzip compression on your site, and makes a cost / benefit analysis. The advantages on the server side are clear, but what about the clients? What are the benefits of gzip encoding for them? I ran a series of tests using curl to see exactly how much time is saved on the client side. I ran the test on friendfeed.com.

Results

Test from Canada (ping 26ms)
FileConnect timePre-transfer timeTotal timeTransfert size
MeanMedianMeanMedianMeanMedianBytesKBytes
no gzip 4141 155144 292282 4253141.53
gzip 4041 148144 186182 96399.41
Test from France (ping 155ms)
FileConnect timePre-transfer timeTotal timeTransfert size
MeanMedianMeanMedianMeanMedianBytesKBytes
no gzip 160157 393384 861853 4253141.53
gzip 157158 392388 548544 96399.41

First Let’s calculate the real download rate. It’s not the number of bytes transfered divided by the total time. The total time includes the connection time and the server’s processing time. Those are not strictly download. The real download rate is calculated this way:

size / (total_time - pre-transfer_time)

This formula is not 100% accurate, since it doesn’t take into account network latency and includes decompression time. But it’s close enough.

On my Canadian connection, the real download rate is about 310 KB/sec for the uncompressed page and 255 KB/sec for the compressed one. From France: 91 KB/sec uncompressed and 61 KB/sec compressed.

Bigger downloads tend to have a better rate. The compression is partially offset by the lower download rate. It’s not because gzip divides the size by 4 or 5 that you will get the page 4 or 5 times faster.

The download rate is not the only important factor. Connection & processing take time too, more than half of it if you have a fast connection.

When you have a rate over 200 K/sec the size of the page is not as important as it used to be. Faster broadband access means shorter connection time and even shorter download time. It also means that the server’s response time matters more and more. It takes at least 110ms after the connection is established before the first byte reach the client. That’s between 36% and 58% of the total time. It might account for even more in a few years.

GZIP encoding helps to reduce the total time significantly: around 35%-36%.

In theory compressing the page should do wonders with slower download rate. The server’s response time is just 26% - 42% of the total time. Download rate being lower, one would except GZIP encoding to reduce the total time by more than 36%. Surprisingly that’s not the case. The gain is unchanged at 36%.

That was a surprise for me. It turns out that GZIP encoding didn’t reduce the total latency by much more than 36%, because of the long connection time.

The results might have been better with bigger transfer, but 40K is already big for a HTML file. My data set was limited, so other tests with a greater set of connection types and different speeds might yield different results and would help determining how efficient GZIP encoding is in different situations.

Can GZIP encoding hurt performance?

Is there any case where you should avoid compressing your pages? Let’s run the test on www.google.ca.

This is probably one of the worst case:

Test from Canada (www.google.ca, ping 12ms)
FileConnect timePre-transfer timeTotal timeTransfert size
MeanMedianMeanMedianMeanMedianBytesKBytes
no gzip 2928 5652 7473 68386.68
gzip 2927 5857 6362 28972.83

The gain is lower than before, but still significant at 15%.

I guess there are very few cases where you shouldn’t use gzip your content. If your typical page is less than 100 bytes then gzipping it could hurt the client’s and the server’s performance. But no website —except maybe a few web-services— serves pages with a typical size of 100 bytes or less. So there’s no excuse for serving uncompressed HTML.

Want to test your own site?

Download the scripts. You’ll need curl to run the test and Python to create the report. The code is in the Public domain.