Better gzip compression with Zopfli

You can make your gzip files 3 to 8% smaller with Zopfli. Zopfli is a file compressor that uses advanced techniques to shrink gzip files. The gzip archives produced are 100% compatible with existing decompressors; the files are smaller than the original gzip compressor because Zopfli leverages CPU and memory resources during compression to squeeze out every last byte.

In this post I’ll show how Zopfli can reduce file sizes compared to gzip, demonstrate some benchmark results, and then explain how you can incorporate it into your workflow.

How does Zopfli compare to Gzip?

Gzip runs fast and with little overhead. It uses heuristics to split blocks, parse literals, and create the Huffman coding. Zopfli uses optimal parsing, enhanced Huffman coding, and optimized block splitting. By optimizing repeated patterns and the data structuring, it achieves better results while being backwards compatible.

How does Zopfli compare to Gzip with its maximum compression level?

Let’s use this site’s Atom feed as an example. The original size weighs 60,507 bytes. gzip -9 compresses it into a 21,749 bytes file, and zopfli produces a 20,790 bytes file. That’s a 4.4% saving.

Let’s compare with a bigger file: search_index.en.js, which is the search index generated by Zola. The original file size is 1,668,261 bytes. gzip -9 produces a 193,712 bytes file, while zopfli compresses it to 180,757 bytes. That’s a 6.7% reduction.

Speedwise, gzip is 40 times faster than Zopfli according to my quick benchmark:

$ hyperfine --warmup=3 'zopfli atom.xml' 'gzip -9 -c atom.xml > atom.xml.gz'
Benchmark 1: zopfli atom.xml
  Time (mean ± σ):     113.3 ms ±   0.7 ms    [User: 109.2 ms, System: 3.6 ms]
  Range (min … max):   112.1 ms … 115.1 ms    26 runs

Benchmark 2: gzip -9 -c atom.xml > atom.xml.gz
  Time (mean ± σ):       2.8 ms ±   0.1 ms    [User: 2.2 ms, System: 0.6 ms]
  Range (min … max):     2.5 ms …   3.3 ms    950 runs

...

Summary
  gzip -9 -c atom.xml > atom.xml.gz ran
   40.21 ± 2.04 times faster than zopfli atom.xml

How I use Zopfli

I use Zopfli to compress this website’s text files. I use Nginx as my web server and leverage its nginx_gzip_static module to serve the pre-compressed gzip files using the gzip_static directive. Thus configured, when Nginx looks to serve a file named foo.html it checks if there’s a file named foo.html.gz beside it and uses it instead of compressing the original file on the fly with gzip.

There are several benefits over letting Nginx compress on the fly:

  1. Reduce download time by serving smaller files
  2. Reduce the processor load on the web server
  3. Eliminate the minuscule delay from compressing the files on the web server

This lowers latency and saves a bit of processing power.

I create my web pages with Zola, a static site generator. Once the website’s files are generated, I run the following script to compress the files in parallel with fd and Zopfli:

fd \
    --no-ignore-vcs \
    -e html -e css -e js -e xml -e atom -e txt -e json \
    -t f \
    -j 32 \
    . public/ \
    --exec sh -c '
if [ ! -f "{}.gz" -o "{}" -nt "{}.gz" ]; then
    zopfli "{}" && touch -r "{}" "{}.gz"
fi
'

fd is a modern find-like utility. Here’s what the command above does:

  1. --no-ignore-vcs doesn’t take .gitignore into account.
  2. -e html -e css -e js -e xml -e atom -e txt -e json finds all the filenames with the right extensions.
  3. -t f gets only files and ignores directories and links. This is somewhat redundant, but better to be safe than sorry.
  4. -j 32 limits the number of concurrent jobs to 32.
  5. . public looks for all the files in the public directory.
  6. --exec sh -c ... executes a command over all the files.
  7. if [ ! -f "{}.gz" -o "{}" -nt "{}.gz" ]; then ... fi only considers the files that haven’t been compressed or whose modification date is newer than their gzipped equivalent.
  8. zopfli "{}" && touch -r "{}" "{}.gz" compresses the file with Zopfli and copies the modification date from the original file to the new gzipped file.

Then I upload all the files to my web server and I get Nginx to serve the smaller files and save a bit of bandwidth.

Conclusion

Reducing the files’ size by around 5% isn’t groundbreaking, but it’s always good to optimize the user experience by shaving off a few milliseconds when loading web pages. It’s also a nice bonus to save CPU power on the web server by not having to compress the files served on the fly. Zopfli may not be appropriate for files that change frequently or when compression time is important. If you distribute gzipped files for a wide audience, using Zopfli instead of gzip is worth it. Give it a try, Zopfli is available for most platforms and easy to use.