Better gzip compression with Zopfli
You can make your gzip files 3 to 8% smaller with Zopfli. Zopfli is a file compressor that uses advanced techniques to shrink gzip files. The gzip archives produced are 100% compatible with existing decompressors; the files are smaller than the original gzip compressor because Zopfli leverages CPU and memory resources during compression to squeeze out every last byte.
In this post I’ll show how Zopfli can reduce file sizes compared to gzip, demonstrate some benchmark results, and then explain how you can incorporate it into your workflow.
How does Zopfli compare to Gzip?
Gzip runs fast and with little overhead. It uses heuristics to split blocks, parse literals, and create the Huffman coding. Zopfli uses optimal parsing, enhanced Huffman coding, and optimized block splitting. By optimizing repeated patterns and the data structuring, it achieves better results while being backwards compatible.
How does Zopfli compare to Gzip with its maximum compression level?
Let’s use this site’s Atom feed as an example. The original size weighs 60,507 bytes. gzip -9
compresses it into a 21,749 bytes file, and zopfli
produces a 20,790 bytes file. That’s a 4.4% saving.
Let’s compare with a bigger file: search_index.en.js
, which is the search index generated by Zola. The original file size is 1,668,261 bytes. gzip -9
produces a 193,712 bytes file, while zopfli
compresses it to 180,757 bytes. That’s a 6.7% reduction.
Speedwise, gzip is 40 times faster than Zopfli according to my quick benchmark:
$ hyperfine --warmup=3 'zopfli atom.xml' 'gzip -9 -c atom.xml > atom.xml.gz'
Benchmark 1: zopfli atom.xml
Time (mean ± σ): 113.3 ms ± 0.7 ms [User: 109.2 ms, System: 3.6 ms]
Range (min … max): 112.1 ms … 115.1 ms 26 runs
Benchmark 2: gzip -9 -c atom.xml > atom.xml.gz
Time (mean ± σ): 2.8 ms ± 0.1 ms [User: 2.2 ms, System: 0.6 ms]
Range (min … max): 2.5 ms … 3.3 ms 950 runs
...
Summary
gzip -9 -c atom.xml > atom.xml.gz ran
40.21 ± 2.04 times faster than zopfli atom.xml
How I use Zopfli
I use Zopfli to compress this website’s text files. I use Nginx as my web server and leverage its nginx_gzip_static
module to serve the pre-compressed gzip files using the gzip_static
directive. Thus configured, when Nginx looks to serve a file named foo.html
it checks if there’s a file named foo.html.gz
beside it and uses it instead of compressing the original file on the fly with gzip.
There are several benefits over letting Nginx compress on the fly:
- Reduce download time by serving smaller files
- Reduce the processor load on the web server
- Eliminate the minuscule delay from compressing the files on the web server
This lowers latency and saves a bit of processing power.
I create my web pages with Zola, a static site generator. Once the website’s files are generated, I run the following script to compress the files in parallel with fd and Zopfli:
fd \
--no-ignore-vcs \
-e html -e css -e js -e xml -e atom -e txt -e json \
-t f \
-j 32 \
. public/ \
--exec sh -c '
if [ ! -f "{}.gz" -o "{}" -nt "{}.gz" ]; then
zopfli "{}" && touch -r "{}" "{}.gz"
fi
'
fd is a modern find-like utility. Here’s what the command above does:
--no-ignore-vcs
doesn’t take.gitignore
into account.-e html -e css -e js -e xml -e atom -e txt -e json
finds all the filenames with the right extensions.-t f
gets only files and ignores directories and links. This is somewhat redundant, but better to be safe than sorry.-j 32
limits the number of concurrent jobs to 32.. public
looks for all the files in thepublic
directory.--exec sh -c ...
executes a command over all the files.if [ ! -f "{}.gz" -o "{}" -nt "{}.gz" ]; then ... fi
only considers the files that haven’t been compressed or whose modification date is newer than their gzipped equivalent.zopfli "{}" && touch -r "{}" "{}.gz"
compresses the file with Zopfli and copies the modification date from the original file to the new gzipped file.
Then I upload all the files to my web server and I get Nginx to serve the smaller files and save a bit of bandwidth.
Conclusion
Reducing the files’ size by around 5% isn’t groundbreaking, but it’s always good to optimize the user experience by shaving off a few milliseconds when loading web pages. It’s also a nice bonus to save CPU power on the web server by not having to compress the files served on the fly. Zopfli may not be appropriate for files that change frequently or when compression time is important. If you distribute gzipped files for a wide audience, using Zopfli instead of gzip is worth it. Give it a try, Zopfli is available for most platforms and easy to use.