Post-processed Code

August 7 2024

A few years ago, I worked for a large financial institution that peddled meme stocks and crypto. They had an extensive infrastructure managed with Terraform.

One of my tasks was to set up a Kubernetes cluster to process some of their CI workload. This was my first time using Terraform in a production environment; previously, I had used it mostly to manage development and staging infrastructure. The project involved creating numerous AWS objects for different availability zones, along with various permissions and policies. This resulted in a complex web of dependencies between blocks. I utilized Terraform's meta-programming features: dynamic blocks, for loops, conditional expressions, and local blocks to generate these objects. After some debugging to get the logic right, I was fairly pleased with the outcome.

Once the cluster was provisioned, I needed to integrate it with the rest of the infrastructure. I met with one of the local experts who offered to review the Terraform source. After a few minutes, he informed me that my carefully crafted meta-programming logic would have to be scrapped because, in their world, Terraform meta-programming was frowned upon. He explained that while this approach made developing a small project with relatively few objects easier, it made managing and debugging large infrastructure a nightmare. The Ops team needed a clear picture of each individual object, and with meta-programming, they had to mentally expand the code to figure out what was what and how it all fit together. For the Ops team, it was more convenient to have all the objects expanded with their values populated, making their job much easier. Post-processed code can be more useful than its source.

Terraform🔗

This was a blind spot for me, I had never considered that it may be easier to view the data in its raw form. For me data was something to be summarized and expressed in the tersest way possible. I loved meta-programming because it made my job easier, but it made the Ops team's work in an emergency miserable. Having the full view of their infrastructure even if this meant tripling the amount of code was worth it. "Post-processed" code made more sense in that context.

Autoconf & Automake🔗

Autoconf and Automake are build tools used to generate scripts and makefiles. These old-school tools have largely fallen out of favor these days. When shipping a project, one would distribute the post-processed scripts and makefiles because requiring users to have Autoconf and Automake installed would make building the project more difficult. Having both the source and the generated scripts in the repository was just easier for everyone.

C Pre-processor Versus Copy/Paste🔗

I've done a fair amount of C meta-programming using the pre-processor. It's difficult and error-prone. The C pre-processor has all kinds of quirks and gotchas that often lead to hard-to-debug problems. For example you have to be careful to put parentheses everywhere to avoid ambiguous expansions. Over time, I came to realize that in many cases, a bit of copy-pasted code is a lesser evil than relying on the pre-processor.

Hydrated code can be better than DRY🔗

Sometimes, post-processed code makes more sense than trying to keep things DRY. Meta-programming can be overused, and sometimes it's better to avoid it. Maybe it's because people have an easier time parsing the expanded dataset, or maybe it's maintenance cost, or maybe just because it's convenient for the user.