The Tyranny of the Fake Problems Equation

Jun 22, 2023

In rocket engineering, the more heavy the rocket, the more fuel is needed to lift it up off the ground. But fuel itself adds more weight, so you need more fuel in order to lift up the fuel you’ve already have.

Here’s a source I found on the internet: Tyranny of the Rocket Equation, by Austin Morris, Director OF Engineering at Kall Morris

So to push that mass, you add fuel. But now you’ve added more fuel mass, so you need to add more fuel mass to fuel the mass that you’ve already amassed atop your fuel. And herein lies the tyranny of the rocket equation: the more fuel you have, the more fuel you need.

The concept applies to programming, and to all engineering domains.

The same way that “fuel” adds weight, and thus requires more fuel, every “solution” in engineering adds new problems, thus requiring yet more solutions.

This is not a bad thing nor a good thing. This is just how things are.

To steal the phrasing from the article linked above:

The more solutions you have, the more solutions you need!

Fake Problems

Sometimes you think you have a problem, but it’s a fake problem. The thing you want to achieve can be achieved in a simpler, more robust way, than what you are currently doing.

You want to show a login form, but you find yourself writing piles and piles of boilerplate glue code to connect 10 different microservices together. Session management needn’t be that complicated!

You want your web application backend to be able to handle the infrequent (but not so rare) spikes of 1000 requests/second, so you study about cloud services and horizontal scaling. 1k req/sec is not that difficult to handle on one machine.

It is my contention that a lot web companies spend most of their resources solving fake problems but they don’t realize it because their focus is too narrow.

Context

Fake problems are only fake in context, relative to a specific goal you want to achieve, where you have a way to evaluate the effectiveness of the solution. Here’s an attempt at formalization:

X is a fake problem iff you mistakengly think you need X in order to achieve objective A but there’s a another technique Y that is better than X at achieving A when measured by relevant metric M.

Maybe you have to use microservices for your login workflow because you are just the junior programmer assigned the task, and the system is architected in such a way as to make it impossible to achieve the task without microservices. In this context, your problem is “I need to complete this particular task within the constraint of the system by the end of the current sprint”. The relevant measure here is “number of hours it takes to get this feature done”. Changing the architecture would take a lot more time, and would be considered a bad alternative.

In this narrow context, the problem is not necessarily fake per se.

However, if your measure is “total time required to implement all features now and in the future, and total cost (in terms of resources) for developing and maintaining this system for the next 10 years”, then it matters what the architecture is.

While there’s no way to know ahead of time what the right architecutre is (you have to discover it) there are many wrong architectures that you should avoid.

“Microservice Architecture” is just the wrong architecture for any kind of problem. Period. Never even consider it.

Fake problems caused by architectural decisions cannot really be solved after-the-fact (at least not easily). If you make an obviously wrong choice, you will have to pay a very high interest rate over the course of the product’s lifetime.

Your initial achitecture should be just the minmal thing that allows you get the early things done and move forward with minimal friction. The details will reveal themselves to you later. As the project grows, you discover that there are five or six operations that you do a lot. They have a lot in common, yet the code for doing them is dispersed all over the place. You figure out how to streamline the process, eliminate waste, and reduce the combinatoric explosion of codepaths. Now you have a little piece of architecture: everytime you want to do one of those five or six things, you go through that codepath.

Interpreted languages

They are not only slow - which is already bad enough; they don’t have a clean deployment story. This causes all sorts of “fake” problems that the entire industry has spent over a decade trying to fight against.

See, when you have a proper compiled language, typically you can take source code as input, and produce a binary executable as output. The Operating System can just load and start executing it. That’s what Operating Systems are designed to do.

With interpreted languages, all of that goes out the window. There’s no “compilation”. There’s just code. You tell the interpreter to strat executing a certain script file and it just goes. The script might load other scripts and “packages”. The location of these packages depends on the environment: the version of the interpreter running, where it’s located, environment variables, etc.

There’s no “deliverable” that you can produce. If I write a program in Python on my machine, and I want to then run it on another machine, I have to duplicate the entire environment on the target machine: the source code, the interpreter version and the installed packages, environment variables, etc.

Many “solutions” to this problem have been proposed and adopted over the years. First there was virtual env, but apparently it was not “meta” enough, so then “virtual env wrapper” became more or less the defacto standard. At least that was the status quo about 10 years ago; I’m pretty sure it’s entirely different now.

This is a clear example of a fake problem. Writing a program on one machine and executing it on another machine has been a solved problem since forever. All operating systems are designed to be able to load and execute binary files with machine code. You just take the source code, feed it to the compiler, and it will produce the correct file. There is your deliverable! You might have to include other resources & media with it, but it’s not particularly difficult to bundle the media assets along with the binary executable into a zip file.

Go does this well. It does it even better than C. Compiling C programs is a huge mess because the language spec does not define what constitues a program and how to build one, so you get many different (and incompatible) build systems. Integrating libraries into your source code is also a mess for this reason.

Go on the other hand does this right: the language clearly defines what constitues a package and what constitutes a program and how to compile source code into programs. The compiler even knows how to produce executable files for different operating systems and different cpu architectures. So I can write code on my M2 MacbookAir laptop (aarch64, darwin) and produce code that runs on a typical 64bit Ubuntu server running on Intel hardware (x86_64, linux), and it just works. I believe the Odin compiler is able to do the same.

This is the correct way to “write once, run anywhere”.

Poisoning the well

Now the problems with interpreted languages don’t stop here. They just keep spawning new problems, some of them are “emergent”, in the sense that they are not even intrinsic problems to the technology, but emerge out of the interaction between the technology and its users.

When programmers spend years programming in this model of virtual environments and scripting languages, they get used to the mess. They think it’s a natural part of the process. It imprints on them and affects how they approach all engineering problems.

So we get companies with entire projects (serious projects; and with important clients) that are a huge mess of programs and packages and dependencies and build scripts and configuration scripts that all need to be executed in a very specific order and glued together in an intricate matter in order to make the thing run, and hardly anyone on the team knows how to make it happen. There’s like maybe 2 people who understand how this works - in a team of 10 or so programmers.

This is all a fake problem. It’s a situation you put yourself in because you don’t understand how to create a deliverable; because you spent your entire programming education without being taught the concept.

Now, it’s a fake problem, but with a narrow enough focus, it’s a real problem: we have a company and a project and paying customers and we need to keep the lights on. So it’s a real problem. It’s a huge problem, infact!

And it needs a solution!

Containers

And this is how we got Docker. Supposedly it lets you encapculate the entire environment (for a system running on a linux machine) in what amounts to a special configuration format. Anyone should be able to use the docker file to rebuild the environment. This will often involve downloading data from the internet, but who cares! The internet is fast! Let’s normalize downloading 500MB of data from 10 different providers everytime you rebuild your system (which can happen several times per day, mind you).

But the mentality that created the problem is still there. So when you give it a way to encapsulate the environment into one deliverable item, what do you think it will do? It will construct a software system that requires multiple docker containers to be running in parallel and that need to somehow coordinate with each other.

Just when you thought we finally solved the fake problem (of not being able to create a deliverable) with a so-so solution, they make the situation worse.

So the solution to creating a system out of multiple containers is something like docker-compose. It lets you create a config file that tells Docker how to launch the relevant containers and which environment variables they need to share in order to make the system work.

Next thing you know, someone who just loves fiddling with configuration files figures out how to create an advanced configuraton system where you can change the values of 50 or so config options before you launch the docker-compose set and it will change how the system behaves. No one on earth knows how the system works except for him.

Microservices is only the logical next step.

The slippery slope is real; don’t let anyone tell you it’s a fallacy.

What do you think the majority of junior Go programmers do? They write microservices! They write a microservice in the order of 30k lines of code in Go, half of which are boiler plate. They also create a docker image for their deliverable, because how can you run a program without Docker, right? We’ve never heard of anything that just runs on its own without Docker; have you?

This is the fakest of the fake problems. You took a language that is so good at creating a simple self contained deliverable, and instead of using it to its fullest potential to make the development and deployment of your system as simple as possible, you just couldn’t resist the urge to screw it all up.

Congratulations!

Now you need to hire 3000 thousand engineers to maintain a website that could’ve been maintained by 50 engineers.

Hasen Judi

Discussion about this post