Every Principle Has a Scar

Most engineering advice arrives stripped of its scar. Someone hands you "keep it simple" or "measure before you optimize" as if it were a fact about the world, and it slides right off, because a principle you did not earn is indistinguishable from a slogan. The rules that actually run my hands on a keyboard are different. Each one has a specific failure attached to it, a day I remember, a thing that broke and cost something. This post is that: not a tutorial, but the changelog of how I came to think about building software, told through the scars.

The timeline on the right is the real structure. Read it as a spine. Every node is a moment something taught me a lesson the hard way, and every lesson became a line in a doctrine I now keep as a living document. The prose just walks the spine from top to bottom.

Why

Why I keep a changelog of my own beliefs

Here is the thing worth your time before any of the war stories: your principles are only trustworthy if you can say when and why each one changed.

For years I would have told you I did not have a process. That turned out to be false. I had a heuristic-shaped one I had compiled so deep I stopped noticing it, the way you stop noticing the grammar of a language you speak fluently. When I finally sat down to write it out, I realized I had a set of hard beliefs about building software, and every single one traced back to a specific failure. The belief was the projection. The scar was the event that produced it.

So I started treating my own thinking the way I treat a system I respect: event-sourced. The git log is the immutable record of every change. The current doctrine is the projection you get by replaying it. And a short human-readable timeline sits on top explaining why each belief shifted, because git tells you what changed and the why is the part worth keeping.

That reframing matters for a practical reason. It means a principle is falsifiable. It has a birth date and it can have a death date. Deleting a belief I have stopped believing is as valuable as adding a new one, and I can only do that honestly if I remember what it cost me to hold it in the first place. The rest of this post is me replaying the log out loud.

First build

The first product nobody needed

"There is nothing so useless as doing efficiently that which should not be done at all." Peter Drucker

The first real thing I built, early on, I built for the architecture. I remember being genuinely proud of it. It was clean, it was extensible, it was the kind of thing you would put in a portfolio to show you understood layering and abstractions. It also solved a problem the business did not have.

Nobody needed it. Not "nobody appreciated the craft", but literally: there was no outcome on the other side of it. I had spent my effort making the engineering good and never once checked whether the engineering was pointed at anything. The product died, quietly, the way most first products do, and it took me an embarrassingly long time to understand that it had not failed because of a bug. It failed because I had answered a question nobody asked.

That is where my oldest principle comes from: engineering serves outcomes, not orders. Engineering is a support function to the business. Half of the "hard" problems I now get handed disappear the moment I check whether they actually need solving. Start from the goal, not from the tech.

But I want to be careful with the word "support", because there is a lazy version of this that is also wrong. "Support function" does not mean "order-taker". The cheap shortcut whose second-order cost is ten times worse in six months is not serving the business, no matter who asked for it. Serving the outcome sometimes means pushing back on the expedient thing. The scar taught me to aim; it did not teach me to stop thinking.

Start from the goal, not the tech. Half of the hard problems disappear when you check whether they need solving at all.

III

Pre-GenAI

The clever model only I could debug

"The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise." Edsger W. Dijkstra

The pre-GenAI years were model years. I built a custom dual-encoder transformer, a whole synthetic data pipeline that could spit out a hundred thousand training examples a run, retraining systems, the works. It was some of the most technically satisfying work I have done. And then one of those systems broke in a way I could not see into, and the person on the hook for understanding it at an unreasonable hour was me, because I was the only one who could.

The lesson split into two halves that I now hold together.

The first half: genuine complexity should become a deep black box. When complexity is essential, and in ML it often genuinely is, you encapsulate it behind a small, simple interface so it does not leak into everything it touches. A deep module: tiny surface, powerful inside. Other systems should depend on the interface, not the internals. This is just information hiding, the oldest good idea in the field, and I had half-done it.

The second half is the one I only learned from the outage: keep the seams observable. Hiding complexity is not the same as making it invisible. Every non-trivial abstraction leaks eventually, and when it does you need to be able to drill in fast. A black box you cannot see into is not an asset, it is a liability with good PR. The recovery at 3am depends entirely on the seams being instrumented. I had built a beautiful box with no windows.

GenAI pivot

The orchestration that ate itself

"The cheapest, fastest, and most reliable components of a computer system are those that aren't there." Gordon Bell

Then the ground shifted and everything became orchestration. Multiple LLMs with fallbacks, safety guardrails, toxicity and manipulation checkers, PII handling, a hybrid retrieval pipeline with a custom reranker sitting over knowledge-graph, full-text, and semantic search. Every one of those pieces was individually justified. Every one of them solved a real problem. And the system as a whole slowly became something I dreaded changing.

That is the scar that produced the principle I now apply most often: every new moving part must pay its rent. Not once, at install time. Every day, forever. Do not add a component, a service, or a dependency unless its value clearly and continuously exceeds its ongoing cost. Most complexity is self-inflicted, and it is inflicted one reasonable-sounding addition at a time.

The mechanism underneath it is second-order effects. A component's real cost is almost never its first-order function, the thing you added it to do. The cost is the integration, the coupling, the new failure modes, and the set of things that now have to stay consistent with each other. Systems fail at the seams, and every new part adds seams. So now, before I commit to a part, I try to trace what it will cost me in the places I am not adding it. And the default, always, is the simplest thing that works. Simplicity is not a style preference. The simplest thing that works is almost always more correct, more debuggable, and cheaper to change than the clever thing. Complexity has to be earned, not assumed.

Real-time

The argument I lost to a number

"Measure. Don't tune for speed until you've measured." Rob Pike, Notes on Programming in C

Building real-time systems is where I learned to shut up and measure. Somewhere in the middle of designing how state should live and fan out to many live clients, I was in one of those architecture arguments that go nowhere, two smart people with two confident intuitions, neither of which could be true at the same time. We could have kept debating for a week.

Instead we built a tiny benchmark. A throwaway harness, a couple of hours of work, pushing a stupid number of events through the candidate designs. The number came back and it did not agree with either of us. It was surprising, which is exactly the point. That is when it clicked: measure before you decide.

Most architecture debates are two intuitions colliding. A measurement ends them, and it is usually surprising, which means the debate would never have converged on the truth on its own. Now, when I hit a scary assumption, the one the whole design hangs on, I do not design around it. I write the smallest POC or benchmark that would tell me if I am wrong, and I let the number settle the argument. It is faster than being right by authority, and it is the only version of "being right" that survives contact with production.

Scaling up

The system I could no longer hold in my head

"The competent programmer is fully aware of the strictly limited size of his own skull." Edsger W. Dijkstra, The Humble Programmer

Systems grow. A multi-tenant SaaS platform, fifty-plus endpoints, hierarchical role-based access control, a dozen moving subsystems. At some point I noticed I was spending real energy just trying to keep the whole thing loaded in my head at once, and failing, and feeling vaguely ashamed of failing, as if a good engineer should be able to hold it all.

That shame was the bug. The scar taught me: design in zoom levels, and page in detail on demand. Working memory is the real constraint, not intelligence. So the top level of a system should stay genuinely simple and comprehensible, and the complexity should be available on the way down, when and only when you need it. One simple top diagram. Drill in for the rest. Do not hold it all in your head, reload it from the docs, the same way you page memory in from disk.

The uncomfortable part of this one is that ego is what makes people resist it. Wanting to be the person who holds the whole system in their head is a status impulse, not an engineering one. Navigable systems beat memorized ones every time, because the memorized ones have a single point of failure and it is a human being who occasionally sleeps. Do not miss the forest for the trees, and do not mistake carrying the whole forest in your arms for strength.

VII

Leading

The 3am only I could answer

"Programs must be written for people to read, and only incidentally for machines to execute." Abelson and Sussman, SICP

The last scar is the one that turned me from an engineer into something closer to a lead, and it is the same shape as scar two, just aimed at people instead of code.

I did a heroic recovery once. The kind where you are the only one who knows the incantation, you save the day, and for about an hour it feels great. Then it curdles, because you realize what you have actually built is a system that requires you, specifically, awake, at 3am. The heroics were the symptom. The disease was that I had become a black box with no seams, in human form.

So the principle is: after a heroic recovery, kill the bus factor. Write down what only you knew, so no one ever has to be you at 3am again. And the larger turn behind it, the one I am still learning: give away the work you are best at and coach instead. My output is no longer what I personally ship. My output is now the team's output. That means teaching through heuristics and apprenticeship rather than runbooks, saying the quiet strategic part out loud, narrating the debugging while someone watches instead of just fixing it. The tacit strategy is the thing worth transferring. The fix is disposable.

This is the hardest one to actually do, because everything that made me good at the individual work pulls against it.

VIII

Now

What the timeline is actually for

"When the facts change, I change my mind." John Maynard Keynes (attributed)

If you read the timeline on the right as a list of accomplishments, you have read it wrong. It is a list of corrections. Each node is a place I was going in the wrong direction with total confidence until something pushed back.

That is why I keep the doctrine event-sourced. The current beliefs, pay-its-rent, don't-overcomplicate, second-order effects, deep black boxes with observable seams, zoom levels, measure-first, serve-the-outcome, are just the projection you get by replaying these scars. They are not eternal. Every one of them has a birth date and every one of them is allowed to have a death date, and the day I stop being able to tell you which incident taught me a given rule is the day that rule has quietly become a slogan again.

So the prompt I leave for the future version of me, the one who will read this back years from now, is simple. What did a recent failure teach me that none of these principles capture yet? And which one have I stopped believing? Because the timeline is not done. It is just paused at today. Every principle here has a scar, and I fully expect to earn a few more.