OpinionJuly 4, 2026

The Conservation of Complexity: The Ends Still Need a Middle

The Conservation of Complexity: The Ends Still Need a Middle

Cheap code does not make software simpler.

It makes software cheaper to produce and more expensive to trust.

Those are not the same thing. And the gap between them is about to become the most expensive mistake in the industry.

This is the third piece in a line of thought. The Rehydration Flip argued that when code is cheaper to generate than to understand, the source of truth moves up, from implementation to intent. The Fork Winter followed the bill as it came due at maintenance time, with interest.

Both were arguments about cost. This one is about shape.

If code is nearly free, what does the stack actually look like? There is a clean answer going around, and I think it is wrong. The answer is: flat. The layers collapse. Protocols, schemas, interfaces, all of it accidental, all of it dissolved by a capable enough agent. One model. One interface. One English-speaking thing where a stack used to be.

I want to argue where that answer breaks. And I want to start from a principle, not a rebuttal.

Complexity is conserved

Layers are not the disease. Layers are where we localize complexity, so a person can hold one slice of it without holding all of it. A schema localizes what counts as a valid record. A protocol localizes how two parties agree. A transaction localizes what must be true before you commit.

Delete the layer and the complexity does not vanish. It moves.

The Conservation of Complexity. The essential complexity of a system belongs to the problem and the guarantees you demand of it, not to the code. Remove a layer and you do not destroy the complexity it held. You relocate it, usually somewhere you can no longer see.

Press one side. It bulges where you are not looking.
Press one side. It bulges where you are not looking.

I did not invent this, and I am not claiming a new law. Fred Brooks separated essential from accidental complexity in No Silver Bullet. Larry Tesler gave interface design its conservation law: a minimum complexity that cannot be removed, only shifted between the system and the people around it. Larry Wall is credited with the waterbed image for language design: push down here and it bulges there. I am borrowing the image, not turning it into a theorem. Joel Spolsky's leaky abstractions are what the bulge feels like from above.

The old versions were about where complexity goes. Into the problem. Into the user. Into the leak. The AI version has a sharper edge. When code is free, complexity moves between the layers. It leaves the places where we used to check it and lands where trust is most expensive. You never get to move it to zero.

Delete the schema, and "valid record" moves inside the agent. Implicit. Unverifiable. Derived again, silently, on every model update. Delete the protocol, and the agreement becomes a negotiation you pay for on every call. Delete the transaction, and "true before we commit" becomes a thing you hope the model remembered.

There is a second cost, quieter and worse.

Every guarantee-bearing layer you delete is also a seam. A place you could watch, test, interpose. Flatten the stack and you have nowhere left to put a probe. You bought a simpler surface with your own blindness.

I just watched that bet being made out loud, in its strongest form.

The idea, and the people who hold it

I was giving an invited talk at the DP2E-AI workshop in Paris a few days ago. One of the keynotes, by Ludovic Denoyer of IMEC's AI lab, was listed under the title "Building Agent Societies: A Revolution in Software Engineering," and it made the flat-stack case about as well as I have heard it made. The slides called the new methodology "not just possible, but inevitable."

The talk is the cleanest statement I have heard of a view that is suddenly everywhere. The view, not the talk, is what this post is about.

The argument goes like this. Programming is complexity we imposed on ourselves. The slides put it in writing: programming is "human-imposed complexity," its consequence "paradigms, databases, protocols," its data systems "schemas held together by tricks." Three forces have arrived at once. Machine learning that implements behavior from data. Foundation models that make that behavior reusable. Natural language as a universal interface. Put them together and you do not improve software engineering. You end a version of it.

The example was a booking system. The old stack is user, operator, UI, database. The naive agent just replaces the operator and keeps driving the UI. Real agentic AI, the slide reads, is "operator + UI + database collapse into one agent." One step further, and the agents talk to each other too, exchanging information "without complex intermediate data structures," English text blocks flowing through a unified connection cable. The destination is "the agent society: English-speaking agents" on heterogeneous hardware. Schemas go. Protocols go. The stack pancakes.

This is not the hype crowd. These are serious researchers describing something real, and they are far from alone.

They are standing on the strongest current in modern AI. Rich Sutton called it the bitter lesson. Across seventy years of the field, the general methods that scale with computation beat the clever structure humans hand-build, and they win by a wide margin. Andrej Karpathy gave the software version its cleanest slogan. Software 1.0 is code. Software 2.0 is learned weights. Software 3.0 is a model you program in English. The hottest new programming language, he likes to say, is English. Matt Welsh, a systems professor before he was a founder, wrote a piece for the Communications of the ACM titled The End of Programming, and he meant it literally. "For all but very specialized applications," he wrote, most software "will be replaced by AI systems that are trained rather than programmed."

This is a broad, senior view, and the direction it points is real. That is why it is worth saying exactly where it goes wrong.

Where I agree

Most software complexity really is accidental. The brittleness in a normal system does not live in its logic. It lives in the glue. The adapters, the serializers, the bespoke mapping between one team's nouns and another's, the UI that exists only so a human can do what an API already could. The flat-stack camp is right that most of what we call engineering is manual translation between representations that never needed to differ. That layer can go. Good riddance.

Machine learning implements behavior we cannot specify. This is the deep one, and it is true. The shift is not "learn from data." The shift is that the specification leaves the code and moves into examples. We can build functions nobody could write down. Sutton is right about the direction. That is a change in kind, not a change in speed. But notice what the bitter lesson is actually about. How to build intelligence, not where a deployed system keeps its guarantees. The difference does quiet work in everything below.

Natural language collapses the cost of the long tail. Most integrations were never built, because agreeing on a contract cost more than the integration was worth. That cost is collapsing. This opens an enormous space of software that was never economical before. It is the real engine, and it is bigger than people think.

If the argument stopped there, I would have nothing to add.

It does not stop there. It takes one more step, and that step is where it breaks.

Where I disagree

The accidental complexity collapses. The essential complexity does not. And the flat-stack story spends the substrate as if it were glue.

A booking system still has to refuse the double booking. Durably. Under concurrency. With an audit trail. Whether or not anyone is watching. That requirement belongs to the world, not the code. No amount of model quality makes a double booking acceptable.

A guarantee like that gets enforced by machinery with explicit assumptions, known failure modes, and an audit trail, or it does not exist. This is not taste. It is how you build things that hold together. You put contracts between the layers, and each layer keeps its promise so the layer above can stop checking. That is what modularity is. And unless the flat stack can name where that promise lives, it is asking you to point at the hardest promise in the system and say "fuck it," the model will remember.

So the promise has to live somewhere.

Either the agent holds it, and the agent becomes an opaque, drifting thing that quietly carries all the logic the schema used to hold in the open.

Or a deterministic substrate keeps enforcing it. Maybe that substrate is a full database. Maybe it shrinks to a single authoritative commit point, one conditional write on one calendar. Shrinking is not collapsing. The layer lost its UI and its glue and kept its job.

Those are the options. The guarantee does not disappear. It changes address.

The strongest flat-stack story is not stupid. It does not say guarantees vanish. It says the durable layers vanish. The schema is synthesized when needed. The protocol is negotiated for this interaction. The checker is generated at commit time. The whole middle is used, trusted for a moment, and thrown away. Only one thing persists: an authority over scarce resources.

Fine. Grant almost all of it. The concession is still fatal. An authority over scarce resources is the substrate, under a new name. A database an agent wrote is still a database. A generated commit point is still a commit point. Its author does not matter. Its lifetime does not matter. Its job matters. This is the old proof-carrying code bargain in architectural form. The producer can be untrusted. The artifact can be generated. The receiver still needs a small checker and a policy it is willing to enforce.

And my claim is narrower than it sounds. Not that any latch survives. That high-stakes systems keep commit semantics that are named, typed, and specified separately from the agent proposing the effect. A generic write gate is not enough to audit, not enough to reverse, not enough to sue over. The pancake needs the substrate gone. It is not gone. It is smaller. It is generated. It is still there.

And look at what already happened while we argued about it. The early agent protocols did not build a pure English cable. MCP puts natural language where it helps a model choose, and JSON Schema on the callable surface: inputs and, increasingly, outputs. That is not a proof of correctness. It is a typed boundary. A2A is looser, agent-to-agent rather than agent-to-tool, and even there the protocol wraps the conversation in agent cards, task lifecycles, artifacts, and authentication. Even the loose one wanted envelopes around the language.

The market has cast an early, weak vote. Not for the old stack. Not for the pure English cable either. It voted for speech surrounded by typed surfaces, lifecycle state, and security machinery. The engineers building for agents had every incentive to stop at speech. They did not stop at speech.

Hyrum's Law says the cable would not have stayed English anyway. Let agents interoperate through natural language long enough and someone will depend on every accidental phrasing, every timing quirk, every refusal habit. The protocol does not disappear. It goes unofficial.

English for the handshake. Typed payloads for the transaction.
English for the handshake. Typed payloads for the transaction.

That is not a stack pancaking. That is a stack sorting itself.

The end-to-end principle comes back to bite you

Reduce the middle to the minimum. Not to nothing.
Reduce the middle to the minimum. Not to nothing.

The movement bills itself as a revolution in software engineering. I do not think the flattening question is a software engineering question at all. It is a systems question. I am not saying that to stay relevant. I am saying it because the systems answer is the only one that works, and it was written down forty years ago.

In 1984, Saltzer, Reed and Clark published the end-to-end argument. It is one of the few genuinely load-bearing ideas in system design. Read it quickly and it almost sounds like the flat stack. A function belongs at the endpoints, where the knowledge to implement it correctly actually lives, not buried in the layers below. Push the intelligence to the end. Keep the middle thin.

Read it properly and it denies the last step.

The end-to-end argument is a rule for placing functions across layers. It is not a license to delete the layers. The endpoints in that paper still sit on a stack. The stack is thin, it is well defined, and it keeps its contracts. The argument tells you to stop putting a function in a lower layer when that layer cannot implement it completely and correctly. It never tells you the lower layer disappears. Addressing, ordering, the deterministic floor, all of it stays. It just gets minimal.

That is the move the flat-stack story misses. You can push logic up when it does not belong where it sits. You can shrink a layer to the least it can be. What you cannot do is delete it and expect its guarantee to survive in the agent's memory. In other words, reduce the middle to the minimum, not to nothing.

So the honest version of the flat-stack instinct is not "collapse everything into the agent." It is "move each function to the layer that can own it, and make every other layer as thin as it can be while it still keeps its contract." That is a better vision, and it is not a pancake. It is a well-factored stack with a very smart endpoint on top of a very deterministic floor.

When Saltzer, Reed and Clark wrote this down in 1984, they were right. The rule has held through every architecture we have thrown at it, because it was never a rule about a technology. It was a rule about where guarantees can live. A model at the endpoint is still an endpoint. It does not get to repeal the argument that put it there.

The 2001 follow-up sharpens it. Blumenthal and Clark revisited the end-to-end arguments for an Internet whose endpoints could no longer be trusted. They did not say put everything in the middle. They said the argument must be re-applied under new pressures, because untrusted endpoints demand mechanism elsewhere, and too much mechanism in the core destroys what the argument protected. That tension cuts in my direction. An agent is a very smart endpoint. It is also an untrusted one.

Then it bills you for performance

The end-to-end argument has a second half that people forget.

Saltzer, Reed and Clark say a function may still live in a lower layer, even when the endpoints could do it, purely as a performance enhancement. Systems people have lived in that clause for forty years. Hardware offload. Kernel fast paths. Layer bypass. Correctness said the function could live at the top. Performance kept a copy at the bottom.

Tokens are the modern version of that clause.

Running a guarantee through a model is not free. It costs tokens, it costs latency, it costs energy, and it costs a probability of being wrong. The pancake assumes all of that trends to zero. It does not. It trends down. Down is not zero.

Say inference gets three or four orders of magnitude cheaper. It probably will. You still do not get to run a hot transactional path, a million times a second, through a language model. Going local just moves the bill from a cloud invoice to your own power budget. Cheaper never meant free. It meant we raised our ambitions to match.

So even where a guarantee could technically live in the agent, most of the time it will not, for the same boring reason it lives in the fast path today. Performance is a constraint. It does not care how impressive the model is.

The floor under the floor

The Verification Floor. You can flatten a layer only as fast as you can verify what replaces it. The cost of removing a guarantee is bounded below by the cost of rebuilding that guarantee somewhere else.

It is the Rehydration Law again, generation cheap while understanding stays expensive, pointed at architecture instead of maintenance.

And it has a sting the flat-stack story never faces. The layers that resist flattening hardest are exactly the ones holding the promises. Glue flattens easily. It never promised anything. The substrate resists hardest, because it promises the most.

I owe my own side the same honesty. The floor is highest where the stakes are highest. Regulated cores, systems of record, anything with adversarial inputs and proof obligations to meet. So regeneration is cleanest in the low-stakes regime where you barely needed it, and it stalls in the high-stakes regime where it would pay the most. The inequality bites least where it matters most. That is not a win for the flat stack. It is the reason the substrate is the last thing standing.

So as code goes to zero, the stack does not flatten toward the agent. It sorts itself around the verification floor. What you can verify floats up. What you cannot sinks down and stays deterministic.

The objection from tomorrow

The strongest objection left is tomorrow, and I would make it myself. This whole argument sounds like a report on today's models. Transformers hallucinate, so the substrate survives. Tomorrow's architecture might not. Neurosymbolic cores. Built-in world models. Inference that arrives with a proof attached. Change the machine, the objection goes, and everything changes again.

Look closely at what the objection is asking for: a system that holds a guarantee deterministically. Whose behavior can be verified from the outside. Cheap enough to live in the hot path. That is not a refutation of the substrate, but a job description for it.

The substrate is a role, not a technology. The moment some future architecture can refuse the double booking the way a transaction refuses it, durably, auditably, at fast-path cost, it has not deleted the bottom of the stack. It has been promoted into it. Security people have had a name for this role since the 1972 Anderson report. The reference monitor: always invoked, tamperproof, small enough to check. The agent era changes the workload, not the shape.

None of these rules mention transformers. Conservation is about problems and guarantees. The end-to-end argument has outlived every architecture since 1984. The Verification Floor touches the models only as a dial, not a hinge. Better models make verification cheaper, the floor drops, and more of the middle flips from maintained to regenerated. The sort re-runs. The shape survives.

So yes. Tomorrow's architecture will change everything again. It will move the line. It will not remove it.

This can fail, and here is how. I am wrong if high-stakes systems of record start accepting irreversible external effects where the only enforcement of the invariant is model behavior. No separately specified policy. No independent checker. No transactional authority. No replayable accept-or-reject decision. If that future arrives, sorting by trust was a mirage.

What software actually looks like

Here is my answer to the question the talk and I were both trying to answer: what software looks like when code is free.

Not flat. Sorted by trust.

Sorted by what you can afford to trust.
Sorted by what you can afford to trust.

Intent on top. Formal, versioned, expensive to change, the most political artifact in the building. This is where the specification went. Not into English. Into something you can sign.

Code in the middle. Cheap, disposable, regenerated the moment generating plus verifying costs less than understanding. This is the only layer the flat-stack story called correctly, and it called it by accident. Code does not disappear. Code stops being sacred.

The substrate on the bottom. Systems of record. Transactional cores. The deterministic floor that holds the guarantees no agent has yet earned the right to hold. The agent sits on top of it and asks nicely.

Governance around all of it. Verification, provenance, attestation, audit. The mechanisms exist today in fragments. As a tier, it barely exists, and it becomes the product, because when code is free the only scarce thing left is a reason to believe the output. The Fork Winter was the story of what happens when this tier shows up late.

Count the trust boundaries, not the modules. There may be fewer durable layers and far less hand-written code. The boundaries that remain get harder, more explicit, and more expensive, because they are where generated behavior becomes trusted effect.

Flattening is real, and in the low-stakes glue it will win big. It stops at the first thing someone must verify, audit, reverse, or sue over. The mistake was never predicting flattening. The mistake is predicting flattening all the way down. That future is the only arrangement where the guarantees have nowhere to live, and I do not think it survives contact with accountability.

The movement is right that a version of software engineering is ending. It is wrong about the shape of what comes next. Not one English-speaking agent where a stack used to be. Intent hardens above. Code churns in the middle. The floor holds below. Governance wraps it all, because users and developers still need something to believe in.

The revolution is real. It just is not a flattening. It is a sorting.

Code is becoming free.

Structure is becoming expensive.

The stack will not flatten all the way down. It will sort itself by what you can afford to trust.

Comments