Something shifted in interface work over the past year, and I think it is bigger than the demos suggest. Interfaces are starting to be decided at runtime. An agent mid-task can now return not just text but working interface, a form when it needs structured input, a preview when it needs confirmation, controls when a task forks. The protocol work that landed this winter made it official, tools can hand back interactive components, and a chat window quietly becomes a place where applications assemble themselves around a conversation.

The pattern has a name, generative UI, and this year it crossed from experiment to production. I have now used products where no designer composed the screen I was looking at. A system decided, in the moment, what I needed to see. Sometimes it was right, and the experience was startlingly good, the interface equivalent of a sentence finishing exactly the way you needed it to.

I build the opposite. My learning engine generates content offline and serves it through a deterministic client, every screen composed from a fixed set of renderers I built by hand. So this post is me taking the other side seriously, because the strongest version of the generated-interface argument deserves better than the strawman it usually gets, and because I want to locate where my own choice would be wrong.

The strong case for deciding at runtime

The honest argument for generative UI is not novelty, it is fit. Predefined screens are a bet placed months in advance about what a person will need. Every designer knows how often that bet misses, we just bury the misses in settings pages, empty states, and support docs. A runtime-decided interface places the bet at the moment of need with the context in hand. The user who needs three fields gets three fields, the user untangling a mess gets the untangling view, and nobody pays for screens designed for someone else's situation.

There is also a quieter argument I find harder to dismiss. Hand-composed interfaces scale by headcount. Every state, every variant, every edge needs a person to have anticipated it. Runtime composition scales by capability, and capability is compounding right now while headcount is not. A year ago generated screens were ugly and wrong. This spring they are mostly fine, and mostly fine on a curve that steepens is how most disruptions I have lived through introduced themselves.

What determinism is still for

Against that, here is what a fixed interface buys, stated as concretely as I can.

It buys sameness, and sameness is not a small thing. A person's competence with an interface is muscle memory accumulated across visits. When the screen reassembles itself each time, every visit is a first visit. For a tool someone uses twice a year, that costs little. For practice, for work, for anything where fluency is the point, it taxes exactly the thing the product exists to build.

It buys testability. I can put a fixed renderer in front of a hundred children and know the thing I tested is the thing that ships. A generated interface is a distribution, and you do not usability-test a distribution, you sample it and hope. The same goes for accessibility, which is hard-won work on my renderers, and I do not currently believe a runtime process holds that bar reliably. I hold that belief loosely and expect it to age.

And it buys accountability. When something goes wrong on a fixed screen, there is an artifact to point at, version, diff, author. When a runtime-composed screen misleads someone, what exactly failed, and who reviews it? For my product the answer has to be crisp, because the user is seven years old.

Where each belongs, my current map

The split I keep arriving at is consequence and repetition. High-consequence interfaces, money moving, benefits applications, a child's learning, want determinism near the user even when generation runs underneath. High-repetition interfaces want stability because fluency compounds. Low-consequence, low-repetition, high-variance situations, exploration, triage, one-shot tasks, are where runtime composition already wins today, and that territory is large and growing.

My engine sits at the intersection of high consequence and high repetition, which is why generation happens in my factory and never in the child's client. The interface variety a learner experiences comes from config, many renderers, many scenarios, one stable grammar. I think that is right for this product. I would not generalize it into a law.

Where I would be wrong

If runtime composition gets reliably testable, my second objection dissolves, and there are serious people working on exactly that. And if it turns out that adaptivity at the representation layer matters pedagogically, that a specific child learns fractions best through an interface shape no fixed renderer set anticipated, then my stable grammar becomes the rigidity I built this product to escape, and the generated approach holds the advantage where I least want it to. I do not see evidence of that yet. I am watching for it, and saying so here feels like the appropriate insurance.

What seems hardest to argue with is the direction. The interface is joining content, copy, and code as something a system can produce on demand, and the designer's artifact is drifting from the screen itself toward the rules, contracts, and constraints the screens get produced within. That is not a demotion. Constraint design is harder than screen design, the consequences of getting it wrong are bigger, and much of my profession, me included, is still early in practicing for it.