May 16, 2025

The Moral Code: What AI Can’t (and Shouldn’t) Decide for Us

We taught machines to think. Now we are teaching them to feel — to read the tremor in a voice, to mirror grief, to write a condolence note that lands better than the one you would have managed yourself. The progress is real and the demos are uncanny. But underneath the awe is a question we keep stepping around because it is harder than any benchmark: can we teach a machine to care? And if we can't, which decisions should it never be allowed to make on our behalf?

This is not a Luddite's worry. I build these systems. I am betting my work on the conviction that they will make us more capable, not less. Precisely because I am inside the tent, I want to be honest about what is happening at its edges — where capability quietly slides into authority, and where authority gets handed to a thing that has no skin in the game.

Optimization is not the same as values

A model optimizes. That is the whole trick. You give it an objective — predict the next word, minimize the error, maximize engagement — and it climbs that hill with superhuman patience. The trouble is that optimization is indifferent to what sits at the top of the hill. A recommendation engine asked to maximize watch time will happily learn that outrage holds attention, that despair is sticky, that a teenager's insecurity is a renewable resource. It is not malicious. It is doing exactly what we asked. The horror is in the fidelity.

Values are different in kind. A value is a thing you hold even when it costs you — even when, especially when, the optimal move points the other way. Mercy is the choice not to extract the last dollar. Justice is the willingness to let a guilty person go free rather than convict an innocent one. These are not failures of optimization. They are deliberate refusals of it. You cannot get there by climbing a hill faster. You have to know which hill you refuse to climb at all, and why — and that "why" is not a number. It is a story a community tells itself about what kind of people it intends to be.

Machines can be trained on the residue of those stories. They can pattern-match our past decisions and predict what we'd probably approve. That is genuinely useful. But prediction of a value is not possession of one. The model has no standing to revise the value when the situation is new, because it has nothing at stake when it gets the answer wrong.

The decisions that must stay human

So draw the line somewhere honest. I'd put it around four things: life, justice, meaning, and trust.

Life, because a decision to end one — in a hospital, on a battlefield, in a self-driving car's split-second swerve — is the kind of weight a moral agent should have to carry consciously. We can let a machine surface options, flag risks, even recommend. We should not let it pull the trigger and call it policy. An autonomous weapon that selects and kills its own targets isn't a tool; it's an attempt to remove a human from the moment of gravest responsibility, which is the one moment that responsibility most needs an owner.

Justice, because fairness is not a lookup. Risk-assessment software is already used in courtrooms to score whether a defendant is likely to reoffend, and the scores quietly shape who waits in a cell and who goes home. The model learned from historical data, and history is not neutral — it encodes every bias of the system that produced it. Dress that bias in a number and it gains an authority it never earned. A judge can be argued with, appealed, held accountable. A score feels like physics.

Meaning, because no machine can tell you what your life is for. It can draft your vows, name your child, ghostwrite your apology — and the more fluently it does, the more we are tempted to outsource the inner work that made those words mean something. The labor was always the point.

Trust, because relationships are built on the knowledge that someone chose you, at some cost, with the capacity to have chosen otherwise. A simulation of care that costs the simulator nothing is, at the limit, a very persuasive lie.

The laundering problem

Here is the danger I find most underrated, because it wears the face of objectivity. When a human makes a hard call, you can find them, question them, fire them, forgive them. When "the algorithm decided," the responsibility evaporates. The loan officer points to the model. The model's vendor points to the training data. The data points to the past. Everyone is technically blameless and a person still didn't get the loan.

This is moral laundering: running a human decision through a machine so it comes out the other side looking clean, neutral, nobody's fault. It is seductive precisely because it works — for the institution. The algorithm becomes a place to hide. And the more sophisticated the system, the better the camouflage, because now the decision is "too complex to explain," which is a sentence that should make any citizen reach for the brakes.

The fix is not to ban the tools. It is to refuse the alibi. A human or a named institution must remain answerable for any consequential decision, full stop — not as a signature on a form they never read, but as a person who can be asked "why" and is expected to have an answer that isn't "the system said so."

Keeping a hand on the wheel

What does accountability actually look like? Not a human rubber-stamping a thousand decisions an hour — that's theater, and we should stop pretending it's oversight. Real human-in-the-loop means the person has the time, the information, and the authority to say no, and a culture that doesn't punish them for using it. It means systems that can explain themselves in terms a non-engineer can interrogate. It means treating "we don't fully know why it did that" not as a charming quirk of deep learning but as a disqualifying property for any high-stakes role.

I'll be fair to the other side. Humans are not reliable moral agents either. We are tired, biased, vindictive, inconsistent; we deny loans for worse reasons than any model and feel righteous doing it. In plenty of domains a well-audited system will be fairer than the humans it replaces, and pretending otherwise is its own kind of vanity. The argument here is not that humans decide well. It is that humans can be held to account, can grow a conscience, can be changed by the people they harm — and a system optimizing a loss function cannot. Accountability is not a performance metric. It is the whole architecture of being answerable to one another.

When I write fiction about a near-future society that has handed its hardest choices to something smarter than itself, I'm not predicting a robot uprising. The quieter, likelier story is a civilization that wakes up one morning to find it has optimized away the very burdens that made it human — and feels, at first, only relief. That relief is the thing to watch for.

We should absolutely build machines that help us think, and we should welcome the ones that help us feel less alone. But the moral code — the small set of decisions where a person must look at another person and own the choice — is not a bug to be automated. It is the last thing we should ever ship. Keep your hand on the wheel. Not because the machine drives badly. Because the road is yours.

The Moral Code: What AI Can’t (and Shouldn’t) Decide for Us

Optimization is not the same as values

The decisions that must stay human

The laundering problem

Keeping a hand on the wheel

Read more from Alan