The Wrong Executor Is a Cost Error, Not a Tech Choice

You do not send every patient to a surgeon. Not because a surgeon is less capable than a general practitioner; for the right problem, that extra training is exactly what saves you. It is because the condition in the room does not require that cost. You would be paying for twelve years of specialization that the case in front of you never asks for.

Automation works the same way. We have three kinds of executor now: software robots, AI agents, and people. Each one has a price. And the question almost everyone asks is "which one is smarter?" That is the wrong question. The right one is narrower and less comfortable: which executor matches the cost of this specific step?

The industry pushes you toward the wrong question on purpose. One major RPA platform describes its history as a ladder: "first there was robotic process automation… then came AI-powered automation… and now, there's agentic automation," the "latest step in automation's evolution." Another frames its previous generation as "limited to relatively simple tasks" where "complexity, decision-making, interpretation… were beyond its capabilities." Read enough of this and you start to believe newer means better, and that putting an agent on a step is a kind of promotion.

It is not a promotion. It is a purchase. And the task, not the executor, sets the price you are allowed to pay.

I wrote earlier about how robots and agents work together, not one instead of the other: "RPA and AI Agents: Not Instead, But Together." That piece left a question open: if they work side by side, how do you decide who does what? This is the answer. You decide step by step, on cost. And you can get it wrong in both directions.

A judgment step dressed as a workflow

Here is the first direction, from a real project of ours.

A company gets a request for customer account information. To answer it, you have to investigate: search one system, and depending on what you find there, move to another system, and sometimes go back and re-query the first. The path is not fixed. It branches on what the data says at each step.

We built a robot for it. A lot of analytical work went into the branching scenario — mapping the cases, the conditions, the loops. On paper it looked like a workflow. In production it was brittle. The data came in with small variations the scenario did not expect. The character of the queries shifted. UI elements were not always where the bot looked for them. So it failed, regularly.

The root error was not bad engineering. It was that the task needed interpretation, and we had given it determinism. Forrester has a rough rule for this — Craig LeClair's "Rule of Five": a process fits RPA when it has fewer than five decision points, touches fewer than five applications, and runs in under 500 keystrokes. Our investigative flow failed all three.

What made it expensive is the part you do not see on any dashboard. A robot on a judgment step does not fail loudly and stop. It exports the hard cases to humans, quietly. The straight-through rate that was promised at 75 to 85 percent lands at 30 to 50 in practice. The other half walks back to a person's desk, and those people show up as general operations time, not as an automation error line. You automated the easy part and kept paying for the hard part, just under a different name.

A deterministic step handed to judgment

Now the other direction.

Ask anyone in operations a simple question: would you rather have your salary calculated by an AI agent that can make a decision error and whose logic is not fixed — or by a robot that runs deterministic, auditable rules every time? I have asked this in the field. Nobody picks the agent.

This is not about trust in technology. Payroll has a right answer. The rules are written down. What you want from that step is the same output every run, and a clean audit trail when someone asks why a number is what it is. An agent introduces non-fixed logic where you needed fixed logic, and now you also have to check its decision, not just check that it ran. You are paying to add a question that the step never had.

This is the sharpest way I can put it: an agent on a deterministic step — risk of losing where you could be earning.

And it is more expensive in raw cost too, in a way most people miss because they price agents like a cloud bill. In a closed corporate setup — your own robots, your own agents, no public AI touching your data — there is no per-token price. The agent's cost is infrastructure: GPU and CPU hardware you bought and amortize, power, cooling, and the engineers who keep it running. An 8-GPU A100 server costs $75,000–$150,000 to buy. Spread that over three years and add power, cooling, and a half-FTE AI/ML engineer to keep it running, and the all-in annual cost reaches $90,000–$160,000. H100-class hardware roughly doubles that figure. (Smaller quantized models on CPU bring the floor down further — but the structure I am about to describe holds at every tier.)

The structure is the trap. That cost is fixed. It does not drop when the server sits idle. And idle is the normal state: real production GPU utilization sits at 20 to 65 percent, while you need 60 to 80 percent just to break even against renting from the cloud. So the cost per execution is not a flat rate; it is a fixed pile divided by however much you actually run. Put a deterministic, low-volume step on that infrastructure and you are paying agent prices for robot work, plus the review labor, plus the decision risk you introduced for free.

The three cuts that actually decide it

You do not need a full cost model at the whiteboard. Three cuts settle most steps.

Utilization — how busy is this step? A robot license runs about $5,000 to $15,000 a year per bot, plus 18 to 25 percent for maintenance. That price is the same whether the bot runs eight hours a day or twenty minutes. A bot that fills half its license capacity has effectively doubled its cost per run. Agent infrastructure has the same idle trap through a different door — fixed amortization and a fixed half-engineer, divided by throughput that is rarely near capacity. So the first question is honest volume. A step that fires monthly cannot carry either fixed cost well, and that pushes it toward a human or toward shared capacity, not a dedicated executor.

Oversight — who has to check, and check what? This is the cut people forget, and it is the most expensive one. A deterministic robot needs run monitoring: did it complete, did it break an SLA. The output was decided in advance, so nobody grades the decision. An agent needs decision review: was this output right in this ambiguous case, and that takes a domain expert, not an operations monitor. Even well-tuned agent deployments escalate cases to a human — from around 2% in a mature customer-service deployment to 12–18% in general enterprise practice — and you staff roughly one reviewer per 500 to 1,000 interactions a day as a standing cost. EY now lists "governance burden — guardrails, compliance, human-in-the-loop reviews" as its own line in agent total cost. That line simply does not exist for a deterministic robot. If your human review rate is not falling over time, you did not buy automation; you bought a more expensive way to do the same reviewing.

Unit economics — what does one execution cost at this volume? On-prem, one execution costs the whole fixed base — hardware, power, oversight — divided by how many times the step runs. Double the volume and the per-execution cost roughly halves, with no change to any invoice. Halve the volume and it doubles. This is why the same agent can be cheap on a high-volume interpretive step and absurd on a low-volume one. Run the division before you assign the executor, not after.

The full cost model has six dimensions: capex, software, launch labor, operations, decision oversight, and error cost. These three cuts are the highest-signal subset for a whiteboard decision; the rest belong in your spreadsheet. These three belong on the wall.

Decompose your own process

So take one real process and tag every step against two screens.

First screen, determinism. Does this step have fixed, auditable rules — one correct output that you could explain to an auditor? If yes, it is robot territory, and you should not dress it up as anything more. If the step branches on interpretation, carries real decision content, or reads unstructured input that varies, it is not deterministic, and a robot here will export exceptions to your people. That was Case 1 exactly: it looked like a workflow, it scored as judgment, and we paid for the gap.

Second screen, the three cuts — but only for the steps that failed the first screen. A non-deterministic step is not automatically an agent. Run utilization: is the volume high enough to carry fixed infrastructure? Run oversight: will every output need a human to review the decision, and can you afford that review at this rate? Run unit economics: what is one execution worth here? If the volume is thin or the oversight cost swallows the saving, the honest answer is a human, not an agent. Payroll fails this screen on purpose — it is deterministic, so it never leaves the first screen, and the agent never gets near it.

Walk a process through both screens and something useful happens. Each step lands on the cheapest executor that still clears the step's required bar — robot for the fixed-rule steps, agent for the high-volume interpretive ones where review cost is justified, human for the judgment that is too rare or too consequential to hand off. Gartner expects more than 40 percent of agentic AI projects to be canceled by 2027, mostly on cost and weak controls. A lot of those are steps that failed the second screen and got an agent anyway.

I will be honest about one thing I have not solved. The second screen needs a number — the cost of one human decision-review on a given step — and that number is the hardest to pin down. It hides inside operations headcount and quality teams, attributed to no single system.

The Wrong Executor Is a Cost Error, Not a Tech Choice

A judgment step dressed as a workflow

A deterministic step handed to judgment

The three cuts that actually decide it

Decompose your own process

See Primo in your environment