RPA and AI Agents: Not Instead, But Together

There's a question I keep hearing: "should we go with RPA or AI agents?" It assumes you pick one. In practice, they work in different parts of the same process — and the more useful question is where exactly each one belongs.

Here's an analogy that helped me think about this clearly.

In an insurance company, the underwriter and the claims processor do completely different work. The underwriter looks at client history, weighs factors, makes a call that isn't in any manual. The processor follows a checklist: fill these fields, generate the document, send it to the client. One is judgment. The other is execution. Nobody asks which of them is "more efficient" — because they're not doing the same thing.

Automation tools work the same way. For any step in a process, one question is enough: is there a known correct answer here, before anyone looks at it?

Where a robot is simply better

Loading incoming documents into an ERP via a standard routing rule. Validating policy fields — amounts, dates, identifiers. Reconciling payment registers against ledger entries.

For all of this: the rule is clear, the sequence is fixed, the result is predictable. RPA runs it fast, leaves a full audit trail, and doesn't need to think. Bringing an agent into these steps means adding cost and unpredictability where neither is needed.

People sometimes underestimate how much of business is actually like this. Deterministic operations are not going away.

What an agent does — and how it actually fails

A client sends an email. From the text alone, it's not obvious whether this is a complaint about a denied claim, a status check, or a question about policy terms. A human reads it, understands the context, routes it to the right team.

An agent does the same: reads the text, classifies it based on instructions you've written, routes it — with a log of every decision.

This is not a script with if-else branches. It's a language model that acts on the meaning of the text, not exact word matches. A business analyst configures it through plain-language instructions — no code required. But it does require iteration. You check edge cases, refine the instruction, check again. It takes time.

How agents fail is worth understanding before you deploy one. A robot fails loudly — it throws an error and stops, you notice immediately. An agent can make a wrong decision that looks completely reasonable. That's a different kind of risk. At the start, manually review a sample of decisions, define the cases where the agent should escalate to a human, and set that boundary explicitly. Don't skip this.

Why an agent doesn't replace RPA

An agent can make decisions. It cannot execute them.

It can't log into an ERP, update a record, click a confirmation button in a core banking system. The agent decides; something else has to act.

Each enterprise system is its own interface — its own logic, its own versioning, its own edge cases. Integration with SAP, Oracle, or a legacy core platform doesn't get simpler because language models are improving. That's still an engineering problem, and it's exactly the problem RPA was built to solve.

So "replacement" is the wrong frame. They operate at different layers.

The layer between them: document processing

Between the agent and the robot, there's often a third piece.

IDP — Intelligent Document Processing — converts unstructured content into structured data. It extracts fields from scanned forms, PDFs, image attachments, and passes them forward in a usable format.

An agent can technically read a document itself. For one-off tasks, fine. For volume processing — it's slower, more expensive, and the accuracy is harder to measure. IDP is purpose-built for this: trained on specific document types, produces an accuracy metric per field, and flags when it's not confident. The agent then works with data that's already clean.

Think of it as a preparation layer. It makes everything downstream easier.

What this looks like in practice

Claims settlement. Documents arrive in pieces — initial filing today, assessor's report in three days, additional evidence a week later.

The robot collects each document as it arrives. IDP extracts the data — but one scan is low quality, confidence drops below threshold, that document goes to manual review. The agent classifies the case: standard or non-standard. Standard proceeds automatically, robot enters the decision into the core system. Non-standard: the client is disputing the damage estimate, their account conflicts with the assessor's report. The agent generates a request for additional documentation, passes it to a specialist. The specialist decides. The robot executes.

Pure RPA automation typically reaches 60–70% of a process before hitting the exceptions. Add the agent layer, and in mature implementations you're looking at 90–97% straight-through. The agent handles the judgment points; the robot still does the execution.

The architectural inversion: when the agent is in charge

Everything above assumes RPA is the primary orchestration layer — the robot calls the agent when it hits a judgment point. That works well for existing automation.

But there's a different pattern emerging. Agent-first workflows, where the agent orchestrates and the robot is a tool it calls.

We built an MCP Server for Primo RPA for exactly this. MCP — Model Context Protocol — is an open standard that lets any compatible agent (Claude, Cursor, your own) connect to the RPA platform and use robots as callable tools. The agent decides when to delegate structured execution; the robot runs it and returns the result.

This inverts the usual direction: not "robot calls agent" but "agent calls robot."

Why does it matter? Because if you're building agent-based workflows now, your RPA infrastructure doesn't have to become a legacy layer you work around. It becomes a service — callable from whatever orchestration layer you're using. We're still learning where this pattern works best, honestly. But the integrations it enables are genuinely different from what was possible before.

Measuring what's actually working

Three layers, three different things to measure.

Robots: cycle time, exception rate, FTE hours released. Straightforward.

Document processing: share of documents processed without manual review, extraction accuracy against a labeled reference, percentage of low-confidence cases flagged. The key number is straight-through rate — what share goes all the way through automatically.

Agents: decision accuracy compared to what an expert would decide, escalation rate, time from input to outcome on non-standard cases.

When all three run together, one end-to-end metric becomes meaningful: what percentage of the process completes without human involvement — from incoming document to system record. That's where you can actually see what each layer contributes.

Where to start

Take one of your automated processes — ideally one with the most exceptions going to manual handling — and walk through each step with that one question: is there a known correct answer here, before anyone looks at it?

Steps where the answer is "no" — that's where a person is currently standing in. Those are the candidate points for the agent layer.

Then: pick one, define what good looks like before you start, run the pilot small. The architecture is not complicated. The work is always in the specifics.