the frontier

The day software learns to do the paperwork

June 1, 2026 agents · LLMs · compliance · capital markets

There’s a moment, the first time you watch an agent decide to do something you didn’t explicitly tell it to do, where the floor shifts a little under you.

Not because it's smart, exactly. Because it's an agent, it held a goal, looked at the world, reached for a tool, and acted. That's a different category of thing than autocomplete, and it lands on finance in a very specific way.

From oracle to participant

For two years we mostly used language models as oracles. You ask, it answers. A very good intern who never leaves the chair. Useful, impressive, but fundamentally passive, a thing that produces text when prompted and then waits.

The shift isn’t that the models got smarter, though they did. It’s that they got hands. Tool use turned the oracle into a participant, something that can not only reason about the world but reach into it. Pull the filing. Reconcile the position. Draft the memo. Query the database, notice the discrepancy, flag it, and propose the fix. The qualitative leap is from a thing that describes to a thing that does, and once something can do, it stops being a tool and starts being an actor.

I don’t think most people have felt this distinction yet, because the consumer-facing version of it is still mostly chatbots that talk. But anyone who’s actually built with agents that have real tool access knows the floor-shifting moment I mean. You give it a goal, you step away, and you come back to find it has taken seven steps you didn’t specify, made judgment calls along the way, recovered from two errors on its own, and produced a result. That’s not autocomplete. That’s a junior colleague.

Where this actually bites: the back office

Everyone’s excited about agents that trade, that read the market and place orders and try to make money. I understand the appeal, and I’m more skeptical of it than most, partly because I know exactly what markets do to participants who need them for the wrong reasons, even silicon ones.

I’m far more interested in agents that do the part nobody wants.

I’ve spent years in the machinery of capital formation, and I’ll tell you the bottleneck is never the clever part. It’s the grunt work, the KYC, the reconciliation, the transfer-agent back-and-forth, the compliance review, the exception that needs a human at 2am because the rules don’t quite fit the case and someone has to decide. That’s the work that’s resisted automation for decades, and the reason is subtle and important: it was never that the work was intellectually hard. It’s that it’s fuzzy, judgment-laden, and high-stakes. Rules-based automation chokes on it, because the whole problem is the cases where the rules don’t cleanly apply. You needed a human in the loop not for raw intelligence but for judgment under ambiguity with consequences.

That is, almost exactly, the new capability. A capable agent is precisely a thing that can exercise judgment under ambiguity. Which means the wall that’s stood for decades, the reason finance back-offices are still full of people doing careful, fuzzy, high-stakes paperwork, has just developed a door.

The compliance-and-paperwork layer has blocked automation in finance forever. It’s about to get a participant that doesn’t sleep, doesn’t get bored, and leaves a perfect audit trail.

The unglamorous frontier (again)

Of course, the frontier isn’t the demo where an agent does something cute. It’s the boring, hard infrastructure that lets an agent act safely in a regulated context, and this is the part almost nobody is building, because it isn’t fun.

What does safe even mean here? It means bounded authority, this agent can do exactly these things and no more. It means a complete, auditable record of every decision and why it was made, because in regulated finance “the AI did it” is not an acceptable answer to a regulator. It means a sane escalation path, the agent knows the edge of its own competence and hands off to a human at exactly the right moment, not too early to be useless, not too late to be dangerous. It means identity and permissions and receipts. “This software is allowed to do exactly this, on exactly these things, and here’s the proof of everything it did.”

That’s deeply unglamorous work. It’s also the difference between a parlor trick and a new economic actor. An agent that can think but can’t act safely in a regulated environment is a demo. An agent that can act, bounded and accountable and audited, is a new kind of worker for the exact jobs that were too regulated to automate.

Why this is my lane

That’s the seam I sit on, and it’s not an accident, it’s where my whole career has been quietly pointing. Finance is learning to run on rails. Agents are learning to use them. The overlap is a back office that finally runs at the speed of software, with the compliance still intact and the audit trail still perfect.

I’ve crossed paths with this problem from both sides, the finance side, where I know how the back office actually works and why it’s so stubbornly manual, and the agent side, where I now build by directing software every single day and have a visceral sense of what these things can and can’t yet do. Most people see one side. The bridge between “agents that can act” and “industries that couldn’t automate” runs right through the work I already do.

It’s the least glamorous frontier in AI. I think it’s the most valuable one. I’m not certain how it lands, the timing, the form, which pieces come first. I’m certain it lands. And I’d rather be early and useful than late and right.