What a podcast about AI jobs made us change

Someone recently sent us a podcast episode by Toni Caron-Brown and Benedict Evans — their conversation on how to think about which jobs AI will affect. The central argument was direct: giving numeric percentage scores to jobs is “ludicrous” and “self-deception.” Anthropic and OpenAI do versions of this. So do we.

The critique is worth sitting with. The example Evans keeps returning to is accountants. Every wave of automation — adding machines, mainframes, PCs, Excel, ERPs — raised the “exposure score” for accounting. And yet the number of accountants in the US went up every single decade of the 20th century. The score never predicted this, because the score was measuring the wrong thing. The job wasn’t adding up columns of numbers. That was just how the job was delivered. The actual work — judgment, interpretation, accountability, relationship — wasn’t touched.

The honest question isn’t “what percentage of this job’s tasks involve a computer?” It’s closer to: is AI automating the actual thing people are paying for, or just the mechanism of delivery? Elevator attendants vanished because pressing the button really was the job. McKinsey partners didn’t, because making slides was never what clients were buying.

Our model has always tried to capture this distinction. The gap between naive exposure (task type alone) and effective exposure (adjusted for knowledge depth and social function) exists precisely because we think the raw task-level number is incomplete. But we’ve been displaying the output as a large bold percentage. That looked like a measurement. It was always an estimate.

So we’ve made a set of changes to the results page. The band label — Low, Moderate, High, Very High — now leads visually. The percentage is prefixed with a tilde and drops to secondary context. The score sentence was rewritten: "A rough estimate: around X% of this job’s tasks fall in the automation zone. A directional signal — use it as a starting point, not a verdict."

We also added a section called "What this score doesn’t capture." It makes two points explicitly: first, that this estimate maps your current role as described — it won’t show if the role itself gets redefined, which is exactly the accountant problem. Second, that any task-based framework will miss transformative disruption. The compass bearing framing is now in the product, not just in our thinking about it.

The podcast’s honest conclusion was that any framework will be directionally right most of the time, and wrong in a few cases nobody predicted. That’s true of ours. The right response isn’t to stop giving people a useful signal — it’s to be clearer about what kind of signal it is.