The Six Value Leaks of Enterprise AI
The first wave of enterprise AI was about access. The second is about absorption—and most of the value leaks out before it ever reaches the firm.
For three years now, companies have bought model access, rolled out copilots, funded proofs of concept, and let a sprawl of employee-built tools grow up in the gaps. Count the activity and adoption looks close to universal — but count the results and it mostly isn’t there.
MIT Project NANDA’s 2025 widely cited research put a hard number on it. After an estimated $30–40 billion in enterprise GenAI spending, roughly 95% of pilots showed no measurable impact on profit and loss. The figure is directional, built from interviews, surveys, and a review of public deployments, so don’t treat the number as gospel. But the shape is hard to argue with. Experimentation is everywhere and production value is scarce. Enterprises have bought intelligence; but they haven’t learned to use it or scale it.
In my view, the hard part of enterprise AI is no longer getting access to intelligence, it’s operationalising it. A pilot proves something can work but this is a long way from generating revenues, reducing cost, or having some other commercial impact. Production systems are the part that actually does.
The gap is real, and it is expensive
Follow the funnel and you can watch the value leak out. In MIT’s data, about 60% of organisations evaluated GenAI tools, roughly 20% ran a pilot, and only 5% reached production. That drop, from interest to experiment to something the business can actually lean on, is where the value disappears.
The pattern will be familiar to anyone who’s sat through it. A team finds a use case. A few people build a promising proof of concept. The demo works, the deck lands, everyone nods, and then the thing just... stalls.
The model, it turns out, was only ever part of the problem. What most pilots lack isn’t intelligence. It’s the operating layer around it, all the plumbing that turns a good demo into something the business can rely on. Most pilots have the model and almost none of the rest.
MIT’s interviews are full of the specific ways this breaks. At one Fortune 500 insurer, the sanctioned pilot looked great in the boardroom and fell apart in the field, because it couldn’t hold context from one interaction to the next. The model was fine. Everything you’d have to build around it to make it usable simply wasn’t there.
Where the value hides
For me, the really interesting question isn’t whether the value exists, it’s where it goes.
Economists have watched task-level value fail to reach the bottom line before. The economist Robert Solow caught it in 1987, when he pointed out that the computer age was visible everywhere except in the productivity statistics. The standard explanations were mismeasurement, adjustment lags, dissipated profits, and mismanagement. AI has its own, sharper version of each. When a pilot fails to move the P&L, the value has usually been trapped somewhere between the model and the firm.
In thinking about the current situation with AI, I believe there’s six core archetypes of value leakage in AI; deferred, starved, eaten, leaked, competed away, or left unmeasured.
1. Deferred — the value is forming somewhere the accounts can’t see yet. Electricity is the cleanest analogy, not because AI is electricity, but because the lag has the same pattern. Early factories swapped the steam engine for one big electric motor, left the old shaft-and-belt guts in place, and captured almost none of the gain. The real payoff came later, once small motors on individual machines let factories be rebuilt around how the work actually flowed. Warren Devine traced that shift from shafts to wires; Brynjolfsson, Rock and Syverson later formalised it as the productivity J-curve: general-purpose technologies demand complementary investment (redesign, retraining, restructuring) before they show up in measured output, so productivity dips before it climbs. And even the gains that have landed tend to be undercounted, because standard metrics miss the free, better-quality digital goods AI produces and expense the redesign rather than capitalising it. The task-level gains, at least, are already visible: Brynjolfsson’s study of more than 5,000 support agents found AI lifting output 14% on average and up to 35% for the least experienced, who soaked up the know-how of the best. It’s the firm-level gains that wait on redesign. This isn’t a fringe read, either. The BIS pins the gulf between task-level gains of 20–50% and sub-1% aggregate productivity on exactly this integration problem, and the Fed’s Lisa Cook has made the same point, that firms are adopting AI without reorganising work around it. Some slice of that damning 95% is simply firms sitting at the bottom of the J. We can call this the redesign gap.
2. Starved — the model never gets what it needs, because the data isn’t ready. A model is only as good as the context it can pull, and most enterprise context is a quagmire: siloed across systems, largely unstructured, riddled with stale and conflicting copies. MIT watched pilots die due to exactly this, a model wired into a repository holding ten versions of the same document, grabbing one at random and answering with complete confidence. The problem is foundational, not exotic, which is why Gartner expects enterprises to abandon 60% of AI projects through 2026 for lack of an AI-ready data foundation. The teams that get furthest are almost always the ones that fixed their data before they scaled a model. Least glamorous item on the list, and probably the most decisive. We can characterise this as the data gap.
3. Eaten — it gets consumed downstream as rework. Stanford and BetterUp, writing in Harvard Business Review, gave this one a name: workslop. Output that looks polished but has no substance behind it, so it doesn’t move the task forward, it just pushes the real work onto whoever opens the document next. Around 40% of workers said they’d received some within a single month, and each hit cost nearly two hours to clean up. The researchers put the tax at roughly $186 per employee per month, and that’s before you count the trust it quietly burns. The irony more or less writes itself: cleaning it up takes exactly the same amount of human effort that AI was supposed to save you. We can call this the quality-and-trust gap.
4. Leaked — it dissipates somewhere between the person and the firm. Even clean, real time savings don’t reach the P&L by themselves. People write and search and code faster, and then those reclaimed hours just... evaporate into unallocated time, unless somebody deliberately points the freed capacity at more output or at cost actually removed. It almost never happens. In Workday’s study, 89% of organisations had updated fewer than half their roles to reflect what AI now does, so people are still managed as if the job never changed. Saved hours don’t show up on the P&L; only outcomes do. This is the reallocation gap, and what makes it a gap is that almost no one decides, before deployment, where the saved time is supposed to go.
5. Competed away — the value is real, you just don’t get to keep it. This is the sneakiest one, because nothing is lost; it simply ends up somewhere other than your margins. AI reaches your competitors on the same terms it reaches you. When everyone can buy the same capability for the same marginal cost, whatever edge it hands you gets matched into lower prices instead of fatter margins. Picture every call centre in an industry shaving a third off handling time. Nobody pulls ahead, and the saving flows straight through to customers as surplus that lands in no one’s P&L. McKinsey’s description of this phenomenon is blunt: AI is less a productivity revolution than a competitive reset, where early movers reset the cost base and take the profit pool, and a firm that merely keeps up has effectively gifted its gains away. But notice where that leaves the advantage. The model is the part that gets competed away; the operating layer you wrap around it is the part rivals can’t order off a price list, and that’s where any durable edge actually lives. Of the six, this is the one I’d worry about first. The strategy gap: you can win the efficiency and still lose the surplus, because the real fight is over who keeps it.
6. Unmeasured — you can’t prove it, so it dies in review. Value can arrive, convert, and stick around, and still get killed in a budget meeting, because hardly anyone runs a baseline or a holdout. With no control, you can’t tell the gains apart from ordinary business growth, and unproven value gets cut on the same logic as unreal value. This is the measurement gap.
Line the six up against Solow’s old paradox and the rhyme is obvious. Adjustment lags are the redesign gap; mismeasurement is the measurement gap; dissipated profits are the strategy gap; and mismanagement fans out into the data, quality, and reallocation gaps. And here’s the thing they share: almost none is really a model problem. The model can still trip on reliability or context limits, but in most failed pilots it stopped being the main focus a while ago. AI has collapsed the cost of building software. What it hasn’t touched is the cost of absorbing it, which is why so many organisations can now build far faster than they can turn what they’ve built into value.
The clearest tell that all of this is real is shadow AI. Employees aren’t waiting for permission. MIT found staff at over 90% of surveyed companies already using personal AI tools for work, against only about 40% of firms with an official subscription. Read that as revealed demand, not recklessness. The value is close enough to touch; but it’s the operating model to capture it that’s missing. If shadow AI is a symptom, it’s a useful one, because it points straight at where the value is trying to get out.
And to be clear, the value is genuinely there. The firms that reach production do see real returns, clustered in the back office, in finance, procurement, and customer operations, which is exactly where MIT also found them. It’s real. It’s just hindered by everything above.
The market is moving the same way
It’s a validation that the money is moving in the same direction, and fast. In something like sixty days, deployment went from an afterthought to a category. Anthropic teamed up with Blackstone and others to build an enterprise AI services firm. OpenAI launched a $4 billion Deployment Company and bought a consultancy so it could park forward-deployed engineers inside customer operations. AWS put $1 billion behind the same bet. Microsoft stood up Frontier Company with $2.5 billion and thousands of engineers.
Each is effectively making the same bet: capable models are now abundant. The scarce resource is integrating them into messy, regulated, legacy organisations where they can reliably create value.
Where this leaves us
The model matters, it’s just not enough on its own. Enterprise AI creates value only when it becomes part of the operating fabric of the firm: the workflows, the controls, the incentives, and the decisions people actually make. That’s a lot harder than shipping a pilot, and it’s precisely where the durable advantage sits. The firms that pull ahead will be the ones that can name which of the six leaks they’ve got and go close it.
The first wave of enterprise AI was about buying intelligence. The second will be about absorbing it. Access to capable models is becoming ubiquitous; the advantage now lies in the operating systems organisations build around them. The firms that win won't be the ones with the smartest models. They'll be the ones that learn how to convert intelligence into economic value.