Is AI Progress Slowing Down? The HN Brawl Is Arguing the Wrong Variable
Zitron's broadside and the 'xAI is a datacentre REIT now' thread relit the slowdown debate. Both camps cite real numbers — but they're measuring two different curves. The narrative is cooling; the engineering curve isn't.
Summary
On June 8, Ed Zitron’s long essay “AI Is Slowing Down” hit the Hacker News front page (561 points, 591 comments). The same day, Martin Alderson’s piece on xAI renting its compute to direct competitors — looking “more like a datacentre REIT than a frontier lab” — pulled 569 points. Together they relit an old fight: is AI progress actually slowing down?
My read: both camps are armed with real data, but they are arguing two different questions. Zitron measures whether commercial returns and the capital structure can support trillion-dollar compute commitments — and that curve is genuinely decelerating. The other camp measures model capability on verifiable tasks — and by METR’s own late-January data, that curve hasn’t stalled; if anything it sped up. Collapsing both into the single phrase “AI is slowing down” is the biggest analytical error in the whole debate. For builders, the variable worth watching is neither general IQ nor funding noise. It’s the reliability and unit-cost curve on specific, verifiable tasks — and that line is still dropping fast.
The debate
The slowdown camp’s strongest case is entirely about money, and Zitron does the arithmetic without flinching. Using Sightline Climate’s figure of roughly 190GW of planned data centers and Jensen Huang’s own $80–100 billion per gigawatt, the buildout runs $9.5 to $15 trillion. For that compute not to become stranded metal, AI services need to generate over $2 trillion in annual revenue by 2030. Reality: OpenAI and Anthropic together make up 89% of all AI startup revenue, and their combined 2026 revenue is projected around $60 billion — a 496% growth gap to close. The demand side is actively tightening. Uber burned its entire annual token budget in one quarter, then capped employees at $1,500/month; Brex limits engineers to $500 a week and non-engineers to $5; Microsoft’s AI chief Mustafa Suleyman has said he wants Anthropic usage cut to zero. In an unreleased KPMG survey, only 26% of companies say they have a comprehensive view of their AI costs. Zitron’s conclusion: once enterprises pay true cost and can’t see a return, revenue growth has to slow — and the industry is decelerating precisely when it needs to accelerate.
Alderson lands the same blow from a different angle. Starting in May, xAI leased its older Colossus 1 datacentre to Anthropic ($1.25bn/month, ~300MW, roughly 220k GPUs), then leased more to Google last week ($920mn/month, ~110k GPUs). A lab that’s supposed to be burning capital training frontier models is instead collecting rent from its direct competitors. Alderson’s verdict is cold: xAI now looks “like a datacentre REIT with a frontier lab attached, rather than the other way around.” Grok’s retreat from the frontier race reads, to this camp, as a signal that even Musk no longer wants to bet on the next model generation when there’s landlord money on the table.
The no-slowdown camp argues in an entirely different coordinate system. Their point: benchmark saturation is not a capability ceiling. Outsiders watch MMLU and GSM8K get maxed out and assume the models stopped improving — but what’s still moving is reliability on agentic, coding, and long-horizon reasoning tasks. METR’s Time Horizon 1.1, published January 29, supplies the hard number. The length of task a frontier agent can complete autonomously at 50% reliability has, since 2023, been doubling every 130.8 days — about 20% faster than the prior estimate of 165.3 days. By that official measure, the capability curve isn’t flattening; it accelerated.
Who’s right
Neither side is lying. Who’s right depends on which question you’re actually asking.
If the question is “is the business model and capital structure sustainable,” Zitron’s side has the harder evidence. He isn’t citing vibes; he’s citing commitment figures and corporate behavior — hundreds of billions in compute bets, customers imposing their own spend caps, CFOs who can’t read their own bills. These are cold deceleration signals, and they’re the side of the story the hype machine least wants to admit. Alderson’s observation is equally grounded: the very fact that xAI is renting out compute means at least one top-tier player has decided that collecting rent beats betting on the next model. On the money axis, the slowdown camp wins cleanly.
If the question is “are the models still getting more capable,” the evidence tips the other way. METR’s data is the closest thing we have to a first-hand measurement of the capability trend, and it directly contradicts the intuition that generational gains are flattening. There’s a crucial distinction here. The “slowdown” Zitron describes is, in his own framing, slowing revenue growth — not slowing capability. He says so explicitly: set judgment aside, this isn’t about whether the models are good, it’s about the promises that have been made. Treating his financial argument as proof that “the models stopped improving” is a second-order misread by the reader, not his claim. Capability rising and the money not adding up can both be true at once.
So my verdict: this isn’t one side right and one side wrong. It’s two curves jammed into a single sentence. The actionable conclusion for builders is that whether general IQ is slowing is mostly unobservable and unactionable, while reliability and unit cost on specific verifiable tasks — which METR measures, and which you can measure in your own workflow — keep improving fast. The real value of the slowdown narrative isn’t “AI is finished.” It’s “the financial structure paying for this compute round is fragile.” Both are true; just don’t use the second to deny the first.
Why it matters
This fight is worth a builder’s attention not because it tells you whether AI “works,” but because the two camps’ conclusions push you toward opposite resource decisions.
Believe the slowdown camp and you freeze budget, delay wiring AI into core workflows, and wait to enter until “the bubble pops.” Believe the no-slowdown camp and you double down on compute and bet that the next model generation lifts capability another tier the moment it ships. Both paths stake the whole pot on a macro variable you can’t actually observe. And the fragility Zitron exposes is real: when only 26% of companies can even see what they spend, “AI productivity gains” are, financially, an article of faith rather than a measurement — which means whichever side you bet on, you’re holding no ledger that can validate the bet.
The actionable move is to pull the wager from the macro level back to the local one. METR’s methodology is worth copying internally: stop asking “did AI get better overall,” and ask “on my specific, verifiable task, what is the model’s success rate this quarter, its unit cost, and the cost of recovering when it fails.” That curve you can measure, and you can plan a roadmap and a budget against it. Once a team can show how much a model saved on a given task each quarter and how many points its reliability gained, it no longer needs the macro argument to resolve — the basis for deciding shifts from “who won” to “where my own line is heading.”
What to ignore
The first noise to drop is mistaking the intensity of a single broadside for the strength of its evidence. Zitron writes with a blade — the essay is wall-to-wall “paypig” and “midwit” — and that register makes it easy to read “the author is certain” as “the matter is certain.” What carries real weight are the commitment numbers and the corporate behavior; the profanity is packaging, not proof. Strip out the swearing and the zingers, and the ledger underneath is the part worth auditing.
The second thing to ignore is treating benchmark saturation as a capability ceiling. Old benchmarks maxing out means the rulers are too short, not that the thing being measured stopped growing. METR itself concedes that in Time Horizon 1.1 the latest generation of models can do almost every task in the suite — the ruler has simply been maxed out and needs to be replaced with a longer one. Reading “we can’t measure it anymore” as “it stopped improving” is the most common outsider error in this whole debate.
The third filter is for funding and valuation noise dressed up as a technical signal. xAI renting out compute, SpaceX heading toward an IPO, Google being a SpaceX shareholder — yes, financial engineering is plausibly in the mix. But Alderson himself stresses that the compute shortage is real and that SpaceX’s execution speed (Colossus 1 built in 122 days) is a genuine competitive edge. A single deal can carry both an accounting motive and real commercial logic. Concluding “frontier players are retreating” from one rental deal, and concluding “AGI lands tomorrow” from a 130-day doubling time, are the same lazy move: using a noisy signal to answer a question it can’t.