AI Forecasting

The Trajectory

Where will LLMs be in one year? Three years? The research community is placing bets—and the predictions range from "AGI by 2027" to "we're already hitting walls." Here's what the data actually says.

Listen
Abstract visualization of exponential growth curves and AI capability trajectories
01

17 Experts Bet on 2026: Half a Work Week of AI Tasks, No Economic Miracle

Understanding AI assembled nine expert contributors to make 17 concrete predictions for 2026. The consensus: continued capability gains, but nothing that transforms the economy overnight.

Timothy B. Lee predicts Big Tech capex will exceed $500 billion (75% confidence), with OpenAI hitting $30 billion revenue and Anthropic reaching $15 billion. Kai Williams forecasts AI completing 20-hour software tasks by year-end—half a human work week—at 55% confidence. But Lee directly challenges "fast takeoff" theories: he gives 90% odds that US GDP growth stays below 3.5%, far from the explosive growth some predict.

Other notable bets: context windows plateau around one million tokens (80%), a Chinese company surpasses Waymo in fleet size (55%), and Tesla launches truly driverless taxis (70%). James Grimmelmann predicts the legal free-for-all ends, with courts imposing serious financial consequences on AI companies.

The meta-lesson from 2025: Any advantage for an AI lab was temporary. Once one lab proved a capability, others quickly followed. The contributors expect this "fast-follower" dynamic to persist—meaning no one stays ahead for long.

02

Raschka's 2025 Retrospective: Inference Scaling Is the New Frontier

Sebastian Raschka, author of "Build A Large Language Model (From Scratch)" and researcher with 150,000+ subscribers, published his annual state-of-the-field analysis. The headline: the industry pivoted from training-centric to inference-centric optimization.

The year's watershed moment was DeepSeek R1's January release. Their transparency revealed striking economics: training their reasoning model cost just $294,000—dramatically lower than the hundreds of millions industry assumed. Reinforcement Learning with Verifiable Rewards (RLVR) and the GRPO algorithm fundamentally changed post-training, enabling reasoning capabilities without massive compute.

Raschka identifies "benchmaxxing" as a growing crisis: public test contamination means leaderboard scores no longer reliably predict real-world performance. His 2026 predictions center on inference-time gains, diffusion models for consumer applications, and RLVR expanding beyond math and coding into chemistry and biology. Classical RAG, he argues, will fade as long-context windows improve.

The quiet shift: Qwen displaced Meta's Llama as the open-source community standard in 2025. Private data emerged as the key competitive moat—companies increasingly refuse to license proprietary datasets to AI labs.

03

The AI 2027 Scenario: AGI in Two Years, Superintelligence in Three

In April 2025, Daniel Kokotajlo—who correctly predicted the trends leading to ChatGPT, GPT-4o, and o1 nearly four years ago—published AI 2027, a detailed scenario for rapid AI progress. The prediction: AGI by 2027, superintelligence by 2028.

The scenario draws on Leopold Aschenbrenner's 165-page "Situational Awareness" treatise from 2024, which projected AGI through continued exponential scaling. AI 2027 maps it concretely: a fictional "Agent-1" evolves through increasingly capable versions until, by September 2027, Agent-4 achieves superhuman AI research capabilities—making a year's worth of breakthroughs every week. Humans become spectators.

The authors added a November 2025 note: "2027 was our modal (most likely) year at publication, our medians were somewhat longer." A predictions tracker shows 91% accuracy on evaluated forecasts so far—though only 18% have been assessed. Critics like Gary Marcus remain skeptical: "even though I highly doubt AGI will arrive in three years, I can't absolutely promise you it won't happen in 10."

The geopolitical dimension: Aschenbrenner frames AGI as an intensifying US-China race requiring "Manhattan Project"-style initiatives. By late 2026, the scenario predicts an AI arms race in full swing, with both nations making dramatic moves to outpace each other.

04

METR's Doubling Law: AI Task Duration Doubles Every 7 Months

METR (Model Evaluation and Threat Research) proposed a new way to measure AI progress: the length of tasks AI can complete autonomously. Their finding is striking: this metric has doubled approximately every 7 months for the past six years, possibly accelerating to every 4 months in 2024.

Line chart showing AI task duration doubling every 7 months from 2019-2025
AI-completable task duration has grown exponentially, from ~1 minute in 2019 to ~1 hour in 2025. Source: METR (Mar 2025)

The methodology: 170 tasks across software engineering, cybersecurity, and reasoning, with human completion times ranging from seconds to tens of hours. Over 800 human baselines from skilled professionals established typical durations. The best current models—like Claude 3.7 Sonnet—can handle some tasks that take expert humans hours, but reliably complete only tasks of a few minutes.

Extrapolating the trend: in under a decade, AI agents could independently complete tasks that currently take humans days or weeks. By late 2026, the forecast suggests 50% reliability on 20-hour software tasks—roughly half a work week.

The productivity paradox: METR also ran a randomized controlled trial on real-world productivity. The surprising result: when experienced open-source developers used AI tools on their own repositories, they took 19% longer than without AI. Benchmark capability and practical utility don't always align.

05

Epoch AI's Data Wall: Training Data Exhaustion Between 2026 and 2032

Epoch AI researchers Pablo Villalobos, Jaime Sevilla, and collaborators tackled a fundamental constraint: how much training data actually exists? Their estimate: approximately 300 trillion tokens of quality, repetition-adjusted human-generated public text.

Chart showing training data consumption approaching the data ceiling between 2026-2028
Training data consumption is accelerating toward available supply. The "data wall" arrives between 2026-2028 depending on overtraining intensity. Source: Epoch AI (2024-2025)

The exhaustion timeline depends on training intensity. With 5x overtraining (training on data multiple times), models hit the wall around 2027. Without overtraining, around 2028. With aggressive 100x overtraining, potentially 2025. The breakdown: CommonCrawl offers ~130 trillion tokens, the indexed web ~510 trillion, and the whole web including private content ~3,100 trillion.

The researchers acknowledge progress need not halt at data exhaustion. Synthetic data generation, multi-modal learning, and efficiency improvements offer paths forward. But the core constraint remains: human-generated text is finite, and current consumption patterns are depleting it rapidly.

The semiconductor bottleneck: Beyond data, TSMC is fully booked until 2026. Building new wafer fabs involves long lead times and equipment shortages. The exponential increase in chip deployment is pushing manufacturing toward its production ceiling—another potential wall.

The Three-Year Window

One year out: AI likely completes half-workweek tasks, inference optimization delivers most gains, and the legal reckoning begins. Three years out: the forecasts diverge wildly. The optimists see AGI and an intelligence explosion. The skeptics see data exhaustion, semiconductor bottlenecks, and diminishing returns forcing a rethink. Both camps agree on one thing: the trajectory we're on—exponential capability gains through pure scaling—is approaching its limits. What comes next depends on whether we find new curves to climb.