METR chart shows top AI models doubling task length about every 7 months

In short: A widely watched METR chart suggests leading AI models are getting able to finish longer work tasks at a steady pace, roughly doubling every seven months.

What's going on

METR, a nonprofit that studies AI, made a chart that has become a common reference point across the AI industry. Instead of looking at puzzle-like test scores, it tracks something simpler, how long a real task is, and whether an AI system can finish it.

METR starts by timing how long skilled human experts take to complete complex projects, such as building a web app or solving a hard programming problem. Then it gives the same tasks to AI “agents” (AI tools that can take steps on their own, like a junior assistant following instructions) and measures how well they do.

The chart includes “50%” and “80%” success time horizons. That is the task length where the AI is expected to succeed about half the time, or four out of five times. METR’s data shows an exponential pattern, meaning progress keeps compounding, with capability roughly doubling every seven months over the last six years.

METR’s examples show how fast this has moved. Around the launch of ChatGPT in late 2022, top models handled tasks that took humans about 30 seconds. By early 2026, some frontier models were reported to complete certain computing tasks that would take a human expert more than 14 hours. METR also reports current models are around a 50% success rate on tasks that take an expert human about one hour.

What to watch

If this doubling trend continues, METR suggests AI could handle much longer projects by the end of the decade, possibly even month-long efforts. Still, the chart is focused on software and research tasks, not on whether AI can replace most jobs or match humans at everything.

Source: NYTimes

METR chart shows top AI models doubling task length about every 7 months

Jack Harrison

What's going on

What to watch

Similar News

AI labs study whether advanced AI could be conscious

OpenAI says its AI disproved a famous math conjecture from 1946

Study finds AI models can learn false claims even when labeled false

Reports mention Claude Opus 4.8, but Anthropic lists Opus 4.7 as latest

AI labs shift focus to recursive self-improvement, but definitions vary

Explore AI Directory