← Home

AI Coding Autonomy Tracker

METR-style horizon chart showing how long coding tasks frontier models can complete autonomously. Y-axis values are shown in hours (converted from METR source-minute estimates); toggle between 50% and 80% reliability views and compare linear vs log scale.

Autonomous coding capability matters because it changes how quickly software can be built, how teams are staffed, and how costs move over time. As these systems improve, they also reshape which technical skills become most valuable and how engineering work is organized. At the national level, capability trajectories increasingly map to competitiveness in AI development, and at the personal level they influence the speed, quality, and safety of the digital tools people rely on every day.

What does “autonomous coding for an hour+” mean in practice?

METR-style tasks are real multi-step software assignments. The model is expected to plan, write code, run tests, debug failures, and deliver a working result with limited human intervention.

In plain terms: this chart estimates how long a model can keep doing realistic software work correctly before it fails.

Source & update status