Loading the Elevenlabs Text to Speech AudioNative Player…
On paper, everything looked promising. The latest AI tools — Cursor Pro, Claude 3.5 and Claude 3.7 Sonnet — were set to revolutionize the work of experienced programmers. The METR team (Model Evaluation & Threat Research) decided to check what this revolution looks like in practice.
Too beautiful to be fast
In July 2025, the results of a meticulously designed experiment were published: 16 senior developers with an average of 5 years of experience in specific repositories, 246 realistic tasks, randomized controlled trial. Some tasks were performed with the assistance of AI, some without. Before starting, participants predicted that AI would speed up their work by 24%. After completion, they evaluated the actual gain at 20%.