{"id":13012,"date":"2025-08-01T09:00:00","date_gmt":"2025-08-01T07:00:00","guid":{"rendered":"https:\/\/haimagazine.com\/uncategorized\/uncertainty-code\/"},"modified":"2025-08-01T10:17:59","modified_gmt":"2025-08-01T08:17:59","slug":"uncertainty-code","status":"publish","type":"post","link":"https:\/\/haimagazine.com\/en\/ai-news-2\/uncertainty-code\/","title":{"rendered":"\ud83d\udd12 Uncertainty code"},"content":{"rendered":"<p>On paper, everything looked promising. The latest AI tools \u2014 Cursor Pro, Claude 3.5 and Claude 3.7 Sonnet \u2014 were set to revolutionize the work of experienced programmers. The METR team (Model Evaluation &amp; Threat Research) decided to check what this revolution looks like in practice.<\/p><h4 class=\"wp-block-heading\"><strong>Too beautiful to be fast<\/strong><\/h4><p>In July 2025, the results of a meticulously designed experiment were published: 16 senior developers with an average of 5 years of experience in specific repositories, 246 realistic tasks, randomized controlled trial. Some tasks were performed with the assistance of AI, some without. Before starting, participants predicted that AI would speed up their work by 24%. After completion, they evaluated the actual gain at 20%.<\/p><p>The data told a different story. The turnaround time with AI was&#8230; longer. On average by 19%.<\/p><h4 class=\"wp-block-heading\"><strong>Code doesn&#8217;t write itself<\/strong><\/h4><p>143 hours of screen recordings, version control system logs, interviews, surveys \u2014 all revealed a surprising work dynamic. Instead of coding, developers spent more time formulating prompts, waiting for responses and correcting errors generated by AI. Only 44% of code segments were accepted without major modifications. The rest needed to be rewritten or sometimes abandoned.<\/p><p>Developers hit a wall: AI struggled with large, complex repositories (averaging over a million lines of code), failed to consider the project&#8217;s context, was unaware of code history and didn\u2019t \u201csense\u201d the architecture. Where people used tacit knowledge, AI floundered.<\/p><h4 class=\"wp-block-heading\"><strong>Hope for a second chance<\/strong><\/h4><p>The experiment was not entirely pessimistic. For participants who had over 50 hours of experience with Cursor Pro, there were signs of acceleration. This suggests that experience with a specific tool can be crucial, as well as its further development. AI will not compile a complex system without context, but it can be helpful when a person knows how to harness it.<\/p><p>This is also a strong signal for AI benchmark creators \u2014 testing labs need to move beyond synthetic data. The real work of a programmer is more than just \u201csolve task X\u201d: it involves understanding the code, collaborating with others, the history of the repository, and thousands of small decisions that can&#8217;t be simulated.<\/p><h4 class=\"wp-block-heading\"><strong>Conclusions: less magic, more realism<\/strong><\/h4><p>The METR study shows that enthusiasm for AI should be tempered with empirical data. In a mature open source environment, where code is not just a sequence of instructions but also history and culture, AI can be a burden. It can also be a potential, but only if we treat it not as a wizard, but as a tool, and learn to use it with as much precision as we debug code.<\/p>","protected":false},"excerpt":{"rendered":"<p>Experienced open source developers expected AI to accelerate their work. However, the results of the latest study present a slightly different perspective \u2014 AI-based tools may actually slow things down.<\/p>\n","protected":false},"author":354,"featured_media":12928,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"rank_math_lock_modified_date":false,"footnotes":""},"categories":[813],"tags":[],"popular":[],"difficulty-level":[36],"ppma_author":[776],"class_list":["post-13012","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-news-2","difficulty-level-easy"],"acf":[],"authors":[{"term_id":776,"user_id":354,"is_guest":0,"slug":"redakcja","display_name":"Redakcja","avatar_url":{"url":"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/07\/Zrzut-ekranu-2025-07-10-o-16.00.36.png","url2x":"https:\/\/haimagazine.com\/wp-content\/uploads\/2025\/07\/Zrzut-ekranu-2025-07-10-o-16.00.36.png"},"first_name":"","last_name":"","user_url":"","job_title":"","description":""}],"_links":{"self":[{"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/posts\/13012","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/users\/354"}],"replies":[{"embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/comments?post=13012"}],"version-history":[{"count":1,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/posts\/13012\/revisions"}],"predecessor-version":[{"id":13013,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/posts\/13012\/revisions\/13013"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/media\/12928"}],"wp:attachment":[{"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/media?parent=13012"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/categories?post=13012"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/tags?post=13012"},{"taxonomy":"popular","embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/popular?post=13012"},{"taxonomy":"difficulty-level","embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/difficulty-level?post=13012"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/haimagazine.com\/en\/wp-json\/wp\/v2\/ppma_author?post=13012"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}