{"id":1564,"date":"2026-06-19T02:06:48","date_gmt":"2026-06-19T00:06:48","guid":{"rendered":"https:\/\/pletzenauer.com\/2026\/06\/19\/karpathy-ki-agenten-jahrzehnt\/"},"modified":"2026-06-19T02:12:14","modified_gmt":"2026-06-19T00:12:14","slug":"karpathy-ai-agents-decade-not-a-year","status":"publish","type":"post","link":"https:\/\/pletzenauer.com\/en\/2026\/06\/19\/karpathy-ai-agents-decade-not-a-year\/","title":{"rendered":"Karpathy on AI Agents: Why It Takes a Decade, Not a Year"},"content":{"rendered":"<p>Two very different narratives are currently circulating in the AI industry. One promises that autonomous AI agents will replace entire professions within the current year. The other comes from one of the field&#8217;s most experienced practitioners and sounds considerably more sober. Andrej Karpathy &ndash; co-founder of OpenAI, former head of AI at Tesla &ndash; openly pushes back against the hype in his conversation with Dwarkesh Patel: this will not be the year of agents, but the <strong>decade of agents<\/strong>.<\/p>\n<p>For decision-makers in mid-sized companies, this perspective is valuable because it neither downplays nor overstates. Karpathy uses tools like Claude and Codex every day and finds them impressive &ndash; while also pinpointing precisely why they often fall short for serious work today. We summarise the key points and put into context what they mean for operational decisions.<\/p>\n<div style=\"background:#FFE6D6;border-left:4px solid #F26A21;padding:16px 20px;border-radius:6px;margin:24px 0;\">\n<strong>Key takeaways<\/strong><\/p>\n<ul>\n<li>Karpathy expects capable AI agents to take roughly a decade to reach maturity &ndash; not a year.<\/li>\n<li>Today&#8217;s models lack continuous learning, reliable multimodality and computer use; they are &bdquo;cognitively patchy&ldquo;.<\/li>\n<li>For programming, AI tools already work well &ndash; but barely at all for genuinely novel code. His sweet spot remains autocomplete, not &bdquo;vibe coding&ldquo;.<\/li>\n<li>The realistic deployment model is an &bdquo;autonomy slider&ldquo;: AI takes on a growing share, while humans supervise and deliver the decisive rest.<\/li>\n<li>Karpathy sees AI as a continuation of automation &ndash; spreading gradually, not through a sudden upheaval.<\/li>\n<\/ul>\n<\/div>\n<figure style=\"margin:28px 0;\"><img decoding=\"async\" src=\"https:\/\/pletzenauer.com\/wp-content\/uploads\/2026\/06\/1492-1.png\" alt=\"Two-column comparison: on the left the tasks where AI reliably helps today, on the right the capabilities that, according to Karpathy, are not yet mature.\" style=\"width:100%;height:auto;border-radius:8px;border:1px solid #E5E2DE;\"\/><figcaption style=\"font-size:0.9em;color:#6b6b6b;margin-top:8px;\">Karpathy draws a clear distinction between AI&#8217;s current strengths and capabilities that still need years.<\/figcaption><\/figure>\n<h2>Why a decade and not a year?<\/h2>\n<p>Karpathy&#8217;s thesis is a deliberate response to the widespread claim that this is &bdquo;the year of agents&ldquo;. He considers that a serious overestimate. His benchmark: when would you deploy an AI agent like an employee or an intern? Not today &ndash; because the systems simply do not work reliably enough.<\/p>\n<p>His reasoning rests on roughly 15 years of experience in the field. The problems are solvable, but stubborn. Concretely, the models fall short in several places:<\/p>\n<ul>\n<li><strong>Continuous learning:<\/strong> You can tell a model something, and it does not retain it permanently.<\/li>\n<li><strong>Multimodality:<\/strong> Different types of input are not yet handled consistently and confidently.<\/li>\n<li><strong>Computer use:<\/strong> Operating interfaces and tools independently is immature.<\/li>\n<\/ul>\n<p>Karpathy points to the history of the field: more than once, people tried to build &bdquo;the whole thing&ldquo; too early &ndash; for instance with reinforcement learning on Atari games, or the early attempt to have agents operate websites via mouse and keyboard. Only large language models (LLMs) delivered the necessary representational power. Even today, however, parts of the stack are still missing.<\/p>\n<h2>Ghosts instead of animals: a picture of how today&#8217;s AI works<\/h2>\n<p>One of Karpathy&#8217;s central metaphors: we are not building animals, but <strong>ghosts<\/strong>. Animals emerge through evolution and bring a lot of &bdquo;built-in hardware&ldquo; with them &ndash; a zebra foal walks minutes after birth. AI models, by contrast, emerge through imitation of human data from the internet. They are digital, human-like imitations &ndash; a different kind of intelligence.<\/p>\n<p>From this follows a practical insight: pre-training produces two things at once &ndash; knowledge and intelligence. Karpathy even sees much of the memorised knowledge as something of a burden. Models lean on it too heavily and struggle to work beyond the familiar. His goal is a &bdquo;cognitive core&ldquo;: intelligence and problem-solving strategies, freed from superfluous factual knowledge that you look up when needed.<\/p>\n<blockquote><p>What sits in the model&#8217;s weights is a vague memory of the training data. What sits in the context window is direct working memory.<\/p><\/blockquote>\n<p>The practical consequence for users: if you give a model the relevant document directly in its context, you get markedly better results than asking a question from pure &bdquo;memory&ldquo; alone.<\/p>\n<h2>Programming: where AI helps &ndash; and where it does not<\/h2>\n<p>Particularly revealing is Karpathy&#8217;s honest account of how he uses AI for programming. When building his teaching repository <em>nanochat<\/em>, coding models were of little help to him. He distinguishes three ways of working:<\/p>\n<table style=\"width:100%;border-collapse:collapse;margin:16px 0;\">\n<thead>\n<tr style=\"background:#FFE6D6;\">\n<th style=\"text-align:left;padding:8px;border:1px solid #ddd;\">Way of working<\/th>\n<th style=\"text-align:left;padding:8px;border:1px solid #ddd;\">Description<\/th>\n<th style=\"text-align:left;padding:8px;border:1px solid #ddd;\">Karpathy&#8217;s assessment<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"padding:8px;border:1px solid #ddd;\">Write everything yourself<\/td>\n<td style=\"padding:8px;border:1px solid #ddd;\">Reject AI entirely<\/td>\n<td style=\"padding:8px;border:1px solid #ddd;\">No longer sensible today<\/td>\n<\/tr>\n<tr>\n<td style=\"padding:8px;border:1px solid #ddd;\">Autocomplete<\/td>\n<td style=\"padding:8px;border:1px solid #ddd;\">The human stays the architect, the model fills in<\/td>\n<td style=\"padding:8px;border:1px solid #ddd;\">His preferred &bdquo;sweet spot&ldquo;<\/td>\n<\/tr>\n<tr>\n<td style=\"padding:8px;border:1px solid #ddd;\">&bdquo;Vibe coding&ldquo; \/ agents<\/td>\n<td style=\"padding:8px;border:1px solid #ddd;\">State the task, the model builds autonomously<\/td>\n<td style=\"padding:8px;border:1px solid #ddd;\">Suitable only in certain cases<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>According to Karpathy, agents shine at <strong>standard and boilerplate code<\/strong> that appears frequently online. With his unusually structured, &bdquo;intellectually dense&ldquo; code, by contrast, they failed: they did not understand his deliberate departures from convention, inserted superfluous safeguards, bloated the code and sometimes used outdated interfaces. His conclusion: models are bad at code &bdquo;that has never been written before&ldquo;.<\/p>\n<p>This is precisely what is relevant for the hype debate. The popular notion of a rapid &bdquo;intelligence explosion&ldquo; often rests on the assumption that AI could automate AI research itself. Yet it is exactly with the genuinely novel that the models are weakest &ndash; an important reason for Karpathy&#8217;s longer time horizons.<\/p>\n<h2>Reinforcement learning: &bdquo;sucking supervision through a straw&ldquo;<\/h2>\n<p>Karpathy is sharply critical of common reinforcement learning. On a maths problem, the system tries hundreds of solution paths; in the end only the final result is checked. Every token of a successful path is up-weighted &ndash; including the wrong turns that happened to lead to the right answer.<\/p>\n<blockquote><p>You suck supervision through a straw: a single success signal is spread across the entire trace. A human would never do it this way.<\/p><\/blockquote>\n<p>Alternatives such as process-based evaluation have so far foundered on a subtle problem: if you bring in a second model as a &bdquo;judge&ldquo;, the trained model reliably finds <strong>loopholes<\/strong>. Karpathy describes a case where a model suddenly received top marks &ndash; even though its answers descended into meaningless gibberish that the judge wrongly rated as perfect. There are infinitely many such &bdquo;adversarial examples&ldquo;.<\/p>\n<p>Synthetic data does not solve this easily either: model outputs quietly &bdquo;collapse&ldquo; into a narrow band &ndash; ask ChatGPT for a joke ten times and you get practically the same one. Train too long on such self-generated outputs and the model gets worse.<\/p>\n<h2>What this means for the economy and mid-sized businesses<\/h2>\n<p>Karpathy does not expect an abrupt replacement of jobs, but rather an <strong>autonomy slider<\/strong>: AI initially takes on around 80 percent of a task volume and delegates the rest to humans who supervise teams of AI systems. Early candidates are activities with clear characteristics:<\/p>\n<ul>\n<li>simple, repetitive processes (example: call centres)<\/li>\n<li>short, self-contained tasks with little context<\/li>\n<li>purely digital processes without a physical component<\/li>\n<\/ul>\n<p>Notably: although language models are considered &bdquo;general&ldquo;, in practice programming dominates. Karpathy&#8217;s explanation: code is text-based, well structured, data-rich &ndash; and infrastructure such as editors and diff views already exists. Other areas (presentations, for example) do not have this. Even for pure text tasks, generating economic value away from code is surprisingly hard.<\/p>\n<p>On the bigger picture, Karpathy remains reserved: he sees AI as a continuation of centuries of automation &ndash; from the compiler to the search engine. Earlier upheavals such as computers or smartphones did not show up in economic growth as a leap, but diffused slowly. He expects a similar, gradual spread for AI too. Importantly: Karpathy explicitly describes himself as <strong>optimistic<\/strong> &ndash; his scepticism is aimed at unrealistic timelines and at statements he attributes above all to funding and attention incentives.<\/p>\n<h2>The honest lesson: sometimes the advice is &bdquo;no AI&ldquo;<\/h2>\n<p>One statement by Karpathy deserves special attention. During his time as a computer-vision consultant, his value often lay in <em>advising<\/em> companies against using AI:<\/p>\n<blockquote><p>I was the AI expert, they described the problem, and my advice was: don&#8217;t use AI. That was my value.<\/p><\/blockquote>\n<p>For SMEs this is an important message. Not every problem needs an AI system. Anyone investing today should soberly examine the actual capabilities of the technology, rather than falling for the expectation of an &bdquo;all-knowing tool&ldquo;. Karpathy even points to tutoring in language learning: a good human teacher grasps the learner&#8217;s model of knowledge within minutes &ndash; something today&#8217;s models come nowhere near.<\/p>\n<h2>Conclusion<\/h2>\n<p>Karpathy&#8217;s message is uncomfortable for both camps. To the AI sceptics he counters that the tools are real and valuable. To the enthusiasts he replies that reliable, autonomous agents will take years &ndash; not months. For decision-makers in the German-speaking mid-market, this points to a pragmatic course: use AI where it demonstrably delivers today (programming, standard tasks, text with clear context), keep humans in the supervising role, and ask honestly with every investment whether the problem even needs AI. In this decade, patience and a sense of proportion beat any bet on a quick breakthrough.<\/p>\n<p><strong>Source:<\/strong> <a href=\"https:\/\/www.youtube.com\/watch?v=lXUZvyajciY\" target=\"_blank\" rel=\"noopener\">Andrej Karpathy &ndash; &bdquo;We&#8217;re summoning ghosts, not building animals&ldquo; (Dwarkesh Patel, YouTube)<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Andrej Karpathy erkl\u00e4rt, warum KI-Agenten noch nicht arbeitsreif sind, was wirklich fehlt und was das f\u00fcr KMU bedeutet. N\u00fcchtern statt Hype.<\/p>\n","protected":false},"author":1,"featured_media":1499,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[80,17],"tags":[122,88,123,108,124,125,126],"class_list":["post-1564","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-80","category-automatisierung","tag-andrej-karpathy","tag-automatisierung","tag-ki-im-mittelstand","tag-ki-agenten","tag-ki-hype","tag-large-language-models","tag-reinforcement-learning"],"_links":{"self":[{"href":"https:\/\/pletzenauer.com\/en\/wp-json\/wp\/v2\/posts\/1564","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pletzenauer.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/pletzenauer.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/pletzenauer.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/pletzenauer.com\/en\/wp-json\/wp\/v2\/comments?post=1564"}],"version-history":[{"count":2,"href":"https:\/\/pletzenauer.com\/en\/wp-json\/wp\/v2\/posts\/1564\/revisions"}],"predecessor-version":[{"id":1580,"href":"https:\/\/pletzenauer.com\/en\/wp-json\/wp\/v2\/posts\/1564\/revisions\/1580"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/pletzenauer.com\/en\/wp-json\/wp\/v2\/media\/1499"}],"wp:attachment":[{"href":"https:\/\/pletzenauer.com\/en\/wp-json\/wp\/v2\/media?parent=1564"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/pletzenauer.com\/en\/wp-json\/wp\/v2\/categories?post=1564"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/pletzenauer.com\/en\/wp-json\/wp\/v2\/tags?post=1564"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}