Everyone in artificial intelligence is looking at the same thing, and most of us are lying to ourselves about it.
It’s not a malicious deception, or even a conscious one. It’s the kind of comforting, collective delusion that arises when billions of dollars, countless academic careers, and the entire public imagination depend on believing something that isn’t quite true.
The lie is that we’re building intelligence. That by scaling our models, feeding them the entirety of the internet, and piling on more parameters than there are stars in the galaxy, we will eventually cross some magical threshold into genuine machine cognition.
We won’t. Not on this path.
What we’re creating are not nascent minds. They are sophisticated parrots. We’ve poured the world’s knowledge into systems that can only ever mimic the patterns within it. We’ve engineered a miracle of mimicry, a marvel of statistical reflection, and become so mesmerised by its fidelity that we’ve mistaken it for the real thing. We’ve built the most sophisticated parrot in history and are trying to convince ourselves it’s a hawk.
The parrot can repeat anything you say, often more eloquently than you did. The hawk, however, understands the world. It adapts, it learns from every gust of wind, it changes its strategy based on the rustle of leaves. One is a static marvel of repetition. The other is a dynamic engine of learning.
This isn’t pedantic. It’s the most important problem in AI today. And our collective failure to grasp it is leading us toward a strange, powerful new form of stupidity: a dumb superintelligence.
## The Gospel of Scale: A Fair Hearing for the Orthodoxy
Before I dismantle the prevailing view, I must give it a fair hearing. The argument for scale as the primary driver of intelligence — the orthodoxy ruling from Silicon Valley to Shenzhen — is powerful and seductive.
First, emergent abilities are undeniably real. A model like GPT-2 could barely string a coherent sentence together. GPT-4 can write code, pass the bar exam, and explain complex scientific concepts. Somewhere between those two points, new capabilities appeared that weren’t explicitly programmed. They “emerged” from the sheer scale of the model and its training data. Proponents of scale argue this process isn’t finished. What other profound abilities lie dormant, waiting for us to build a model large enough to awaken them?
Second, history seems to be on their side. Computer scientist Rich Sutton articulated this in his essay, “The Bitter Lesson.” He observed that for 70 years, AI researchers who tried to build explicit knowledge and clever reasoning into their systems were consistently outperformed by those who simply leveraged more computational power. General methods that scale with compute, like search and learning, have always won. The lesson? Stop trying to be clever and just build bigger computers and bigger models.
Finally, the practical results are staggering. These “parrots” are already transforming industries. They are co-pilots for programmers, creative partners for artists, and powerful analytical engines for scientists. When a tool is this useful, it’s easy to forgive its fundamental limitations. It’s tempting to believe that whatever powers this utility must be a form of genuine intelligence.
This is a strong position, backed by billions in investment and tangible, world-changing results. It is also, I believe, fundamentally wrong. It mistakes performance for competence, and mimicry for understanding.
For the full technical analysis with code and data, see my deep dive on tyingshoelaces.
## The Cracks Appear: The Parrot in the Machine
The illusion of learning persists because, for most tasks, it looks identical to the real thing. But under pressure, at the edges, the cracks in the facade begin to show.
The most damning piece of evidence is a condition I call architectural anterograde amnesia. In neuroscience, this is a devastating condition where a person can no longer form new long-term memories. Their past is intact, but their future is a perpetually resetting present. This is the precise condition of every Large Language Model in existence.
After its gargantuan training run is complete, the model is fundamentally frozen. Its weights — the very substrate of its “knowledge” — are static. You can show it new information in a prompt, and it can use that information masterfully within that single conversation. This is “in-context learning.” But the moment that context window slides, that new information is gone forever. It has not been integrated. No learning has occurred. The model has not updated its worldview. It is a brilliant conversationalist with the memory of a goldfish.
“The gap between AI demos and AI in production is where careers get made or broken.”
This isn’t a bug we can fix with a clever patch. It’s a fundamental feature of the architecture. Transformer-based models are interpolation engines. They are designed to find the most statistically probable path through the vast, high-dimensional space of their training data. They take a prompt and find a plausible-sounding continuation based on the trillions of patterns they have already seen. They are not designed to encounter a genuinely novel piece of information and say, “Aha, this changes everything.” The very concept of “changing everything” is alien to their structure.
This leads us to the uncomfortable truth. We haven’t built a machine that learns. We have built a machine that has *already learned everything it ever will*. It is a crystallised intelligence, a snapshot of the internet frozen in time.
And this is why I call it a “dumb superintelligence.” It possesses more factual knowledge than any human in history, yet it lacks the most basic component of true cognition: the ability to grow. It is a library that contains every book ever written, can remix their sentences into new paragraphs, but can never, ever read a new book and add it to the shelves.
## The Deeper Truth: From Static Patterns to Dynamic Growth
If scale isn’t the answer, what is? If larger static models only create more sophisticated parrots, how do we build the hawk?
The answer requires a paradigm shift. We must stop thinking about intelligence as something to be *built* and start thinking of it as a process to be *cultivated*. The goal is not to create a finished artefact of intelligence, but to design a system capable of growth. We need to move from static architectures to dynamic ones. We need to build what I call neuro-symbiotic frameworks.
These are systems designed not just to process information, but to change themselves in response to it. They are built around growth loops — feedback mechanisms that allow the model to learn from its experiences, update its internal representations.