The Rise of AI Agents and What We May Never Understand

David Wolpert warns AI networks may evolve beyond human math, creating unpredictable, emergent intelligence that defies control and comprehension.

An abstract image representing AI agents and mathematics

When the Flash Crash hit in 2010, wiping nearly a trillion dollars from the stock market in minutes before restoring it almost as quickly, something weird happened. Not just the near-catastrophic market event, but a foreshadowing of a future where our mathematical models—often our highest fidelity tools for thinking about systems—break down completely. The crash wasn't caused by a single rogue AI or brilliant hacker, but by the emergent behavior of simple trading algorithms interacting in ways that go beyond our ability to model mathematically. Over a decade later, we still can't write equations that fully explain what happened.

This kind of mathematical inadequacy isn't just a temporary limitation—it might be fundamental to how we think about artificial intelligence and its future. While public attention focuses on individual AI models like GPT-4 or Google's Gemini becoming superintelligent, physicist David Wolpert argues we're missing something deeper. Our focus on singular, powerful AI systems is, in his words, "retro, so twentieth century." The real transformation will emerge from networks of AI systems interacting in ways that don't just exceed our predictions, but go beyond what we can actually understand or calculate using math.

This cuts to the heart of how we understand reality itself. All of human mathematics, from simple arithmetic to our most sophisticated theories of quantum mechanics, has to be expressed through finite sequences of symbols. Every equation, proof, or theorem humans have ever written consists of strings of characters from a limited alphabet. Even our most abstract mathematical concepts must be expressed through this restrictive format.

To grasp what this means, imagine trying to explain color to a species that can only communicate in the dots and dashes of Morse code. They might develop incredibly sophisticated theories about color using their dot-dash language, but would these theories capture what color really is? Similarly, our symbol-sequence mathematics might be missing fundamental aspects of reality simply because we can't express them in our limited format.

The limitation becomes crucial when we try to forecast how AI might evolve. Current AI systems, despite their complexity, still operate within frameworks we can mathematically describe. But as these systems begin to interact in more sophisticated ways, they might develop forms of information processing that operate outside the boundaries of what our mathematics can express.

Take category theory, which is often considered to be the most abstract and general framework in mathematics. While it beautifully unifies seemingly disparate mathematical fields, Wolpert points out that this unification through simple structures might actually reflect our cognitive constraints rather than any fundamental truth about reality. It's as if we've discovered that all our maps use the same kind of paper and we're celebrating this as a deep truth about geography.

Which would mean we might think AI is uncovering the essence of intelligence itself, when we are actually fine-tuning systems that mirror the way we frame problems—providing insights shaped by the boundaries of our own assumptions and the informational shortcuts we build into their architecture.

When trading algorithms interact in ways that create flash crashes, they're not following any human-written mathematical proof but instead operating in a space of possibilities that our mathematical language might be fundamentally incapable of describing.

Wolpert proves this limitation formally through computational hierarchies. He demonstrates mathematically that there are levels of computation where higher levels can solve problems that lower levels provably cannot. When networks of AIs interact through smart contracts—self-executing agreements that enable complex, autonomous exchanges—highlighting the inadequacy of the phrase "it's just math" to capture the complexity and emergent potential of these systems.

Imagine three AI systems interacting with each other through smart contracts—essentially self-executing rules they agree to follow. These contracts are Turing-complete, which means they can compute anything that is theoretically computable, like a computer running any possible program. However, this power comes with a catch. When these AIs interact, their behavior can become so entangled that it creates feedback loops and dependencies that are impossible to fully untangle or predict.

This goes beyond regular complexity as we know it—it’s about reaching the very limits of what mathematics and computation can resolve. Certain outcomes, like whether the system will ever reach a stable state, are what mathematicians call "uncomputable." That means no amount of computational power can figure them out in advance. What’s more, the system can produce new, unexpected behaviors that aren’t evident from the rules or the individual AIs—behaviors that emerge from their interactions. This highlights how such systems aren’t just hard to predict—in some cases, they’re fundamentally beyond prediction.

It’s like trying to predict the outcome of a conversation between three people, where each person’s next sentence depends on what the others just said, how they interpreted it, and even how they’re feeling in the moment. The back-and-forth creates layers of complexity that make it impossible to write a script for how the conversation will unfold. It’s not just complicated—it’s fundamentally unpredictable because every interaction shapes the next in ways you can’t foresee.

The genius of Wolpert's perspective is that it suggests how AI agents could get really crazy, really fast. Their interactions might evolve in ways that fundamentally challenge our understanding of control, prediction, and agency. Imagine networks of AI agents not just responding to each other but developing their own evolving "languages," goals, and strategies through constant interaction. Already we see evidence of these exact capabilities in limited circumstances and in reasoning-centric LLMs. Such systems wouldn’t be guaranteed to follow predefined rules because they could adapt, improvise, and create entirely new frameworks for collaboration or competition that humans never anticipated.

The result is a self-evolving ecosystem, increasingly independent from human intentions. "Wolpert's Warning" (my expression) is that even with our most powerful tools—mathematics and the emerging field of mechanistic interpretability—we may never be able to fully understand why advanced AI agents behave the way they do or predict what they will do next.

We expect AI agents to eventually be able to recursively improve themselves and the structures they operate within, shifting from tools to autonomous participants in a broader, emergent system. Smart contracts, for example, might evolve into self-maintaining legal frameworks that reflect the agents' collective learning, creating a space where outcomes are driven by their own adaptive logic rather than human oversight.

These networks might also reshape evolutionary dynamics by creating self-sustaining artificial ecosystems. Instead of competing for natural resources, agents could compete for computational resources, energy, or influence over shared infrastructures. What emerges is no longer just an analogy of three people in a room but a web of interlinked, self-organizing conversations evolving beyond their initial purposes. Just as human societies grew from individual interactions into complex civilizations, these AI networks could evolve into an autonomous layer of agency operating alongside—and potentially beyond—humanity, with rules and logic shaped by their interactions rather than human intent.

Blockchain and Web3 add another layer to this picture. These systems might accidentally be building the perfect infrastructure for this distributed singularity, creating computational ecologies where artificial intelligence evolves through "social" interaction rather than design. But unlike our current financial markets or social networks, these systems might develop forms of interaction and computation that operate on principles we cannot even formulate mathematically.

Mathematics is considered to be the universal language of reality. If Wolpert is right, math might be more like a human dialect, limited by the structure of our minds. As AI agents and networks evolve, they might develop their own "mathematical languages" that operate on principles derived based on the structure of their "minds." This is humbling: we cannot even conceive of what these minds could conjure.

Building on this idea—and everything we already understand about complexity—the singularity is unlikely to arrive as a dramatic moment when a single superintelligent AI awakens. Instead, it will emerge as a form of intelligence that would operate outside our mathematical frameworks entirely—a distributed intelligence arising from the network rather than any individual node, thinking in ways we cannot even formulate equations to describe.

The question isn't whether this distributed singularity will happen, but whether we can expand our mathematical thinking enough to even partially comprehend it when it does.

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Artificiality.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.