Interpreting Intelligence Part 3

Key Points:

Grokking Phenomenon: In 2021, researchers discovered “grokking,” where models suddenly switch from memorizing training data to generalizing on unseen inputs after extended training. This phenomenon has since been replicated at larger scales.
Three-Stage Generalization Process: Generalization involves three stages: initial memorization, formation of intricate internal circuits for problem-solving, and a “clean up” phase where redundant data dependencies are shed. This gradual process underlies the seemingly sudden shift in performance metrics.
Phase Transition in Learning: The abrupt shift from memorization to generalization is akin to a phase transition, where the model experiences a sudden leap in performance. This occurs because memorization becomes increasingly complex, prompting the model to adopt a simpler, generalizable solution.
Occam’s Razor for AI: The transition is driven by the model’s inherent bias towards simpler solutions, preventing further memorization and favoring generalization to handle complexity more efficiently.
Human-Like Learning: This transition mirrors human learning, where memorization lays the foundation for understanding general patterns. Generalization allows intelligence to respond to new situations, making it fundamental to human creativity and adaptability.
Redefining Intelligence: The ability to generalize, to learn and apply knowledge to new situations, is what defines intelligence. This quality makes human intelligence multifaceted and useful, and observing it in AI models is particularly striking.

This week, in part 3, we look at what we've learned this year since the discovery in 2021 of "grokking" where generalization happens abruptly and long after fitting the training data.

You can read part 1 of the series here, and part 2 here.

Switching from Memorizing to Generalizing

In 2021, researchers training tiny models made a surprising discovery. A set of models suddenly flipped from memorizing their training data to correctly generalizing on unseen inputs after being trained for a much longer time. Since then, this phenomenon—called “grokking”—has been investigated further and reproduced in many contexts, at larger scale.

Generalization is a three stage process. Initially, models memorize data. They then form intricate internal circuits for problem solving. Finally, they refine these solutions. In a “clean up” phase, they shed redundant data dependencies.

Though appearing sudden in performance metrics, this process is gradual and nuanced under the surface. Train versus test metrics, which track the learning over time, show a linear progression. The sudden shift is evidence of the complex, layered nature of AI learning, where transformative moments are built upon a foundation of gradual, consistent learning.

What’s going on? Initially, neural networks focus on memorizing the training data. As the training progresses and the dataset grows, the complexity of memorization scales accordingly. However, the complexity involved in generalization—the network's ability to apply learned knowledge to new, unseen data—remains constant regardless of the dataset's size. At a certain point in the learning process, there's an inevitable crossover where the network shifts its focus from memorization to generalization, and this shift occurs quite suddenly, marking a pivotal moment in the network's training.

This abrupt shift is part of what's known as a phase transition. A phase transition in neural networks is a rapid development of a specific capability during a brief period of training. Rather than a gradual improvement, the model experiences a sudden leap from poor performance to proficiency in a specific task. This is thought to be due to the challenges in reaching a solution that generalizes well.

The current thinking is that as the network memorizes, it becomes increasingly complex, and its bias towards a more simple solution intervenes, preventing further memorization. Think of this bias as the Occam’s Razor for AI. This resolves the competing tension between continuing to memorize versus choosing the simpler solution. Memorization and generalization compete and, at some point in the training process, the model switches abruptly to generalization because it’s a simpler way to handle complexity.

This transition from rote learning to sophisticated problem solving is akin to how we learn. Memorization lays the foundation for learning general patterns and remains fundamental to human learning and creativity.

Without generalization, an intelligence can only respond to current inputs in a manner consistent with what worked in the past. If the future is the same as the past, that’s fine. But we are far more interested in situations where the future is different, where we need to discover some kind of underlying past pattern that shares something in common with an emergent future.

How we define intelligence is being rewritten. What does it mean to be intelligent? It is the ability to generalize that gives intelligence meaning, to learn anew and break out of past patterns, that makes human intelligence so multifaceted, useful, and endlessly useful. And when we can see this happen at the level of math inside an AI: spooky.

I’m utterly fascinated by how learning happens—whether we’re talking about biological or artificial intelligence. Returning to Sutton, who made a second point in his Bitter Lesson: the actual contents of minds are tremendously, irredeemably complex. He advocated then that we should stop trying to find simple ways to think about the contents of minds.

It might be fair to say that 2023 was the year when this view has been borne out empirically. The more we understand how these models learn, the more they appear to approach spookily intelligent learning.

I hope you've enjoyed this short series. If 2023 was the year when mechanistic interpretability moved from nascent to niche as some people claim, then 2024 may see it become solely an engineering problem. At the very least it will lead us into new territories of understanding what it means to be intelligent.

Blaise Aguera y Arcas and Michael Levin: The Computational Foundations of Life and Intelligence

Maggie Jackson: Embracing Uncertainty

Greg Epstein: Tech Agnostic

Interpreting Intelligence Part 3

Key Points:

Switching from Memorizing to Generalizing

Helen Edwards