ChatGPT can speak more than 95 languages. Google Translate can speak 133. Facebook’s AI translator works with 200 languages. One could assume that the ability of AI to follow codified rules of language makes human expression less mysterious and more universal. However, machines using methods of formal systemization can paradoxically reveal the organic essence and nuance of human language.
What is this essence? If there are intangibles, what are they? Your intuitions might tell you that what’s missing is something elusive in the language of emotional connection or in the ineffable experience of awe.
That’s what my intuition would have told me. But the reality is both more mundane and more magical.
Rather than revealing core universalities, modern data-driven linguistics has uncovered far greater diversity across languages than previously thought. Languages are rooted in culture, shaped by shared human experiences, not the mathematical universals we use for translation. Despite the undeniable usefulness of machine translation and large language models that mimic cognition, languages cannot be properly understood except in relation to the physical and social contexts where speakers use them.
In A Myriad of Tongues, Caleb Everett writes a story of subtle and non-intuitive linguistic diversity. Of languages shaped by senses, the environment, and culture. In a world increasingly disconnected from the tangible origins of speech, this book grounds language in the physical challenges and variability of the everyday settings humans faced. Language diversity reflects geographical, ecological, and cultural diversity in surprising ways.
Everett writes that some languages lack distinct words for "arm" and "hand." This idea has an almost magical quality—by looking at how cultures label body parts, we gain this uncanny window into their physical environments. In warmer climates where sleeveless clothes are common, speakers are less prone to separate "arm" from "hand" lexically. As Everett explains, "In fact, 228 languages do not separate “hand” from “arm,” and the wrist is not a key separator of basic words in these languages."
This phenomenon initially struck me as profoundly counterintuitive. Surely every language must have its own terms for arm and hand? Yet the implication makes sense upon reflection. In places where arms are routinely exposed rather than covered by clothing, there is less need to differentiate arms from hands verbally.
Take a moment to imagine that you speak one of these languages. How do your arms and hands feel? If you are like me, it feels like something to only have one word for arm and hand and no word for wrist. Embodied experiences shape language in subtle yet deeply meaningful ways and words convey the thoughts and sensations evoked by our environment.
Languages also reflect our tangible experiences in the physical world. The development of color terms is tightly interlinked with the need to describe object shapes and qualities. In fact, focal colors seem to emerge in a universal pattern based on daily human phenomena: black and white (night/day), red (fire, blood), green and yellow (vegetation, sun), and blue (sky). This hierarchy likely represents the relative salience of these colors in enabling our ancestors to effectively communicate about their environment and experiences. Night vs day, plant life, fire, and the sky above profoundly shaped the sensory perceptions encoded into language.
Everett provides insight into why machine translation succeeds up to a point. This partial computability does not arise from some universal grammar mirroring universal thought. Rather, it stems from detectable linguistic patterns. For instance, statistician George Zipf observed word frequency distributions follow a reliable rule: the most common word occurs about twice as often as the second most common, which appears three times as frequently as the third, and so on. Moreover, these frequent words tend to be shorter, conveying little meaning. "Very common words tend to be very short," as Everett notes.
Grammar also evolves predictably through "grammaticalization"—words shrinking when used repetitively in fixed contexts. Greater predictability means less information conveyed. Over time, words transform into suffixes or prefixes. We see this occurring in English as "I want to" becomes "I wanna." Everett speculates that if English were an undocumented indigenous language, a linguist might conclude that “wanna” is a prefix attached to “eat”.
This book is gloriously human. It reads like a novel. Everett could have written an entire book about data-driven discoveries in linguistics, how machine learning is transforming our understanding of language, or how AI might be helpful in preserving threatened languages.
And while this may all be true, the real beauty of this book is that it stays focused on why humans invented language in the first place. For all our progress, language’s origins were in helping our ancestors make sense of the world. Our tongues are tethered to our tangible existence.
Link to publisher’s page
Link to the book on Amazon
Link to where I bought it at Powells in PDX