The key metric in AI

Insights from a four year journey into AI design and use.

An abstract image of an abacus

Everyone has learned a lot since the peak of the AI hype back in 2016. The technological progress being made every day and the projections for what was going to happen made AI seem both inevitable and invincible. Exponential growth would account for everything. Humans would be redundant (or in utopia.) Bring on UBI.

Most people probably knew in their gut that the story being told by the AI evangelists was too simple at best, plain wrong at worst. What was actually happening at the frontier revealed a more complex story of human-machine than robots doing everything, all the time.

One start-up whose technology showed the power of AI was Nutonian—since acquired by Data Robot. The AI could evaluate literally millions of predictive models every second, far beyond what engineering-centered statistical models could do. But what was most interesting was the company’s philosophy, which wasn’t about removing or downgrading the value of human experience and expertise, but to amplify it.

One successful application of Eureqa was in deriving physical laws from data generated by complex physical systems. This functionality was used by Rio Tinto, a mining giant, to diagnose a production problem that had the process engineers stumped. They had been using statistical methods to analyze process data—hundreds of sensors providing details of processing conditions and the physical and chemical properties of every input and output—but these tools were not able to handle more than a handful of variables and the analysis was taking months. The engineers hooked up their data to the AI and 45 minutes later had a smoking gun in the form of an equation linking the specification to a variable that was a complete surprise.

While this correlation was interesting, what mattered most was what happened next—mobilizing human expertise to find a causal relationship. The plant's R&D engineers searched the metallurgical literature and found some things that explained the phenomenon; things that no one would have even thought to go back to without the pointer from the machine. This was a story of humans and machines both playing to their strengths.

Then came the Princeton/Bath work on bias in AI. This was a critical turning point for many people working in AI because it framed algorithmic bias as human bias. Up until this point, people hadn’t thought a lot about human experience being inherently biased nor how this bias would be learned by an AI. This work held up a mirror—yes, flowers are more pleasant than bees.

So what did this mean for how AI understands the world when AI has no commonsense? A commonsense AI couldn’t be coded and our algorithms aren’t general enough to learn it for themselves, so how could we tell AI what we want it to know about human experience? If we don’t ask an AI to be unbiased and fair, it will not be unbiased and fair. It will simply be what it learns from the data and what we tell it to optimize.

Anytime you train an algorithm based on human culture, you wind up with results that mimic it — Joanna Bryson

There’s no doubt that the AI community has made significant progress on bias and fairness since 2016. But we haven’t got anywhere near where we should be. Back then, Google image search would return CEO Barbie as the first female CEO. Thankfully, CEO Barbie has now gone. Female CEOs are more “fairly” returned but we still have no idea of how the algorithm works. Is it some Googler’s judgment of accurate statistical representation combined with a dose of aspiration? What does the AI do in real-time and how do humans contribute? We have no insight or oversight into the way our worldview is shaped by human decisions about AI in search other than pointing out absurdities from the sidelines.

Tech in general, and AI in particular, desperately needs more diversity in the design process. It’s important because those other voices actually can alter how the intricate dials of algorithmic tuning get set. But only if they have practical, measurable, inclusive and collaborative ways of being involved early. Tweaking things on the back end leads to all sorts of problems, including unethical data gathering practices—such as how Google offered gift cards to black homeless people in exchange for an image of their face in a misguided attempt to improve the company’s facial recognition algorithms.

In 2017, I was lucky enough to visit DeepMind in London. DeepMind is iconic in AI and it was an exciting moment. But it was a bit of a letdown because I didn’t actually get treated as anyone who had anything to contribute, despite having deep domain knowledge in one of their projects at the time (electricity grid operations), as well as great press credentials—Quartz—and a decent, albeit boutique, pedigree in AI market research—Intelligentsia. I left having had a delicious lunch at their in-house cafeteria and with some nice photos of the view of the London skyline. It still, to this day, feels like an opportunity lost.

AI’s culture was observably one of technical arrogance. Many AI experts simply didn’t see non-technical input as valid input. What’s worse was when those developing AI poo-poo’d non-technical people’s concerns. Engineers who make the design decisions can position themselves as being able to use their personal judgment. Then if they are the ones seen to be best positioned to evaluate a hypothetical harm, they also have the power to dismiss the concern as not realistic, not relevant, or not worth bothering about given the probabilities. As Sam Harris said in his 2016 TED talk:

One researcher has said, "Worrying about AI safety is like worrying about overpopulation on Mars." This is the Silicon Valley version of "don't worry your pretty little head about it.” — Sam Harris

We still have to work on this arrogance. And one of the ways to do this is to develop more empathy and compassion towards those harmed—right now, today—by AI. This means that every story we read about AI harm is relevant. It might be discrimination in some far flung corner of the world that we would otherwise think is not relevant to a daily life as white-collar professionals. But every story of harm is relevant to how AI is designed. This is why even designing very narrow AI for a specific commercial use case needs to use tools that are informed by research from socially-focused non-profit think-tanks, even if their work feels remote.

AI in 2017 was a platform story. When Facebook redefined “meaningful connections” for billions or people, Dave wrote a terrifically popular piece about what Facebook could have learned from Kierkegaard, a dead existentialist philosopher.

The core philosophical issue with Facebook’s algorithmic change is the conundrum that the very act of choosing meaningful content for us means that the consumption of that content cannot be meaningful. By filtering our experiences, Facebook removes our agency to choose. And by removing our choice, it eliminates our ability to live authentically. An inauthentic life has no meaning. — Dave Edwards

We’ve become somewhat obsessed with how AI affects human agency. It plays a central role in how we think about AI design. There are a great many upsides and places where AI can contribute to human agency—well designed nudges, where users have control and get feedback about their own choices. “Pre-commitment” signals are discoverable by AI. Imagine you want to eat fewer unhealthy snacks but find it hard to resist. Order smaller serving sizes and an AI can detect that you intend to eat less of them, then nudge you to keep that commitment to yourself. With the right design, this is agency-increasing AI.

But AI that’s designed without conscious consideration of human agency is far more likely to trend towards agency-decreasing. Why? Because AI is incredibly good at creating preferences that favor the goal of the AI. The paradox of personalization says that the best way to personalize a person’s future is to make that future less personalized. To increase the efficiency of personalization, designers need to put individual consumers in a box that will fit the prediction of who they will be tomorrow.

Perhaps the most striking impact of AI on human agency are the things we don’t yet know about ourselves. Unsupervised learning techniques and AI’s ability to find the unintuitive or even the undiscoverable (for a human) hasn’t really even been tested by society yet.

The “gold standard” for unsupervised learning is a discovery made by DeepMind in 2018. Human ophthalmologists have a 50/50 chance of determining someone’s gender by looking at their retina. It’s a guess; there’s no way to tell. DeepMind’s AI got it right 97% of the time. But one of the researchers said that he thinks the AI’s prediction accuracy is closer to 100%. Who is lying? Who doesn’t know? What don’t we know yet about gender in humans? What should an observer, using the AI, do if it reveals that someone is male yet the subject says they are female? We are not well-prepared to deal with the ethics of such new knowledge.

Everyone is an AI designer now—we all have a stake in how these systems are built, how human values are instantiated in machines and how we hold people accountable. The minimum qualification is lived experience. AI design should be measured by how it enables participation because AI design and its experience when used, in practice, often involves redefining boundaries.

That’s what the product is about. Human-centered AI design in one place so that everyone can understand, participate and be valued and design AI that humans want. For more information on Sonder Scheme Studio, our human-centered system for AI design, go here. You can schedule a demo with us, here.


This week, a selection of thoughts on why videoconferencing sucks.

  • From the Chronicle of Higher Education: “I think the exhaustion is not technological fatigue,” Petriglieri says. “It’s compassion fatigue.”
  • From The Convivial Society: “What all of this amounts to, then, is a physically, cognitively, and emotionally taxing experience for many users as our minds undertake the work of making sense of things under such circumstances. We might think of it as a case of ordinarily unconscious processes operating at max capacity to help us make sense of what we’re experiencing.”
  • From The Conversation: “There are no longer two consciousnesses” in a moment of locked eye contact, “but two mutually enfolding glances.”

What Facebook is doing in response with Messenger Rooms, via The Verge. Plus what Zuckerberg thinks about videoconference fatigue, courtesy of Casey Newton’s newsletter: “I think some of this also is just about the social dynamics. I get a headache when I sit in the office—or when I used to sit in the office, I guess, before all of this—scheduled minute to minute throughout the day, because I didn’t have time to take a break or think. I think that some people are having that reaction now, where you’re just on videoconferences all day long. But that’s not because you’re on a videoconference all day long, it’s because you’re in meetings all day long, back to back. So I think a lot of this is more about the social dynamics than it is just about the technology.” — Mark Zuckerberg

Well, yea.

Have a great week, and to my subscribers in NZ—enjoy level 3. We here in Oregon are very envious, in so many ways.

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Artificiality.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.