AI Agents, Mathematics, and Making Sense of Chaos
From Artificiality This Week * Our Gathering: Our Artificiality Summit 2025 will be held on October 23-25 in Bend, Oregon. The
A conversation with Professor James Evans from the University of Chicago about scientific progress.
Why is scientific progress slowing down? That's a question that's been on the minds of many. But before we dive into that, let's ponder this—how do we even know that scientific progress is decelerating? And in an era where machines are capable of understanding complexities that sometimes surpass human cognition, how should we approach the future of knowledge?
Listen on Apple, Spotify, or YouTube.
Joining us in this exploration is Professor James Evans from the University of Chicago. As the director of the Knowledge Lab at UChicago, Professor Evans is at the forefront of integrating machine learning into the scientific process. His work is revolutionizing how new ideas are conceived, shared, and developed, not just in science but in all knowledge and creative processes.
Our conversation today is a journey through various intriguing landscapes. We'll delve into questions like:
We're also thrilled to discuss Professor Evans' upcoming book, "Knowing," which promises to be a groundbreaking addition to our understanding of these topics.
So, whether you're a scientist, a creative, a business leader, a data scientist or just someone fascinated by the interplay of human intelligence and artificial technology, this episode is sure to offer fresh perspectives and insights.
Welcome to Artificiality, Where Minds Meet Machines.
We founded Artificiality to help people make sense of artificial intelligence.
Every week, we publish essays, podcasts and research to help you be smarter about AI.
Please check out all of Artificiality at www.artificiality.world.
Why is scientific progress slowing down?
That's a question that's been on the minds of many.
But before we dive into that, let's ponder this.
How do we even know that scientific progress is decelerating?
And in an era where machines are capable of understanding complexities that sometimes surpass human cognition, how should we approach the future of knowledge?
Joining us in this exploration is Professor James Evans from the University of Chicago.
As the Director of the Knowledge Lab at UChicago, Professor Evans is at the forefront of integrating machine learning into the scientific process.
His work is revolutionizing how new ideas are conceived, shared and developed, not just in science, but in all knowledge and creative processes.
Our conversation today is a journey through various intriguing landscapes.
We delve into questions like, why has the pace of scientific discovery slowed and can AI be the catalyst to reaccelerate it?
How does the deluge of information and the battle for attention affect the evolution of ideas?
What are the predictive factors for disruption and novelty in the idea landscape?
How should teams be structured to foster disruptive innovation?
What risks do homogeneity and thinking and AI model concentration pose?
How can we redefine diversity in the context of both human and AI collaboration?
What's the evolving nature of knowledge mean for the next generation of scientists?
Can tools like ChatGPT enhance diversity and innovative capabilities in research?
Is it time to debunk the myth of the lone genius and focus instead on the collective intelligence of humans and AI?
We're also thrilled to discuss Professor Evans' upcoming book, Knowing, which promises to be a groundbreaking addition to our understanding of these topics.
So whether you're a scientist, a creative, a business leader, a data scientist, or just someone fascinated by the interplay of human intelligence and artificial technology, this episode is sure to offer fresh perspectives and insights.
James, thanks so much for joining us.
We're excited to talk to you today.
Perhaps start off by telling us what inspired you to start the Knowledge Lab?
When I initially began graduate school as a PhD in sociology at Stanford, I had previously been at Harvard University studying social network analysis.
I was interested in culture and the nature of knowledge and thought that some of these new formalisms might be a way to represent knowledge at scale and analyze it at scale.
And it turns out, you know, I realized there was an entire industry associated with the construction of knowledge representations, knowledge graphs and ontologies that was devoted to this, but it was really emergent from Aristotelian kind of realism.
It didn't reflect the way in which people were talking within the literature, the frequency, the overlaps, really the empirical representations of knowledge in practice.
It was kind of like a representation of these cartoon models that particular theorists had.
They didn't include probability or contradiction or any of these things.
And so I was really excited across my PhD and beyond to turn and tune these knowledge representations into things that actually reflected the kinds of conflicts and the way in which advance actually took place at scale within the field.
And so a couple of years into my job as a professor at the University of Chicago, I really became excited about the possibility of creating a lab on the one hand that studied knowledge production.
And on the other hand, by studying knowledge production at scale with representations that kind of crossed areas and time scales, we could actually rethink and produce new opportunities to reformulate knowledge that served our particular goals and purposes.
And so that was really the twin idea.
On the one hand, we're studying knowledge in such a way with generalized representations that will allow us to rewire and reorganize, for example, scientific or the civic knowledge scape in ways that are definable and really clear, that facilitate experimentation and change.
So that was the underlying idea.
So how have you used machine learning to almost provide another brain on this?
Well, I think there are a number of different ways in which we've approached this.
I mean, one is we've tried to represent different forms of knowledge in a kind of a formal syntax, graph or geometric representation that allows us to see things like research gaps, right?
So these are, that's kind of a topological metaphor, but we don't actually typically measure it topologically or geometrically, like there's a gap or there's a hole, there's a lacuna in the research.
And so we're basically using artificial intelligence to really, on the one hand, like align, extract and align meanings, right?
Across both the syntax and the semantics of phrases inside scientific context.
And then also, but that assumes that that fixness is kind of digital, that something is part of the concept or is not part of the concept.
And I think that's the power of natural language, is it facilitates continuous gradation.
And that's, I would say, one of the powers of new large-scale neural representations, especially of languages.
They create a kind of a quasi-continuous meaning space.
And in identifying such a meaning space, then we can all of a sudden begin to operationalize all these things that we've been talking about for centuries.
Gaps in the research, the possibility of arbitrage, bridging communities in a really formal way to identify how frequently they occur, where they occur, where they don't occur, what the mechanisms are that drive them, or that make it impossible in some cases for them to be achieved by particular human institutions and groups.
And a lot of your research has landed in a super practical kind of way that you can see, you've been able to put, to be able to quantify and prove out the progress in science, the fact that it's essentially in many ways sort of slowing down or getting more constrained, less breakthrough oriented.
Can you talk a little bit about how you were able to discover that and what you really sort of found out at a practical level there?
Sure, and I think a lot of this research is ongoing, but yeah, so I think once you have, let's say, a measure of disruption, which in this particular case is just the likelihood that your work eclipses the work on which it builds.
It's a purely structural measure, but its structural nature actually makes it quite general across contexts, I mean, from software development, game design to patents, to papers, anything that has an implicit or explicit reference structure.
So then once you have and you validate a tool or a model or a measurement, then you can take it on the road and ask questions about the ways in which different kinds of institutions end up associating with or driving this outcome in question.
So for example, the paper that you're referencing, the Nature paper, looked at how it is the team size.
So there were a lot of researchers in the past who have identified the rise of larger teams in all fields of science and scholarship.
So Brian Utsi and colleagues had a really nice paper, you know, a decade and a half ago, that just identified and traced these patterns.
But the assumption inside that work and really largely, the commentary that followed from that work was that, well, of course, this is the most efficient possible thing, you know, that these large teams are increasing in prevalence because the problems that we face require large teams.
So it was a kind of a, this is the best of all possible worlds, efficient market hypothesis of teams.
I don't think that paper itself explicitly felt that, but many works subsequent to that did.
And so we were just asking the question, okay, well, what if it's not just the most efficient of all possible things?
What if their interests inside the space that could drive the structure of teams and institutions away from their ideal purpose, you know, which in this particular case would be discovery, scientific advance and its practical application.
And so that's what we were trying to look at, okay?
How does disruption, disruptive advance, end up scaling with the size of teams in recent years?
And what we find is kind of in some ways exactly the opposite of what many had assumed before, which is that, you know, as teams get larger, then there's an exponential decline in the likelihood of disruption of that work.
And it occurs across all timescales.
About 70% occurs within person.
So as people move across projects of these structures, subsequently we've done other related work.
So for example, we had a PNAS paper in the spring of 2022, which looked at the structure of teams.
So more hierarchical teams, even more than larger teams, ended up doing exactly the same process, which is to say, it tended to be associated with much less disruption, much less novelty, et cetera.
How do we think about the, it feels like a potential tension between diversity of ideas on a team to be able to drive new innovative ideas, but also the size of the team, as you're saying, decreases the potential for breakthrough.
Am I understanding that correctly?
And how do we think about those two sort of countering forces?
Yeah, so I think there's a couple ways in which they relate.
So it's basically a larger effect than the size is the structure of those teams.
So in their hierarchical teams, it's much less likely that they're gonna activate the various brains on the team.
So at the extreme, you've got one person who's designing the research and everybody else is a muscle inside this organism, just kind of like pumping out a Southwestern plot or whatever it is, a blot test, or whatever it is that needs to be produced.
So those teams are not very disruptive.
And they're not very novel.
And they maximize the productivity of the people at the top and they minimize the productivity of the people at the bottom of those pyramids.
So there's very strong reasons to believe that they're individual incentives that are driving both the size of teams and the structure of teams, independent of their utility for science.
They're very useful for principle investigators.
They allow them to get more of their old work done, push out more of the same faster.
They're not as advantageous for discovery for discovering the frontiers.
So basically large, highly hierarchical teams don't activate the diversity that underlie their number, basically.
And one of the things that we find to really, that really nails this is in a piece published in Nature Communications in the spring of last year, 2023, which shows that the really most radically novel discovery come systematically, you know, the biological science and the physical sciences and patent invention from expeditions from people in one area of science who travel over and solve problems in another area of the space, right?
So it's, so the event, the diversity event is the, you know, is the convening of this, you know, novel pattern or method or problem, or I'm sorry, another novel pattern method approach or pattern, you know, that ends up solving a problem inside an alien domain who's never seen anyone like them before.
No one from their community has ever seen a problem like this before.
So it's really about the activation of these things across spaces and teams, unusual teams, unusual individuals end up being associated with lower levels of novelty because they require a kind of a social contract that there's a they're there that's worth investigating together.
Does that make sense?
Sure does.
Have you looked at or speculated about the application of this over into sort of corporate innovation?
I mean, thinking a lot about tech and AI worlds, and there's sort of this myth of the singular founder who is the driving force of all innovation, that sort of mindset.
Do you feel like this applies to that world?
That's interesting.
I mean, I think certainly one person, I mean, a smaller team, as it were, can drive off an initial innovation, right?
But when we think about the perpetuation of that team and that singular founder over time, they can also fix a corporation on a trajectory.
And so I think if we're talking about corporations as basically engines of continuous innovation, then I think there is a challenge associated with a singular intelligence that's sitting at the head, which is great at protecting work, protecting early original work and conserving early original work.
That's precisely what they do.
And it's like the lottery ticket hypothesis.
If that initial bet was a great bet, then conserving it is not a bad thing in the context of the economy, until the economy changes.
Under which circumstances we want systems and corporations that are more able to evolve, that are flatter, and that really where the innovation exists inside the distribution of companies.
I think the thing that concerns me in this space and Ufuk Aksijit has written a fair bit about this, but I think we can see strong manifestations is just the decrease in number of corporations.
It's like 50% of the corporations that there were in 1980s, which dramatically decreases the pool of interacting companies inside the space.
And so everything is effectively a merger and acquisition story, which is to say the large established early founder companies that are conserving their great ideas are buying up and killing a lot of the competing innovation in the space.
I feel like there's an entire field of ways to apply and analyze this.
I'm thinking about some of the larger tech companies and when they were most innovative and not and how they've organized their teams.
And as you say, there's sort of this tension if you can keep smaller teams smaller with the right sort of structure and diversity, you have the most breakthrough innovation and then it can kind of dwindle out.
It's fascinating.
Yeah.
I have so many questions just popping up all over the landscape.
Can we go back to this idea of an expedition?
And I've got two questions here.
One is, how do you know that the expedition is happening?
What are the markers that that's?
Is it straight cross-disciplinary kind of in some ways at a human level understandable?
It's someone moving from earth sciences to biology to social sciences or something like that.
Or is it more something that's being discovered at a more abstract level by the AI itself, that there's another pattern in that expedition?
And then following on from that, what are the things that mark this expedition out as being more disruptive, more successful, more likely to create breakthrough?
Because there must be a chunk of survivor bias here that you're only seeing the ones that they're...
Absolutely.
Absolutely.
I'm not saying this is a perpetual motion machine.
So these are low probability events, and we know that they've happened after they've happened because they yielded an insight that was disruptive.
Yes, it's absolutely about survivor bias.
And yet, when you look across the distribution of truly radical innovations, the rate at which they come from these kind of expeditions, kind of like long range, low probability bets, is extremely high.
It's hundreds of times higher than it would be in expectation.
So what you're suggesting is, hey, there's some...
These are more likely to fail.
And I think the answer is absolutely yes.
Absolutely.
We're not capturing the budget that's allocated to these.
Like how many times do people engage in a low probability expedition that just fails, either because it really doesn't work or because they're not even let in the portcullis of the castle where they're attempting to scale.
It might be that it works, but it doesn't.
But we know that when they scale it, those innovations end up being larger innovations.
They attract more attention.
They rewire the entire semantic space much more radically than the ones that are produced by combinations of researchers inside and outside or by teams like Jason the Argonauts, that comprise all kinds of special expertise.
So I think the one thing that's confirmed with these kinds of patterns is that effectively, different sciences end up being the reservoir of difference for which any particular science increasingly solves its problems.
A science who is facing a problem, there's lots of places it could draw to solve those problems.
It could develop its own novel tools, it could do this or that.
Increasingly, the way in which those problems get solved are by drawing on this reservoir of difference that comes from conserved field level differences across the scientific space.
And we're calculating the character of that expedition just as a function of the surprise, the surprisingness of a person's background to the readers of the outlet that they produce their discovery in.
It feels like a way of sort of forcing diversity in that respect.
You could put, some of your models could predict or could provide predictive power to take one particular problem in one particular field and say, well, let's add people from these fields, and that may enhance the likeliness of solving that problem.
But it also feels like there's something else happening at a deeper level that maybe we're not able to detect yet, which is what is it about these new individuals and their cognitive processes and their reasoning style and the way they use the tools available to them that make them particularly valuable in what is essentially a project team.
And it feels like there's so much more to discover about almost not just team diversity but individual cognitive diversity, how you might use a particular tool versus how someone else might use a particular tool.
That's a great question.
I think that how it is that people use a tool versus just the cognitive availability of the tool to them, it's not obvious to me, which I know that the latter accounts for a lot of the variation.
I think once we nail that down, then there's going to be some residuum, and we're going to ask the question about the degree to which they're just thinking about the entire problem in a different way.
If they had any tool in their arsenal, they would apply it in a way that could have the potential to disrupt a particular space.
But I will say two things about this budget, about this kind of improbability of connection, which is related to one of the papers that I think motivated my invitation to this conversation, which is when we look at, when we build an AI-driven, I would call it a digital double of the discovery system.
So we basically are predicting how it is that individual scientists are discovering that, let's say, materials have particular properties.
And we're really building a tight system that accounts for where people are, what they see, how they discover it.
And we know that modeling people into that system improves the system because we can predict one and two of the people that are going to make any particular discovery in this space in the next year.
So this is a very tight model of discovery.
And when we turn that in the opposite direction, when we try to avoid those people, rather than making predictions at the ridge of this kind of human cognitive availability landscape we're predicting in the valleys, where there's no people available to make those discoveries, those discoveries tend to do better on average.
So this is accounting for the budget.
This is on average, those discoveries have better, have higher scores in first principles and data-driven simulations of the properties in question.
And when we've gone into, and that's a conservative estimate.
Why is it a conservative estimate?
Because the scientists in the field have access to those first principles and data-driven simulations of the properties.
They could use that to sample the space.
Even with that, we see that they tend to follow the crowd and that on average, these things that avoid the distribution of human capacity are better.
And the reason I believe this is the case, which suggests that even though they're low probability, they're higher probability than the things that people are studying on average and publishing on average in the fields, is because this diminishing marginal returns to theory.
You set up a theory strategically, and then you develop hypotheses and test it in the best case that you can.
That's your individual motive as a scientist, is you develop, you shrink wrap a theory to a context, and you test it in that context, and when you shift its context, then the likelihood that that theory is going to hold goes down.
And if like a thousand people have studied it in this first context, the most profitable context that you expected that it could be studied in, it decreases the relative value of that theory for new contexts.
And so I think this is just, you know, all this means is that people are motivated to squeeze it partly because we're pack animals, because it's an existential risk if people don't read and cite your stuff inside the scientific domain, we're much more likely to squeeze at the last drop from an existing theory than we are to start tapping new trees in the forest.
Because it's an existential risk to die in the forest.
If you discover that an improbable theory was false, you die in the forest alone, you know.
So from a progress of science perspective or a progress of knowledge perspective, is there a shift underway, or could there be a shift underway, or has it already happened, that theories are a more low-dimensional representation of what's going on in the world, and those are often personally driven.
They're things that a given scientist comes up with and, as you say, wants to ring the last drops out of.
With artificial intelligence creating much more ability to just stay in a high-dimensional space, are we witnessing a different kind of tension between high-dimensional complexity and sort of low-dimensional human understanding and explainability?
Absolutely.
Absolutely.
I think it's like growing up in the 1980s, there was one form of movie art.
It's between an hour and a half and two and a half hours.
And every story was told in that medium.
There weren't ongoing mini-series.
That was one story.
And I would say we have, for a long time, for two and a half centuries, a kind of a certain length of equation, which is acceptable, and a length of a research paper.
I mean, they vary across fields, but within field, they don't vary.
You know, the social sciences force you to have an extended disco dance remix of a literature section.
You know, in certain areas of physics and engineering, they force you not to have such an extended context-driven space.
And so we have these templates, these knowledge templates, which are highly constraining.
Because what if, you know, a database theory, that's kind of like a four-letter word for a theory, you know, it's a terrible theory, it's completely inefficient, what if that's the most efficient theory?
You know, like even accounting for, you know, it's a very complex theory, but it explains, you know, efficiently, you know, relative to its complexity, lots of things in the universe, which is to say that that complexity at some level becomes irreducible.
But we can't see it because we have a special human preference for an equation of a certain length, a paper of a certain length, etc.
So I think we're definitely in a space where there are what I'll call computational theories, you know, which are, I mean, typically we think of them and talk about them as surrogate models.
So, for example, in computational chemistry, I mean, not just computational, but material chemistry, increasingly in material science, increasingly you'll have this kind of cycle where you have an experimentation which is computationally driven, like it's a, you know, it's a robot driven lab that feeds a surrogate model process, which is either deep neural network or maybe a Gaussian process, if you're thinking of this as a Bayesian optimization, that then feeds what I'll call a computational phenomenologist, right?
So someone who basically is identifying the optimal next experiments to perturb that theory that feeds then the experiment that, you know, continues in the cycle.
And I've been working with colleagues to try to extend this to a number of other fields where this allows us to basically build arbitrarily sized and shaped theories that can nevertheless be enormously efficient and predictive in cases where we don't have, especially valuable in cases where we don't have strong scientific intuitions, which don't just include the social sciences, but also include like, you know, anything of complexity like polymers in the context of chemistry, material science.
Like we don't have first principles models for those things.
We can build data-driven digital doubles, which are, you know, computational theories of very different size, structure and complexity than traditional human theories.
Yeah, so I think it's a really exciting time insofar as we allow and, you know, take into account these changes that are taking place.
I would say one of the challenges is that many times, you know, scientists hide, you know, what it is that the AI has done or the representational space that it's in so that they can kind of couch and represent it in same historic representations, even though those were not the ones used to discover the things that they discover.
So, yeah, but I do, I agree, I think it's an exciting moment.
Do you think that people, it really speaks to, you know, when we talk to young graduate students and there seems to be, there is quite a schism in the psychology that some are, if I could do field work and never touch a computer, that would be perfect.
That's what I'd like to do.
But that's unrealistic.
Through to, I don't care how the knowledge comes, I just want the knowledge.
And that there's almost this exposition of human preference at the level of sort of purpose and theory and causation and explanation that it's, how do you characterize the way that human preference is impacting the progress in the scientific system?
And also how do you characterize the different kinds of incentives that are coming from either this individual agency with young scientists being, with their preferences, against a system that has locked in incentives for funding and discovery and citation.
It feels like this is changing so quickly, but there's so much inertia in the sort of top-down how things get done mode.
Yeah, I mean, there is enormous inertia inside this space.
I mean, what I hear, the first part that you were describing sounds like epistemic standards, basically.
Within subfields, there are standards for what's knowledge.
In certain areas of cognitive psychology, if you don't describe a mechanism, it's not knowledge.
It has to have a mechanism.
It can't just be a pattern.
And I would say that's also emerging in the context of certain areas of the social sciences.
There has to be a fixed causal identification, for example, of an effect for us to talk about it as knowledge.
If it doesn't have a causal identification, it's not knowledge at all.
And there are strong disciplinary traditions that conserve these things and actually make it very difficult for other really powerful knowledge.
You know, like Jane Goodall's discovery that apes are using tools was not, you know, there's no causal effect.
It was just a description that, you know, radically rewired our understanding about, you know, like what it is the different kinds of creatures could do, you know, on this earth and what our relationship is to them.
And so I think disciplines, you know, strongly conserve these patterns.
I think, again, there are benefits and costs to that conservation.
The, you know, the costs are exactly as you described.
New computational approaches and opportunities occur, arise, and it's very difficult for them to be taken advantage of.
It takes decades for them to diffuse.
On the other hand, because they're sufficiently conserved, they actually maintain their difference and their independence, which allows them to be drawn upon by other fields in times when other fields are facing a crisis because they have a problem they can't solve.
And this becomes a repository of alien logics that have been conserved by this community.
So I think it's a feature and a bug.
You know, the feature is if everything was radically interdisciplinary, one of the things that we've studied in the past is when basically two fields engage in conversation with one another, the diameter of their interest collapses.
It drops to half, right?
So as two separate fields, they had twice as many ideas as they have when they become one field.
So even if there's some meaningful combinations that you can yield by putting those things together, then at the same time you drop a whole bunch of things for which there were creditors' knowledge in one field or the other that are now no longer credited at all, there's no longer space in the broader field.
So we see these fields increase, there's an exponential decline in the likelihood that new ideas enter the canon of those fields.
So I think this is one of the, I mean, if anything, what you want are new places on the map, not necessarily that fields will radically move, because their fixity, their resistance to movement, is the very thing that makes them available as resources of difference to other fields.
So I think you want new journals, new outlets, new places, and this is what you see historically in the advancement of science.
So for example, organic chemistry hated early work in biochemistry, because it assumed all kinds of things were the same, which organic chemists knew were not the same, as they were looking at this higher level enzymatic event, which involved many, many different lower level organic reactions.
But that higher level view ended up being really profitable.
And the same happened, so there was a speciation event.
They couldn't get their work published in organic chemistry.
They started new journals.
Those new journals ended up being really successful.
The same when genetic molecular engineering emerged, later called molecular biology.
But initially it was molecular engineering.
It was hated by biochemists because they did exactly the same thing.
They assumed that there were the same conservation of biochemical events underneath a singular massive genetic event.
You know, the biochemists were right.
That was a false assumption, but it was a really profitable assumption.
There was a higher level pattern of genetic events that yielded a totally new set of insights.
And I think what we're seeing now is we're going to likely see a number of these kinds of speciations occur.
It's not that the old fields will embrace them immediately.
It's that these are new questions, and we're talking about new kinds of things that are credited as knowledge, new epistemic standards.
And that's going to require new journals, new conversations.
And if they're useful and profitable, if they yield patterns that are descriptive and predictive and engineerable, that can really yield new artifacts, then they are going to rise.
They're going to rise.
They're going to find their way, as they are, in multidisciplinary journals and the broader popular imagination.
Yeah, the Journal of Collective Intelligence is attracting such a diverse field.
For example, I'm giving that as an example, we got synthetic biologists talking with scholars of political economy, as an example.
We talked with Stephen Fleming recently, did a podcast with him.
He was one of the authors on the paper that I think was published in August last year on how to define or how to detect consciousness and artificial intelligence, multi-disciplinary papers, many different authors both from the neuroscience side and cognitive science and through to artificial intelligence.
And what struck me when we talked with him about that is how long that paper was in gestation.
A couple of years of people talking about, huh, maybe this would be something that we should get a common language over.
That seemed to be the first step.
Let's get a common language for what we mean by consciousness.
And the paper itself outlines in significant detail the way that they came to think about the constraints and the boundaries, describes exactly in a really tangible way what you said around it being kind of half the distance.
That people come, they take time to get to what are the boundaries around this new thing that we haven't named yet that is somewhere, is the valley between our two ridges.
And that to me is so fascinating to think about watching for that.
That there's a premise here that what it takes to go from two areas that have open questions that they're having difficulty solving or haven't got to yet, that defining the valley first starts with this process of contraction, if you like, of what it is you're interested in.
Like, this is the crossover.
And looking for that in lots of different areas, like even in a corporate space, how do you think about taking a marketing and a technology group and what's the common language that's just smaller and tighter and doesn't have all of the other pieces around it that each group is individually concerned with, and then saying, this is the knowledge, this is what we will call knowledge when we have, it might be a framework, it might be a set of definitions, it might be just a list of what it isn't.
And that's almost how the consciousness papers, they said, we are theory driven because, and the reason was that if you want to ask an AI whether it's conscious or not, you don't know whether it's trying to deceive you.
So you can't be very empirically driven, you actually have to be really theory driven.
So a lot of that paper was laying out that rationale.
And it seems like the way that Stephen described it was the process of getting to that joint understanding that that's the investment that pays off in the long term.
That takes quite a long time.
And how do we think about maybe different ways of speeding that up?
How do we think about breaking path dependency and keeping the space open enough for those conversations to be really productive, but not so open that they never actually converge on any set of definitions that people can move forward with.
Yeah, no, I think that's...
I mean, we have, I would say, a number of kind of facts that corroborate the generality of that story.
I mean, these flatter teams take more time to produce any particular paper.
And in a new work, we basically show, this was the opposite of our expectation, but it confirms your finding here, is that the bigger the hole between people's experience in these teams, the greater likelihood that the work that gets produced, if it gets produced, will be disruptive.
So it's like if everyone is like tessellated across the space, everyone can speak to one another in conventional ways, even if they have a high diameter, you know, if they cross a wide space, they're very unlikely to do disruptive work.
But if there's a big gap somewhere in the middle, it invokes a search process.
Again, it takes more time, it's more likely to fail.
If it succeeds, however, it's much more likely to yield something that changes the way in which the rails, you know, in which ideas move through space.
So I think these...
So are there ways to accelerate these things?
I think absolutely.
You know, we have two recent works, for example, that shows, well, how does this normally happen?
Well, when we surveyed tens of thousands of people, we took them from a variety of fields.
We asked them for...
We randomly grabbed a couple of citations from their papers.
How did you find these citations?
How influential were they, the intellectual ideas of the field?
Systematically, the thing that drove...
that people were really familiar with things that were really unfamiliar to them and were the most likely to influence their project was that they were at the same institution and not the same department, which is to say by accident.
Like you've got kids in the same soccer team, you go to the Whole Foods together, like you were put together by physical space.
It's not the conference that you went to expecting to find this thing.
It's the circumstances in which you don't expect to find these things systematically.
And another case, we look at citation and attentional biases across...
from every field to every other field.
And what we find is systematically when, like a lower status field, which is to say that, you know, like a field that doesn't look at this lower field as much as the lower field looks at them, when they do cite the kind of like underappreciated lower field, the work that gets cited inside, you know, the work that's produced in the higher status field has a much higher average and variance of citation.
So the idea actually was a useful idea, and it's much more likely for the person who cited that lower status idea to be in the same institution as the person whom they're citing.
So again, this is an accident.
They didn't go to that institution to find that lower status field person.
Like they bumped into that person.
So we're explicitly working on, hey, how can we engineer attention in this space to see where are blind spots?
Where are we building institutions and regularities that are systematically thwarting our ability to move the best ideas?
And can we identify those much more quickly?
I think moreover, one of the things that really good AI models of this space can do is not predict low probability events.
They're not going to predict black swan events, but they can predict, you know, they can identify the moment that something becomes recognized as important, right?
The moment something becomes published, even as a preprint, it can show how surprising this should be to science and exactly who and where it will be surprising.
So you can make science as a whole and science policy as a whole operate like at the speed with which Alexander Fleming, when he sees, you know, mold basically repelling staff inside of a Petri dish by accident, immediately he realizes the opportunity, you know, and starts working towards penicillin, you know, completely unprecedented, inconceivable.
Previous to this, he sees this accident, this abductive moment, and, you know, and he kind of moves in.
And so increasingly, these models allow us to have a kind of a collective cognition that identifies these things at the same speed that individual human brains, scientific brains identified them in the past, rather than requiring a 15-year search gestation, which is the way in which we typically spread good ideas now.
That makes me really think that it's so important to have incentives for people to celebrate what the AI tells them, as opposed to bury it beneath what they think.
So that becomes there.
It has to be so transparent.
When you think about AI as an agent in this knowledge system, whether that's a more, as you're saying, predictive model that's helping you identify or predict when a discovery is going to have a great impact, or if it's a more of an exploratory experience of using, say, a large language model for, whether it's scientific discovery or creative innovation, idea discovery that's not purely in a scientific realm.
I have two different questions here.
One is, do you consider those models as a single entity in the system, or do you think of them as potentially many entities in the system because they've been trained on so much data?
So when you think about the number of people that have collaborated, if you bring ChatGPT, just because it's the one that everybody talks about, into the system, is that one or is that many?
And then also, is it, are the models, how do you think about the tension between the model keeping you in sort of an idea echo chamber because it's answering the question that you've asked, versus exploring the potential of going out and coming up with new novel ideas, being that person that came from another field of study to help you out and bring a new idea to the project?
And I realize that I'm probably asking you to speculate a heck of a lot.
— I know, those are, well and ouss화 say we have been working on a number of projects that make my speculations less speculation.
We have a number of evidentiary basis for some of these things.
So one, I would say, if one thinks of these large scale models as a singular agent A, and if you treat them, you use them as a singular agent B, then you have all of the same problems that we described with respect to like a singular leader on the team.
Basically, you kind of conserve the logic and you discover the same things over and over again.
And it's kind of like there's a trade off between short and long term discovery here.
So let's say, Lavoisier, famous chemist, better than the French Revolution for tax collecting and using those proceeds to fund his science.
But one of the things he does is creates a new language for chemicals, the kind of the elementary language that we use today.
And that language, in some ways, you can think of as being like a new AI model, like specifies a bunch of new crystals or like characterizes a bunch of proteins.
Okay, now we've got this like unambiguous set of things and we can think about their combination.
You know, so in the short term, that is, you know, hugely fruitful, you know, because there's all kinds of questions that can be asked in the short term.
In the longer term, you know, it's created a framework within which combinations are occurring.
But the space of elements that are being cultivated outside that space that could be combined with that thing in the future are going down.
Right?
And I think this is one of the concerns, for example, of people in the AI space.
Like if very particular architectures dominate the entire space, so that nothing else is being explored, there's nothing else to breed with those architectures to yield additional information.
And so, it's like, you know, short-term we're going to basically reduce the vocabulary, we're going to get rid of all ambiguity, and we're going to just mine the heck out of that combination space, that combinatorial space of existing elements.
And I think, again, it's a short-term, long-term thing.
That's going to yield a big bolus of results.
I think in many ways we should do them, but we shouldn't kill all the other approaches to AI, or even situating, if we're thinking about GPT, situating that GPT agent as multiple different kinds of agents, to really think about developing this further.
One analogy that I'll give is related to human languages.
So I have a paper that's about to come out, Nature, Human Behavior, but it's on the archives, which is with Pete Aceves, Pedro Aceves.
And the paper basically shows, you know, in human languages, if languages are ambiguous, which is to say they have a high level of polysemy, so we look at a thousand human languages, then they're much more likely to do a couple of things.
One is conversation occurs more quickly when you have more ambiguous things.
And conversations in those languages or expositions take a lot longer because you're cycling around the same ideas over and over again, talking about them in new ways, getting more and more refined senses of opportunity.
So Wikipedia articles of equal length talk about fewer things because they talk about those things in more ways, which means they kind of conserve that ambiguity if you basically just fix all the elements.
So one concept to one word, one item to one thing inside the space, you can explore all those combinations of the reduced set, but it ruins that esteem, those diminishing margin returns to that space.
And if that space has killed all the other related spaces.
So I strongly encourage people to think about these AIs, as multiple agents, and facilitate exactly the same process of abduction, which is the model of human scientific, technological and business advance over the last couple of centuries, which is to say, you basically build a model, and when the model is surprised, right, then you discover typically an alien pattern, right, that's outside the system that has the potential to provide insight.
And so what does that mean for AIs?
Well, that means, you know, even in the GPT context, you know, basically designing very different objective subjectivities of these models.
And there's been some work already, and some work in my lab, some work of other folks, that basically shows that if you basically create a diversity of subjective positions of these AIs generating hypotheses, then the resulting hypotheses are much more robust and more likely to be important and disruptive to the sciences that surround those relevant areas.
And I think this is going to be true even kind of outside just the GPT context.
Like we need to build a diverse ecology of AIs so that when one part of the space starts to poop out, when it has diminishing marginal returns, then we have another set of AIs that have been conditioned to look at another level of analysis that can be drawn upon, basically.
So this is how human scientific advance is characterized over the last two centuries.
Every time we look at it, this is how it looks.
And if we basically create a singular AI architecture that's just going to mine the heck out of one frame of a problem, that's like an accelerated discipline, a discipline on steroids that's going to yield great results early on because it was a great bet, it's a great hypothesis, but it's going to tail off and it's going to have diminishing margin returns.
And so I think really thinking, and I think this is also true with when we think about the existential risk issues and threats about AI, how have we dealt with unpredictable, powerful actors in the past?
Well, we make them sign the Magna Carta.
We build an independent parliament.
We build an independent court system.
We need to build other AIs that are actively regulating, auditing, turning off, disciplining the AIs that we have in question.
And if we don't do that, if we just assume, oh, we're going to set up regulations, we're going to have the SEC watch trading AIs.
I mean, these AIs are too complicated.
We need an ecology of AIs like a GAN, where the adversarial elements, you know, the generator, you know, which is kind of like the AI itself and the discriminator, like the policymaker, are growing, you know, together.
And if they're not balanced, then the system is going to break.
And so I think it's critical to think about diversity in the construction of these AIs from the very beginning, both for governance and for discovery.
Final question, if you like, on how you think about knowledge and meaning and what it means to know something now.
Like how do you think about a new way of thinking about that?
Well, I think, so I'm finishing up a book called Knowing with a colleague, Jacob Foster, in Los Angeles.
And, you know, we have a complex systems view of knowledge where a singular knower has very rarely in history been the one that did all the knowing functions, that sensed the world, you know, that described it, that developed a theory, you know, of the world that made predictions, that engaged in action.
I mean, you can imagine cases in which that's the case, but for scientific knowledge, it's always been, I'm borrowing, you know, an understanding, you know, which is to say a theory from you, like you're selling me an understanding.
I'm taking it into my, you know, as a cassette, effectively, in my kind of knowledge apparatus.
And increasingly, there are going to be more and more non-human agents in these complex knowledge exchanges where we're borrowing data, you know, from this agent, we're borrowing a description from that agent, we're borrowing, you know, an understanding or a theory from this agent, prediction from this agent.
And so increasingly, the scope of the complex knower that knows any particular thing is going to be higher than the level of the person.
It's going to be at the level of, you know, this broader complex of machines and people that are relying on each other.
You know, they're relying on each other's cassettes, you know, that we're exchanging these different knowledge units effectively to kind of create a complex knowing architecture.
And so I think that's going to require, it's going to require different kinds of trust, and those trusts are going to have to be associated with assurances that we make to each other, that we can audit, you know, for them to actually, for fields to feel like they can seed certain kinds of insight, so that they can jump on top of the shoulders of giants and get new insights.
So I think that, I think it's, you know, basically it's definitely pushing knowledge outside, further outside of the heads of individual scientists.
I think that we have this illusion that it was inside the heads of scientists that make this feel like a big jump.
And I think our book in some ways is an effort to kind of suggest.
We've already taken that leap.
We just need to understand it.
We need to recognize it.
And we need to kind of, you know, tread carefully as we move forward to knowing bigger things, you know, bigger collective cognitions.
I can't wait to read it.
Yes.
I'm excited to have you back on the podcast to talk about it once it's published.
Thank you so much for joining us.
This has been really interesting, and I feel like we could have about three or four more hours of conversation here.
Yeah, what a way to set 2020 for work.
Exactly.
This is great.
But thank you so much.
It's been great, and we definitely would love to have you back to talk about your book when it comes out.
Okay, great.
Yeah, we've drafted it.
We're just tweaking.
So I'll let you know when that happens.
Excellent.
Okay, thank you both.
Thank you.
The Artificiality Weekend Briefing: About AI, Not Written by AI