AI Agents, Mathematics, and Making Sense of Chaos
From Artificiality This Week * Our Gathering: Our Artificiality Summit 2025 will be held on October 23-25 in Bend, Oregon. The
Chain of thought, tree of thoughts, and now graph of thoughts—a progression that may lead to agentic AI.
A recent paper on Graph of Thoughts reasoning highlights the progression towards Agentic AI, one of our Artificiality Pro Obsessions. As AI tools become increasingly autonomous and capable of handling complex tasks with minimal supervision, their reasoning abilities, task design, and capacity to recover from failures, is crucial.
To encourage Large Language Models (LLMs) to reason in more human-like ways, researchers have been exploring various methods, with a recent focus on enabling models to be more flexible and how they combine ideas. We see three big steps in prompt engineering techniques, each of which demonstrate a significant step up in the ability of prompt engineers to tap into the knowledge in a large language model: chain of thought, tree of thoughts, and recently graph of thoughts reasoning.
Initially, Chain of Thought reasoning was found to enhance the effectiveness of LLMs. By breaking down complex problems into simpler components, models could reason more effectively. Humans do this too: one of the best predictors of someone’s ability to solve a problem is how early in the process they break the problem up. Multiple chains of thought advanced this technique by enabling multiple independent paths to be generated but, even with this advance, reasoning is still limited because there is no way to perform any "local exploration" such as back tracking.
Building on this, the Tree of Thoughts approach was introduced, which allowed AI models to follow multiple paths, similar to human decision-making processes, while still breaking down problems. Think of this as the LLM generates many thoughts (outputs) and all thoughts exist in a structure similar to a decision tree and the final reasoned output is the best construction of this tree of thoughts. Though an improvement, this method remained fundamentally linear in its approach.
How can AI break out of its linear thinking? It has to be able to think a bit more like a human: recursively and in a graph (network) kind of way.
Human reasoning, as highlighted by thinkers like Andy Clark, is often "loopy." We pull ideas from various sources, combine them, revisit previous thoughts, and continuously integrate new insights or preferences. Our thinking operates in parallel, is dynamic, and exhibits complex systems behavior such as phase transitions and criticality. For example, if you're working on a novel problem you might start down one path, backtrack, pause, combine an idea from a previous thought process, merge ideas, and then consider strengths and weaknesses of various ideas.
This process of loopy human reasoning is closer to a graph structure, which is prevalent in information systems, where thoughts interconnect in a network. A social network is a graph where people are nodes and their relationships are links.
A Graph of Thoughts (GoT) process allows LLMs to operate in a graph-like structure, more closely resembling human thought processes. This method represents the reasoning of LLMs as a graph, with thoughts as nodes and their dependencies and relationships as links or edges. This structure allows the combination of various thoughts in new and recursive ways. This in turn facilitates more complex and interconnected reasoning patterns, including novel transformation and aggregation of thoughts.
For example, some nodes could model the plans for writing a paragraph while others model the actual paragraphs of text. GoT offers a structure for transforming these "thoughts" or for looping over a thought in order to enhance it. These graph-enabled transformations could enable better document merging, for example. The researchers used GoT to generate a new NDA based on several input documents that partially overlapped in their contents with the goal of minimum duplication while maximizing information retention.
The architecture of the system is quite complicated: a set of modules that themselves interact. The prompter prepares messages as inputs to the LLM, encoding the graph structure within the prompt. The parser extracts information from the LLM's replies and constructs a thought state. The scoring module verifies and scores the replies, either by referring back to the LLM or to a human. The controller coordinates the entire process and decides how to progress it.
In a (for a human) simple task—sorting—the researchers found that GoT outperformed other prompting techniques. GoT prompting resulted in a 62% increase in the quality of tasks over ToT, while simultaneously reducing costs by more than 30%. GoT seems to do this by improving the tradeoff between latency (number of hops in the graph of thoughts to reach a given final thought) and the volume (number of preceding LLM thoughts that impact the final thought). In other words: demonstrating the efficiency of graph-based structures for (what is essentially) a search process.
The base line performance improvement is quite impressive, but even more importantly, GoT seems to get better as problems get bigger. Quality increases as the problem size (and complexity) grows. What's going on here? Unlike Input/Output, CoT, or ToT prompting schemes, Graph of Thoughts demonstrates that a graph structure can brings more thoughts to a problem and allows for those thoughts to change as the problem becomes more elaborate.
The approach is perhaps best conceptualized as a generic framework to enhance an LLM architecture without having to update the model itself. The researchers have creatively adapted graph abstraction, a general computing approach exemplified by developments like AlphaFold, and applied it to prompting in the field of AI. This unique application signifies a significant advancement in the area.
What interests us more is how it may accelerate the use of LLMs in more complex reasoning tasks where humans have to make a lot of interrelated decisions. For example, planning a complex itinerary. Here the model could consider various travel options, user preferences, and constraints simultaneously, navigating through these choices in a non-linear, interconnected, and recursive manner.
The graph of different thoughts doesn't emerge on its own: there isn't a magic prompt for having ChatGPT unbundle an elaborate problem for you. Unlike chain of thought reasoning, it's not a prompting scheme: it's more of a fine-tuning scheme. To put it into action, designers will have to take on the challenge of creating an interface and affordances that allow a user to guide the generation of thoughts and their relationships in partnership with the AI.
The Artificiality Weekend Briefing: About AI, Not Written by AI