28.2 ReAct: Reasoning + Acting in Interleaved Steps

Right, let’s talk about ReAct. You’ve probably hit the wall with standard LLM prompting. You ask a question, it gives you an answer that sounds plausible but is, in fact, a beautiful and confident hallucination. It’s like asking for directions from a poet. ReAct is our first solid attempt to fix that by giving the model a way to do things to find the answer, not just make one up.

The name says it all: Reasoning + Acting. It’s a framework that forces the LLM to interleave its internal reasoning with external actions. Instead of just spitting out a final answer, the model is prompted to create a “Thought, Act, Observation” loop. The “Thought” is its internal monologue where it plans its next move. The “Act” is where it uses a tool, like a search API or a calculator. The “Observation” is the result it gets back from that tool. This loop continues until it has all the information it needs to give you a final, grounded answer.

The sheer, unadulterated power of this is that it grounds the model’s responses in real data. It can’t just make up the population of Peru; it has to execute a search action, get the real number back, and then reason about it. This drastically reduces hallucinations and gives you something you can actually trust.

The Core Loop: Thought, Act, Observe

Let’s break down the cycle. This isn’t just a cute pattern; it’s the entire engine. The model is constrained by a strict structure in its prompt to output in this JSON-like format. This is crucial because it makes the output parseable by our code.

# A simplified look at what the model's output looks like in each step
react_step = {
    "thought": "I need to find the current population of Peru to answer the user's question.",
    "act": "search",
    "act_input": "current population of Peru 2023"
}
# Our system would then run a search tool with that query and feed the result back as the 'observation'

The magic is in the interleaving. Each “Observation” becomes context for the next “Thought.” It’s a way for the model to build up a knowledge base during the conversation, much like you or I would if we were solving a problem by looking things up.

Building a Simple ReAct Agent

Let’s move from theory to practice. Here’s a bare-bones implementation in Python. We’ll use LangChain here because it saves us from writing a ton of boilerplate prompt engineering, but it’s important to see what’s happening under the hood.

from langchain.agents import load_tools, AgentType
from langchain.agents import initialize_agent
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI

# First, we need the brain and the tools.
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0) # Temperature 0 for less creativity, more factuality
tools = load_tools(["serpapi", "llm-math"], llm=llm) # A search tool and a calculator

# Now, we stitch them together into an agent.
# The AgentType.ZERO_SHOT_REACT_DESCRIPTION tells LangChain to use the ReAct framework.
agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True  # So we can see the Thought/Act/Observe loop in action!
)

# Let's ask it something that requires multiple steps.
result = agent.run("Who is the current CEO of Apple? What is their age raised to the power of 0.43?")

When you run this with verbose=True, you’ll see the glorious loop unfold in your terminal:

Thought: “I need to find the current CEO of Apple.”
Act: Search: "current CEO of Apple"
Observation: “Tim Cook”
Thought: “Now I need to find Tim Cook’s age.”
Act: Search: "Tim Cook age"
Observation: “62 years old” (or whatever it is at the time)
Thought: “Now I need to calculate 62 raised to the power of 0.43. I should use the calculator for this.”
Act: Calculator: 62^0.43
Observation: “4.501…”
Final Answer: “The current CEO of Apple is Tim Cook, and his age raised to the power of 0.43 is approximately 4.50.”

It’s a thing of beauty. The model didn’t need to know Tim Cook’s age; it just needed to know how to find it and what to do with it.

Where This Whole Thing Falls Apart

Don’t get me wrong, ReAct isn’t magic fairy dust. It has very real, very frustrating failure modes.

The Loop of Doom: The most common pitfall is the infinite loop. The model gets stuck. It performs a search, doesn’t find a clear answer in the snippet, so it… performs another almost identical search. And another. You need to build in hard step limits to avoid burning through your API credits.
Action Parsing Failures: The model is supposed to output a strict format, but sometimes it gets creative. It might just output a thought and then an answer, forgetting to act. Or it might format the action wrong, so your code can’t parse it. Your code needs to be a paranoid fortress against malformed responses.
Bad Tool Selection: You give a model a calculator and a search engine. The user asks “what’s the meaning of life?” and the model’s thought process is: “This is a numerical question. I will use the calculator: meaning of life * 1.” This happens more than you’d think. The model isn’t reasoning about the tools, it’s just pattern-matching which one to use.
The “I’m Done Too Soon” Bug: Sometimes the model just… gives up. It does one search, gets a result that’s close, and then confidently delivers a final answer that is completely wrong. It’s lazy, and you have to prompt it rigorously to ensure it verifies its work.

The best practice here is to assume the agent will fail and build accordingly. Max steps. Timeouts. Clear, constrained prompts. Fallback routines. You’re not just building an agent; you’re building a supervisor for a brilliant but occasionally derpy intern.

ReAct is the foundational blueprint. It’s the proof that this can work. The real-world systems you’ll build on top of it will be messier, more robust, and involve a lot more error handling. But it all starts with this simple, powerful idea: think, then do, then look.