Right, so you’ve got your LLM, and it’s brilliant at spitting out text. But you want it to do things. You want it to look up the weather, query a database, run some code, or maybe book you a flight to a tropical island (we can dream, right?). This is where LangChain agents come in. Think of an agent as a slightly overwhelmed but brilliant intern inside your computer. The LLM is the intern’s brain, capable of complex reasoning, and the tools you give it are, well, the tools it’s allowed to use. The agent’s job is to figure out which tool to use, when, and with what input, based on your instructions.

The magic—and the occasional frustration—lies in how the agent makes those decisions. We’re going to look at the two heavyweight champs in this arena: the ReAct framework and OpenAI’s Function Calling. They approach the same problem from slightly different angles, and understanding both will save you a world of headaches.

The ReAct Framework: Think, Act, Repeat

ReAct stands for Reason + Act. It’s a beautifully simple yet powerful pattern. The LLM doesn’t just blurt out an answer; it verbalizes its chain of thought (the Reason) and then decides on an action (the Act). This creates a transparent, step-by-step process that you can actually follow in the agent’s output.

Here’s the basic loop:

  1. Reason: The LLM looks at your prompt, the available tools, and any previous steps. It thinks, “Hmm, what’s the next logical step here?”
  2. Act: It decides to use a tool. The output is a structured command like Action: search_api followed by Action Input: "current weather in London".
  3. Observe: The tool runs, and its result is fed back to the LLM.
  4. Repeat: The LLM reasons over this new observation, deciding if it has the final answer or needs to take another action.

Let’s see it in action. First, you’ll need to install the usual suspects: langchain, langchain-openai, and langchain-community. We’ll use a simple Python REPL tool for this example.

from langchain.agents import initialize_agent, AgentType
from langchain.tools import Tool
from langchain_community.utilities import PythonREPL
from langchain_openai import ChatOpenAI

# Let's give our agent a calculator. Because who doesn't need help with math?
python_repl = PythonREPL()

# Wrap it in a LangChain Tool
calc_tool = Tool(
    name="python_repl",
    description="A Python shell. Use this to execute simple math calculations. Input should be a valid Python expression.",
    func=python_repl.run,
)

# Initialize the LLM that will power the agent's brain
llm = ChatOpenAI(model="gpt-4-turbo-preview", temperature=0)

# Create the agent. We're using the ZERO_SHOT_REACT_DESCRIPTION type, which is pure ReAct.
agent = initialize_agent(
    tools=[calc_tool],
    llm=llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,  # This is crucial to see the agent's thought process!
)

# Let's ask it something that requires multiple steps.
result = agent.run("What is the square root of the factorial of 7? Show me the intermediate value for the factorial.")

Run this, and watch the verbose output. You’ll see the LLM think: “I need to calculate the factorial of 7 first.” Then it will act by calling the Python tool with 7*6*5*4*3*2*1. It will observe the result (5040), then think again: “Now I need the square root of that.” Then it will act again with math.sqrt(5040). The beauty is in its transparency. You can debug exactly where and why it went wrong, which is a lifesaver when building complex workflows.

OpenAI Function Calling: The Structured Shortcut

Now, here’s where the designers at OpenAI decided to be clever. Instead of making the LLM output a string that we have to painfully parse for Action: and Action Input:, they built a way for the model to request a function call in a perfectly structured JSON format. This isn’t a separate framework; it’s a capability baked directly into their API models like gpt-3.5-turbo and gpt-4.

The flow is different:

  1. You define your tools, but you describe them as JSON schemas (using Pydantic is the easiest way).
  2. You send the user’s prompt along with these function schemas to the OpenAI API.
  3. The model doesn’t output text; it outputs a JSON object saying, “Hey, I want to call function X with these parameters.”
  4. You run the function yourself in your code, get the result, and send that result back to the model for the next step.

This is vastly more reliable than parsing text. Let’s build the same calculator agent using this method.

from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain import hub
from langchain.tools import StructuredTool
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_openai import ChatOpenAI

# Define the input schema for our tool using Pydantic
class CalculateInput(BaseModel):
    expression: str = Field(description="A valid mathematical expression to evaluate, like '5 * 8' or 'math.sqrt(64)'")

# The function itself
def calculate(expression: str):
    return python_repl.run(expression)

# Wrap it as a StructuredTool, which knows about its JSON schema
calc_tool_structured = StructuredTool.from_function(
    func=calculate,
    name="calculate",
    description="Executes a Python calculation.",
    args_schema=CalculateInput
)

# Pull a default prompt for OpenAI functions
prompt = hub.pull("hwchase17/openai-functions-agent")
llm = ChatOpenAI(model="gpt-4-turbo-preview", temperature=0)

# Create the agent and executor
agent = create_openai_functions_agent(llm, [calc_tool_structured], prompt)
agent_executor = AgentExecutor(agent=agent, tools=[calc_tool_structured], verbose=True)

result = agent_executor.invoke({"input": "What is the square root of the factorial of 7?"})

The key difference? Under the hood, LangChain is handling the conversation with the OpenAI API. The model responds not with text but with a function call request like {"name": "calculate", "arguments": {"expression": "math.factorial(7)"}}. This is parsed automatically, the function is executed, and the result is fed back. It’s cleaner, more reliable, and the recommended approach for most new projects.

Pitfalls, Edge Cases, and How Not to Lose Your Mind

Agents are powerful but notoriously fiddly. Here’s the real-world advice you need:

  • The Verbose Flag is Your Best Friend: Always run with verbose=True while developing. If your agent gets stuck or does something stupid, the logs are your only clue as to why. Without it, you’re debugging in the dark.

  • Tool Descriptions are Everything: The LLM chooses tools based solely on their name and description. If your description is vague or wrong, the agent will make bad choices. Be specific. “Use this to get the current weather for a city” is good. “This is a tool” is useless.

  • They Can Get Stuck in Loops: You’ll see it eventually. The agent will call a tool, get a result, then call the exact same tool with the exact same input again. And again. You need to build in safeguards, like a maximum number of iterations (max_iterations in AgentExecutor), to prevent infinite loops and a massive API bill.

  • Cost and Latency: Every “step” in the ReAct loop is a new API call. A complex task can take 5-10 steps. That adds up in both cost and time. OpenAI Functions can be more efficient, but it’s still multiple calls. Don’t use an agent for a simple, one-step prompt. That’s like using a rocket launcher to open a tin can.

  • Hallucinating Tools: This is a classic. If you give the agent a search_web tool, it might decide it needs to use a send_email tool that doesn’t exist. It will just hallucinate the name and try to call it, causing your program to crash. Your code needs to be robust to handle unknown action requests gracefully.

So, which should you use? Start with OpenAI Function Calling. It’s more modern, more reliable, and better supported. Use the raw ReAct framework when you need maximum transparency for complex reasoning or if you’re using a model that doesn’t support function calling. Either way, you’re now giving your LLM the ability to roll up its sleeves and get to work. Just remember to keep an eye on your new intern. They’re clever, but they can be a bit literal.