25.1 LangChain Architecture: Models, Prompts, Chains, Memory, Agents

Right, let’s pull back the curtain on LangChain. You’ve probably seen the buzzwords: “Chains,” “Agents,” “Memory.” They sound intimidatingly abstract, like something a team of over-caffeinated architects would whiteboard for weeks. In reality, they’re just sensible, pragmatic ways to organize the chaos of talking to LLMs. Think of it less as a rigid framework and more as a set of well-labeled boxes to keep your prompts from becoming a tangled mess on the floor.

The core idea is brutal simplicity: you break your application into logical components, wire them together, and let LangChain handle the tedious glue code. This is the opposite of writing a single, monstrous, 1000-line prompt and praying to the AI gods. We’re building something maintainable.

The Core Components: Your New Toolbox

At its heart, LangChain is built on a few key concepts. You’ll use these every single day.

Models: These are the actual LLMs. LangChain provides a standard interface so you can swap between different providers without rewriting your entire app. It’s the difference between being locked into one vendor and having actual choices.

from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic

# The standard way. 'gpt-4-turbo' today, maybe 'claude-3-opus' tomorrow?
llm = ChatOpenAI(model="gpt-4-turbo")

# Swapping it out is trivial. Your prompts and chains might just work.
llm = ChatAnthropic(model="claude-3-sonnet-20240229")

Prompts: This is where most people waste 80% of their time. LangChain encourages you to treat prompts as templates, not string concatenation nightmares. This means you can define a structure with placeholders and fill it in later. It keeps your code clean and your prompts reusable.

from langchain.prompts import ChatPromptTemplate

# This is a template, not a finished prompt. Notice the `{topic}`.
prompt_template = ChatPromptTemplate.from_messages([
    ("system", "You are a sarcastic technical expert. You love metaphors about 90s tech."),
    ("human", "Explain the concept of {topic} to me.")
])

# Now, we create the actual prompt by injecting the input.
final_prompt = prompt_template.invoke({"topic": "recursive functions"})
# The LLM will now get the full, structured message with the topic inserted.
response = llm.invoke(final_prompt)

The beauty here is that the system message stays separate from the human message. The underlying API gets them as distinct entities, which is how you get consistent, role-aware behavior from the model. This is a huge win over f-strings.

Chains: This is the “Lang” and the “Chain” part. A chain is just a sequence of steps, where the output of one step becomes the input to the next. The simplest chain is an LLMChain, which is literally just “take a prompt template, fill it with inputs, and send it to the LLM.” But you can build wildly complex ones.

from langchain.chains import LLMChain

# Create a chain by combining our model and our prompt template.
chain = LLMChain(llm=llm, prompt=prompt_template)

# Execute the whole thing in one go. LangChain handles the templating and the model call.
result = chain.invoke({"topic": "database indexing"})
print(result['text'])

Why is this powerful? Because now chain is a reusable object. You can pass it around, combine it with other chains, and it’s a single, testable unit. This is the foundation of everything else.

Memory: Because Amnesia is a Bad Feature

By default, LLMs are stateless. Each API call is a brand-new conversation. This is, to put it technically, dumb. For a chatbot or any ongoing interaction, you need memory. LangChain’s memory systems are essentially clever managers for your chat history.

from langchain.memory import ConversationBufferWindowMemory

# Let's keep the last 3 turns of the conversation in memory.
memory = ConversationBufferWindowMemory(k=3, return_messages=True)

# Create a chain, but now with memory!
chain = LLMChain(llm=llm, prompt=prompt_template, memory=memory)

# First call
result1 = chain.invoke({"topic": "Python decorators"})

# Second call. The chain automatically injects the previous Q&A into the prompt.
result2 = chain.invoke({"topic": "Now explain how I'd use one in a real project."})

The magic is that LangChain automatically formats the entire history and slots it into your prompt template in the right place. You can use simpler memory (like just storing the last message) or far more complex memory that summarizes past interactions. The key is that you, the developer, aren’t manually pasting history into every new prompt. You’d get that wrong. Let the framework handle it.

Agents: Giving Your LLM a Swiss Army Knife

Here’s where things get genuinely wild. An agent is an LLM that has been empowered to use tools. The LLM becomes a reasoning engine that decides what to do next. It thinks, “Hmm, the user asked for the weather. I need to use the ‘get_weather’ tool,” and then it uses the result to form its answer.

from langchain.agents import AgentType, initialize_agent, load_tools
from langchain.tools import Tool

# A dummy function to simulate a tool
def get_company_stock_price(company_name: str) -> str:
    """A fake tool that returns a mock stock price. Imagine this calls a real API."""
    return f"${len(company_name) * 47.83}"

# Wrap the function in a LangChain Tool
stock_tool = Tool(
    name="get_stock_price",
    func=get_company_stock_price,
    description="Useful for when you need to find the stock price of a company."
)

# Give the agent access to the tool
tools = [stock_tool]
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)

# Watch the agent reason, choose a tool, and execute it.
result = agent.invoke("What is the stock price of NVIDIA?")

When you run this with verbose=True, you’ll see the agent’s internal monologue: its “Thought,” the “Action” it decides to take, and the “Observation” from the tool. It’s like watching a tiny, hyper-efficient intern reason through a problem. The catch? You absolutely must give your tools clear, precise descriptions. The LLM only knows what you tell it about the tool. A bad description leads to the agent trying to use a hammer to screw in a lightbulb.

The most common pitfall with agents is letting them spiral. Without careful tuning, an agent can get stuck in a loop of useless tool calls. You need to set strict timeouts or max iteration limits. It’s a powerful paradigm, but it trades predictability for flexibility. Use it where you need that flexibility, not for every simple task.