May 5, 202610 min readTools

LangChain in Python: From First Chain to Production Agent

LangChain is the leading open-source Python framework for building LLM-powered applications. This guide covers LCEL, RAG pipelines, agents, LangGraph, LangSmith, and every core component with working code.

LangChain Python framework documentation interface

LangChain is an open-source Python framework that gives you composable building blocks for LLM-powered applications: prompt templates, output parsers, chains, retrievers, memory, and agents. Raw API calls get you a single model response. LangChain gets you 135,000-star infrastructure for multi-step workflows, RAG pipelines, and autonomous agents.

With a unified interface across 80+ model providers, 35% of Fortune 500 companies use it in production. The framework hit its 1.0 release in October 2025, consolidating its architecture around LangChain Expression Language (LCEL) and LangGraph.

This guide covers everything you need to know about LangChain: core components, LCEL syntax, RAG pipelines, agents, the broader ecosystem (LangGraph, LangSmith, LangServe), and how it compares to LlamaIndex, Haystack, and CrewAI.

Key Takeaways

  • LangChain standardizes LLM calls across 80+ providers. Swap from OpenAI to Anthropic by changing one import.
  • LCEL (the | pipe operator) is the modern way to compose chains. LLMChain and SequentialChain are deprecated.
  • RAG (Retrieval-Augmented Generation) is the most common use case. Chroma and FAISS are the top vector stores.
  • LangGraph handles complex, stateful agents. Use it whenever your workflow needs loops or conditional branching.
  • Add LangSmith from day one. Without it, your LangChain app is a black box.

What Is LangChain?

LangChain is an open-source Python framework designed to solve the limitations of raw LLM API calls. On their own, LLMs cannot retain context between calls, access live data, or invoke external tools. LangChain abstracts the glue code that connects these capabilities into reusable components.

Harrison Chase started LangChain as a side project in fall 2022. By October 2025, the repository had 135,000 stars, 22,300 forks, and 235 million monthly downloads.

The company behind it, LangChain Inc., raised a Series B of $125M in October 2025 at a $1.25B valuation, totaling roughly $260M in funding.

Why LangChain Matters in 2026

LangChain's 1.0 release in October 2025 was a consolidation point. The framework unified its chain and agent APIs around a single abstraction built on LangGraph, its graph-based orchestration layer. The result is a cleaner architecture that handles everything from a simple one-step summarization chain to a multi-agent system with persistent memory.

The State of AI 2024 report shows what developers are actually building: 43% of LangSmith organizations now send LangGraph traces, signaling a shift from simple retrieval workflows to agentic, multi-step applications. Open-source model adoption doubled, with Ollama and Groq breaking into the top 5 providers. 84.7% of usage comes from the Python SDK.

How LangChain Works: The Seven Core Components

LangChain's architecture centers on seven composable building blocks. You rarely use all seven at once. Every app needs at minimum Chat Models and Prompt Templates.

RAG adds Retrievers. Chatbots add Memory. Autonomous workflows add Agents and Tools.

Component

What it does

When to use

LCEL (Chain)

Composes LLM calls with data transformations

Every LLM application

Chat Model

Unified interface to any LLM provider

Every LLM application

Prompt Template

Structures and parameterizes input to the LLM

When you need consistent prompt formatting

Output Parser

Structures LLM output into typed objects

When you need structured data from LLMs

Retriever

Fetches relevant documents from a data source

RAG applications

Memory

Stores conversation history

Chatbots and multi-turn interactions

Agent + Tool

LLM decides which functions to call and in what order

Dynamic, multi-step workflows

Installation

LangChain uses a modular package structure. Install only what your project needs:

Shell
pip install langchain langchain-openai
pip install langchain-anthropic          # for Claude
pip install langchain-community          # community integrations
pip install langchain-text-splitters     # document chunking
pip install python-dotenv                # env variable management

Requires Python 3.10+. Set API keys via environment variables, never in source code:

Python
from dotenv import load_dotenv
load_dotenv()  # reads from .env file automatically

Component 1: Chat Models

LangChain wraps every LLM provider into a consistent interface. You switch providers by changing one import, nothing else:

Python
from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic

# OpenAI
llm = ChatOpenAI(model="gpt-4o", temperature=0)

# Anthropic — identical interface
llm = ChatAnthropic(model="claude-opus-4-6", temperature=0)

response = llm.invoke("What is prompt engineering in one sentence?")
print(response.content)

For streaming (tokens appear as they generate):

Python
for chunk in llm.stream("Write a haiku about Python."):
    print(chunk.content, end="", flush=True)

Component 2: Prompt Templates

Hard-coded prompts in strings break as soon as you need to vary any input. Prompt templates parameterize prompts and enforce consistent structure:

Python
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are an expert in {domain}. Answer concisely."),
    ("user", "{question}")
])

messages = prompt.format_messages(
    domain="Python automation",
    question="What is the best way to handle retries?"
)

response = llm.invoke(messages)

For few-shot learning, use FewShotChatMessagePromptTemplate to inject labeled examples before the user's message.

Component 3: LCEL and the Pipe Operator

LCEL (LangChain Expression Language) is the modern way to compose chains. It uses | like Unix pipes. LLMChain and SequentialChain from older tutorials are deprecated. Use LCEL exclusively:

Python
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

chain = (
    ChatPromptTemplate.from_template("Summarise this in one sentence: {text}")
    | llm
    | StrOutputParser()
)

result = chain.invoke({"text": "LangChain is a framework for building LLM applications."})
print(result)  # plain string, no AIMessage wrapper

You can chain multiple steps. Each step's output feeds the next:

Python
summarise_prompt = ChatPromptTemplate.from_template(
    "Summarise this text in 2 sentences: {text}"
)
translate_prompt = ChatPromptTemplate.from_template(
    "Translate this to Spanish: {summary}"
)
parser = StrOutputParser()

chain = (
    {"summary": summarise_prompt | llm | parser}
    | translate_prompt
    | llm
    | parser
)

result = chain.invoke({"text": "Your long English document here..."})

For concurrent steps, RunnableParallel runs multiple chains at the same time:

Python
from langchain_core.runnables import RunnableParallel

chain = RunnableParallel(
    pros=pros_prompt | llm | StrOutputParser(),
    cons=cons_prompt | llm | StrOutputParser(),
    summary=summary_prompt | llm | StrOutputParser(),
)

results = chain.invoke({"topic": "LangChain"})
# All three LLM calls ran concurrently

Component 4: Output Parsers

Output parsers extract structured data from LLM responses. Three main types:

StrOutputParser returns a plain string. Use it whenever you just need the text.

JsonOutputParser extracts a JSON object. Useful for light structure.

PydanticOutputParser gives you type-safe output with validation:

Python
from pydantic import BaseModel, Field
from langchain_core.output_parsers import PydanticOutputParser
from langchain_core.prompts import ChatPromptTemplate

class APIEndpoint(BaseModel):
    path: str = Field(description="The API endpoint path")
    method: str = Field(description="HTTP method (GET, POST, etc.)")
    description: str = Field(description="What this endpoint does")
    params: list[str] = Field(description="Required parameters")

parser = PydanticOutputParser(pydantic_object=APIEndpoint)

prompt = ChatPromptTemplate.from_template(
    "Design an API endpoint for: {use_case}\n\n{format_instructions}"
)

chain = prompt | llm | parser

result = chain.invoke({
    "use_case": "user authentication",
    "format_instructions": parser.get_format_instructions(),
})

print(result.path)    # "/api/auth/login"
print(result.method)  # "POST"

Building a RAG Pipeline with LangChain

Retrieval-Augmented Generation is LangChain's most common use case. The pattern: split documents into chunks, embed them into a vector store, retrieve the relevant chunks at query time, and pass them as context to the LLM. Chroma and FAISS remain the top two vector stores in the LangChain community, with Milvus, MongoDB, and Elasticsearch entering the top 10 in 2024.

A minimal RAG pipeline:

Python
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import FAISS
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

# 1. Load your document
loader = TextLoader("my_docs.txt")
docs = loader.load()

# 2. Split into chunks
splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50
)
chunks = splitter.split_documents(docs)

# 3. Embed and store
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(chunks, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})

# 4. Build the RAG chain
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

prompt = ChatPromptTemplate.from_template(
    """Answer using only the context below.
If the answer is not in the context, say "I don't know."

Context: {context}

Question: {question}"""
)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

answer = rag_chain.invoke("What is the refund policy?")
print(answer)

Adding Memory to a Chatbot

For multi-turn conversations, wrap your chain with RunnableWithMessageHistory. Use InMemoryChatMessageHistory for development, Redis or a database for production:

Python
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a Python automation expert."),
    MessagesPlaceholder(variable_name="history"),
    ("human", "{input}"),
])

chain = prompt | llm | StrOutputParser()

store = {}

def get_session_history(session_id: str) -> InMemoryChatMessageHistory:
    if session_id not in store:
        store[session_id] = InMemoryChatMessageHistory()
    return store[session_id]

chain_with_memory = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="history",
)

config = {"configurable": {"session_id": "user_42"}}

r1 = chain_with_memory.invoke({"input": "I'm building a web scraper."}, config=config)
r2 = chain_with_memory.invoke({"input": "What async library should I use?"}, config=config)
# r2 knows the context from r1

What You Can Build with LangChain

LangChain covers six primary use cases. Most production applications combine two or more:

RAG applications: Query private documents, internal wikis, codebases, or databases. This is where most enterprise LangChain deployments start. Legal document search, customer support over internal docs, and engineering knowledge bases are the most common implementations.

Conversational chatbots: Stateful assistants that remember conversation history across turns. The RunnableWithMessageHistory wrapper handles the threading. Swap InMemoryChatMessageHistory for Redis when you need session persistence across restarts.

Autonomous agents: LLMs that decide which tools to invoke based on the task. An agent might call a web search tool, a calculator, and a database query tool in sequence, choosing the order based on intermediate results. Since LangChain 1.0, create_agent() is the standard entry point:

Python
from langchain.agents import create_agent

def search_docs(query: str) -> str:
    """Search internal documentation for the given query."""
    return retriever.invoke(query)[0].page_content

agent = create_agent(
    model="openai:gpt-4o",
    tools=[search_docs],
    system_prompt="You are a helpful assistant for Python developers.",
)

result = agent.invoke({
    "messages": [{"role": "user", "content": "How do I handle rate limiting in the API?"}]
})

Structured data extraction: Convert unstructured text (emails, PDFs, contracts) into typed Python objects using PydanticOutputParser.

Multi-step pipelines: Summarization followed by translation followed by classification. LCEL's pipe operator makes these readable. RunnableParallel handles steps that can run concurrently.

Code analysis and generation: Agents that read a codebase, identify issues, generate fixes, and run tests. Common in developer tooling.

The LangChain Ecosystem

LangChain is not just a library. The company ships four integrated products:

LangGraph

Released in March 2024, LangGraph is the low-level orchestration framework for stateful agents. It models workflows as directed graphs where nodes are Python functions and edges are transitions. Use it when your agent needs loops, conditional branching, or persistent state that survives between invocations.

Klarna, Replit, and Elastic run LangGraph in production. 43% of LangSmith organizations now send LangGraph traces, a number that has grown steadily since its launch.

The key capabilities LangGraph adds over a standard agent:

  • Durable execution: Agents persist through failures and can run for extended periods.
  • Short and long-term memory: Working memory for ongoing reasoning, plus persistent memory across sessions.
  • Human-in-the-loop: Pause execution to request human approval before continuing.
  • Multi-agent coordination: Multiple specialized agents working on different sub-tasks.

If your workflow fits a linear chain, use create_agent(). If it needs loops or branches, move to LangGraph.

LangSmith

LangSmith is the observability, debugging, and evaluation platform. It captures traces of every LLM call, lets you inspect inputs and outputs, and provides tools for evaluating retrieval quality and prompt performance.

Almost 30,000 new users sign up for LangSmith every month. It's framework-agnostic: 15.7% of its traces come from applications built outside of LangChain entirely. Add it during development, before you need it in production.

LangServe

LangServe turns any LCEL chain into a REST API. It builds a FastAPI application with schema validation, a playground interface, and streaming support. Use it when you need to expose a chain as an HTTP endpoint.

Deep Agents

Introduced in 2025, Deep Agents are "batteries-included" agents with automatic context compression, a virtual filesystem, and sub-agent spawning. They sit on top of LangChain agents and are intended for long-running, complex tasks.

LangChain vs LlamaIndex vs Haystack vs CrewAI

Framework

Best for

Notable users

Weakness

LangChain

Broad agent-first platform, 80+ providers, RAG + chatbots + agents

Klarna, Replit, Elastic

Abstraction layers can obscure behavior

LlamaIndex

Document-centric ingestion and advanced indexing for complex RAG

N/A

Less mature agent tooling

Haystack

Enterprise retrieval pipelines, production-grade deployments

Airbus, Netflix

Steeper learning curve

CrewAI

Multi-agent orchestration with role-based agent definitions

N/A

Fewer integrations than LangChain

Pick LangChain when you need a unified interface across multiple LLM providers, your app combines RAG and agents, or you want LangSmith for observability out of the box. 84.7% of the LangChain community works in Python.

Pick LlamaIndex when your primary challenge is document ingestion and complex indexing strategies, not agent behavior.

Pick Haystack when you need enterprise-grade retrieval pipelines with strict production SLAs and your organization has existing Airbus or Netflix-style infrastructure requirements.

Pick CrewAI when your architecture requires multiple agents with distinct roles collaborating on a shared task, and you want an opinionated multi-agent framework rather than building it yourself on LangGraph.

When not to use LangChain at all: simple, single-model API calls where you only need one response and no chaining. In that case, call the provider SDK directly. LangChain's abstraction layers add complexity that isn't worth it for trivial tasks.

Common LangChain Mistakes to Avoid

Using LLMChain and SequentialChain

These classes are deprecated since LangChain 0.3+. Dozens of 2023 tutorials still use them, and the code will break or produce deprecation warnings on current versions. Replace every LLMChain(prompt=..., llm=...) call with LCEL:

Python
# Deprecated (don't use)
from langchain.chains import LLMChain
chain = LLMChain(prompt=prompt, llm=llm)

# Current (use this)
chain = prompt | llm | StrOutputParser()

Hardcoding API Keys

Environment variables and .env files are the correct approach. Never write OPENAI_API_KEY = "sk-..." in your source code. Use python-dotenv and a .gitignore entry for .env.

Skipping LangSmith

LangChain applications without tracing are opaque. You can't tell which step failed, what the LLM actually received, or why retrieval returned the wrong documents.

Add LangSmith from the first day of development. Retroactively adding it after a production failure is painful.

Using InMemoryChatMessageHistory in Production

In-memory session storage disappears on every restart. Any user who was mid-conversation loses their history. Use Redis, PostgreSQL, or a managed backend for production chatbots.

The API is identical: swap the storage backend in get_session_history, not in the chain itself.

Ignoring Chunk Size in RAG

The default RecursiveCharacterTextSplitter settings (chunk_size=500, chunk_overlap=50) work for many documents but not all. Legal contracts, code files, and highly technical documentation often need different chunk sizes.

Test retrieval quality with multiple configurations before shipping. A wrong chunk size is the most common reason RAG returns irrelevant context.

Building Complex Agents Without LangGraph

create_agent() is sufficient for agents that follow a single linear path. As soon as your workflow needs a loop ("retry until the output passes validation"), a conditional branch ("if error, escalate to human; otherwise continue"), or parallel sub-tasks, you need LangGraph. Building these patterns on top of basic agents leads to fragile code.

Conclusion

LangChain gives Python developers a production-tested path from a single LLM call to a full agent system. LCEL handles chain composition, LangGraph handles stateful orchestration, and LangSmith handles observability. The three work together as an integrated stack.

The best next step: install langchain and langchain-openai, write your first LCEL chain (prompt template, model, output parser), then read the RAG pipeline tutorial to query your own documents. Once you're ready to move to agents, the LangChain agents tutorial covers tool definition, create_agent(), and when to upgrade to LangGraph.

Frequently Asked Questions