LangChain's Deep Agents Are Actually Worth Your Time
How Deep Agents cut Boilerplate and Keep Your Agent Context Lean
Langchain ecosystem has been evolving very fast making it easier for users to adopt. Their docs clearly show the amount of integrations , abstractions, wrappers and SDK support they provide. Recently they introduced something which they call as an “agent-harness” in the form of “deep-agents”.
That boilerplate code you’ve been writing and writing for the every project, that’s not needed anymore. All you need is your tools and the domain logic , and the deep-agents will take care of planning, file storage, context overflow, subagent delegation, and middleware. One line to create, one line to invoke.
It’s like a manager who knows exactly whom to hire and what rules to follow and has the ability to customize the existing hires to ensure the complex projects are delivered.
In this article ,we focus on three such customization features I think are most immediately useful for working Python/LangChain developers: custom tools, middleware, and subagents.
Custom tools:
Let’s say I want a custom tool that runs "internet search”.
To achieve this, without the deep-agents there would be a lot of steps or boilerplate code that we would need to write which involved :
Defining the tool using
@tooldecorator.Pick a model
Build a prompt template
Create a tool calling agent
Wrap all these into an AgentExecutor
Invoke the agent
All these besides handling the chat history format, context overflow, summarization, subagents etc , all by ourselves.
What changes with deep-agents is ,
Once you define your tool, you get to write this:
from langchain.tools import tool
from deepagents import create_deep_agent
@tool
def internet_search(query: str) -> str:
"""Search the internet."""
return tavily.search(query)
agent = create_deep_agent(
model="claude-sonnet-4-6",
tools=[internet_search],
system_prompt="You are a helpful assistant",
)
result = agent.invoke({
"messages": [{
"role": "user",
"content": "Latest AI news?"
}]
})With planning, summarization, file system, and middleware included by default.
Pay attention to the docstrings of the tool here , as the model refers to the docstring like it refers to the prompt. Use your prompt engineering magic wand there with clear, action oriented, no fluff docstrings at the tool.
Use the type hints liberally as the SDK uses them to build tool schema.
Middleware:
Here’s when things get interesting!
Deep-agent SDKs are shipped with solid middleware built-in and available by default:
It’s not too long ago that the token counts were manually tracked, and when it exceeded we wrote a summarization step using an extra LLM call to compress the history and replace it into the messages list. That is now handled with SummarizationMiddleware which automatically compresses old conversation history when the context window gets long. No more silent failures when the long session overflows the context window.If the tool got interrupted mid-flight, the message history would break resulting in no response and writing defensive cleanup code to detect and repair the history before the next invocation was handled by ourselves. That’s now handled by PatchToolCallsMiddleware.
Prompts caching was not done right most of the time. It was either manually structured to take the advantage of it or completely missed out. It was like a question with no right answer. With deep-agents, AnthropicPromptCachingMiddleware reduces redundant token processing when using Anthropic models
TodoListMiddleware : tracks the agent’s internal task listFilesystemMiddleware : handles all file read/write operationsSubAgentMiddleware : manages spawning and coordinating subagents
The custom middleware also handles the logging part which was earlier taken care of using try/except wrapped tool calls with loggers.
Here’s an example for a custom middleware used for logging tool calls:
from langchain.agents.middleware import wrap_tool_call
from deepagents import create_deep_agent
call_count = [0] # use a list for simple case like this
@wrap_tool_call
def log_tool_calls(request, handler):
"""Log every tool call with its arguments."""
call_count[0] += 1
print(f"[Tool Call #{call_count[0]}] {request.name} — args: {request.args}")
result = handler(request)
print(f"[Tool Call #{call_count[0]}] Completed")
return result
agent = create_deep_agent(middleware=[log_tool_calls])One thing to note here would be to not mutate instance attributes in middleware (e.g. self.counter += 1). The agent can run subagents and parallel tool calls concurrently, which means shared mutable state will race. The docs recommend storing counter/accumulator state in LangGraph's graph state instead. The list trick above works for simple cases, but for anything production-grade, use graph state.
Subagents:
Things get further interesting from here onwards , especially when it comes to complex workflows. The main agent holding the context for every subtask is a bad idea. That’s how the context window get blowed up and the agent forgets what it was doing. Spawn a subagent and let it take care of the context , for isolated deep work, and have it report back.
For example:
from deepagents import create_deep_agent
research_subagent = {
"name": "research-agent",
"description": "Used to research more in-depth questions",
"system_prompt": "You are a thorough researcher who cites sources.",
"tools": [internet_search],
"model": "openai:gpt-5.2", # Can use a different model than the main agent!
}
agent = create_deep_agent(
model="claude-sonnet-4-6",
subagents=[research_subagent]
)Note :
Keep the description very clear. This is what it uses to decide when to delegate.
Use the models accordingly based on the complexity of the task. The models in subagents can be different than the main agent, which is also a way to reduce costs and keep the orchestration logic in Python rather than prompts.
Subagent and main agent are independent. Subagent does its deep work, returns a result, and the main agent never had to see all the intermediate steps. This keeps the main agent's context lean and focused.
Let’s bring it all together:
import os
from langchain.tools import tool
from deepagents import create_deep_agent
from langchain.agents.middleware import wrap_tool_call
from tavily import TavilyClient
tavily = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])
@tool
def internet_search(query: str, max_results: int = 5) -> dict:
"""Search the web for current information on a topic."""
return tavily.search(query, max_results=max_results)
@wrap_tool_call
def audit_middleware(request, handler):
"""Log all tool calls to an audit trail."""
print(f"AUDIT: {request.name}({request.args})")
return handler(request)
research_subagent = {
"name": "deep-researcher",
"description": "Handles any task requiring thorough web research or fact-checking",
"system_prompt": "You are a meticulous researcher. Always verify facts from multiple sources.",
"tools": [internet_search],
}
agent = create_deep_agent(
model="anthropic:claude-sonnet-4-6",
system_prompt="You are a helpful analyst. Delegate research tasks to the deep-researcher subagent.",
tools=[internet_search],
middleware=[audit_middleware],
subagents=[research_subagent],
)
result = agent.invoke({
"messages": [{
"role": "user",
"content": "Write a summary of recent developments in open-source LLMs."
}]
})
This is not all.
The three features above — custom tools, middleware, and subagents — are the first things worth reaching for. Tools give your agent domain knowledge. Middleware gives you observability and control. Subagents give you scale without context blowout.
Try pip install -U deepagents
The full customization docs are worth a read once you’ve got the basics working. The customization guide covers a lot more than what I've covered here like backends (virtual filesystems, sandboxed code execution), human-in-the-loop approvals, persistent memory, and skills (reusable instruction sets). All worth exploring once you've got the basics wired up.
I’ll be covering more in the upcoming articles.
If you found this useful, feel free to share and hit the ❤️ button!
