How to Build Human Approval for High-Risk Agent Tools with DeepAgents and LangGraph

This article explains how to use DeepAgents, LangGraph, and LangChain to build an agent with human approval: standard queries run automatically, while high-risk delete operations are interrupted, reviewed, and then resumed. This pattern reduces the risk of unauthorized agent behavior and irreversible mistakes. Keywords: Human-in-the-Loop, LangGraph, DeepAgents.

The technical specification snapshot defines the solution at a glance

Parameter Description
Core language Python
Agent framework DeepAgents
Orchestration and state management LangGraph
Model access protocol OpenAI-Compatible API
Example models kimi-k2.5 / Tongyi Qianwen-compatible endpoint
Checkpoint implementation InMemorySaver
Risk control mechanism interrupt_on human-interrupt approval
Typical actions Query table, delete table, delete file
Code repository deepagents-practice

This architecture creates an auditable safety gate for agent actions

Agents are powerful because they can plan tasks and call tools on their own. But when a high-risk action is triggered by mistake, the cost is often irreversible, such as deleting a table, removing a file, transferring funds, or changing production configuration.

For that reason, the better strategy is not to “hope the model is careful.” Instead, define safety boundaries at the framework layer: low-risk tools execute automatically, while high-risk tools must pause and wait for human approval.

DeepAgents human-in-the-loop walkthrough | Using LangGraph to implement human approval for high-risk agent tools AI Visual Insight: This image serves as the article cover and highlights the combined use of DeepAgents, LangGraph, and a human approval mechanism. It points to a technical flow where the agent is interrupted before executing dangerous tools, then reviewed and resumed.

The example task shows how human intervention is triggered

The example task is: first query the product table, then delete the user table, and finally delete the lucaju.txt file. Querying is a low-risk action and can run directly. Deleting a table and deleting a file are high-risk actions and must be approved.

user_task = "First query data from the product table. Then delete the user table. Finally, delete the lucaju.txt file"
# Queries can run automatically, but delete operations must go through human approval

This input validates that automatic execution and human intervention can coexist within the same task.

Model initialization must explicitly define the compatibility protocol and provider configuration

This article uses an OpenAI-compatible endpoint to access the model. For that reason, model_provider is set to openai, and base_url points to a DashScope-compatible endpoint. The main benefit is a unified model invocation pattern, which makes it easier to swap the underlying model later.

import os
from langchain.chat_models import init_chat_model

llm = init_chat_model(
    model="kimi-k2.5",
    model_provider="openai",  # Use the OpenAI-compatible protocol
    api_key=os.getenv("AliQwen_API"),  # Read the key from an environment variable
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
    model_kwargs={"reasoning_effort": "none"},
)

This code initializes the model and exposes it through a unified OpenAI-compatible interface.

Environment variable configuration must be completed first

export AliQwen_API="your API key"

This step injects the API key securely at runtime and avoids hardcoding secrets.

Tool definitions determine which actions enter the approval surface

The tools themselves can stay simple, but the risk classification must be explicit. The example defines three tools: query table data, delete a table, and delete a file. The last two enter the approval flow.

from langchain.tools import tool

@tool
def delete_table(table_name: str) -> str:
    """Delete the specified table"""
    return f"Delete table {table_name}"

@tool
def delete_file(file_name: str) -> str:
    """Delete the specified file"""
    return f"Delete file {file_name}"

@tool
def query_table_data(table_name: str) -> str:
    """Query data from the specified table"""
    return f"Query data from table {table_name}"

This code defines the smallest runnable toolset needed to demonstrate execution strategies across different risk levels.

A checkpointer is the prerequisite for pausing and resuming execution

Human intervention is not just a confirmation dialog. The agent must actually pause at a specific execution node and continue from the same position after approval. This capability depends on LangGraph checkpointing.

from langgraph.checkpoint.memory import InMemorySaver

checkpointer = InMemorySaver()
config = {
    "configurable": {
        "thread_id": "123"  # Used to identify the same conversation
    }
}

This code creates an in-memory checkpoint and binds it to a resumable execution session through thread_id.

Without checkpoints, the interrupted execution state cannot be restored

When a high-risk tool triggers an interrupt, LangGraph saves the context, tool invocation plan, and current execution position. After approval returns, the framework uses the same thread_id to recover the saved state and continue execution.

You should not rely on InMemorySaver in production for long-running use cases, because state is lost after a process restart. Redis, a database, or another persistent store is a safer choice.

DeepAgents interrupt_on is the core switch for human approval

When you create the agent, the real control point for “which tools should be intercepted” is not the prompt. It is interrupt_on. This is a framework-level control and is more reliable than prompt-only constraints.

from deepagents import create_deep_agent

main_agent = create_deep_agent(
    model=llm,
    name="Main Agent",
    system_prompt="Reply in Chinese and call the corresponding tools to complete each task.",
    tools=[delete_table, delete_file, query_table_data],
    interrupt_on={
        "delete_table": True,  # High-risk table deletion requires approval
        "delete_file": True,   # High-risk file deletion requires approval
    },
    checkpointer=checkpointer,
)

This code declares the interception rules for high-risk tools and connects state persistence to the agent.

The first execution interrupts at a high-risk tool and returns pending approval data

On the first invocation, the agent executes low-risk actions first and then pauses when it reaches a high-risk tool. The pending approval information appears in the __interrupt__ field.

result_1 = main_agent.invoke(
    {
        "messages": [
            {"role": "user", "content": user_task}
        ]
    },
    config=config,
)

interrupt = result_1["__interrupt__"]  # Read the interrupt information

This code triggers a standard approval flow: execute what is safe first, then interrupt.

__interrupt__ contains the requested action and the allowed decision types

action_requests describes the tool calls waiting to be executed and their parameters. review_configs describes the available review outcomes, usually including approve, edit, and reject.

That means a frontend or approval system can render a full approval card from this data, rather than showing only a vague “do you want to continue?” message.

During approval, you can reject actions or edit parameters before execution

If the action is too risky, you can reject it directly. If the model produced something close to correct but still needs adjustment, you can use edit to modify the arguments before allowing execution. This is far more practical than a binary approve-or-reject model.

decisions = []
action_requests = interrupt[0].value["action_requests"]

for action_request in action_requests:
    if action_request["name"] == "delete_table":
        decisions.append({"type": "reject"})  # Reject table deletion
    elif action_request["name"] == "delete_file":
        decisions.append({"type": "reject"})  # Reject file deletion

This code constructs a list of human decisions in the same order as the pending approval actions.

decisions.append({
    "type": "edit",
    "edited_action": {
        "name": "delete_file",
        "args": {"file_name": "new-file.txt"}  # Manually override the target argument
    }
})

This code shows how to override the model-generated parameters during review.

The second execution resumes the original flow through Command(resume=...)

After approval is complete, you do not need to send the user message again. Instead, pass the decisions back to the agent through Command(resume=...) and continue using the same thread_id.

from langgraph.types import Command

result_2 = main_agent.invoke(
    Command(resume={"decisions": decisions}),
    config=config,  # You must keep the same thread_id
)

print(result_2["messages"][-1].content)  # Output the final resumed result

This code resumes execution from the interrupt point and determines whether tools should actually run based on the approval results.

The complete minimal example can serve directly as a PoC skeleton

import os
from deepagents import create_deep_agent
from langchain.chat_models import init_chat_model
from langchain.tools import tool
from langgraph.checkpoint.memory import InMemorySaver
from langgraph.types import Command

llm = init_chat_model(
    model="kimi-k2.5",
    model_provider="openai",
    api_key=os.getenv("AliQwen_API"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
    model_kwargs={"reasoning_effort": "none"},
)

@tool
def delete_table(table_name: str) -> str:
    return f"删除表{table_name}"

@tool
def delete_file(file_name: str) -> str:
    return f"删除文件{file_name}"

@tool
def query_table_data(table_name: str) -> str:
    return f"查询表{table_name}的数据"

checkpointer = InMemorySaver()
config = {"configurable": {"thread_id": "123"}}

main_agent = create_deep_agent(
    model=llm,
    name="主智能体",
    system_prompt="回答使用中文,调用对应的工具实现对应的功能!",
    tools=[delete_table, delete_file, query_table_data],
    interrupt_on={"delete_table": True, "delete_file": True},
    checkpointer=checkpointer,
)

result_1 = main_agent.invoke(
    {"messages": [{"role": "user", "content": "先查询product表的数据!再删除user表,最后,删除lucaju.txt文件"}]},
    config=config,
)

interrupt = result_1["__interrupt__"]
decisions = []
for action_request in interrupt[0].value["action_requests"]:
    if action_request["name"] in ["delete_table", "delete_file"]:
        decisions.append({"type": "reject"})  # Reject all high-risk actions

result_2 = main_agent.invoke(
    Command(resume={"decisions": decisions}),
    config=config,
)

This code connects the full chain: model initialization, tool definition, interrupt-driven approval, and resumed execution.

In production, human intervention must be designed as a complete control chain

In real projects, you should persist the tool name, parameters, trigger reason, session context, approver identity, approval time, and the before-and-after values of edited parameters. This is necessary for auditing and accountability.

At the same time, approval permissions must integrate with your business identity system. Not everyone should be able to approve destructive database operations, file deletion, or finance-related actions. The approval system itself must enforce privilege isolation.

Compared with a Java approach, this Python stack is better for agent orchestration experiments and rapid prototyping

Spring AI Alibaba is often a better fit for deep integration into enterprise Java ecosystems. By contrast, LangChain + LangGraph + DeepAgents is lighter and faster for multi-step task orchestration, state recovery, and PoC iteration, especially for Python-focused teams.

FAQ

1. Why can’t you rely only on prompts to constrain high-risk tools?

Prompts are soft constraints, and models can drift. interrupt_on is a hard interception mechanism at the framework layer. It can pause execution before the tool is actually called, which provides significantly stronger safety guarantees.

2. Why must you reuse the same thread_id when resuming execution?

Because thread_id is the state index for a session. Without the same identifier, LangGraph cannot locate the checkpoint saved before the interrupt, so it cannot continue from the original position.

3. Why is InMemorySaver not recommended in production?

InMemorySaver is suitable only for demos. In service restarts, autoscaling events, or multi-instance deployments, in-memory state cannot be shared and can easily cause in-flight approval sessions to disappear.

Core summary

This article reconstructs a practical Human-in-the-Loop pattern built on LangChain, LangGraph, and DeepAgents. It focuses on high-risk tool interruption and approval, checkpoint-based resume behavior, the approve / edit / reject decision flow, and production-grade implementation guidance for Python projects.