ReAct Execution Framework Explained: How AI Agents Close the Loop with Thought, Action, and Observation - Devuly | Smart Analytics for Developers & Projects

ReAct is one of the most established execution frameworks for AI agents. It uses a Thought, Action, and Observation loop to combine reasoning, tool use, and feedback-driven iteration, helping large language models avoid skipped steps, distorted execution, and excessive retries. Keywords: ReAct, Agent, CoT.

Table of Contents

Technical specifications provide a quick snapshot

Parameter	Details
Framework Name	ReAct (Reasoning and Acting)
Applicable Domains	AI Agents, tool calling, task planning
Core Flow	Thought → Action → Observation
Primary Languages	Language-agnostic, commonly used in Python and TypeScript ecosystems
Interaction Protocols	Function calling, HTTP APIs, JSON instructions
GitHub Stars	Not provided in the source
Core Dependencies	LLM, Tool Registry, prompt templates, executor

ReAct is the most important execution paradigm for production AI agents

The value of ReAct is not that it can call tools, but that it can call tools in an orderly way. It breaks complex tasks into a series of small decisions so the model can reason first, act next, and then revise its path based on returned results.

This makes it better suited than one-shot question answering for real-world tasks such as ticket lookup, retrieval, order placement, and calls to internal enterprise APIs. External environments are dynamic, so the model must update its next action based on feedback.

The minimum ReAct loop consists of three stages

Thought handles reasoning about the current state, missing information, and the next-step strategy. Action outputs a structured instruction. Observation receives the result returned by a tool and feeds it back into the next round of reasoning.

class ReActAgent:
    def run(self, task, tools):
        state = {"task": task, "history": []}
        while True:
            thought = llm_reason(state)  # Generate reasoning from the current state first
            action = llm_act(thought)    # Then convert the reasoning into a structured tool call
            result = tools.execute(action)  # Execute the external tool and get the returned result
            state["history"].append({
                "thought": thought,
                "action": action,
                "observation": result
            })
            if done(result):             # Stop the loop when the termination condition is met
                return result

This code shows the core of ReAct: the LLM iterates on feedback at every step instead of betting on a final answer in a single shot.

Thought is the primary control lever for reducing errors

Thought is not decorative logging. It is the decision layer. It forces the model to explicitly check goals, parameters, and constraints so it does not appear to be doing work while actually doing the wrong thing. In agent systems, missing Thought is often more dangerous than limited model capability.

It also improves observability. Developers can see why the model selected a tool and why it passed a specific set of parameters, which makes it easier to debug API failures, task drift, and prompt defects.

Thought is essentially an engineering extension of CoT for agent scenarios

CoT emphasizes showing the reasoning first and then giving the answer. ReAct extends that idea into reasoning first, acting next, and continuously consuming feedback. The former solves pure text reasoning, while the latter solves task execution that interacts with an environment.

{
  "thought": "The user wants the cheapest economy flight from Beijing to Shanghai tomorrow. First convert the relative date into a standard date, then call the flight search tool and sort by price.",
  "action": {
    "tool": "search_flights",
    "args": {
      "from": "Beijing",
      "to": "Shanghai",
      "date": "2026-04-30",
      "cabin": "economy"
    }
  }
}

This JSON example shows how Thought maps to Action: clarify constraints first, then generate executable parameters.

Skipping Thought causes agents to lose control quickly

Many failure cases do not come from poor tools. They happen because the model outputs an Action directly. As a result, natural-language parameters enter the interface without normalization. For example, passing “tomorrow” directly to an API can immediately trigger a formatting error.

A more serious issue is logical step-skipping. A user asks for the cheapest option, but the model may simply pick any flight from the list. A user asks to compare options before executing, but the model may go straight to submission. These are usually not syntax errors. They are goal-alignment failures.

A typical issue is correct parameters but the wrong objective

def choose_flight(user_query, flights):
    cheapest = min(flights, key=lambda x: x["price"])  # Select the lowest fare by price first
    if "经济舱" not in cheapest["cabin"]:             # Continue validating whether the cabin meets the constraint
        raise ValueError("未找到符合条件的经济舱")
    return cheapest

This logic shows that correct execution depends not only on API availability, but also on whether the model understood the filtering criteria first.

Observation gives agents the ability to self-correct

Observation is the real response from the external world to an Action. It can be a successful result, an error, empty data, a permission denial, or a timeout. Without Observation, the agent can only operate inside an imagined world.

Its real value lies in state updates. After reading the Observation, the model can determine whether parameters are missing, whether it should retry, whether it should switch tools, or whether the termination condition has already been met. That is how the closed loop forms.

The image mainly shows website interface elements rather than a technical architecture diagram

![](https://kunyu.csdn.net/1.png?p=56&adId=1071043&adBlockFlag=0&a=1071043&c=0&k=浅聊ReAct：Agent 的执行框架&spm=1001.2101.3001.5000&articleId=160637645&d=1&t=3&u=58fda8204fb147978eab09495c7ed11c)

AI Visual Insight: This image is a screenshot of an ad placement on a webpage. It does not contain technical details such as a ReAct execution chain, tool topology, or system state flow, so it does not add direct value to understanding the agent architecture.

ReAct increases token usage, but total system cost usually goes down

Explicit Thought consumes more tokens. That is true. But in production environments, the truly expensive factors are often not single-step reasoning costs, but incorrect tool calls, repeated retries, and task rollbacks. ReAct trades explainable reasoning for a lower failure rate.

If a workflow without Thought frequently calls the wrong tool, passes incorrect parameters, or misses filtering constraints, then the tokens it saves are only superficial savings, not savings in full task cost. In most business systems, stability is more valuable than short-term token reduction.

ReAct should be evaluated by task completion cost, not single-turn output cost

In engineering practice, it is best to track three categories of metrics: per-step token usage, tool failure rate, and task completion rate. Only by looking at these three together can you determine whether Thought is truly worth keeping.

The difference between ReAct and CoT must be understood through action

CoT is a reasoning-enhancement technique that fits purely cognitive tasks such as math problems, logic problems, and text analysis. It outputs a reasoning process and an answer, without interacting with external systems.

ReAct is an execution framework. It fits tasks such as search, retrieval, order placement, operations workflows, and enterprise workflow orchestration. It embeds reasoning before action and continuously reads environmental feedback, which makes it a natural fit for agent engineering.

FAQ uses a structured Q&A format

1. What is the difference between ReAct and standard function calling?

Standard function calling only answers how to call a tool. ReAct answers when to call it, why to call it, and how to proceed after the call. The former operates at the interface layer, while the latter operates at the execution-framework layer.

2. In production, does Thought always need to be fully exposed to users?

Not necessarily. Many systems keep internal Thought for control and debugging, while exposing only a summary or final result to users in order to balance security, privacy, and product experience.

3. What scenarios are best suited for prioritizing ReAct?

Any task that requires multi-step decisions, depends on external tools, and has intermediate results that affect later steps is a strong fit for ReAct. Examples include retrieval-augmented workflows, data queries, automated approval flows, and intelligent customer service orchestration.

Core Summary: This article systematically reconstructs the execution logic of ReAct in AI agents, focusing on the three-stage closed loop of Thought, Action, and Observation. It explains ReAct’s relationship to CoT, why Thought cannot be skipped, and the real trade-off between token cost and system stability.