EmbedClaw K10 on ESP32-S3: Lightweight AI Agent Architecture and Deployment Guide

EmbedClaw K10 is a lightweight AI agent runtime designed for the ESP32-S3. It enables message ingestion, model inference, tool invocation, and a long-term memory loop directly on a microcontroller. It addresses the core challenge of running a stable agent system on embedded hardware. Keywords: ESP32-S3, AI Agent, Feishu, DeepSeek.

The technical specification snapshot outlines the platform at a glance

Parameter Details
Core Platform Xingkong Board K10 / ESP32-S3
Primary Language C / C++
RTOS FreeRTOS
Communication Protocols Wi-Fi, WebSocket, HTTP, Feishu long connection
Model Interface OpenAI-compatible API, primarily optimized for DeepSeek
Search Tools Tavily / Bocha Search
Storage Medium SD card (FAT32)
Key Dependencies ESP-IDF, FreeRTOS, SD SPI, JSON Schema
Project Repository https://gitee.com/genvex/k10-claw
Star Count Not provided in the source material

EmbedClaw K10 gives an MCU a complete agent loop

EmbedClaw K10 does not simply embed a chat interface into a development board. Instead, it implements a lightweight runtime on the ESP32-S3 that can connect to message channels, invoke tools, persist memory, and execute ReAct-style decision-making.

It follows OpenClaw’s four-layer decoupling model of Channel → Agent → Inference → Tools, while compressing the design and isolating tasks to fit embedded resource constraints. Its core capabilities include Feishu integration, DeepSeek inference, Tavily web search, SD card memory, and automatic Wi-Fi provisioning.

It fits edge scenarios that require low cost and long uptime

It is well suited for edge intelligence nodes that need low cost, low power consumption, network connectivity, and long-term operation, such as environmental monitoring terminals, remote control panels, personal assistant devices, and educational AIoT development boards.

git clone https://gitee.com/genvex/k10-claw.git
cd k10-claw
idf.py set-target esp32s3  # Lock the chip target to avoid mismatched build artifacts
idf.py build               # Build the firmware
idf.py -p COM3 flash monitor  # Flash the device and view serial logs

These commands complete repository cloning, target configuration, firmware compilation, flashing, and runtime monitoring.

The four-layer architecture breaks complex agent behavior into controllable components

The Channel layer handles reliable external message ingestion

The message channel supports WebSocket, Feishu, WeChat, and similar integration patterns, with Feishu being the primary optimization target. Its long-connection mode does not require a public IP address, which makes it ideal for ESP32-S3 devices deployed on home or office Wi-Fi networks.

FreeRTOS separates message listening, network requests, and UI refresh into different tasks, reducing the chance of disconnections caused by blocking operations. This allows Feishu messaging, search calls, and model inference to proceed in parallel.

The Agent layer executes the ReAct decision loop

The agent core follows a closed loop of Thought, Action, Observation, and Decision. The device first interprets user intent, then decides whether to call search, read or write memory, perform hardware control, and finally returns the result through Feishu.

To adapt to MCU constraints, the system limits the maximum number of iterations, compresses intermediate results, and releases resources immediately after a task completes. This helps preserve the stability of long connections and subsequent requests.

{
  "reasoning_mode": "ReAct",
  "max_iterations": 10,
  "memory_backend": "sdcard",
  "channels": ["feishu", "websocket"],
  "tools": ["search", "sensor", "file", "cron"]
}

This configuration shows the minimum capability set of an embedded agent.

The LLM and Tools layers together define the system ceiling

The LLM layer supports OpenAI-style interfaces, with deepseek-chat recommended as the primary model. Its cost and response speed make it a good fit for a collaborative pattern of cloud-side inference and device-side execution.

The Tools layer standardizes search, file access, scheduled tasks, temperature and humidity readings, LED control, and Wi-Fi status queries into a unified tool interface. As a result, the model only needs to output structured parameters to drive real hardware behavior.

The SD card storage design significantly reduces coupling between firmware and configuration

The project stores configuration, sessions, long-term memory, skills, and scheduled snapshots on the SD card. Developers do not need to repeatedly modify and reflash firmware. Instead, they can switch the device’s persona and capability set by replacing files on the card.

Typical directories include config/config.json, SOUL.md, USER.md, session/, memory/, and skills/. This gives the device hot-update capability while reducing the risk of hardcoded sensitive keys.

/sdcard/embedclaw/
├── config/
│   ├── config.json
│   ├── SOUL.md
│   └── USER.md
├── session/
├── memory/
├── skills/
└── cron.json

The key value of this directory layout is simple: swap the card to swap the configuration, and upgrade without losing memory.

The deployment workflow is already streamlined around Feishu, DeepSeek, Tavily, and Wi-Fi

The first step is to prepare an SD card formatted as FAT32 and create the /embedclaw/config/ directory. Then write a config.json file containing model, search, Feishu, and Wi-Fi parameters.

The following core configuration template is recommended

{
  "llm": {
    "api_key": "",
    "model": "deepseek-chat",
    "api_url": "https://api.deepseek.com/v1/chat/completions"
  },
  "search": {
    "api_key": ""
  },
  "feishu": {
    "app_id": "",
    "app_secret": ""
  },
  "wifi": {
    "ssid": "",
    "password": ""
  }
}

This configuration defines the four most important external dependencies of the agent.

The second step is to create an internal enterprise app in the Feishu Open Platform, enable message send/receive permissions, and subscribe to im.message.receive_v1 in event subscriptions using long-connection mode.

The third step is to fill in the DeepSeek and Tavily API keys. The fourth step is to compile and flash the firmware. The fifth step is to complete Wi-Fi provisioning through the configuration file or the hotspot-based setup page. If the screen shows WiFi Connected and Feishu Connected after boot, the full connection path is working.

The hardware resource profile explains why the agent can run on a small board

The ESP32-S3 dual-core 240 MHz processor provides the scheduling foundation. The 8 MB PSRAM and 16 MB Flash buffer inference results, message context, and network data. The SD card provides external storage so that long-term memory does not consume limited on-chip space.

The 240×320 LCD displays connection status and result summaries, while the AHT20 and WS2812 provide the most direct sensing-and-action loop. This is not just a chat terminal. It is an edge agent with basic environmental interaction capabilities.

EmbedClaw K10 runtime demo AI Visual Insight: The image shows the device in operation along with the project theme. The key signal is that the system has already integrated Feishu message ingress, Bocha search, DeepSeek inference, and an embedded terminal into a deployable interaction pipeline. The emphasis is on the closed-loop collaboration of edge hardware, cloud models, and external tools.

Real-world scenarios show that this is more than an experimental demo

In a remote environment manager scenario, Feishu serves as the entry point, the temperature and humidity sensor collects data, DeepSeek makes decisions, and Tavily can provide external environmental context when needed. Feedback is then delivered through the LED or message notifications.

In a personal assistant scenario, the device can combine web search, scheduled reminders, and long-term memory to provide cross-session service. In educational or warehouse settings, it can also act as a low-cost IoT automation node for continuous deployment.

Extending the system with a custom tool is also straightforward

#include "ec_config_internal.h"
#include "core/ec_tools.h"

static esp_err_t ec_tool_feishu_push_execute(const char *input_json, char *output, size_t output_size) {
    // Core logic: execute a custom Feishu push and return a structured result
    snprintf(output, output_size,
             "{\"result\":\"Feishu message sent successfully\",\"wifi_status\":\"connected\",\"status\":\"success\"}");
    return ESP_OK;
}

static const ec_tools_t s_feishu_push = {
    .name = "feishu_push_tool",
    .description = "Extend the Feishu message push format with Wi-Fi status feedback",
    .input_schema_json = "{\"type\":\"object\",\"properties\":{\"message\":{\"type\":\"string\"}},\"required\":[\"message\"]}",
    .execute = ec_tool_feishu_push_execute,
};

This code shows how to add a new tool to an embedded agent by using JSON Schema.

The current optimization focus has shifted from simply running to running reliably over time

The source material highlights several key optimizations planned for the 2026 version: dual-core task separation, isolation between networking and UI, reserved memory for SSL and DMA, improved SD card compatibility through SPI mode, and a visual Wi-Fi provisioning page that lowers the deployment barrier.

The essence of these optimizations is to move from an AI agent that can be demonstrated in a lab to an edge service device that can stay online sustainably. This is exactly where EmbedClaw K10 delivers more value than a simple API-chaining example.

Hardware photo AI Visual Insight: The image shows the actual development board and peripheral layout. It suggests the system includes a display module, main control board, and peripheral interfaces, which confirms that this is not a purely software framework but a complete deployment carrier for real embedded hardware.

Hardware UI and deployment form factor AI Visual Insight: This image further supplements the device appearance and assembly form factor. It highlights LCD-based visual feedback, compact board integration, and edge-terminal deployment characteristics, making the platform suitable as an educational board, control node, or low-cost smart terminal.

FAQ provides structured answers to common deployment questions

1. Why is EmbedClaw K10 a good fit for the ESP32-S3 instead of a weaker MCU?

The ESP32-S3 provides dual cores, PSRAM, Wi-Fi, and a relatively complete peripheral set. It can simultaneously handle long-lived message connections, HTTPS requests, SD card reads and writes, and sensor control. A weaker MCU would struggle to support a full agent loop reliably.

2. Why does Feishu integration not require a public IP address?

Because the project uses Feishu’s long-connection mode, the device actively establishes a connection to the platform and receives events through that session. It does not rely on an externally exposed webhook. This makes it ideal for development boards deployed behind NAT on private networks.

3. How do DeepSeek, Tavily, and local tools work together?

DeepSeek interprets requests and plans actions, Tavily provides real-time external information, and local tools execute hardware and file operations. The agent orchestrates the three parts into a closed loop of understanding, retrieval, execution, and feedback.

[AI Readability Summary]

This article reconstructs the technical materials for EmbedClaw K10 and systematically explains how it implements a lightweight AI agent on the ESP32-S3. It covers the four-layer architecture, SD card configuration, Feishu long connections, DeepSeek integration, Tavily search, Wi-Fi provisioning, and extensibility for custom development.