This zero-dependency self-improvement skill for Agents enables continual optimization across tasks through local Markdown-based persistent memory, captured user corrections, and post-task self-reflection. It addresses recurring mistakes, poor knowledge retention, and unauditable rule accumulation. Keywords: OpenClaw, layered memory, self-reflection.
The technical specification snapshot summarizes the skill at a glance
| Parameter | Description |
|---|---|
| Project Name | self-improving-with-reflection |
| Ecosystem | OpenClaw Skills |
| Version | 1.2.11 |
| Implementation Language | Markdown + file system conventions |
| Storage Protocol | Local file read/write |
| Dependencies | Zero dependencies |
| Memory Architecture | HOT / WARM / COLD |
| Key Files | memory.md, corrections.md, index.md |
| Stars | Original data not provided |
| Core Value | Enables an Agent to accumulate reusable execution experience from corrections and reflection |
The skill’s core goal is to give Agents auditable continual learning
self-improving-with-reflection is not a traditional vector memory store, and it is not a black-box reasoning plugin. Instead, it turns the loop of correction, abstraction, improvement, and archival into a local file system that is readable, editable, and exportable.
Compared with approaches that only record conversation history, this skill focuses more on accumulating execution quality. The Agent does not just remember facts. It also remembers which practices work better, which expressions users tend to correct, and which workflows deserve reuse.
The three-tier memory structure balances performance and maintainability
./self-improving/
├── memory.md # HOT: always loaded at startup, ≤100 lines
├── index.md # Memory index and maintenance status
├── corrections.md # Recent correction log
├── projects/ # WARM: project-level knowledge
├── domains/ # WARM: domain-level knowledge
└── archive/ # COLD: historical archive
This directory layout separates high-frequency rules, contextual knowledge, and historical traces into different storage locations, preventing a single file from growing without bound.
The layered memory architecture separates short-term high-frequency rules from long-term knowledge
The HOT layer lives in memory.md. It is best suited for confirmed user preferences, stable working patterns, and correction rules that appeared frequently within the last seven days. Because the system must always load it, its size must remain tightly constrained.
The WARM layer supports on-demand retrieval and is divided into projects/ and domains/. The former stores project-specific override rules, while the latter stores domain-specific conventions. This allows the same Agent to switch behavior automatically when working across different repositories.
Namespace inheritance rules reduce the cost of rule conflicts
global (memory.md)
↓
domain (domains/*.md)
↓
project (projects/*.md)
The core precedence is project > domain > global. If a conflict exists at the same level, the most recently modified rule wins. If the conflict still cannot be resolved, the Agent should ask the user for clarification. This strategy works especially well in multi-project and multi-team environments.
The correction detection mechanism turns natural language feedback into executable rules
The system listens for explicit correction phrases such as “No, that’s not right” or “I prefer X, not Y.” These signals are appended to corrections.md, where the system then evaluates whether they can be abstracted into stable patterns.
Once the same type of pattern appears three times, it graduates from tentative experience into a HOT rule. This prevents one-off overfitting while still promoting genuinely frequent preferences into primary memory.
## 2026-04-26 - [23:10] Changed X → Y
Type: format
Context: API documentation output format correction
Confirmed: pending (2/3)
Source: user explicit correction
This log format converts vague feedback into structured records that support auditing, counting, and later promotion.
The self-reflection mechanism lets Agents improve without relying only on user corrections
This skill does not just accept external corrections. It also requires proactive review after multi-step tasks, bug fixes, and negative feedback. Reflection entries usually contain three parts: context, issue, and improvement plan.
That means the Agent learns more than a single answer. It learns why the result was poor and how to avoid the same failure next time. This is the key step that upgrades memory from a passive record system into an execution improvement system.
The reflection template stays minimal and supports later compression
CONTEXT: Building Flutter UI
REFLECTION: Spacing looked off, had to redo
LESSON: Check visual spacing before showing user
The value of this template lies in its low-friction capture model while preserving enough information density to remain useful.
Lifecycle management rules prevent the memory system from growing out of control
The skill includes built-in migration rules across HOT, WARM, and COLD layers. Frequently reused patterns within seven days stay in HOT. Patterns unused for 30 days move to WARM. Patterns unused for 90 days are archived into COLD.
Maintenance operations include weekly compression of similar entries, updating index.md, and automatically archiving stale rules. This is not just storage. It is a self-cleaning knowledge cache.
# Inspect the memory directory and read the core files
ls -la ./self-improving/
cat ./self-improving/index.md # View the index and maintenance status
cat ./self-improving/memory.md # Load HOT memory
find ./self-improving/domains ./self-improving/projects -maxdepth 1 -type f -name "*.md" | sort
These commands initialize the memory loading workflow: read HOT first, then enumerate context files that can be loaded on demand.
The file format design emphasizes both human readability and machine compatibility
memory.md is divided into Confirmed Preferences, Active Patterns, and Recent so the system can distinguish never-decaying preferences from still-maturing rules. corrections.md focuses on timestamp, type, context, and confirmation state. index.md records the size of each layer and the most recent compression time.
The advantage of this format is simple: the Agent can parse it directly, and developers can edit it manually without needing database tooling.
A minimal memory.md template looks like this
## Confirmed Preferences
- Prefer summarizing parameters in tables when generating technical documentation
## Active Patterns
- List the execution plan first when handling multi-step tasks
## Recent (last 7 days)
- The user recently corrected capitalization of English technical terms multiple times
This template shows how HOT memory separates stable rules, active patterns, and recent signals that may still need confirmation.
Compared with other memory approaches, its strengths are transparency, low cost, and strict constraints
Traditional Agent memory systems often depend on external databases or vector indexes, which makes deployment more complex and auditing more difficult. This skill is completely file-system-based, making it a strong fit for local-first, privacy-sensitive, or manually reviewed environments.
More importantly, it explicitly forbids storing credentials, financial information, medical data, third-party private data, or location patterns, and it supports forget everything as a reset switch.
Integration with SOUL.md and AGENTS.md completes the operational loop
The recommended integration model is straightforward: in SOUL.md, require the Agent to load HOT memory and related memory files before non-trivial tasks; in AGENTS.md, distinguish factual memory from execution-improvement memory; and in HEARTBEAT.md, periodically check promotion and archival status.
## Self-Improving Check
- [ ] Review corrections.md for patterns ready to graduate
- [ ] Check memory.md line count (should be ≤100)
- [ ] Archive patterns unused >90 days
This checklist institutionalizes maintenance work and prevents the memory system from being left unmanaged over time.
FAQ structured common questions
1. What fundamentally distinguishes this skill from ordinary chat history memory?
Answer: Chat history stores what was said, while this skill stores which practices work better. It functions more like an execution strategy library than a conversation replay log.
2. Why split memory into HOT, WARM, and COLD layers?
Answer: Keeping all memory resident slows loading and adds noise. With layering, high-frequency rules stay resident, project knowledge loads on demand, and old patterns move into archives, improving both efficiency and maintainability.
3. Which Agent scenarios is it best suited for?
Answer: It is a strong fit for coding assistants, personal productivity Agents, documentation generation Agents, and long-running task systems that need stable output style and process quality across sessions.
Core Summary: This article reconstructs and analyzes OpenClaw’s self-improving-with-reflection skill, focusing on its zero-dependency file system, HOT/WARM/COLD layered memory, correction detection, self-reflection, lifecycle management, and integration patterns for AGENTS.md, SOUL.md, and HEARTBEAT.md.