Memory Management: Understanding OpenClaw's 4-Layer System
OpenClaw's memory system is not a single mechanism—it's four separate memory layers working together. Understanding these layers is critical for building reliable agents that remember what matters and
Memory Management: Understanding OpenClaw's 4-Layer System
Overview
OpenClaw's memory system is not a single mechanism—it's four separate memory layers working together. Understanding these layers is critical for building reliable agents that remember what matters and forget what doesn't. This guide explains each layer, how they interact, and how to optimize them for production use.
The Four Memory Layers

Think of OpenClaw's memory like a computer:
- Bootstrap Files = Hard drive (permanent storage)
- Session Transcript = Disk storage (persistent but summarized)
- Context Window = RAM (active working memory)
- Retrieval Index = Search index (queryable archive)
Why Four Layers?
Each layer serves a different purpose:
- Durability - Bootstrap files survive restarts
- History - Transcripts preserve conversation details
- Performance - Context window enables real-time processing
- Scalability - Retrieval index handles large datasets
Layer 1: Bootstrap Files
What They Are
Permanent identity files loaded from disk at every session start.
Location: ~/.openclaw/ or ~/.claude/
Common files:
soul.md- Agent personality and core instructionsmemory.md- Long-term facts and preferencesagents.md- Sub-agent configurationtools.md- Tool usage instructions
How They Work
- Session starts (daily restart or manual)
- Files are read from disk - Fresh copy every time
- Content injected into context - Immediately available
- Immune to compaction - Never summarized or lost
Critical Characteristics
Always loaded:
- Every session start reads these files
- No exceptions, no caching
- Fresh from filesystem
Not in conversation history:
- Separate from chat transcript
- Changes take effect immediately on next session
- No need to "remind" the agent
Most durable layer:
- Survives context compaction
- Survives session restarts
- Survives agent crashes
Size Limits
Default limits:
- 20,000 characters per file
- 150,000 characters total
Check your usage:
/context list
Output shows:
soul.md: 15,234 / 20,000 characters
memory.md: 8,456 / 20,000 characters
agents.md: 3,221 / 20,000 characters
Total: 26,911 / 150,000 characters
Truncation Warning
If a file exceeds 20,000 characters:
- Content is truncated (cut off)
- No warning given
- Agent sees incomplete instructions
Solution:
- Keep soul.md to 15-30 lines
- Remove unnecessary biographical information
- Focus on work-relevant instructions only
Optimization Tips
Bad soul.md (bloated):
# About Me
I'm a software engineer with 10 years of experience.
I graduated from University X with a degree in Computer Science.
I worked at Company A, then Company B, then Company C.
My hobbies include reading, hiking, and photography.
I have a dog named Max and a cat named Luna.
I prefer coffee over tea, usually with milk.
I wake up at 6 AM and go to bed at 11 PM.
My favorite programming language is Python.
I also know JavaScript, Go, Rust, and Java.
I'm currently learning machine learning.
...
(100+ more lines of personal information)
Good soul.md (focused):
# Core Instructions
You are a specialized development assistant focused on:
- Code review and optimization
- Technical documentation
- API integration
## Communication Style
- Concise, technical responses
- Include code examples
- Cite sources when relevant
## Work Preferences
- Test-driven development
- Prefer TypeScript over JavaScript
- Follow project conventions
Sub-Agent Behavior
Important: Parallel sub-agents only read:
agents.mdtools.md
They do NOT read:
soul.mdmemory.md- Other bootstrap files
Implication:
- Sub-agents lack main agent's personality
- Task instructions must be in
agents.mdor passed explicitly - Sub-agents are "dumber" by design (minimal context)
Layer 2: Session Transcript
What It Is
Full conversation history saved to disk as a file.
Location: ~/.openclaw/sessions/ or similar
Contains:
- User messages
- Assistant messages
- Tool calls and results
- Timestamps
How It Works
- Every message is appended to transcript file
- Transcript is rebuilt into context when continuing a session
- Persists across restarts - Can resume conversations
- Stored in vector database format - Not human-readable
The Compaction Problem

When context window approaches limit (typically 200K tokens):
- Auto-compaction triggers
- Old messages are summarized into compact form
- Summary replaces detailed history in context
- Original transcript still exists on disk (but agent can't see it)
Critical distinction:
- Raw transcript file = Still on disk, complete
- Agent's view = Summarized version only
What Survives Compaction
Preserved:
- Last 20,000 tokens (recent messages)
- Anything written to bootstrap files
- General themes and topics
Lost:
- Exact wording of earlier instructions
- Nuance and context from old messages
- Specific constraints mentioned mid-conversation
- Casual preferences stated in chat
- Images from earlier in session
The Walter White Problem
Scenario:
Day 1: Long conversation with agent about project X
Day 2: Context compacts overnight
Day 3: Agent asks "Who are you? What project?"
Why it happens:
- Conversation details were in transcript only
- Never saved to bootstrap files
- Compaction summarized away the specifics
Solution:
Always save important information to files, not chat
Lifespan
Before compaction:
- Full detailed history available
- Agent remembers exact wording
- Can reference specific earlier messages
After compaction:
- Summary + recent 20K tokens only
- General understanding remains
- Specific details are lost
Layer 3: Context Window
What It Is
Active working memory - fixed-size container where everything competes for space.
Size by model:
- Claude Opus/Sonnet: 200,000 tokens
- GPT-4: 128,000 tokens
- Gemini Pro: 1,000,000 tokens
- MiniMax: 200,000 tokens
Conversion: 1 token ≈ 0.75 words (English)
What Fills It
- System prompt - OpenClaw's instructions
- Bootstrap files - Loaded at session start
- Conversation history - Recent messages
- Tool results - File reads, web fetches, API responses
- Current message - Task being processed
Compaction Trigger
Formula:
Trigger = Context Limit - Reserve Tokens - Soft Threshold
Example (200K context):
200,000 - 40,000 - 4,000 = 156,000 tokens
Compaction fires at 156K, not 200K
Reserve Tokens Floor
Purpose: Space reserved for agent's response
Default: 40,000 tokens
Configurable:
- Large tasks: Reduce to 20,000
- Small tasks: Keep at 40,000
Soft Threshold
Purpose: Additional buffer to prevent edge cases
Default: 4,000 tokens
What Competes for Space
Biggest consumers:
- Tool results - File reads, web snapshots
- Long conversations - Multi-turn back-and-forth
- Code blocks - Full file contents
- Bootstrap files - Loaded every turn
Optimization Strategy
Instead of:
Analyze this YouTube video: [link]
(Agent fetches full transcript via API - 50K tokens)
Do this:
- Get transcript manually
- Save to text file
- Upload file
Token savings: Up to 95%
Layer 4: Retrieval Index
What It Is
Searchable archive that sits beside or outside memory files.
Technology:
- Vector database (SQLite)
- Hybrid search (keyword + semantic)
- Embeddings-based retrieval
How It Works
- Write information to memory files
- OpenClaw indexes the content automatically
- Agent searches with memory_search tool
- Index returns relevant snippets with file paths
- Agent reads full context with memory_get
Two-step process: Search → Retrieve
Enabling Embeddings
Requirement: OpenAI or Gemini API key
Check if enabled:
How does your memory embedding system work?
Agent response should mention:
- Vector database
- Semantic search
- SQLite file in memory directory
If not enabled:
Set up memory embeddings with OpenAI key
Keyword vs. Semantic Search
Keyword search:
- Exact word matching
- "Pepsi" finds "Pepsi"
- Fast but limited
Semantic search:
- Concept matching
- "soda" finds "Pepsi", "Coca-Cola", "soft drink"
- Understands relationships
How Embeddings Work
Simplified explanation:
- Text → Numbers - "Pepsi" becomes vector [0.23, 0.87, 0.45, ...]
- Similar concepts = Similar numbers - "soda" becomes [0.25, 0.85, 0.43, ...]
- Search by similarity - Find vectors close to query vector
- Return relevant content - Matches based on meaning, not just words
Why it matters:
- Computers are good with numbers, not words
- Vector similarity enables semantic understanding
- Scales to large memory archives
Storage Location
Check for SQLite database:
ls -la ~/.openclaw/memory/
Look for:
memory.dbor similar SQLite file- Not human-readable
- Contains vector embeddings
Use Cases
Scenario 1: Long-term project memory
Agent: "What did we decide about the pricing model?"
Search: "pricing decision"
Result: Finds discussion from 3 weeks ago
Scenario 2: Offloading large datasets
Store: Daily news scraping results in memory files
Query: "What AI developments happened last week?"
Result: Retrieves relevant articles without loading all data
Scenario 3: Cross-session knowledge
Store: Lessons learned from past projects
Query: "How did we handle authentication in Project X?"
Result: Finds implementation notes from months ago
Integration with External Tools
Obsidian + GitHub pattern:
- Store large datasets in Obsidian (outside OpenClaw memory)
- Sync to GitHub for backup and version control
- Agent searches via retrieval index
- Fetches relevant content on-demand
Benefits:
- No memory directory bloat
- Version-controlled knowledge base
- Accessible outside OpenClaw
Memory Priority
OpenClaw prioritizes recent memory:
- Yesterday's work: Easily accessible
- Last week: Requires search
- Last month: Needs retrieval index
Without embeddings:
- Agent may not find old information
- Relies on bootstrap files and recent transcript
With embeddings:
- Semantic search finds relevant content regardless of age
- Scales to months or years of history
How the Layers Work Together
Session Start Flow
1. Bootstrap files loaded from disk
↓
2. Session transcript rebuilt into context
↓
3. Context window populated with:
- System prompt
- Bootstrap files
- Recent conversation
↓
4. Retrieval index ready for queries
During Conversation
User message
↓
Agent checks context window (active memory)
↓
If needed: Searches retrieval index
↓
If needed: Reads additional files
↓
Generates response
↓
Appends to session transcript
When Context Fills
Context reaches 156K tokens
↓
Auto-compaction triggers
↓
Old messages summarized
↓
Summary replaces detailed history
↓
Bootstrap files remain intact
↓
Retrieval index unaffected
Memory Failures: Three Common Types
Failure 1: Bootstrap File Truncation
Symptom: Agent forgets core instructions
Cause: File exceeded 20,000 character limit
Solution:
- Check file sizes:
/context list - Trim to under 20,000 characters
- Remove unnecessary content
Failure 2: Chat Instructions Lost
Symptom: Agent forgets instructions given in conversation
Cause: Instructions never saved to file, lost in compaction
Solution:
Save this instruction to my soul.md file
Failure 3: Retrieval Index Not Enabled
Symptom: Agent can't find old information
Cause: No OpenAI/Gemini API key configured
Solution:
- Set up API key
- Verify SQLite database exists
- Test with memory search
Best Practices
1. Save Important Information to Files
Rule: If it's not in a file, it doesn't exist long-term
Good:
Save this rule to my agents.md file:
"Never delete emails without explicit confirmation"
Bad:
Remember: never delete emails without asking
(Will be lost after compaction)
2. Keep Bootstrap Files Minimal
Target:
- soul.md: 15-30 lines
- memory.md: Key facts only
- Total: Under 150,000 characters
Remove:
- Personal biography
- Irrelevant preferences
- Redundant information
3. Use Retrieval Index for Large Datasets
Don't:
- Store all data in bootstrap files
- Load everything into context
Do:
- Store in memory directory
- Enable embeddings
- Search on-demand
4. Organize Memory Directory
Structure:
~/.openclaw/memory/
├── daily/
│ ├── 2026-05-01.md
│ ├── 2026-05-02.md
│ └── 2026-05-03.md
├── projects/
│ ├── project-a.md
│ ├── project-b.md
│ └── project-c.md
├── trading-system/
│ ├── strategy.md
│ ├── rules.md
│ └── performance.md
└── memory.db (SQLite)
Benefits:
- Easy to navigate
- Clear organization
- Scalable structure
5. Distinguish Evergreen vs. Ephemeral
Evergreen (store in memory directory):
- Trading system rules
- Project documentation
- Standard operating procedures
Ephemeral (store externally):
- Daily news scraping
- Temporary research
- One-time analysis
Troubleshooting
"Agent doesn't remember conversation from yesterday"
Cause: Context compacted, details summarized
Solution:
- Check if important info was saved to file
- If not, re-provide and save to bootstrap file
- Enable retrieval index for better recall
"Agent forgets core instructions"
Cause: Bootstrap file truncated or not loaded
Solution:
- Check file size:
/context list - Verify file is in correct directory
- Restart session to reload files
"Can't find information from last month"
Cause: Retrieval index not enabled or not working
Solution:
- Verify OpenAI/Gemini API key is set
- Check for SQLite database file
- Test memory search functionality
"Memory directory is huge"
Cause: Too much data stored locally
Solution:
- Move large datasets to external storage (Obsidian, GitHub)
- Archive old daily memory files
- Keep only evergreen content in memory directory
Advanced Patterns
Hybrid Storage Strategy
Local memory (OpenClaw):
- Core instructions
- Current project context
- Frequently accessed data
External storage (Obsidian/GitHub):
- Historical data
- Large datasets
- Archived projects
Access pattern:
Agent searches local memory first
If not found: Query external storage via retrieval
Memory Compaction Strategy
Proactive approach:
- Monitor context usage regularly
- Trigger manual compaction at 120K tokens
- Review and save important context before compacting
Reactive approach:
- Let auto-compaction handle it
- Accept some information loss
- Rely on bootstrap files for critical data
Related Resources
Duration: 18 minutes
Difficulty: Intermediate
Video Reference: How OpenClaw Memory ACTUALLY Works