Memory Management: Understanding OpenClaw's 4-Layer System

Overview

OpenClaw's memory system is not a single mechanism—it's four separate memory layers working together. Understanding these layers is critical for building reliable agents that remember what matters and forget what doesn't. This guide explains each layer, how they interact, and how to optimize them for production use.

The Four Memory Layers

Memory System Layers

Think of OpenClaw's memory like a computer:

Bootstrap Files = Hard drive (permanent storage)
Session Transcript = Disk storage (persistent but summarized)
Context Window = RAM (active working memory)
Retrieval Index = Search index (queryable archive)

Why Four Layers?

Each layer serves a different purpose:

Durability - Bootstrap files survive restarts
History - Transcripts preserve conversation details
Performance - Context window enables real-time processing
Scalability - Retrieval index handles large datasets

Layer 1: Bootstrap Files

What They Are

Permanent identity files loaded from disk at every session start.

Location: ~/.openclaw/ or ~/.claude/

Common files:

soul.md - Agent personality and core instructions
memory.md - Long-term facts and preferences
agents.md - Sub-agent configuration
tools.md - Tool usage instructions

How They Work

Session starts (daily restart or manual)
Files are read from disk - Fresh copy every time
Content injected into context - Immediately available
Immune to compaction - Never summarized or lost

Critical Characteristics

Always loaded:

Every session start reads these files
No exceptions, no caching
Fresh from filesystem

Not in conversation history:

Separate from chat transcript
Changes take effect immediately on next session
No need to "remind" the agent

Most durable layer:

Survives context compaction
Survives session restarts
Survives agent crashes

Size Limits

Default limits:

20,000 characters per file
150,000 characters total

Check your usage:

/context list

Output shows:

soul.md: 15,234 / 20,000 characters
memory.md: 8,456 / 20,000 characters
agents.md: 3,221 / 20,000 characters
Total: 26,911 / 150,000 characters

Truncation Warning

If a file exceeds 20,000 characters:

Content is truncated (cut off)
No warning given
Agent sees incomplete instructions

Solution:

Keep soul.md to 15-30 lines
Remove unnecessary biographical information
Focus on work-relevant instructions only

Optimization Tips

Bad soul.md (bloated):

# About Me

I'm a software engineer with 10 years of experience.
I graduated from University X with a degree in Computer Science.
I worked at Company A, then Company B, then Company C.
My hobbies include reading, hiking, and photography.
I have a dog named Max and a cat named Luna.
I prefer coffee over tea, usually with milk.
I wake up at 6 AM and go to bed at 11 PM.
My favorite programming language is Python.
I also know JavaScript, Go, Rust, and Java.
I'm currently learning machine learning.
...
(100+ more lines of personal information)

Good soul.md (focused):

# Core Instructions

You are a specialized development assistant focused on:
- Code review and optimization
- Technical documentation
- API integration

## Communication Style
- Concise, technical responses
- Include code examples
- Cite sources when relevant

## Work Preferences
- Test-driven development
- Prefer TypeScript over JavaScript
- Follow project conventions

Sub-Agent Behavior

Important: Parallel sub-agents only read:

agents.md
tools.md

They do NOT read:

soul.md
memory.md
Other bootstrap files

Implication:

Sub-agents lack main agent's personality
Task instructions must be in agents.md or passed explicitly
Sub-agents are "dumber" by design (minimal context)

Layer 2: Session Transcript

What It Is

Full conversation history saved to disk as a file.

Location: ~/.openclaw/sessions/ or similar

Contains:

User messages
Assistant messages
Tool calls and results
Timestamps

How It Works

Every message is appended to transcript file
Transcript is rebuilt into context when continuing a session
Persists across restarts - Can resume conversations
Stored in vector database format - Not human-readable

The Compaction Problem

Memory Compaction Process

When context window approaches limit (typically 200K tokens):

Auto-compaction triggers
Old messages are summarized into compact form
Summary replaces detailed history in context
Original transcript still exists on disk (but agent can't see it)

Critical distinction:

Raw transcript file = Still on disk, complete
Agent's view = Summarized version only

What Survives Compaction

Preserved:

Last 20,000 tokens (recent messages)
Anything written to bootstrap files
General themes and topics

Lost:

Exact wording of earlier instructions
Nuance and context from old messages
Specific constraints mentioned mid-conversation
Casual preferences stated in chat
Images from earlier in session

The Walter White Problem

Scenario:

Day 1: Long conversation with agent about project X
Day 2: Context compacts overnight
Day 3: Agent asks "Who are you? What project?"

Why it happens:

Conversation details were in transcript only
Never saved to bootstrap files
Compaction summarized away the specifics

Solution:

Always save important information to files, not chat

Lifespan

Before compaction:

Full detailed history available
Agent remembers exact wording
Can reference specific earlier messages

After compaction:

Summary + recent 20K tokens only
General understanding remains
Specific details are lost

Layer 3: Context Window

What It Is

Active working memory - fixed-size container where everything competes for space.

Size by model:

Claude Opus/Sonnet: 200,000 tokens
GPT-4: 128,000 tokens
Gemini Pro: 1,000,000 tokens
MiniMax: 200,000 tokens

Conversion: 1 token ≈ 0.75 words (English)

What Fills It

System prompt - OpenClaw's instructions
Bootstrap files - Loaded at session start
Conversation history - Recent messages
Tool results - File reads, web fetches, API responses
Current message - Task being processed

Compaction Trigger

Formula:

Trigger = Context Limit - Reserve Tokens - Soft Threshold

Example (200K context):

200,000 - 40,000 - 4,000 = 156,000 tokens

Compaction fires at 156K, not 200K

Reserve Tokens Floor

Purpose: Space reserved for agent's response

Default: 40,000 tokens

Configurable:

Large tasks: Reduce to 20,000
Small tasks: Keep at 40,000

Soft Threshold

Purpose: Additional buffer to prevent edge cases

Default: 4,000 tokens

What Competes for Space

Biggest consumers:

Tool results - File reads, web snapshots
Long conversations - Multi-turn back-and-forth
Code blocks - Full file contents
Bootstrap files - Loaded every turn

Optimization Strategy

Instead of:

Analyze this YouTube video: [link]

(Agent fetches full transcript via API - 50K tokens)

Do this:

Get transcript manually
Save to text file
Upload file

Token savings: Up to 95%

Layer 4: Retrieval Index

What It Is

Searchable archive that sits beside or outside memory files.

Technology:

Vector database (SQLite)
Hybrid search (keyword + semantic)
Embeddings-based retrieval

How It Works

Write information to memory files
OpenClaw indexes the content automatically
Agent searches with memory_search tool
Index returns relevant snippets with file paths
Agent reads full context with memory_get

Two-step process: Search → Retrieve

Enabling Embeddings

Requirement: OpenAI or Gemini API key

Check if enabled:

How does your memory embedding system work?

Agent response should mention:

Vector database
Semantic search
SQLite file in memory directory

If not enabled:

Set up memory embeddings with OpenAI key

Keyword vs. Semantic Search

Keyword search:

Exact word matching
"Pepsi" finds "Pepsi"
Fast but limited

Semantic search:

Concept matching
"soda" finds "Pepsi", "Coca-Cola", "soft drink"
Understands relationships

How Embeddings Work

Simplified explanation:

Text → Numbers - "Pepsi" becomes vector [0.23, 0.87, 0.45, ...]
Similar concepts = Similar numbers - "soda" becomes [0.25, 0.85, 0.43, ...]
Search by similarity - Find vectors close to query vector
Return relevant content - Matches based on meaning, not just words

Why it matters:

Computers are good with numbers, not words
Vector similarity enables semantic understanding
Scales to large memory archives

Storage Location

Check for SQLite database:

ls -la ~/.openclaw/memory/

Look for:

memory.db or similar SQLite file
Not human-readable
Contains vector embeddings

Use Cases

Scenario 1: Long-term project memory

Agent: "What did we decide about the pricing model?"
Search: "pricing decision"
Result: Finds discussion from 3 weeks ago

Scenario 2: Offloading large datasets

Store: Daily news scraping results in memory files
Query: "What AI developments happened last week?"
Result: Retrieves relevant articles without loading all data

Scenario 3: Cross-session knowledge

Store: Lessons learned from past projects
Query: "How did we handle authentication in Project X?"
Result: Finds implementation notes from months ago

Integration with External Tools

Obsidian + GitHub pattern:

Store large datasets in Obsidian (outside OpenClaw memory)
Sync to GitHub for backup and version control
Agent searches via retrieval index
Fetches relevant content on-demand

Benefits:

No memory directory bloat
Version-controlled knowledge base
Accessible outside OpenClaw

Memory Priority

OpenClaw prioritizes recent memory:

Yesterday's work: Easily accessible
Last week: Requires search
Last month: Needs retrieval index

Without embeddings:

Agent may not find old information
Relies on bootstrap files and recent transcript

With embeddings:

Semantic search finds relevant content regardless of age
Scales to months or years of history

How the Layers Work Together

Session Start Flow

1. Bootstrap files loaded from disk
   ↓
2. Session transcript rebuilt into context
   ↓
3. Context window populated with:
   - System prompt
   - Bootstrap files
   - Recent conversation
   ↓
4. Retrieval index ready for queries

During Conversation

User message
   ↓
Agent checks context window (active memory)
   ↓
If needed: Searches retrieval index
   ↓
If needed: Reads additional files
   ↓
Generates response
   ↓
Appends to session transcript

When Context Fills

Context reaches 156K tokens
   ↓
Auto-compaction triggers
   ↓
Old messages summarized
   ↓
Summary replaces detailed history
   ↓
Bootstrap files remain intact
   ↓
Retrieval index unaffected

Memory Failures: Three Common Types

Failure 1: Bootstrap File Truncation

Symptom: Agent forgets core instructions

Cause: File exceeded 20,000 character limit

Solution:

Check file sizes: /context list
Trim to under 20,000 characters
Remove unnecessary content

Failure 2: Chat Instructions Lost

Symptom: Agent forgets instructions given in conversation

Cause: Instructions never saved to file, lost in compaction

Solution:

Save this instruction to my soul.md file

Failure 3: Retrieval Index Not Enabled

Symptom: Agent can't find old information

Cause: No OpenAI/Gemini API key configured

Solution:

Set up API key
Verify SQLite database exists
Test with memory search

Best Practices

1. Save Important Information to Files

Rule: If it's not in a file, it doesn't exist long-term

Good:

Save this rule to my agents.md file:
"Never delete emails without explicit confirmation"

Bad:

Remember: never delete emails without asking

(Will be lost after compaction)

2. Keep Bootstrap Files Minimal

Target:

soul.md: 15-30 lines
memory.md: Key facts only
Total: Under 150,000 characters

Remove:

Personal biography
Irrelevant preferences
Redundant information

3. Use Retrieval Index for Large Datasets

Don't:

Store all data in bootstrap files
Load everything into context

Do:

Store in memory directory
Enable embeddings
Search on-demand

4. Organize Memory Directory

Structure:

~/.openclaw/memory/
├── daily/
│   ├── 2026-05-01.md
│   ├── 2026-05-02.md
│   └── 2026-05-03.md
├── projects/
│   ├── project-a.md
│   ├── project-b.md
│   └── project-c.md
├── trading-system/
│   ├── strategy.md
│   ├── rules.md
│   └── performance.md
└── memory.db (SQLite)

Benefits:

Easy to navigate
Clear organization
Scalable structure

5. Distinguish Evergreen vs. Ephemeral

Evergreen (store in memory directory):

Trading system rules
Project documentation
Standard operating procedures

Ephemeral (store externally):

Daily news scraping
Temporary research
One-time analysis

Troubleshooting

"Agent doesn't remember conversation from yesterday"

Cause: Context compacted, details summarized

Solution:

Check if important info was saved to file
If not, re-provide and save to bootstrap file
Enable retrieval index for better recall

"Agent forgets core instructions"

Cause: Bootstrap file truncated or not loaded

Solution:

Check file size: /context list
Verify file is in correct directory
Restart session to reload files

"Can't find information from last month"

Cause: Retrieval index not enabled or not working

Solution:

Verify OpenAI/Gemini API key is set
Check for SQLite database file
Test memory search functionality

"Memory directory is huge"

Cause: Too much data stored locally

Solution:

Move large datasets to external storage (Obsidian, GitHub)
Archive old daily memory files
Keep only evergreen content in memory directory

Advanced Patterns

Hybrid Storage Strategy

Local memory (OpenClaw):

Core instructions
Current project context
Frequently accessed data

External storage (Obsidian/GitHub):

Historical data
Large datasets
Archived projects

Access pattern:

Agent searches local memory first
If not found: Query external storage via retrieval

Memory Compaction Strategy

Proactive approach:

Monitor context usage regularly
Trigger manual compaction at 120K tokens
Review and save important context before compacting

Reactive approach:

Let auto-compaction handle it
Accept some information loss
Rely on bootstrap files for critical data

Related Resources

Duration: 18 minutes
Difficulty: Intermediate
Video Reference: How OpenClaw Memory ACTUALLY Works

Memory Management: Understanding OpenClaw's 4-Layer System

Overview

The Four Memory Layers

Why Four Layers?

Layer 1: Bootstrap Files

What They Are

How They Work

Critical Characteristics

Size Limits

Truncation Warning

Optimization Tips

Sub-Agent Behavior

Layer 2: Session Transcript

What It Is

How It Works

The Compaction Problem

What Survives Compaction

The Walter White Problem

Lifespan

Layer 3: Context Window

What It Is

What Fills It

Compaction Trigger

Reserve Tokens Floor

Soft Threshold

What Competes for Space

Optimization Strategy

Layer 4: Retrieval Index

What It Is

How It Works

Enabling Embeddings

Keyword vs. Semantic Search

How Embeddings Work

Storage Location

Use Cases

Integration with External Tools

Memory Priority

How the Layers Work Together

Session Start Flow

During Conversation

When Context Fills

Memory Failures: Three Common Types

Failure 1: Bootstrap File Truncation

Failure 2: Chat Instructions Lost

Failure 3: Retrieval Index Not Enabled

Best Practices

1. Save Important Information to Files

2. Keep Bootstrap Files Minimal

3. Use Retrieval Index for Large Datasets

4. Organize Memory Directory

5. Distinguish Evergreen vs. Ephemeral

Troubleshooting

"Agent doesn't remember conversation from yesterday"

"Agent forgets core instructions"

"Can't find information from last month"

"Memory directory is huge"

Advanced Patterns

Hybrid Storage Strategy

Memory Compaction Strategy

Related Resources

Tags