advanced

Memory Management: Understanding OpenClaw's 4-Layer System

OpenClaw's memory system is not a single mechanism—it's four separate memory layers working together. Understanding these layers is critical for building reliable agents that remember what matters and

Memory Management: Understanding OpenClaw's 4-Layer System

Overview

OpenClaw's memory system is not a single mechanism—it's four separate memory layers working together. Understanding these layers is critical for building reliable agents that remember what matters and forget what doesn't. This guide explains each layer, how they interact, and how to optimize them for production use.

The Four Memory Layers

Memory System Layers

Think of OpenClaw's memory like a computer:

  • Bootstrap Files = Hard drive (permanent storage)
  • Session Transcript = Disk storage (persistent but summarized)
  • Context Window = RAM (active working memory)
  • Retrieval Index = Search index (queryable archive)

Why Four Layers?

Each layer serves a different purpose:

  • Durability - Bootstrap files survive restarts
  • History - Transcripts preserve conversation details
  • Performance - Context window enables real-time processing
  • Scalability - Retrieval index handles large datasets

Layer 1: Bootstrap Files

What They Are

Permanent identity files loaded from disk at every session start.

Location: ~/.openclaw/ or ~/.claude/

Common files:

  • soul.md - Agent personality and core instructions
  • memory.md - Long-term facts and preferences
  • agents.md - Sub-agent configuration
  • tools.md - Tool usage instructions

How They Work

  1. Session starts (daily restart or manual)
  2. Files are read from disk - Fresh copy every time
  3. Content injected into context - Immediately available
  4. Immune to compaction - Never summarized or lost

Critical Characteristics

Always loaded:

  • Every session start reads these files
  • No exceptions, no caching
  • Fresh from filesystem

Not in conversation history:

  • Separate from chat transcript
  • Changes take effect immediately on next session
  • No need to "remind" the agent

Most durable layer:

  • Survives context compaction
  • Survives session restarts
  • Survives agent crashes

Size Limits

Default limits:

  • 20,000 characters per file
  • 150,000 characters total

Check your usage:

/context list

Output shows:

soul.md: 15,234 / 20,000 characters
memory.md: 8,456 / 20,000 characters
agents.md: 3,221 / 20,000 characters
Total: 26,911 / 150,000 characters

Truncation Warning

If a file exceeds 20,000 characters:

  • Content is truncated (cut off)
  • No warning given
  • Agent sees incomplete instructions

Solution:

  • Keep soul.md to 15-30 lines
  • Remove unnecessary biographical information
  • Focus on work-relevant instructions only

Optimization Tips

Bad soul.md (bloated):

# About Me

I'm a software engineer with 10 years of experience.
I graduated from University X with a degree in Computer Science.
I worked at Company A, then Company B, then Company C.
My hobbies include reading, hiking, and photography.
I have a dog named Max and a cat named Luna.
I prefer coffee over tea, usually with milk.
I wake up at 6 AM and go to bed at 11 PM.
My favorite programming language is Python.
I also know JavaScript, Go, Rust, and Java.
I'm currently learning machine learning.
...
(100+ more lines of personal information)

Good soul.md (focused):

# Core Instructions

You are a specialized development assistant focused on:
- Code review and optimization
- Technical documentation
- API integration

## Communication Style
- Concise, technical responses
- Include code examples
- Cite sources when relevant

## Work Preferences
- Test-driven development
- Prefer TypeScript over JavaScript
- Follow project conventions

Sub-Agent Behavior

Important: Parallel sub-agents only read:

  • agents.md
  • tools.md

They do NOT read:

  • soul.md
  • memory.md
  • Other bootstrap files

Implication:

  • Sub-agents lack main agent's personality
  • Task instructions must be in agents.md or passed explicitly
  • Sub-agents are "dumber" by design (minimal context)

Layer 2: Session Transcript

What It Is

Full conversation history saved to disk as a file.

Location: ~/.openclaw/sessions/ or similar

Contains:

  • User messages
  • Assistant messages
  • Tool calls and results
  • Timestamps

How It Works

  1. Every message is appended to transcript file
  2. Transcript is rebuilt into context when continuing a session
  3. Persists across restarts - Can resume conversations
  4. Stored in vector database format - Not human-readable

The Compaction Problem

Memory Compaction Process

When context window approaches limit (typically 200K tokens):

  1. Auto-compaction triggers
  2. Old messages are summarized into compact form
  3. Summary replaces detailed history in context
  4. Original transcript still exists on disk (but agent can't see it)

Critical distinction:

  • Raw transcript file = Still on disk, complete
  • Agent's view = Summarized version only

What Survives Compaction

Preserved:

  • Last 20,000 tokens (recent messages)
  • Anything written to bootstrap files
  • General themes and topics

Lost:

  • Exact wording of earlier instructions
  • Nuance and context from old messages
  • Specific constraints mentioned mid-conversation
  • Casual preferences stated in chat
  • Images from earlier in session

The Walter White Problem

Scenario:

Day 1: Long conversation with agent about project X
Day 2: Context compacts overnight
Day 3: Agent asks "Who are you? What project?"

Why it happens:

  • Conversation details were in transcript only
  • Never saved to bootstrap files
  • Compaction summarized away the specifics

Solution:

Always save important information to files, not chat

Lifespan

Before compaction:

  • Full detailed history available
  • Agent remembers exact wording
  • Can reference specific earlier messages

After compaction:

  • Summary + recent 20K tokens only
  • General understanding remains
  • Specific details are lost

Layer 3: Context Window

What It Is

Active working memory - fixed-size container where everything competes for space.

Size by model:

  • Claude Opus/Sonnet: 200,000 tokens
  • GPT-4: 128,000 tokens
  • Gemini Pro: 1,000,000 tokens
  • MiniMax: 200,000 tokens

Conversion: 1 token ≈ 0.75 words (English)

What Fills It

  1. System prompt - OpenClaw's instructions
  2. Bootstrap files - Loaded at session start
  3. Conversation history - Recent messages
  4. Tool results - File reads, web fetches, API responses
  5. Current message - Task being processed

Compaction Trigger

Formula:

Trigger = Context Limit - Reserve Tokens - Soft Threshold

Example (200K context):

200,000 - 40,000 - 4,000 = 156,000 tokens

Compaction fires at 156K, not 200K

Reserve Tokens Floor

Purpose: Space reserved for agent's response

Default: 40,000 tokens

Configurable:

  • Large tasks: Reduce to 20,000
  • Small tasks: Keep at 40,000

Soft Threshold

Purpose: Additional buffer to prevent edge cases

Default: 4,000 tokens

What Competes for Space

Biggest consumers:

  • Tool results - File reads, web snapshots
  • Long conversations - Multi-turn back-and-forth
  • Code blocks - Full file contents
  • Bootstrap files - Loaded every turn

Optimization Strategy

Instead of:

Analyze this YouTube video: [link]

(Agent fetches full transcript via API - 50K tokens)

Do this:

  1. Get transcript manually
  2. Save to text file
  3. Upload file

Token savings: Up to 95%

Layer 4: Retrieval Index

What It Is

Searchable archive that sits beside or outside memory files.

Technology:

  • Vector database (SQLite)
  • Hybrid search (keyword + semantic)
  • Embeddings-based retrieval

How It Works

  1. Write information to memory files
  2. OpenClaw indexes the content automatically
  3. Agent searches with memory_search tool
  4. Index returns relevant snippets with file paths
  5. Agent reads full context with memory_get

Two-step process: Search → Retrieve

Enabling Embeddings

Requirement: OpenAI or Gemini API key

Check if enabled:

How does your memory embedding system work?

Agent response should mention:

  • Vector database
  • Semantic search
  • SQLite file in memory directory

If not enabled:

Set up memory embeddings with OpenAI key

Keyword vs. Semantic Search

Keyword search:

  • Exact word matching
  • "Pepsi" finds "Pepsi"
  • Fast but limited

Semantic search:

  • Concept matching
  • "soda" finds "Pepsi", "Coca-Cola", "soft drink"
  • Understands relationships

How Embeddings Work

Simplified explanation:

  1. Text → Numbers - "Pepsi" becomes vector [0.23, 0.87, 0.45, ...]
  2. Similar concepts = Similar numbers - "soda" becomes [0.25, 0.85, 0.43, ...]
  3. Search by similarity - Find vectors close to query vector
  4. Return relevant content - Matches based on meaning, not just words

Why it matters:

  • Computers are good with numbers, not words
  • Vector similarity enables semantic understanding
  • Scales to large memory archives

Storage Location

Check for SQLite database:

ls -la ~/.openclaw/memory/

Look for:

  • memory.db or similar SQLite file
  • Not human-readable
  • Contains vector embeddings

Use Cases

Scenario 1: Long-term project memory

Agent: "What did we decide about the pricing model?"
Search: "pricing decision"
Result: Finds discussion from 3 weeks ago

Scenario 2: Offloading large datasets

Store: Daily news scraping results in memory files
Query: "What AI developments happened last week?"
Result: Retrieves relevant articles without loading all data

Scenario 3: Cross-session knowledge

Store: Lessons learned from past projects
Query: "How did we handle authentication in Project X?"
Result: Finds implementation notes from months ago

Integration with External Tools

Obsidian + GitHub pattern:

  1. Store large datasets in Obsidian (outside OpenClaw memory)
  2. Sync to GitHub for backup and version control
  3. Agent searches via retrieval index
  4. Fetches relevant content on-demand

Benefits:

  • No memory directory bloat
  • Version-controlled knowledge base
  • Accessible outside OpenClaw

Memory Priority

OpenClaw prioritizes recent memory:

  • Yesterday's work: Easily accessible
  • Last week: Requires search
  • Last month: Needs retrieval index

Without embeddings:

  • Agent may not find old information
  • Relies on bootstrap files and recent transcript

With embeddings:

  • Semantic search finds relevant content regardless of age
  • Scales to months or years of history

How the Layers Work Together

Session Start Flow

1. Bootstrap files loaded from disk
   ↓
2. Session transcript rebuilt into context
   ↓
3. Context window populated with:
   - System prompt
   - Bootstrap files
   - Recent conversation
   ↓
4. Retrieval index ready for queries

During Conversation

User message
   ↓
Agent checks context window (active memory)
   ↓
If needed: Searches retrieval index
   ↓
If needed: Reads additional files
   ↓
Generates response
   ↓
Appends to session transcript

When Context Fills

Context reaches 156K tokens
   ↓
Auto-compaction triggers
   ↓
Old messages summarized
   ↓
Summary replaces detailed history
   ↓
Bootstrap files remain intact
   ↓
Retrieval index unaffected

Memory Failures: Three Common Types

Failure 1: Bootstrap File Truncation

Symptom: Agent forgets core instructions

Cause: File exceeded 20,000 character limit

Solution:

  1. Check file sizes: /context list
  2. Trim to under 20,000 characters
  3. Remove unnecessary content

Failure 2: Chat Instructions Lost

Symptom: Agent forgets instructions given in conversation

Cause: Instructions never saved to file, lost in compaction

Solution:

Save this instruction to my soul.md file

Failure 3: Retrieval Index Not Enabled

Symptom: Agent can't find old information

Cause: No OpenAI/Gemini API key configured

Solution:

  1. Set up API key
  2. Verify SQLite database exists
  3. Test with memory search

Best Practices

1. Save Important Information to Files

Rule: If it's not in a file, it doesn't exist long-term

Good:

Save this rule to my agents.md file:
"Never delete emails without explicit confirmation"

Bad:

Remember: never delete emails without asking

(Will be lost after compaction)

2. Keep Bootstrap Files Minimal

Target:

  • soul.md: 15-30 lines
  • memory.md: Key facts only
  • Total: Under 150,000 characters

Remove:

  • Personal biography
  • Irrelevant preferences
  • Redundant information

3. Use Retrieval Index for Large Datasets

Don't:

  • Store all data in bootstrap files
  • Load everything into context

Do:

  • Store in memory directory
  • Enable embeddings
  • Search on-demand

4. Organize Memory Directory

Structure:

~/.openclaw/memory/
├── daily/
│   ├── 2026-05-01.md
│   ├── 2026-05-02.md
│   └── 2026-05-03.md
├── projects/
│   ├── project-a.md
│   ├── project-b.md
│   └── project-c.md
├── trading-system/
│   ├── strategy.md
│   ├── rules.md
│   └── performance.md
└── memory.db (SQLite)

Benefits:

  • Easy to navigate
  • Clear organization
  • Scalable structure

5. Distinguish Evergreen vs. Ephemeral

Evergreen (store in memory directory):

  • Trading system rules
  • Project documentation
  • Standard operating procedures

Ephemeral (store externally):

  • Daily news scraping
  • Temporary research
  • One-time analysis

Troubleshooting

"Agent doesn't remember conversation from yesterday"

Cause: Context compacted, details summarized

Solution:

  1. Check if important info was saved to file
  2. If not, re-provide and save to bootstrap file
  3. Enable retrieval index for better recall

"Agent forgets core instructions"

Cause: Bootstrap file truncated or not loaded

Solution:

  1. Check file size: /context list
  2. Verify file is in correct directory
  3. Restart session to reload files

"Can't find information from last month"

Cause: Retrieval index not enabled or not working

Solution:

  1. Verify OpenAI/Gemini API key is set
  2. Check for SQLite database file
  3. Test memory search functionality

"Memory directory is huge"

Cause: Too much data stored locally

Solution:

  1. Move large datasets to external storage (Obsidian, GitHub)
  2. Archive old daily memory files
  3. Keep only evergreen content in memory directory

Advanced Patterns

Hybrid Storage Strategy

Local memory (OpenClaw):

  • Core instructions
  • Current project context
  • Frequently accessed data

External storage (Obsidian/GitHub):

  • Historical data
  • Large datasets
  • Archived projects

Access pattern:

Agent searches local memory first
If not found: Query external storage via retrieval

Memory Compaction Strategy

Proactive approach:

  1. Monitor context usage regularly
  2. Trigger manual compaction at 120K tokens
  3. Review and save important context before compacting

Reactive approach:

  1. Let auto-compaction handle it
  2. Accept some information loss
  3. Rely on bootstrap files for critical data

Related Resources


Duration: 18 minutes
Difficulty: Intermediate
Video Reference: How OpenClaw Memory ACTUALLY Works

Tags

advanced openclaw
Back to Guides