What are Agent Memory Systems?
Memory systems allow agents to retain and recall information beyond the immediate context window. They enable agents to learn from past interactions and maintain coherent long-term behavior.
Types of Memory
Agent memory systems typically combine multiple memory types for different purposes.
Short-Term Memory
The current conversation context. Limited by context window size.
Long-Term Memory
Persistent storage of past interactions, facts, and learned preferences.
Episodic Memory
Specific past events and interactions that can be recalled.
Semantic Memory
General knowledge and facts extracted from experiences.
Implementation Approaches
Various techniques for implementing agent memory.
Vector Stores
Store embeddings of past interactions for semantic retrieval.
Conversation Summaries
Periodically summarize long conversations to preserve key information.
Key-Value Stores
Store explicit facts and user preferences for direct lookup.
Hybrid Memory Patterns
Modern agents combine episodic and semantic memory for human-like recall. The MemGPT pattern and similar architectures treat memory as a first-class resource the agent actively manages.
MemGPT Architecture
Agents with explicit memory management—moving data between fast context and slow storage like an OS manages RAM and disk.
Tiered Memory
Hot (context), warm (vector cache), and cold (archive) tiers with automatic promotion and demotion based on access patterns.
Self-Editing Memory
Agents that can update, consolidate, and restructure their own memories rather than append-only storage.
Dual Encoder Retrieval
Separate encoders for queries and memories enable asymmetric retrieval optimized for each direction.
Temporal Knowledge Graphs
Store memories with explicit time relationships, enabling queries like "what did we discuss last week?" and detecting knowledge drift over time.
Entity-Relationship Tracking
Extract entities (people, projects, concepts) and their relationships from conversations, building a queryable knowledge graph.
Time-Weighted Retrieval
Combine semantic similarity with recency, importance, and access frequency for more relevant recall.
ZEP-Style Memory
Automatic extraction of facts, entities, and temporal relations with bi-temporal modeling (when something happened vs. when it was recorded).
Memory Management
Production memory systems require active management to stay within token budgets while preserving valuable information.
Deduplication
Detect and merge semantically similar memories to prevent bloat. Use embedding similarity thresholds or LLM-based comparison.
Token Budgets
Allocate fixed token counts to different memory types. When budget is exceeded, compress or evict lowest-priority items.
Garbage Collection
Periodically scan memories for stale, redundant, or low-value entries. LRU, LFU, or importance-weighted eviction strategies.
Priority Rules
Define what memories matter most: user preferences, task context, recent interactions, or explicitly pinned facts.
Adaptive Retention
Smart strategies for what to keep, summarize, or forget—mimicking how human memory naturally decays and consolidates.
Context Summarization
Progressive summarization: full detail for recent context, summaries for older conversations, key facts only for distant past.
Entity Extraction
Automatically identify and store important entities (names, preferences, decisions) separately from raw conversation logs.
Decay Strategies
Exponential or logarithmic decay functions reduce memory importance over time unless reinforced by access or explicit importance markers.
Memory System Visualizer
See how different memory types work together
Add Memory
Store new information
Recall
Search stored memories
Short-Term Memory
Current context
User prefers dark mode
Current task: Write documentation
Project name is "learn-guide"
Long-Term Memory
Persistent storage
User is a software developer
Preferred programming language: TypeScript
Short-term memory has limited capacity (5 slots). When full, oldest memories migrate to long-term storage.
Key Takeaways
- 1Memory extends agent capabilities beyond the context window
- 2Combine multiple memory types for best results
- 3Memory retrieval adds latency—balance comprehensiveness with speed
- 4Consider privacy and data retention when storing memories
- 5Hybrid patterns like MemGPT enable agents to manage their own memory like an operating system
- 6Temporal knowledge graphs add time-awareness for more contextual recall