Memory (RAG)
Overview
Memory Settings
public class AgentMemorySettings
{
// Basic Settings
public bool autoSave = true; // Auto-save conversations
public bool generateTitle = true; // Generate conversation titles
public int maxContextMessages = 20; // Max messages in context (10-100)
// RAG Settings
public bool useVectorStore = false; // Enable long-term RAG retrieval
public string summaryModelId; // Model for summarization
public string embeddingModelId; // Model for embeddings
public int retrievalTopK = 8; // Top K results (1-32)
public float retrievalMinSim = 0.5f; // Min similarity threshold (0.0-1.0)
}Basic Settings
Auto Save
Generate Title
Max Context Messages
RAG Settings
Use Vector Store
Embedding Model
Retrieval Top K
Retrieval Min Similarity
How Memory Works
Short-Term Context (Without RAG)
Long-Term Retrieval (With RAG)
Detailed Workflow
Message
Similarity
Recency
Same Thread
Final Score
Configuration Examples
Minimal Memory (Token Efficient)
Balanced Memory (Recommended)
Maximum Memory (Knowledge Intensive)
Custom Configuration
Conversation Stores
Local File Store
Threads API Store (OpenAI)
Conversations API Store (OpenAI)
Realtime API Store (OpenAI)
Custom Store
Memory Management APIs
Creating Conversations
Loading Conversations
Listing Conversations
Saving Conversations
Deleting Conversations
Accessing Conversation Data
RAG Performance Tuning
Optimizing Retrieval Quality
Optimizing Token Usage
Optimizing Response Speed
Best Practices
1. Choose Appropriate Store Types
2. Enable RAG for Long Conversations
3. Balance Context Window and RAG
4. Monitor Token Usage
5. Implement Smart Caching
6. Handle Vector Store Initialization
7. Clean Up Old Conversations
Troubleshooting
RAG Not Working
High Token Costs
Slow Response Times
Conversations Not Persisting
Memory Leaks with Large Conversations
Related Documentation
Last updated