DRAFT Agentic Design Patterns - Memory Management
Building AI agents that remember past interactions, maintain context across sessions, and learn from experience using TypeScript, LangChain, LangGraph, and Vercel's serverless platform.
Mental Model: Building Memory-First Agents
Think of agent memory like a web application's state management system on steroids. Just as React apps use Redux for state, Context API for shared data, and localStorage for persistence, AI agents need working memory (like useState), episodic memory (like session storage), semantic memory (like a database), and procedural memory (like cached computations). LangGraph acts as your memory orchestrator (similar to Redux), while external stores like Redis and vector databases serve as your persistent layer. The key difference: agent memory must handle both structured data and semantic understanding, operating efficiently within serverless constraints.
Basic Example: Conversation Buffer Memory
1. Simple Buffer Memory Implementation
// lib/memory/buffer-memory.ts
import { BufferMemory } from 'langchain/memory';
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { ConversationChain } from 'langchain/chains';
import { map, take } from 'es-toolkit';
export function createBasicMemoryChain() {
const model = new ChatGoogleGenerativeAI({
modelName: 'gemini-2.5-flash',
temperature: 0.7,
});
const memory = new BufferMemory({
returnMessages: true,
memoryKey: 'history',
});
const chain = new ConversationChain({
llm: model,
memory,
verbose: false, // Set to true for debugging
});
return chain;
}
// app/api/chat-memory/route.ts
import { createBasicMemoryChain } from '@/lib/memory/buffer-memory';
import { NextResponse } from 'next/server';
export const runtime = 'nodejs';
export const maxDuration = 60;
const chain = createBasicMemoryChain();
export async function POST(req: Request) {
const { message } = await req.json();
const result = await chain.call({ input: message });
return NextResponse.json({
response: result.response,
memorySize: chain.memory.chatHistory.messages.length
});
}
Creates a basic conversation chain with BufferMemory that maintains all messages in memory during the session.
2. Window Memory with Limited Context
// lib/memory/window-memory.ts
import { BufferWindowMemory } from 'langchain/memory';
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { ConversationChain } from 'langchain/chains';
import { takeRight } from 'es-toolkit';
interface WindowMemoryConfig {
windowSize?: number;
returnMessages?: boolean;
}
export class WindowMemoryManager {
private chain: ConversationChain;
private windowSize: number;
constructor(config: WindowMemoryConfig = {}) {
const model = new ChatGoogleGenerativeAI({
modelName: 'gemini-2.5-flash',
temperature: 0.7,
});
this.windowSize = config.windowSize || 10;
const memory = new BufferWindowMemory({
k: this.windowSize,
returnMessages: config.returnMessages ?? true,
});
this.chain = new ConversationChain({
llm: model,
memory,
});
}
async processMessage(message: string) {
const result = await this.chain.call({ input: message });
// Get current window
const messages = this.chain.memory.chatHistory.messages;
const windowMessages = takeRight(messages, this.windowSize * 2); // User + AI messages
return {
response: result.response,
windowSize: windowMessages.length,
droppedMessages: Math.max(0, messages.length - windowMessages.length)
};
}
}
Implements a sliding window memory that keeps only the last K message pairs, automatically dropping older context to stay within limits.
3. Summary Memory for Long Conversations
// lib/memory/summary-memory.ts
import { ConversationSummaryMemory } from 'langchain/memory';
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { ConversationChain } from 'langchain/chains';
import { debounce } from 'es-toolkit';
export class SummaryMemoryManager {
private chain: ConversationChain;
private summaryModel: ChatGoogleGenerativeAI;
private updateSummary: Function;
constructor() {
const model = new ChatGoogleGenerativeAI({
modelName: 'gemini-2.5-pro',
temperature: 0.7,
});
this.summaryModel = new ChatGoogleGenerativeAI({
modelName: 'gemini-2.5-flash',
temperature: 0,
});
const memory = new ConversationSummaryMemory({
llm: this.summaryModel,
returnMessages: false,
inputKey: 'input',
outputKey: 'response',
});
this.chain = new ConversationChain({
llm: model,
memory,
});
// Debounce summary updates for performance
this.updateSummary = debounce(
() => this.chain.memory.predictNewSummary(
this.chain.memory.chatHistory.messages,
''
),
5000
);
}
async processWithSummary(message: string) {
const result = await this.chain.call({ input: message });
// Trigger summary update (debounced)
this.updateSummary();
return {
response: result.response,
currentSummary: this.chain.memory.buffer
};
}
}
Uses LLM to progressively summarize conversations, maintaining context while reducing token usage for long interactions.
Advanced Example: Multi-Tier Memory System with LangGraph
1. State-Based Memory with LangGraph
// lib/memory/langgraph-memory.ts
import { StateGraph, Annotation } from '@langchain/langgraph';
import { BaseMessage, HumanMessage, AIMessage } from '@langchain/core/messages';
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { MemorySaver } from '@langchain/langgraph';
import { groupBy, orderBy, take } from 'es-toolkit';
// Define the state structure
const StateAnnotation = Annotation.Root({
messages: Annotation<BaseMessage[]>({
reducer: (x, y) => [...x, ...y],
default: () => [],
}),
summary: Annotation<string>({
reducer: (x, y) => y || x,
default: () => '',
}),
userProfile: Annotation<Record<string, any>>({
reducer: (x, y) => ({ ...x, ...y }),
default: () => ({}),
}),
sessionMetadata: Annotation<Record<string, any>>({
reducer: (x, y) => ({ ...x, ...y }),
default: () => ({ startTime: Date.now() }),
}),
});
export function createStatefulMemoryAgent() {
const model = new ChatGoogleGenerativeAI({
modelName: 'gemini-2.5-pro',
temperature: 0.7,
});
const summaryModel = new ChatGoogleGenerativeAI({
modelName: 'gemini-2.5-flash',
temperature: 0,
});
const workflow = new StateGraph(StateAnnotation)
.addNode('processMessage', async (state) => {
// Process with full context
const contextMessages = [
new SystemMessage(`User Profile: ${JSON.stringify(state.userProfile)}`),
new SystemMessage(`Summary of past: ${state.summary}`),
...take(state.messages, 10), // Recent messages
];
const response = await model.invoke(contextMessages);
return {
messages: [response],
};
})
.addNode('updateProfile', async (state) => {
// Extract user preferences from conversation
const recentMessages = take(state.messages, 5);
const analysis = await summaryModel.invoke([
new SystemMessage('Extract user preferences and facts as JSON'),
...recentMessages,
]);
try {
const preferences = JSON.parse(analysis.content as string);
return { userProfile: preferences };
} catch {
return {};
}
})
.addNode('summarizeIfNeeded', async (state) => {
// Summarize when messages exceed threshold
if (state.messages.length > 20) {
const oldMessages = state.messages.slice(0, -10);
const summary = await summaryModel.invoke([
new SystemMessage('Summarize this conversation concisely'),
...oldMessages,
]);
return {
summary: summary.content as string,
messages: state.messages.slice(-10), // Keep only recent
};
}
return {};
});
// Define the flow
workflow.addEdge('__start__', 'processMessage');
workflow.addEdge('processMessage', 'updateProfile');
workflow.addEdge('updateProfile', 'summarizeIfNeeded');
workflow.addEdge('summarizeIfNeeded', '__end__');
// Add persistence
const checkpointer = new MemorySaver();
return workflow.compile({ checkpointer });
}
// app/api/stateful-chat/route.ts
import { createStatefulMemoryAgent } from '@/lib/memory/langgraph-memory';
import { HumanMessage } from '@langchain/core/messages';
export const runtime = 'nodejs';
export const maxDuration = 300;
const agent = createStatefulMemoryAgent();
export async function POST(req: Request) {
const { message, threadId } = await req.json();
const result = await agent.invoke(
{
messages: [new HumanMessage(message)],
},
{
configurable: { thread_id: threadId || 'default' },
}
);
return NextResponse.json({
response: result.messages[result.messages.length - 1].content,
profile: result.userProfile,
messageCount: result.messages.length,
});
}
Implements a sophisticated memory system with automatic summarization, user profiling, and thread-based persistence using LangGraph.
2. Vector Memory with Semantic Search
// lib/memory/vector-memory.ts
import { MemoryVectorStore } from 'langchain/vectorstores/memory';
import { GoogleGenerativeAIEmbeddings } from '@langchain/google-genai';
import { VectorStoreRetrieverMemory } from 'langchain/memory';
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { ConversationChain } from 'langchain/chains';
import { Document } from '@langchain/core/documents';
import { uniqBy, sortBy } from 'es-toolkit';
export class SemanticMemoryManager {
private vectorStore: MemoryVectorStore;
private memory: VectorStoreRetrieverMemory;
private chain: ConversationChain;
private embeddings: GoogleGenerativeAIEmbeddings;
constructor() {
this.embeddings = new GoogleGenerativeAIEmbeddings({
modelName: 'embedding-001',
});
this.vectorStore = new MemoryVectorStore(this.embeddings);
this.memory = new VectorStoreRetrieverMemory({
vectorStoreRetriever: this.vectorStore.asRetriever(5),
memoryKey: 'history',
inputKey: 'input',
outputKey: 'response',
returnDocs: true,
});
const model = new ChatGoogleGenerativeAI({
modelName: 'gemini-2.5-pro',
temperature: 0.7,
});
this.chain = new ConversationChain({
llm: model,
memory: this.memory,
});
}
async addMemory(content: string, metadata: Record<string, any> = {}) {
const doc = new Document({
pageContent: content,
metadata: {
...metadata,
timestamp: Date.now(),
},
});
await this.vectorStore.addDocuments([doc]);
}
async searchMemories(query: string, k: number = 5) {
const results = await this.vectorStore.similaritySearchWithScore(query, k);
// Sort by relevance and recency
const sorted = sortBy(results, [
(r) => -r[1], // Score (descending)
(r) => -r[0].metadata.timestamp, // Timestamp (descending)
]);
return sorted.map(([doc, score]) => ({
content: doc.pageContent,
metadata: doc.metadata,
relevanceScore: score,
}));
}
async processWithSemanticMemory(message: string) {
// Search for relevant past interactions
const relevantMemories = await this.searchMemories(message, 3);
// Add relevant context to the conversation
const contextualMessage = relevantMemories.length > 0
? `Relevant context: ${relevantMemories.map(m => m.content).join('\n')}\n\nUser: ${message}`
: message;
const result = await this.chain.call({ input: contextualMessage });
// Store the interaction
await this.addMemory(
`User: ${message}\nAssistant: ${result.response}`,
{ type: 'conversation' }
);
return {
response: result.response,
relevantMemories: relevantMemories.slice(0, 2),
totalMemories: await this.vectorStore.memoryVectors.length,
};
}
}
Creates a semantic memory system using vector embeddings to find relevant past interactions based on meaning rather than recency.
3. Persistent Memory with Redis for Serverless
// lib/memory/redis-memory.ts
import { Redis } from '@upstash/redis';
import { BufferMemory, ChatMessageHistory } from 'langchain/memory';
import { AIMessage, HumanMessage, BaseMessage } from '@langchain/core/messages';
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { ConversationChain } from 'langchain/chains';
import { memoize, chunk } from 'es-toolkit';
const redis = new Redis({
url: process.env.UPSTASH_REDIS_URL!,
token: process.env.UPSTASH_REDIS_TOKEN!,
});
export class RedisMemoryManager {
private sessionTTL = 3600; // 1 hour
private maxMessagesPerSession = 100;
// Memoize session loading for performance
private loadSession = memoize(
async (sessionId: string) => {
const data = await redis.get(`session:${sessionId}`);
return data as any || { messages: [], metadata: {} };
},
{
resolver: (sessionId) => sessionId,
ttl: 5000, // Cache for 5 seconds
}
);
async getMemory(sessionId: string): Promise<BufferMemory> {
const session = await this.loadSession(sessionId);
const messages = session.messages.map((msg: any) =>
msg.type === 'human'
? new HumanMessage(msg.content)
: new AIMessage(msg.content)
);
const chatHistory = new ChatMessageHistory(messages);
return new BufferMemory({
chatHistory,
returnMessages: true,
memoryKey: 'history',
});
}
async saveMemory(sessionId: string, messages: BaseMessage[]) {
// Keep only recent messages for serverless constraints
const recentMessages = messages.slice(-this.maxMessagesPerSession);
const serialized = recentMessages.map(msg => ({
type: msg._getType(),
content: msg.content,
}));
const session = {
messages: serialized,
metadata: {
lastAccess: Date.now(),
messageCount: serialized.length,
},
};
await redis.setex(
`session:${sessionId}`,
this.sessionTTL,
JSON.stringify(session)
);
// Update session index for cleanup
await redis.zadd('sessions:active', {
score: Date.now(),
member: sessionId,
});
}
async processWithPersistence(
sessionId: string,
message: string
) {
const model = new ChatGoogleGenerativeAI({
modelName: 'gemini-2.5-flash',
temperature: 0.7,
});
const memory = await this.getMemory(sessionId);
const chain = new ConversationChain({
llm: model,
memory,
});
const result = await chain.call({ input: message });
// Save updated memory
await this.saveMemory(
sessionId,
chain.memory.chatHistory.messages
);
// Get session stats
const stats = await redis.get(`session:${sessionId}:stats`) as any || {};
stats.messageCount = (stats.messageCount || 0) + 1;
stats.lastActive = Date.now();
await redis.setex(
`session:${sessionId}:stats`,
this.sessionTTL,
JSON.stringify(stats)
);
return {
response: result.response,
sessionId,
stats,
};
}
async cleanupOldSessions() {
// Remove sessions older than 24 hours
const cutoff = Date.now() - 24 * 60 * 60 * 1000;
const oldSessions = await redis.zrangebyscore(
'sessions:active',
0,
cutoff
);
// Batch delete in chunks for efficiency
const sessionChunks = chunk(oldSessions, 10);
for (const batch of sessionChunks) {
await Promise.all(
batch.map(sessionId =>
redis.del(`session:${sessionId}`)
)
);
}
await redis.zremrangebyscore('sessions:active', 0, cutoff);
return oldSessions.length;
}
}
Implements Redis-based persistent memory optimized for Vercel's serverless environment with automatic cleanup and session management.
4. Hierarchical Memory with Profile Store
// lib/memory/hierarchical-memory.ts
import { StateGraph, Annotation, MemorySaver } from '@langchain/langgraph';
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { Redis } from '@upstash/redis';
import { pick, omit, merge } from 'es-toolkit';
interface UserProfile {
preferences: Record<string, any>;
facts: string[];
interests: string[];
lastUpdated: number;
}
interface SessionMemory {
messages: any[];
summary: string;
topics: string[];
}
interface WorkingMemory {
currentTopic: string;
context: string[];
activeGoal: string;
}
const HierarchicalStateAnnotation = Annotation.Root({
// Working memory - immediate context
working: Annotation<WorkingMemory>({
reducer: (x, y) => ({ ...x, ...y }),
default: () => ({
currentTopic: '',
context: [],
activeGoal: '',
}),
}),
// Session memory - current conversation
session: Annotation<SessionMemory>({
reducer: (x, y) => ({ ...x, ...y }),
default: () => ({
messages: [],
summary: '',
topics: [],
}),
}),
// Long-term memory - user profile
profile: Annotation<UserProfile>({
reducer: (x, y) => merge(x, y),
default: () => ({
preferences: {},
facts: [],
interests: [],
lastUpdated: Date.now(),
}),
}),
});
export class HierarchicalMemorySystem {
private redis: Redis;
private graph: ReturnType<typeof StateGraph.compile>;
private model: ChatGoogleGenerativeAI;
constructor() {
this.redis = new Redis({
url: process.env.UPSTASH_REDIS_URL!,
token: process.env.UPSTASH_REDIS_TOKEN!,
});
this.model = new ChatGoogleGenerativeAI({
modelName: 'gemini-2.5-pro',
temperature: 0.7,
});
this.graph = this.buildMemoryGraph();
}
private buildMemoryGraph() {
const workflow = new StateGraph(HierarchicalStateAnnotation)
.addNode('processInput', async (state) => {
// Extract topic and intent from input
const analysis = await this.model.invoke([
new SystemMessage('Extract the main topic and user intent as JSON'),
new HumanMessage(state.working.context[state.working.context.length - 1]),
]);
try {
const { topic, intent } = JSON.parse(analysis.content as string);
return {
working: {
currentTopic: topic,
activeGoal: intent,
},
};
} catch {
return {};
}
})
.addNode('retrieveRelevantMemory', async (state) => {
// Get relevant long-term memories
const profileKey = `profile:${state.working.currentTopic}`;
const relevantProfile = await this.redis.get(profileKey) || {};
// Get relevant session memories
const sessionKey = `session:recent:${state.working.currentTopic}`;
const relevantSessions = await this.redis.get(sessionKey) || [];
return {
working: {
context: [
...state.working.context,
`Profile context: ${JSON.stringify(relevantProfile)}`,
`Previous discussions: ${JSON.stringify(relevantSessions)}`,
],
},
};
})
.addNode('generateResponse', async (state) => {
// Generate response with full context
const contextMessage = [
new SystemMessage(`User profile: ${JSON.stringify(state.profile)}`),
new SystemMessage(`Current topic: ${state.working.currentTopic}`),
new SystemMessage(`Session context: ${state.session.summary}`),
...state.working.context.map(c => new SystemMessage(c)),
];
const response = await this.model.invoke(contextMessage);
return {
session: {
messages: [...state.session.messages, response.content],
},
};
})
.addNode('updateMemoryTiers', async (state) => {
// Promote important information up the hierarchy
const importantFacts = await this.extractImportantFacts(
state.session.messages
);
if (importantFacts.length > 0) {
return {
profile: {
facts: [...state.profile.facts, ...importantFacts],
lastUpdated: Date.now(),
},
};
}
return {};
})
.addNode('persistMemory', async (state) => {
// Save to Redis with different TTLs
// Working memory - 5 minutes
await this.redis.setex(
`working:${state.working.currentTopic}`,
300,
JSON.stringify(state.working)
);
// Session memory - 1 hour
await this.redis.setex(
`session:${Date.now()}`,
3600,
JSON.stringify(state.session)
);
// Profile - permanent (or very long TTL)
await this.redis.set(
`profile:user`,
JSON.stringify(state.profile)
);
return {};
});
workflow.addEdge('__start__', 'processInput');
workflow.addEdge('processInput', 'retrieveRelevantMemory');
workflow.addEdge('retrieveRelevantMemory', 'generateResponse');
workflow.addEdge('generateResponse', 'updateMemoryTiers');
workflow.addEdge('updateMemoryTiers', 'persistMemory');
workflow.addEdge('persistMemory', '__end__');
const checkpointer = new MemorySaver();
return workflow.compile({ checkpointer });
}
private async extractImportantFacts(messages: string[]): Promise<string[]> {
if (messages.length === 0) return [];
const extraction = await this.model.invoke([
new SystemMessage('Extract important facts about the user as a JSON array'),
...messages.map(m => new HumanMessage(m)),
]);
try {
return JSON.parse(extraction.content as string);
} catch {
return [];
}
}
async process(userId: string, message: string) {
// Load existing profile
const profile = await this.redis.get(`profile:${userId}`) as UserProfile || {
preferences: {},
facts: [],
interests: [],
lastUpdated: Date.now(),
};
const result = await this.graph.invoke(
{
working: {
context: [message],
},
profile,
},
{
configurable: { thread_id: userId },
}
);
return {
response: result.session.messages[result.session.messages.length - 1],
memoryStats: {
workingMemorySize: result.working.context.length,
sessionMessages: result.session.messages.length,
profileFacts: result.profile.facts.length,
},
};
}
}
Implements a three-tier memory hierarchy with working memory for immediate context, session memory for conversations, and long-term profile storage.
5. Frontend Integration with Memory Status
// components/MemoryChat.tsx
'use client';
import { useState, useEffect } from 'react';
import { useMutation } from '@tanstack/react-query';
import { groupBy, debounce } from 'es-toolkit';
interface MemoryStats {
workingMemory: number;
sessionMessages: number;
profileFacts: number;
relevantMemories?: Array<{
content: string;
relevanceScore: number;
}>;
}
export default function MemoryChat() {
const [message, setMessage] = useState('');
const [sessionId] = useState(() =>
`session-${Date.now()}-${Math.random()}`
);
const [memoryStats, setMemoryStats] = useState<MemoryStats>({
workingMemory: 0,
sessionMessages: 0,
profileFacts: 0,
});
const [messages, setMessages] = useState<Array<{
role: 'user' | 'assistant';
content: string;
timestamp: number;
}>>([]);
const sendMessage = useMutation({
mutationFn: async (text: string) => {
const response = await fetch('/api/memory-chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message: text,
sessionId
}),
});
return response.json();
},
onSuccess: (data) => {
setMessages(prev => [
...prev,
{
role: 'user',
content: message,
timestamp: Date.now()
},
{
role: 'assistant',
content: data.response,
timestamp: Date.now()
},
]);
setMemoryStats(data.memoryStats || memoryStats);
setMessage('');
},
});
// Auto-save session periodically
useEffect(() => {
const saveSession = debounce(async () => {
await fetch('/api/memory-chat/save', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ sessionId, messages }),
});
}, 10000);
if (messages.length > 0) {
saveSession();
}
}, [messages, sessionId]);
const handleSubmit = (e: React.FormEvent) => {
e.preventDefault();
if (message.trim()) {
sendMessage.mutate(message);
}
};
// Group messages by time period
const messageGroups = groupBy(messages, (msg) => {
const date = new Date(msg.timestamp);
return date.toLocaleDateString();
});
return (
<div className="flex flex-col h-screen max-h-[800px] bg-base-100">
{/* Memory Status Bar */}
<div className="navbar bg-base-200 px-4">
<div className="flex-1">
<h2 className="text-xl font-bold">AI Memory Chat</h2>
</div>
<div className="flex-none">
<div className="stats stats-horizontal shadow">
<div className="stat place-items-center py-2 px-4">
<div className="stat-title text-xs">Working</div>
<div className="stat-value text-lg">
{memoryStats.workingMemory}
</div>
</div>
<div className="stat place-items-center py-2 px-4">
<div className="stat-title text-xs">Session</div>
<div className="stat-value text-lg">
{memoryStats.sessionMessages}
</div>
</div>
<div className="stat place-items-center py-2 px-4">
<div className="stat-title text-xs">Profile</div>
<div className="stat-value text-lg">
{memoryStats.profileFacts}
</div>
</div>
</div>
</div>
</div>
{/* Messages Area */}
<div className="flex-1 overflow-y-auto p-4 space-y-4">
{Object.entries(messageGroups).map(([date, msgs]) => (
<div key={date}>
<div className="divider text-sm">{date}</div>
{msgs.map((msg, idx) => (
<div
key={idx}
className={`chat ${
msg.role === 'user' ? 'chat-end' : 'chat-start'
}`}
>
<div className="chat-header">
{msg.role === 'user' ? 'You' : 'AI'}
</div>
<div
className={`chat-bubble ${
msg.role === 'user'
? 'chat-bubble-primary'
: 'chat-bubble-secondary'
}`}
>
{msg.content}
</div>
</div>
))}
</div>
))}
{/* Show relevant memories if available */}
{memoryStats.relevantMemories &&
memoryStats.relevantMemories.length > 0 && (
<div className="alert alert-info">
<div>
<h4 className="font-bold">Relevant Memories:</h4>
{memoryStats.relevantMemories.map((mem, idx) => (
<div key={idx} className="text-sm mt-1">
• {mem.content} (relevance: {mem.relevanceScore.toFixed(2)})
</div>
))}
</div>
</div>
)}
{sendMessage.isPending && (
<div className="flex justify-start">
<div className="chat-bubble">
<span className="loading loading-dots loading-sm"></span>
</div>
</div>
)}
</div>
{/* Input Area */}
<form onSubmit={handleSubmit} className="p-4 bg-base-200">
<div className="join w-full">
<input
type="text"
value={message}
onChange={(e) => setMessage(e.target.value)}
placeholder="Type your message..."
className="input input-bordered join-item flex-1"
disabled={sendMessage.isPending}
/>
<button
type="submit"
className="btn btn-primary join-item"
disabled={sendMessage.isPending || !message.trim()}
>
Send
</button>
</div>
</form>
</div>
);
}
React component that visualizes memory statistics in real-time, showing working memory, session messages, and profile facts.
Conclusion
Memory management transforms stateless AI interactions into intelligent, context-aware conversations that build relationships over time. This guide demonstrated progression from simple buffer memory through sophisticated hierarchical systems optimized for Vercel's serverless constraints. Key patterns include using LangGraph's StateGraph for complex memory flows, implementing Redis for serverless persistence, leveraging vector stores for semantic retrieval, and building multi-tier architectures that mirror human cognitive systems. The combination of working, episodic, semantic, and procedural memory enables agents to maintain context, learn from interactions, and provide increasingly personalized assistance while operating efficiently within serverless execution limits.