DRAFT Agentic Design Patterns - Memory Management

Building AI agents that remember past interactions, maintain context across sessions, and learn from experience using TypeScript, LangChain, LangGraph, and Vercel's serverless platform.

Mental Model: Building Memory-First Agents

Think of agent memory like a web application's state management system on steroids. Just as React apps use Redux for state, Context API for shared data, and localStorage for persistence, AI agents need working memory (like useState), episodic memory (like session storage), semantic memory (like a database), and procedural memory (like cached computations). LangGraph acts as your memory orchestrator (similar to Redux), while external stores like Redis and vector databases serve as your persistent layer. The key difference: agent memory must handle both structured data and semantic understanding, operating efficiently within serverless constraints.

Basic Example: Conversation Buffer Memory

1. Simple Buffer Memory Implementation

// lib/memory/buffer-memory.ts
import { BufferMemory } from 'langchain/memory';
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { ConversationChain } from 'langchain/chains';
import { map, take } from 'es-toolkit';

export function createBasicMemoryChain() {
  const model = new ChatGoogleGenerativeAI({
    modelName: 'gemini-2.5-flash',
    temperature: 0.7,
  });

  const memory = new BufferMemory({
    returnMessages: true,
    memoryKey: 'history',
  });

  const chain = new ConversationChain({
    llm: model,
    memory,
    verbose: false, // Set to true for debugging
  });

  return chain;
}

// app/api/chat-memory/route.ts
import { createBasicMemoryChain } from '@/lib/memory/buffer-memory';
import { NextResponse } from 'next/server';

export const runtime = 'nodejs';
export const maxDuration = 60;

const chain = createBasicMemoryChain();

export async function POST(req: Request) {
  const { message } = await req.json();
  
  const result = await chain.call({ input: message });
  
  return NextResponse.json({ 
    response: result.response,
    memorySize: chain.memory.chatHistory.messages.length 
  });
}

Creates a basic conversation chain with BufferMemory that maintains all messages in memory during the session.

2. Window Memory with Limited Context

// lib/memory/window-memory.ts
import { BufferWindowMemory } from 'langchain/memory';
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { ConversationChain } from 'langchain/chains';
import { takeRight } from 'es-toolkit';

interface WindowMemoryConfig {
  windowSize?: number;
  returnMessages?: boolean;
}

export class WindowMemoryManager {
  private chain: ConversationChain;
  private windowSize: number;

  constructor(config: WindowMemoryConfig = {}) {
    const model = new ChatGoogleGenerativeAI({
      modelName: 'gemini-2.5-flash',
      temperature: 0.7,
    });

    this.windowSize = config.windowSize || 10;

    const memory = new BufferWindowMemory({
      k: this.windowSize,
      returnMessages: config.returnMessages ?? true,
    });

    this.chain = new ConversationChain({
      llm: model,
      memory,
    });
  }

  async processMessage(message: string) {
    const result = await this.chain.call({ input: message });
    
    // Get current window
    const messages = this.chain.memory.chatHistory.messages;
    const windowMessages = takeRight(messages, this.windowSize * 2); // User + AI messages
    
    return {
      response: result.response,
      windowSize: windowMessages.length,
      droppedMessages: Math.max(0, messages.length - windowMessages.length)
    };
  }
}

Implements a sliding window memory that keeps only the last K message pairs, automatically dropping older context to stay within limits.

3. Summary Memory for Long Conversations

// lib/memory/summary-memory.ts
import { ConversationSummaryMemory } from 'langchain/memory';
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { ConversationChain } from 'langchain/chains';
import { debounce } from 'es-toolkit';

export class SummaryMemoryManager {
  private chain: ConversationChain;
  private summaryModel: ChatGoogleGenerativeAI;
  private updateSummary: Function;

  constructor() {
    const model = new ChatGoogleGenerativeAI({
      modelName: 'gemini-2.5-pro',
      temperature: 0.7,
    });

    this.summaryModel = new ChatGoogleGenerativeAI({
      modelName: 'gemini-2.5-flash',
      temperature: 0,
    });

    const memory = new ConversationSummaryMemory({
      llm: this.summaryModel,
      returnMessages: false,
      inputKey: 'input',
      outputKey: 'response',
    });

    this.chain = new ConversationChain({
      llm: model,
      memory,
    });

    // Debounce summary updates for performance
    this.updateSummary = debounce(
      () => this.chain.memory.predictNewSummary(
        this.chain.memory.chatHistory.messages,
        ''
      ),
      5000
    );
  }

  async processWithSummary(message: string) {
    const result = await this.chain.call({ input: message });
    
    // Trigger summary update (debounced)
    this.updateSummary();
    
    return {
      response: result.response,
      currentSummary: this.chain.memory.buffer
    };
  }
}

Uses LLM to progressively summarize conversations, maintaining context while reducing token usage for long interactions.

Advanced Example: Multi-Tier Memory System with LangGraph

1. State-Based Memory with LangGraph

// lib/memory/langgraph-memory.ts
import { StateGraph, Annotation } from '@langchain/langgraph';
import { BaseMessage, HumanMessage, AIMessage } from '@langchain/core/messages';
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { MemorySaver } from '@langchain/langgraph';
import { groupBy, orderBy, take } from 'es-toolkit';

// Define the state structure
const StateAnnotation = Annotation.Root({
  messages: Annotation<BaseMessage[]>({
    reducer: (x, y) => [...x, ...y],
    default: () => [],
  }),
  summary: Annotation<string>({
    reducer: (x, y) => y || x,
    default: () => '',
  }),
  userProfile: Annotation<Record<string, any>>({
    reducer: (x, y) => ({ ...x, ...y }),
    default: () => ({}),
  }),
  sessionMetadata: Annotation<Record<string, any>>({
    reducer: (x, y) => ({ ...x, ...y }),
    default: () => ({ startTime: Date.now() }),
  }),
});

export function createStatefulMemoryAgent() {
  const model = new ChatGoogleGenerativeAI({
    modelName: 'gemini-2.5-pro',
    temperature: 0.7,
  });

  const summaryModel = new ChatGoogleGenerativeAI({
    modelName: 'gemini-2.5-flash',
    temperature: 0,
  });

  const workflow = new StateGraph(StateAnnotation)
    .addNode('processMessage', async (state) => {
      // Process with full context
      const contextMessages = [
        new SystemMessage(`User Profile: ${JSON.stringify(state.userProfile)}`),
        new SystemMessage(`Summary of past: ${state.summary}`),
        ...take(state.messages, 10), // Recent messages
      ];

      const response = await model.invoke(contextMessages);
      
      return {
        messages: [response],
      };
    })
    .addNode('updateProfile', async (state) => {
      // Extract user preferences from conversation
      const recentMessages = take(state.messages, 5);
      const analysis = await summaryModel.invoke([
        new SystemMessage('Extract user preferences and facts as JSON'),
        ...recentMessages,
      ]);

      try {
        const preferences = JSON.parse(analysis.content as string);
        return { userProfile: preferences };
      } catch {
        return {};
      }
    })
    .addNode('summarizeIfNeeded', async (state) => {
      // Summarize when messages exceed threshold
      if (state.messages.length > 20) {
        const oldMessages = state.messages.slice(0, -10);
        const summary = await summaryModel.invoke([
          new SystemMessage('Summarize this conversation concisely'),
          ...oldMessages,
        ]);

        return {
          summary: summary.content as string,
          messages: state.messages.slice(-10), // Keep only recent
        };
      }
      return {};
    });

  // Define the flow
  workflow.addEdge('__start__', 'processMessage');
  workflow.addEdge('processMessage', 'updateProfile');
  workflow.addEdge('updateProfile', 'summarizeIfNeeded');
  workflow.addEdge('summarizeIfNeeded', '__end__');

  // Add persistence
  const checkpointer = new MemorySaver();
  
  return workflow.compile({ checkpointer });
}

// app/api/stateful-chat/route.ts
import { createStatefulMemoryAgent } from '@/lib/memory/langgraph-memory';
import { HumanMessage } from '@langchain/core/messages';

export const runtime = 'nodejs';
export const maxDuration = 300;

const agent = createStatefulMemoryAgent();

export async function POST(req: Request) {
  const { message, threadId } = await req.json();
  
  const result = await agent.invoke(
    {
      messages: [new HumanMessage(message)],
    },
    {
      configurable: { thread_id: threadId || 'default' },
    }
  );

  return NextResponse.json({
    response: result.messages[result.messages.length - 1].content,
    profile: result.userProfile,
    messageCount: result.messages.length,
  });
}

Implements a sophisticated memory system with automatic summarization, user profiling, and thread-based persistence using LangGraph.

2. Vector Memory with Semantic Search

// lib/memory/vector-memory.ts
import { MemoryVectorStore } from 'langchain/vectorstores/memory';
import { GoogleGenerativeAIEmbeddings } from '@langchain/google-genai';
import { VectorStoreRetrieverMemory } from 'langchain/memory';
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { ConversationChain } from 'langchain/chains';
import { Document } from '@langchain/core/documents';
import { uniqBy, sortBy } from 'es-toolkit';

export class SemanticMemoryManager {
  private vectorStore: MemoryVectorStore;
  private memory: VectorStoreRetrieverMemory;
  private chain: ConversationChain;
  private embeddings: GoogleGenerativeAIEmbeddings;

  constructor() {
    this.embeddings = new GoogleGenerativeAIEmbeddings({
      modelName: 'embedding-001',
    });

    this.vectorStore = new MemoryVectorStore(this.embeddings);

    this.memory = new VectorStoreRetrieverMemory({
      vectorStoreRetriever: this.vectorStore.asRetriever(5),
      memoryKey: 'history',
      inputKey: 'input',
      outputKey: 'response',
      returnDocs: true,
    });

    const model = new ChatGoogleGenerativeAI({
      modelName: 'gemini-2.5-pro',
      temperature: 0.7,
    });

    this.chain = new ConversationChain({
      llm: model,
      memory: this.memory,
    });
  }

  async addMemory(content: string, metadata: Record<string, any> = {}) {
    const doc = new Document({
      pageContent: content,
      metadata: {
        ...metadata,
        timestamp: Date.now(),
      },
    });

    await this.vectorStore.addDocuments([doc]);
  }

  async searchMemories(query: string, k: number = 5) {
    const results = await this.vectorStore.similaritySearchWithScore(query, k);
    
    // Sort by relevance and recency
    const sorted = sortBy(results, [
      (r) => -r[1], // Score (descending)
      (r) => -r[0].metadata.timestamp, // Timestamp (descending)
    ]);

    return sorted.map(([doc, score]) => ({
      content: doc.pageContent,
      metadata: doc.metadata,
      relevanceScore: score,
    }));
  }

  async processWithSemanticMemory(message: string) {
    // Search for relevant past interactions
    const relevantMemories = await this.searchMemories(message, 3);
    
    // Add relevant context to the conversation
    const contextualMessage = relevantMemories.length > 0
      ? `Relevant context: ${relevantMemories.map(m => m.content).join('\n')}\n\nUser: ${message}`
      : message;

    const result = await this.chain.call({ input: contextualMessage });

    // Store the interaction
    await this.addMemory(
      `User: ${message}\nAssistant: ${result.response}`,
      { type: 'conversation' }
    );

    return {
      response: result.response,
      relevantMemories: relevantMemories.slice(0, 2),
      totalMemories: await this.vectorStore.memoryVectors.length,
    };
  }
}

Creates a semantic memory system using vector embeddings to find relevant past interactions based on meaning rather than recency.

3. Persistent Memory with Redis for Serverless

// lib/memory/redis-memory.ts
import { Redis } from '@upstash/redis';
import { BufferMemory, ChatMessageHistory } from 'langchain/memory';
import { AIMessage, HumanMessage, BaseMessage } from '@langchain/core/messages';
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { ConversationChain } from 'langchain/chains';
import { memoize, chunk } from 'es-toolkit';

const redis = new Redis({
  url: process.env.UPSTASH_REDIS_URL!,
  token: process.env.UPSTASH_REDIS_TOKEN!,
});

export class RedisMemoryManager {
  private sessionTTL = 3600; // 1 hour
  private maxMessagesPerSession = 100;

  // Memoize session loading for performance
  private loadSession = memoize(
    async (sessionId: string) => {
      const data = await redis.get(`session:${sessionId}`);
      return data as any || { messages: [], metadata: {} };
    },
    { 
      resolver: (sessionId) => sessionId,
      ttl: 5000, // Cache for 5 seconds
    }
  );

  async getMemory(sessionId: string): Promise<BufferMemory> {
    const session = await this.loadSession(sessionId);
    
    const messages = session.messages.map((msg: any) => 
      msg.type === 'human' 
        ? new HumanMessage(msg.content)
        : new AIMessage(msg.content)
    );

    const chatHistory = new ChatMessageHistory(messages);
    
    return new BufferMemory({
      chatHistory,
      returnMessages: true,
      memoryKey: 'history',
    });
  }

  async saveMemory(sessionId: string, messages: BaseMessage[]) {
    // Keep only recent messages for serverless constraints
    const recentMessages = messages.slice(-this.maxMessagesPerSession);
    
    const serialized = recentMessages.map(msg => ({
      type: msg._getType(),
      content: msg.content,
    }));

    const session = {
      messages: serialized,
      metadata: {
        lastAccess: Date.now(),
        messageCount: serialized.length,
      },
    };

    await redis.setex(
      `session:${sessionId}`,
      this.sessionTTL,
      JSON.stringify(session)
    );

    // Update session index for cleanup
    await redis.zadd('sessions:active', {
      score: Date.now(),
      member: sessionId,
    });
  }

  async processWithPersistence(
    sessionId: string,
    message: string
  ) {
    const model = new ChatGoogleGenerativeAI({
      modelName: 'gemini-2.5-flash',
      temperature: 0.7,
    });

    const memory = await this.getMemory(sessionId);

    const chain = new ConversationChain({
      llm: model,
      memory,
    });

    const result = await chain.call({ input: message });

    // Save updated memory
    await this.saveMemory(
      sessionId,
      chain.memory.chatHistory.messages
    );

    // Get session stats
    const stats = await redis.get(`session:${sessionId}:stats`) as any || {};
    stats.messageCount = (stats.messageCount || 0) + 1;
    stats.lastActive = Date.now();
    
    await redis.setex(
      `session:${sessionId}:stats`,
      this.sessionTTL,
      JSON.stringify(stats)
    );

    return {
      response: result.response,
      sessionId,
      stats,
    };
  }

  async cleanupOldSessions() {
    // Remove sessions older than 24 hours
    const cutoff = Date.now() - 24 * 60 * 60 * 1000;
    const oldSessions = await redis.zrangebyscore(
      'sessions:active',
      0,
      cutoff
    );

    // Batch delete in chunks for efficiency
    const sessionChunks = chunk(oldSessions, 10);
    
    for (const batch of sessionChunks) {
      await Promise.all(
        batch.map(sessionId => 
          redis.del(`session:${sessionId}`)
        )
      );
    }

    await redis.zremrangebyscore('sessions:active', 0, cutoff);
    
    return oldSessions.length;
  }
}

Implements Redis-based persistent memory optimized for Vercel's serverless environment with automatic cleanup and session management.

4. Hierarchical Memory with Profile Store

// lib/memory/hierarchical-memory.ts
import { StateGraph, Annotation, MemorySaver } from '@langchain/langgraph';
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { Redis } from '@upstash/redis';
import { pick, omit, merge } from 'es-toolkit';

interface UserProfile {
  preferences: Record<string, any>;
  facts: string[];
  interests: string[];
  lastUpdated: number;
}

interface SessionMemory {
  messages: any[];
  summary: string;
  topics: string[];
}

interface WorkingMemory {
  currentTopic: string;
  context: string[];
  activeGoal: string;
}

const HierarchicalStateAnnotation = Annotation.Root({
  // Working memory - immediate context
  working: Annotation<WorkingMemory>({
    reducer: (x, y) => ({ ...x, ...y }),
    default: () => ({
      currentTopic: '',
      context: [],
      activeGoal: '',
    }),
  }),
  // Session memory - current conversation
  session: Annotation<SessionMemory>({
    reducer: (x, y) => ({ ...x, ...y }),
    default: () => ({
      messages: [],
      summary: '',
      topics: [],
    }),
  }),
  // Long-term memory - user profile
  profile: Annotation<UserProfile>({
    reducer: (x, y) => merge(x, y),
    default: () => ({
      preferences: {},
      facts: [],
      interests: [],
      lastUpdated: Date.now(),
    }),
  }),
});

export class HierarchicalMemorySystem {
  private redis: Redis;
  private graph: ReturnType<typeof StateGraph.compile>;
  private model: ChatGoogleGenerativeAI;

  constructor() {
    this.redis = new Redis({
      url: process.env.UPSTASH_REDIS_URL!,
      token: process.env.UPSTASH_REDIS_TOKEN!,
    });

    this.model = new ChatGoogleGenerativeAI({
      modelName: 'gemini-2.5-pro',
      temperature: 0.7,
    });

    this.graph = this.buildMemoryGraph();
  }

  private buildMemoryGraph() {
    const workflow = new StateGraph(HierarchicalStateAnnotation)
      .addNode('processInput', async (state) => {
        // Extract topic and intent from input
        const analysis = await this.model.invoke([
          new SystemMessage('Extract the main topic and user intent as JSON'),
          new HumanMessage(state.working.context[state.working.context.length - 1]),
        ]);

        try {
          const { topic, intent } = JSON.parse(analysis.content as string);
          return {
            working: {
              currentTopic: topic,
              activeGoal: intent,
            },
          };
        } catch {
          return {};
        }
      })
      .addNode('retrieveRelevantMemory', async (state) => {
        // Get relevant long-term memories
        const profileKey = `profile:${state.working.currentTopic}`;
        const relevantProfile = await this.redis.get(profileKey) || {};

        // Get relevant session memories
        const sessionKey = `session:recent:${state.working.currentTopic}`;
        const relevantSessions = await this.redis.get(sessionKey) || [];

        return {
          working: {
            context: [
              ...state.working.context,
              `Profile context: ${JSON.stringify(relevantProfile)}`,
              `Previous discussions: ${JSON.stringify(relevantSessions)}`,
            ],
          },
        };
      })
      .addNode('generateResponse', async (state) => {
        // Generate response with full context
        const contextMessage = [
          new SystemMessage(`User profile: ${JSON.stringify(state.profile)}`),
          new SystemMessage(`Current topic: ${state.working.currentTopic}`),
          new SystemMessage(`Session context: ${state.session.summary}`),
          ...state.working.context.map(c => new SystemMessage(c)),
        ];

        const response = await this.model.invoke(contextMessage);

        return {
          session: {
            messages: [...state.session.messages, response.content],
          },
        };
      })
      .addNode('updateMemoryTiers', async (state) => {
        // Promote important information up the hierarchy
        const importantFacts = await this.extractImportantFacts(
          state.session.messages
        );

        if (importantFacts.length > 0) {
          return {
            profile: {
              facts: [...state.profile.facts, ...importantFacts],
              lastUpdated: Date.now(),
            },
          };
        }

        return {};
      })
      .addNode('persistMemory', async (state) => {
        // Save to Redis with different TTLs
        // Working memory - 5 minutes
        await this.redis.setex(
          `working:${state.working.currentTopic}`,
          300,
          JSON.stringify(state.working)
        );

        // Session memory - 1 hour
        await this.redis.setex(
          `session:${Date.now()}`,
          3600,
          JSON.stringify(state.session)
        );

        // Profile - permanent (or very long TTL)
        await this.redis.set(
          `profile:user`,
          JSON.stringify(state.profile)
        );

        return {};
      });

    workflow.addEdge('__start__', 'processInput');
    workflow.addEdge('processInput', 'retrieveRelevantMemory');
    workflow.addEdge('retrieveRelevantMemory', 'generateResponse');
    workflow.addEdge('generateResponse', 'updateMemoryTiers');
    workflow.addEdge('updateMemoryTiers', 'persistMemory');
    workflow.addEdge('persistMemory', '__end__');

    const checkpointer = new MemorySaver();
    return workflow.compile({ checkpointer });
  }

  private async extractImportantFacts(messages: string[]): Promise<string[]> {
    if (messages.length === 0) return [];

    const extraction = await this.model.invoke([
      new SystemMessage('Extract important facts about the user as a JSON array'),
      ...messages.map(m => new HumanMessage(m)),
    ]);

    try {
      return JSON.parse(extraction.content as string);
    } catch {
      return [];
    }
  }

  async process(userId: string, message: string) {
    // Load existing profile
    const profile = await this.redis.get(`profile:${userId}`) as UserProfile || {
      preferences: {},
      facts: [],
      interests: [],
      lastUpdated: Date.now(),
    };

    const result = await this.graph.invoke(
      {
        working: {
          context: [message],
        },
        profile,
      },
      {
        configurable: { thread_id: userId },
      }
    );

    return {
      response: result.session.messages[result.session.messages.length - 1],
      memoryStats: {
        workingMemorySize: result.working.context.length,
        sessionMessages: result.session.messages.length,
        profileFacts: result.profile.facts.length,
      },
    };
  }
}

Implements a three-tier memory hierarchy with working memory for immediate context, session memory for conversations, and long-term profile storage.

5. Frontend Integration with Memory Status

// components/MemoryChat.tsx
'use client';

import { useState, useEffect } from 'react';
import { useMutation } from '@tanstack/react-query';
import { groupBy, debounce } from 'es-toolkit';

interface MemoryStats {
  workingMemory: number;
  sessionMessages: number;
  profileFacts: number;
  relevantMemories?: Array<{
    content: string;
    relevanceScore: number;
  }>;
}

export default function MemoryChat() {
  const [message, setMessage] = useState('');
  const [sessionId] = useState(() => 
    `session-${Date.now()}-${Math.random()}`
  );
  const [memoryStats, setMemoryStats] = useState<MemoryStats>({
    workingMemory: 0,
    sessionMessages: 0,
    profileFacts: 0,
  });
  const [messages, setMessages] = useState<Array<{
    role: 'user' | 'assistant';
    content: string;
    timestamp: number;
  }>>([]);

  const sendMessage = useMutation({
    mutationFn: async (text: string) => {
      const response = await fetch('/api/memory-chat', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ 
          message: text, 
          sessionId 
        }),
      });
      return response.json();
    },
    onSuccess: (data) => {
      setMessages(prev => [
        ...prev,
        { 
          role: 'user', 
          content: message, 
          timestamp: Date.now() 
        },
        { 
          role: 'assistant', 
          content: data.response, 
          timestamp: Date.now() 
        },
      ]);
      setMemoryStats(data.memoryStats || memoryStats);
      setMessage('');
    },
  });

  // Auto-save session periodically
  useEffect(() => {
    const saveSession = debounce(async () => {
      await fetch('/api/memory-chat/save', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ sessionId, messages }),
      });
    }, 10000);

    if (messages.length > 0) {
      saveSession();
    }
  }, [messages, sessionId]);

  const handleSubmit = (e: React.FormEvent) => {
    e.preventDefault();
    if (message.trim()) {
      sendMessage.mutate(message);
    }
  };

  // Group messages by time period
  const messageGroups = groupBy(messages, (msg) => {
    const date = new Date(msg.timestamp);
    return date.toLocaleDateString();
  });

  return (
    <div className="flex flex-col h-screen max-h-[800px] bg-base-100">
      {/* Memory Status Bar */}
      <div className="navbar bg-base-200 px-4">
        <div className="flex-1">
          <h2 className="text-xl font-bold">AI Memory Chat</h2>
        </div>
        <div className="flex-none">
          <div className="stats stats-horizontal shadow">
            <div className="stat place-items-center py-2 px-4">
              <div className="stat-title text-xs">Working</div>
              <div className="stat-value text-lg">
                {memoryStats.workingMemory}
              </div>
            </div>
            <div className="stat place-items-center py-2 px-4">
              <div className="stat-title text-xs">Session</div>
              <div className="stat-value text-lg">
                {memoryStats.sessionMessages}
              </div>
            </div>
            <div className="stat place-items-center py-2 px-4">
              <div className="stat-title text-xs">Profile</div>
              <div className="stat-value text-lg">
                {memoryStats.profileFacts}
              </div>
            </div>
          </div>
        </div>
      </div>

      {/* Messages Area */}
      <div className="flex-1 overflow-y-auto p-4 space-y-4">
        {Object.entries(messageGroups).map(([date, msgs]) => (
          <div key={date}>
            <div className="divider text-sm">{date}</div>
            {msgs.map((msg, idx) => (
              <div
                key={idx}
                className={`chat ${
                  msg.role === 'user' ? 'chat-end' : 'chat-start'
                }`}
              >
                <div className="chat-header">
                  {msg.role === 'user' ? 'You' : 'AI'}
                </div>
                <div
                  className={`chat-bubble ${
                    msg.role === 'user'
                      ? 'chat-bubble-primary'
                      : 'chat-bubble-secondary'
                  }`}
                >
                  {msg.content}
                </div>
              </div>
            ))}
          </div>
        ))}

        {/* Show relevant memories if available */}
        {memoryStats.relevantMemories && 
         memoryStats.relevantMemories.length > 0 && (
          <div className="alert alert-info">
            <div>
              <h4 className="font-bold">Relevant Memories:</h4>
              {memoryStats.relevantMemories.map((mem, idx) => (
                <div key={idx} className="text-sm mt-1">
                  • {mem.content} (relevance: {mem.relevanceScore.toFixed(2)})
                </div>
              ))}
            </div>
          </div>
        )}

        {sendMessage.isPending && (
          <div className="flex justify-start">
            <div className="chat-bubble">
              <span className="loading loading-dots loading-sm"></span>
            </div>
          </div>
        )}
      </div>

      {/* Input Area */}
      <form onSubmit={handleSubmit} className="p-4 bg-base-200">
        <div className="join w-full">
          <input
            type="text"
            value={message}
            onChange={(e) => setMessage(e.target.value)}
            placeholder="Type your message..."
            className="input input-bordered join-item flex-1"
            disabled={sendMessage.isPending}
          />
          <button
            type="submit"
            className="btn btn-primary join-item"
            disabled={sendMessage.isPending || !message.trim()}
          >
            Send
          </button>
        </div>
      </form>
    </div>
  );
}

React component that visualizes memory statistics in real-time, showing working memory, session messages, and profile facts.

Conclusion

Memory management transforms stateless AI interactions into intelligent, context-aware conversations that build relationships over time. This guide demonstrated progression from simple buffer memory through sophisticated hierarchical systems optimized for Vercel's serverless constraints. Key patterns include using LangGraph's StateGraph for complex memory flows, implementing Redis for serverless persistence, leveraging vector stores for semantic retrieval, and building multi-tier architectures that mirror human cognitive systems. The combination of working, episodic, semantic, and procedural memory enables agents to maintain context, learn from interactions, and provide increasingly personalized assistance while operating efficiently within serverless execution limits.