DRAFT Agentic Design Patterns - Exception Handling"

Master the art of building fault-tolerant AI agents that gracefully handle errors, recover from failures, and maintain reliability in production using TypeScript, LangChain, LangGraph, and Vercel's serverless platform.

Mental Model: The Adaptive Emergency Response System

Think of exception handling in agents like a hospital's emergency response system. Just as a hospital has different protocols for different severity levels (minor injury → nurse, moderate → doctor, critical → full trauma team), agents need tiered error responses. Network errors are like supply delays (wait and retry), API failures are like equipment malfunction (use backup equipment), and system crashes are like power outages (activate emergency generators and alert administrators). The system doesn't just react to problems—it learns from them, adapts its protocols, and maintains service continuity. Your agent should similarly detect issues early, respond appropriately, recover gracefully, and improve from each incident.

Basic Example: Robust Agent with Error Boundaries

1. Define Error Hierarchy and Recovery Strategies

// lib/errors/exception-types.ts
import { z } from 'zod';
import { isError, isString } from 'es-toolkit';

export const ErrorLevelSchema = z.enum(['transient', 'recoverable', 'critical']);
export type ErrorLevel = z.infer<typeof ErrorLevelSchema>;

export const RecoveryStrategySchema = z.enum([
  'retry',
  'fallback',
  'cache',
  'degrade',
  'escalate',
  'abort'
]);
export type RecoveryStrategy = z.infer<typeof RecoveryStrategySchema>;

export interface ErrorContext {
  timestamp: Date;
  attempt: number;
  maxAttempts: number;
  strategy: RecoveryStrategy;
  metadata?: Record<string, any>;
}

export class AgentException extends Error {
  constructor(
    message: string,
    public level: ErrorLevel,
    public strategy: RecoveryStrategy,
    public context?: ErrorContext
  ) {
    super(message);
    this.name = 'AgentException';
  }

  static fromError(error: unknown, level: ErrorLevel = 'recoverable'): AgentException {
    if (error instanceof AgentException) return error;
    
    const message = isError(error) ? error.message : 
                   isString(error) ? error : 
                   'Unknown error occurred';
    
    return new AgentException(message, level, 'retry');
  }
}

export class ToolException extends AgentException {
  constructor(
    public toolName: string,
    message: string,
    strategy: RecoveryStrategy = 'fallback'
  ) {
    super(`Tool [${toolName}]: ${message}`, 'recoverable', strategy);
    this.name = 'ToolException';
  }
}

export class ValidationException extends AgentException {
  constructor(
    message: string,
    public validationErrors?: z.ZodIssue[]
  ) {
    super(message, 'transient', 'retry');
    this.name = 'ValidationException';
  }
}

Creates a structured error hierarchy with clear recovery strategies and contextual information for intelligent error handling.

2. Build Error Recovery Manager

// lib/recovery/recovery-manager.ts
import { retry, delay, throttle } from 'es-toolkit';
import { 
  AgentException, 
  ErrorLevel, 
  RecoveryStrategy,
  ErrorContext 
} from '@/lib/errors/exception-types';

interface RecoveryConfig {
  maxRetries: number;
  baseDelay: number;
  maxDelay: number;
  timeout: number;
  fallbackHandlers: Map<string, () => Promise<any>>;
}

export class RecoveryManager {
  private errorHistory: AgentException[] = [];
  private config: RecoveryConfig;

  constructor(config: Partial<RecoveryConfig> = {}) {
    this.config = {
      maxRetries: config.maxRetries ?? 3,
      baseDelay: config.baseDelay ?? 1000,
      maxDelay: config.maxDelay ?? 30000,
      timeout: config.timeout ?? 777000, // Vercel limit
      fallbackHandlers: config.fallbackHandlers ?? new Map()
    };
  }

  async executeWithRecovery<T>(
    operation: () => Promise<T>,
    operationName: string
  ): Promise<T> {
    const startTime = Date.now();
    let lastError: AgentException | null = null;

    for (let attempt = 1; attempt <= this.config.maxRetries; attempt++) {
      try {
        // Check timeout
        if (Date.now() - startTime > this.config.timeout) {
          throw new AgentException(
            'Operation timeout exceeded',
            'critical',
            'abort'
          );
        }

        // Execute operation
        const result = await Promise.race([
          operation(),
          this.createTimeout(this.config.timeout - (Date.now() - startTime))
        ]);

        // Clear error history on success
        if (attempt > 1) {
          console.log(`Recovery successful for ${operationName} on attempt ${attempt}`);
        }
        
        return result;

      } catch (error) {
        lastError = AgentException.fromError(error);
        lastError.context = {
          timestamp: new Date(),
          attempt,
          maxAttempts: this.config.maxRetries,
          strategy: this.determineStrategy(lastError, attempt)
        };

        this.errorHistory.push(lastError);
        
        // Apply recovery strategy
        const recovered = await this.applyStrategy(
          lastError,
          operationName,
          attempt
        );

        if (recovered !== null) {
          return recovered;
        }

        // Calculate backoff delay
        if (attempt < this.config.maxRetries) {
          const delayMs = Math.min(
            this.config.baseDelay * Math.pow(2, attempt - 1),
            this.config.maxDelay
          );
          await delay(delayMs);
        }
      }
    }

    throw lastError || new AgentException(
      `${operationName} failed after ${this.config.maxRetries} attempts`,
      'critical',
      'escalate'
    );
  }

  private determineStrategy(
    error: AgentException,
    attempt: number
  ): RecoveryStrategy {
    // Critical errors should escalate immediately
    if (error.level === 'critical') return 'escalate';
    
    // Transient errors should retry first
    if (error.level === 'transient' && attempt < this.config.maxRetries) {
      return 'retry';
    }
    
    // Recoverable errors should try fallback
    if (error.level === 'recoverable') {
      return 'fallback';
    }
    
    // Default to degraded service
    return 'degrade';
  }

  private async applyStrategy<T>(
    error: AgentException,
    operationName: string,
    attempt: number
  ): Promise<T | null> {
    const strategy = error.context?.strategy || error.strategy;

    switch (strategy) {
      case 'fallback':
        const fallback = this.config.fallbackHandlers.get(operationName);
        if (fallback) {
          console.log(`Applying fallback for ${operationName}`);
          return await fallback();
        }
        break;

      case 'cache':
        // Return cached result if available
        console.log(`Would return cached result for ${operationName}`);
        break;

      case 'degrade':
        console.log(`Degrading service for ${operationName}`);
        return null;

      case 'escalate':
        console.error(`Escalating error for ${operationName}:`, error.message);
        throw error;

      case 'abort':
        console.error(`Aborting ${operationName}`);
        throw error;
    }

    return null;
  }

  private createTimeout(ms: number): Promise<never> {
    return new Promise((_, reject) =>
      setTimeout(() => reject(new AgentException(
        'Operation timeout',
        'transient',
        'retry'
      )), ms)
    );
  }

  getErrorHistory(): AgentException[] {
    return [...this.errorHistory];
  }

  clearHistory(): void {
    this.errorHistory = [];
  }
}

Implements a sophisticated recovery manager that determines and applies appropriate recovery strategies based on error types.

3. Create Error-Aware Agent with Tools

// lib/agents/error-aware-agent.ts
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { DynamicStructuredTool } from '@langchain/core/tools';
import { z } from 'zod';
import { RecoveryManager } from '@/lib/recovery/recovery-manager';
import { ToolException, ValidationException } from '@/lib/errors/exception-types';
import { pipe, map, filter, reduce } from 'es-toolkit';

// Safe tool wrapper that handles errors
export function createSafeTool(
  name: string,
  description: string,
  schema: z.ZodSchema,
  implementation: (input: any) => Promise<any>,
  fallback?: () => Promise<any>
) {
  return new DynamicStructuredTool({
    name,
    description,
    schema,
    func: async (input) => {
      const recoveryManager = new RecoveryManager({
        maxRetries: 2,
        fallbackHandlers: fallback ? 
          new Map([[name, fallback]]) : 
          new Map()
      });

      try {
        // Validate input
        const validated = schema.safeParse(input);
        if (!validated.success) {
          throw new ValidationException(
            'Invalid tool input',
            validated.error.issues
          );
        }

        // Execute with recovery
        return await recoveryManager.executeWithRecovery(
          () => implementation(validated.data),
          name
        );

      } catch (error) {
        console.error(`Tool ${name} failed:`, error);
        throw new ToolException(name, 
          error instanceof Error ? error.message : 'Unknown error'
        );
      }
    }
  });
}

// Example tools with built-in error handling
export function createResilientTools() {
  const weatherTool = createSafeTool(
    'get_weather',
    'Get current weather for a location',
    z.object({
      location: z.string().min(1),
      units: z.enum(['celsius', 'fahrenheit']).default('celsius')
    }),
    async (input) => {
      // Simulate occasional failures
      if (Math.random() < 0.3) {
        throw new Error('Weather API unavailable');
      }
      return {
        location: input.location,
        temperature: 22,
        units: input.units,
        conditions: 'Partly cloudy'
      };
    },
    async () => ({
      location: 'Unknown',
      temperature: 20,
      units: 'celsius',
      conditions: 'Data unavailable',
      source: 'fallback'
    })
  );

  const calculatorTool = createSafeTool(
    'calculator',
    'Perform mathematical calculations',
    z.object({
      expression: z.string(),
      precision: z.number().int().min(0).max(10).default(2)
    }),
    async (input) => {
      try {
        // Safe evaluation using Function constructor
        const result = new Function('return ' + input.expression)();
        return {
          expression: input.expression,
          result: Number(result.toFixed(input.precision))
        };
      } catch {
        throw new Error('Invalid mathematical expression');
      }
    }
  );

  const searchTool = createSafeTool(
    'web_search',
    'Search the web for information',
    z.object({
      query: z.string().min(1).max(200),
      maxResults: z.number().int().min(1).max(10).default(5)
    }),
    async (input) => {
      // Simulate search with possible failures
      if (Math.random() < 0.2) {
        throw new Error('Search service timeout');
      }
      
      return {
        query: input.query,
        results: Array.from({ length: input.maxResults }, (_, i) => ({
          title: `Result ${i + 1} for "${input.query}"`,
          snippet: `Sample content for result ${i + 1}`,
          url: `https://example.com/result-${i + 1}`
        }))
      };
    },
    async () => ({
      query: 'fallback',
      results: [],
      error: 'Search unavailable, please try again later'
    })
  );

  return [weatherTool, calculatorTool, searchTool];
}

Creates error-aware tools with validation, recovery mechanisms, and fallback handlers for reliable execution.

4. Implement Agent API with Error Boundaries

// app/api/error-aware-agent/route.ts
import { NextResponse } from 'next/server';
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { createResilientTools } from '@/lib/agents/error-aware-agent';
import { RecoveryManager } from '@/lib/recovery/recovery-manager';
import { AgentException } from '@/lib/errors/exception-types';
import { createReactAgent } from '@langchain/langgraph/prebuilt';
import { HumanMessage } from '@langchain/core/messages';

export const runtime = 'nodejs';
export const maxDuration = 300;

export async function POST(req: Request) {
  const recoveryManager = new RecoveryManager();
  
  try {
    const { message } = await req.json();
    
    if (!message || typeof message !== 'string') {
      throw new AgentException(
        'Invalid request: message is required',
        'transient',
        'retry'
      );
    }

    // Execute agent with recovery
    const result = await recoveryManager.executeWithRecovery(
      async () => {
        const model = new ChatGoogleGenerativeAI({
          modelName: 'gemini-2.5-flash',
          temperature: 0.3,
          maxOutputTokens: 2048
        });

        const tools = createResilientTools();
        
        const agent = createReactAgent({
          llm: model,
          tools,
          messageModifier: `You are a helpful assistant with error recovery capabilities.
            If a tool fails, acknowledge it gracefully and try alternatives if possible.
            Always provide helpful responses even when tools are unavailable.`
        });

        const response = await agent.invoke({
          messages: [new HumanMessage(message)]
        });

        return response.messages[response.messages.length - 1].content;
      },
      'agent-execution'
    );

    const errorHistory = recoveryManager.getErrorHistory();
    
    return NextResponse.json({
      success: true,
      result,
      recoveryAttempts: errorHistory.length,
      errors: errorHistory.map(e => ({
        message: e.message,
        level: e.level,
        strategy: e.strategy,
        attempt: e.context?.attempt
      }))
    });

  } catch (error) {
    const agentError = AgentException.fromError(error, 'critical');
    
    console.error('Agent execution failed:', agentError);
    
    return NextResponse.json(
      {
        success: false,
        error: agentError.message,
        level: agentError.level,
        strategy: agentError.strategy,
        errorHistory: recoveryManager.getErrorHistory()
      },
      { status: agentError.level === 'critical' ? 500 : 503 }
    );
  }
}

API route implementing comprehensive error boundaries with detailed error tracking and recovery attempts.

5. Create Frontend with Error Feedback

// components/ErrorAwareChat.tsx
'use client';

import { useState } from 'react';
import { useMutation } from '@tanstack/react-query';
import { pipe, groupBy, map as mapUtil } from 'es-toolkit';

interface ChatResponse {
  success: boolean;
  result?: string;
  error?: string;
  level?: string;
  recoveryAttempts?: number;
  errors?: Array<{
    message: string;
    level: string;
    strategy: string;
    attempt?: number;
  }>;
}

export default function ErrorAwareChat() {
  const [message, setMessage] = useState('');
  const [chatHistory, setChatHistory] = useState<Array<{
    role: 'user' | 'assistant' | 'error';
    content: string;
    metadata?: any;
  }>>([]);

  const sendMessage = useMutation<ChatResponse, Error, string>({
    mutationFn: async (message: string) => {
      const response = await fetch('/api/error-aware-agent', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ message }),
      });

      const data = await response.json();
      
      if (!response.ok && !data.success) {
        throw new Error(data.error || 'Request failed');
      }
      
      return data;
    },
    onSuccess: (data) => {
      if (data.success) {
        setChatHistory(prev => [
          ...prev,
          { role: 'assistant', content: data.result!, metadata: data }
        ]);
      }
    },
    onError: (error) => {
      setChatHistory(prev => [
        ...prev,
        { 
          role: 'error', 
          content: `Error: ${error.message}`,
          metadata: { timestamp: new Date() }
        }
      ]);
    }
  });

  const handleSubmit = (e: React.FormEvent) => {
    e.preventDefault();
    if (!message.trim()) return;

    setChatHistory(prev => [...prev, { role: 'user', content: message }]);
    sendMessage.mutate(message);
    setMessage('');
  };

  const getRecoveryBadge = (attempts?: number) => {
    if (!attempts) return null;
    
    const badgeClass = attempts > 2 ? 'badge-warning' : 'badge-info';
    return (
      <div className={`badge ${badgeClass} badge-sm`}>
        {attempts} recovery attempt{attempts > 1 ? 's' : ''}
      </div>
    );
  };

  return (
    <div className="card w-full bg-base-100 shadow-xl">
      <div className="card-body">
        <h2 className="card-title">Error-Aware AI Assistant</h2>
        
        {/* Chat History */}
        <div className="h-96 overflow-y-auto space-y-2 p-4 bg-base-200 rounded-lg">
          {chatHistory.map((msg, idx) => (
            <div
              key={idx}
              className={`chat ${msg.role === 'user' ? 'chat-end' : 'chat-start'}`}
            >
              <div className="chat-header">
                {msg.role === 'user' ? 'You' : 
                 msg.role === 'assistant' ? 'Assistant' : 'System'}
              </div>
              <div className={`chat-bubble ${
                msg.role === 'error' ? 'chat-bubble-error' :
                msg.role === 'user' ? 'chat-bubble-primary' :
                'chat-bubble-secondary'
              }`}>
                {msg.content}
                {msg.metadata?.recoveryAttempts && (
                  <div className="mt-2">
                    {getRecoveryBadge(msg.metadata.recoveryAttempts)}
                  </div>
                )}
              </div>
              
              {msg.metadata?.errors && msg.metadata.errors.length > 0 && (
                <div className="chat-footer opacity-50 text-xs">
                  Recovered from: {msg.metadata.errors[0].strategy}
                </div>
              )}
            </div>
          ))}
          
          {sendMessage.isPending && (
            <div className="chat chat-start">
              <div className="chat-bubble chat-bubble-secondary">
                <span className="loading loading-dots loading-sm"></span>
              </div>
            </div>
          )}
        </div>

        {/* Input Form */}
        <form onSubmit={handleSubmit} className="join w-full mt-4">
          <input
            type="text"
            className="input input-bordered join-item flex-1"
            placeholder="Ask me anything..."
            value={message}
            onChange={(e) => setMessage(e.target.value)}
            disabled={sendMessage.isPending}
          />
          <button
            type="submit"
            className="btn btn-primary join-item"
            disabled={sendMessage.isPending || !message.trim()}
          >
            Send
          </button>
        </form>

        {/* Error Status */}
        {sendMessage.isError && (
          <div className="alert alert-error mt-2">
            <span>Failed to send message. Please try again.</span>
          </div>
        )}
      </div>
    </div>
  );
}

React component with visual feedback for error recovery attempts and graceful error display.

Advanced Example: Self-Correcting Multi-Agent System

1. Implement Error Propagation in Agent Hierarchy

// lib/agents/error-propagation.ts
import { EventEmitter } from 'events';
import { z } from 'zod';
import { throttle, debounce } from 'es-toolkit';

export interface ErrorEvent {
  agentId: string;
  parentId?: string;
  error: Error;
  timestamp: Date;
  handled: boolean;
  propagated: boolean;
}

export class ErrorPropagationManager extends EventEmitter {
  private errorChain: Map<string, ErrorEvent[]> = new Map();
  private handlers: Map<string, (error: ErrorEvent) => Promise<boolean>> = new Map();
  
  constructor() {
    super();
    this.setupErrorHandling();
  }

  private setupErrorHandling() {
    // Throttle error emissions to prevent flooding
    const throttledEmit = throttle((event: ErrorEvent) => {
      this.emit('error', event);
    }, 1000);

    this.on('error', throttledEmit);
  }

  registerAgent(
    agentId: string,
    parentId?: string,
    errorHandler?: (error: ErrorEvent) => Promise<boolean>
  ) {
    this.errorChain.set(agentId, []);
    
    if (errorHandler) {
      this.handlers.set(agentId, errorHandler);
    }
    
    if (parentId) {
      // Set up error propagation chain
      this.on(`error:${agentId}`, async (event: ErrorEvent) => {
        event.propagated = true;
        
        // Try local handler first
        const handled = await this.tryHandleError(agentId, event);
        
        if (!handled) {
          // Propagate to parent
          this.emit(`error:${parentId}`, {
            ...event,
            parentId: agentId
          });
        }
      });
    }
  }

  async reportError(agentId: string, error: Error): Promise<boolean> {
    const event: ErrorEvent = {
      agentId,
      error,
      timestamp: new Date(),
      handled: false,
      propagated: false
    };

    // Store in error chain
    const chain = this.errorChain.get(agentId) || [];
    chain.push(event);
    this.errorChain.set(agentId, chain);

    // Emit for handling
    this.emit(`error:${agentId}`, event);
    
    // Wait for handling
    return new Promise((resolve) => {
      setTimeout(() => {
        resolve(event.handled);
      }, 100);
    });
  }

  private async tryHandleError(
    agentId: string,
    event: ErrorEvent
  ): Promise<boolean> {
    const handler = this.handlers.get(agentId);
    
    if (handler) {
      try {
        event.handled = await handler(event);
        return event.handled;
      } catch (handlerError) {
        console.error(`Handler for ${agentId} failed:`, handlerError);
        return false;
      }
    }
    
    return false;
  }

  getErrorChain(agentId: string): ErrorEvent[] {
    return this.errorChain.get(agentId) || [];
  }

  clearErrorChain(agentId?: string) {
    if (agentId) {
      this.errorChain.delete(agentId);
    } else {
      this.errorChain.clear();
    }
  }
}

Manages error propagation through agent hierarchies with local handling and parent escalation.

2. Build Self-Correcting Workflow with Validation

// lib/workflows/self-correcting-workflow.ts
import { StateGraph, END } from '@langchain/langgraph';
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { BaseMessage, HumanMessage, SystemMessage } from '@langchain/core/messages';
import { z } from 'zod';
import { ErrorPropagationManager, ErrorEvent } from '@/lib/agents/error-propagation';
import { pipe, chunk, map, filter } from 'es-toolkit';

// Output validation schemas
const DataExtractionSchema = z.object({
  entities: z.array(z.string()),
  relationships: z.array(z.object({
    source: z.string(),
    target: z.string(),
    type: z.string()
  })),
  metadata: z.record(z.any())
});

const AnalysisResultSchema = z.object({
  insights: z.array(z.string()),
  confidence: z.number().min(0).max(1),
  recommendations: z.array(z.string())
});

interface WorkflowState {
  messages: BaseMessage[];
  stage: string;
  extractedData?: z.infer<typeof DataExtractionSchema>;
  analysisResult?: z.infer<typeof AnalysisResultSchema>;
  validationErrors: string[];
  correctionAttempts: number;
  finalOutput?: string;
}

export class SelfCorrectingWorkflow {
  private model: ChatGoogleGenerativeAI;
  private errorManager: ErrorPropagationManager;
  private maxCorrectionAttempts = 3;

  constructor() {
    this.model = new ChatGoogleGenerativeAI({
      modelName: 'gemini-2.5-pro',
      temperature: 0.2,
      maxOutputTokens: 4096
    });
    
    this.errorManager = new ErrorPropagationManager();
    this.setupErrorHandlers();
  }

  private setupErrorHandlers() {
    // Register agents with error handlers
    this.errorManager.registerAgent('extraction', undefined, async (event) => {
      console.log('Extraction error:', event.error.message);
      return false; // Let it propagate
    });

    this.errorManager.registerAgent('validation', 'extraction', async (event) => {
      console.log('Validation error, attempting correction');
      return true; // Handle locally
    });

    this.errorManager.registerAgent('analysis', 'validation');
    this.errorManager.registerAgent('synthesis', 'analysis');
  }

  createWorkflow() {
    const workflow = new StateGraph<WorkflowState>({
      channels: {
        messages: {
          value: (x: BaseMessage[], y: BaseMessage[]) => [...x, ...y],
          default: () => []
        },
        stage: {
          value: (x: string, y: string) => y || x,
          default: () => 'extraction'
        },
        extractedData: {
          value: (x: any, y: any) => y || x,
          default: () => undefined
        },
        analysisResult: {
          value: (x: any, y: any) => y || x,
          default: () => undefined
        },
        validationErrors: {
          value: (x: string[], y: string[]) => [...x, ...y],
          default: () => []
        },
        correctionAttempts: {
          value: (x: number, y: number) => y ?? x,
          default: () => 0
        },
        finalOutput: {
          value: (x: string, y: string) => y || x,
          default: () => undefined
        }
      }
    });

    // Extraction node with structured output
    workflow.addNode('extraction', async (state) => {
      try {
        const prompt = `Extract entities and relationships from the following text.
Return a JSON with this structure:
{
  "entities": ["entity1", "entity2"],
  "relationships": [
    {"source": "entity1", "target": "entity2", "type": "relation_type"}
  ],
  "metadata": {}
}

Text: ${state.messages[0].content}`;

        const response = await this.model.invoke([
          new SystemMessage('You are a data extraction specialist. Always return valid JSON.'),
          new HumanMessage(prompt)
        ]);

        // Parse and validate JSON
        const jsonStr = response.content.toString()
          .replace(/```json\n?/g, '')
          .replace(/```\n?/g, '');
        
        const parsed = JSON.parse(jsonStr);
        const validated = DataExtractionSchema.parse(parsed);

        return {
          extractedData: validated,
          stage: 'validation'
        };
      } catch (error) {
        await this.errorManager.reportError('extraction', error as Error);
        
        return {
          stage: 'correction',
          validationErrors: [`Extraction failed: ${error}`]
        };
      }
    });

    // Validation node
    workflow.addNode('validation', async (state) => {
      if (!state.extractedData) {
        return {
          stage: 'correction',
          validationErrors: ['No data to validate']
        };
      }

      const errors: string[] = [];

      // Validate extracted data quality
      if (state.extractedData.entities.length === 0) {
        errors.push('No entities extracted');
      }

      if (state.extractedData.relationships.length === 0 && 
          state.extractedData.entities.length > 1) {
        errors.push('Multiple entities but no relationships defined');
      }

      // Check for orphaned relationships
      const entities = new Set(state.extractedData.entities);
      for (const rel of state.extractedData.relationships) {
        if (!entities.has(rel.source) || !entities.has(rel.target)) {
          errors.push(`Relationship references unknown entity: ${rel.source} -> ${rel.target}`);
        }
      }

      if (errors.length > 0) {
        return {
          stage: 'correction',
          validationErrors: errors
        };
      }

      return { stage: 'analysis' };
    });

    // Self-correction node
    workflow.addNode('correction', async (state) => {
      if (state.correctionAttempts >= this.maxCorrectionAttempts) {
        return {
          stage: 'failure',
          finalOutput: `Failed after ${this.maxCorrectionAttempts} correction attempts. Errors: ${state.validationErrors.join('; ')}`
        };
      }

      const correctionPrompt = `The previous extraction had these errors:
${state.validationErrors.join('\n')}

Please correct the extraction for this text:
${state.messages[0].content}

Ensure you address all the validation errors.`;

      try {
        const response = await this.model.invoke([
          new SystemMessage('You are a data extraction specialist. Fix the errors and return valid JSON.'),
          new HumanMessage(correctionPrompt)
        ]);

        const jsonStr = response.content.toString()
          .replace(/```json\n?/g, '')
          .replace(/```\n?/g, '');
        
        const parsed = JSON.parse(jsonStr);
        const validated = DataExtractionSchema.parse(parsed);

        return {
          extractedData: validated,
          stage: 'validation',
          correctionAttempts: state.correctionAttempts + 1,
          validationErrors: [] // Clear errors
        };
      } catch (error) {
        return {
          stage: 'correction',
          correctionAttempts: state.correctionAttempts + 1,
          validationErrors: [...state.validationErrors, `Correction failed: ${error}`]
        };
      }
    });

    // Analysis node
    workflow.addNode('analysis', async (state) => {
      if (!state.extractedData) {
        return {
          stage: 'failure',
          finalOutput: 'No data available for analysis'
        };
      }

      try {
        const analysisPrompt = `Analyze the following extracted data and provide insights:
Entities: ${state.extractedData.entities.join(', ')}
Relationships: ${JSON.stringify(state.extractedData.relationships)}

Provide analysis in JSON format:
{
  "insights": ["insight1", "insight2"],
  "confidence": 0.0-1.0,
  "recommendations": ["rec1", "rec2"]
}`;

        const response = await this.model.invoke([
          new SystemMessage('You are a data analyst. Provide thoughtful insights.'),
          new HumanMessage(analysisPrompt)
        ]);

        const jsonStr = response.content.toString()
          .replace(/```json\n?/g, '')
          .replace(/```\n?/g, '');
        
        const parsed = JSON.parse(jsonStr);
        const validated = AnalysisResultSchema.parse(parsed);

        return {
          analysisResult: validated,
          stage: 'synthesis'
        };
      } catch (error) {
        await this.errorManager.reportError('analysis', error as Error);
        
        // Degrade gracefully
        return {
          analysisResult: {
            insights: ['Analysis partially completed'],
            confidence: 0.3,
            recommendations: ['Manual review recommended']
          },
          stage: 'synthesis'
        };
      }
    });

    // Synthesis node
    workflow.addNode('synthesis', async (state) => {
      const report = `## Analysis Report

### Extracted Data
- **Entities Found**: ${state.extractedData?.entities.length || 0}
- **Relationships Identified**: ${state.extractedData?.relationships.length || 0}

### Key Insights
${state.analysisResult?.insights.map(i => `- ${i}`).join('\n') || 'No insights available'}

### Confidence Level
${(state.analysisResult?.confidence || 0) * 100}%

### Recommendations
${state.analysisResult?.recommendations.map(r => `- ${r}`).join('\n') || 'No recommendations'}

### Data Quality
- Validation Errors Encountered: ${state.validationErrors.length}
- Correction Attempts: ${state.correctionAttempts}
- Final Status: ${state.validationErrors.length === 0 ? 'Success' : 'Partial Success'}`;

      return {
        finalOutput: report,
        stage: 'complete'
      };
    });

    // Define edges
    workflow.addConditionalEdges('extraction', [
      { condition: (s) => s.stage === 'validation', node: 'validation' },
      { condition: (s) => s.stage === 'correction', node: 'correction' }
    ]);

    workflow.addConditionalEdges('validation', [
      { condition: (s) => s.stage === 'analysis', node: 'analysis' },
      { condition: (s) => s.stage === 'correction', node: 'correction' }
    ]);

    workflow.addConditionalEdges('correction', [
      { condition: (s) => s.stage === 'validation', node: 'validation' },
      { condition: (s) => s.stage === 'failure', node: 'synthesis' }
    ]);

    workflow.addEdge('analysis', 'synthesis');
    workflow.addEdge('synthesis', END);

    workflow.setEntryPoint('extraction');

    return workflow.compile();
  }

  async execute(input: string): Promise<{
    success: boolean;
    output: string;
    metrics: {
      correctionAttempts: number;
      validationErrors: string[];
      errorChains: Map<string, ErrorEvent[]>;
    };
  }> {
    const workflow = this.createWorkflow();
    
    const result = await workflow.invoke({
      messages: [new HumanMessage(input)],
      stage: 'extraction',
      validationErrors: [],
      correctionAttempts: 0
    });

    return {
      success: result.validationErrors.length === 0,
      output: result.finalOutput || 'Processing failed',
      metrics: {
        correctionAttempts: result.correctionAttempts,
        validationErrors: result.validationErrors,
        errorChains: this.errorManager['errorChain']
      }
    };
  }
}

Implements a self-correcting workflow that validates outputs and automatically attempts corrections when errors are detected.

3. Create Monitoring Dashboard API

// app/api/self-correcting/route.ts
import { NextResponse } from 'next/server';
import { SelfCorrectingWorkflow } from '@/lib/workflows/self-correcting-workflow';
import { z } from 'zod';

export const runtime = 'nodejs';
export const maxDuration = 300;

const RequestSchema = z.object({
  text: z.string().min(10).max(5000),
  enableMonitoring: z.boolean().default(true)
});

export async function POST(req: Request) {
  const encoder = new TextEncoder();
  const stream = new TransformStream();
  const writer = stream.writable.getWriter();

  (async () => {
    try {
      const body = await req.json();
      const validation = RequestSchema.safeParse(body);
      
      if (!validation.success) {
        await writer.write(
          encoder.encode(`data: ${JSON.stringify({
            type: 'error',
            message: 'Invalid request',
            errors: validation.error.issues
          })}\n\n`)
        );
        await writer.close();
        return;
      }

      const { text, enableMonitoring } = validation.data;
      
      // Initial acknowledgment
      await writer.write(
        encoder.encode(`data: ${JSON.stringify({
          type: 'start',
          message: 'Starting self-correcting workflow'
        })}\n\n`)
      );

      const workflow = new SelfCorrectingWorkflow();
      
      // Execute workflow
      const startTime = Date.now();
      const result = await workflow.execute(text);
      const executionTime = Date.now() - startTime;

      // Stream progress updates
      if (result.metrics.correctionAttempts > 0) {
        await writer.write(
          encoder.encode(`data: ${JSON.stringify({
            type: 'correction',
            attempts: result.metrics.correctionAttempts,
            errors: result.metrics.validationErrors
          })}\n\n`)
        );
      }

      // Stream error chains if monitoring enabled
      if (enableMonitoring && result.metrics.errorChains.size > 0) {
        const errorSummary = Array.from(result.metrics.errorChains.entries())
          .map(([agent, events]) => ({
            agent,
            errorCount: events.length,
            handled: events.filter(e => e.handled).length
          }));

        await writer.write(
          encoder.encode(`data: ${JSON.stringify({
            type: 'monitoring',
            errorSummary
          })}\n\n`)
        );
      }

      // Final result
      await writer.write(
        encoder.encode(`data: ${JSON.stringify({
          type: 'complete',
          success: result.success,
          output: result.output,
          executionTime,
          metrics: {
            correctionAttempts: result.metrics.correctionAttempts,
            errorCount: result.metrics.validationErrors.length
          }
        })}\n\n`)
      );

    } catch (error) {
      console.error('Workflow execution error:', error);
      
      await writer.write(
        encoder.encode(`data: ${JSON.stringify({
          type: 'critical_error',
          message: error instanceof Error ? error.message : 'Unknown error'
        })}\n\n`)
      );
    } finally {
      await writer.close();
    }
  })();

  return new Response(stream.readable, {
    headers: {
      'Content-Type': 'text/event-stream',
      'Cache-Control': 'no-cache',
      'Connection': 'keep-alive',
    },
  });
}

Streaming API endpoint that provides real-time updates on workflow execution and error corrections.

4. Build Interactive Monitoring Dashboard

// components/SelfCorrectingDashboard.tsx
'use client';

import { useState, useEffect } from 'react';
import { useMutation } from '@tanstack/react-query';
import { pipe, groupBy, map as mapUtil, reduce } from 'es-toolkit';

interface WorkflowEvent {
  type: 'start' | 'correction' | 'monitoring' | 'complete' | 'error' | 'critical_error';
  message?: string;
  attempts?: number;
  errors?: string[];
  errorSummary?: Array<{
    agent: string;
    errorCount: number;
    handled: number;
  }>;
  output?: string;
  success?: boolean;
  executionTime?: number;
  metrics?: {
    correctionAttempts: number;
    errorCount: number;
  };
}

export default function SelfCorrectingDashboard() {
  const [inputText, setInputText] = useState('');
  const [events, setEvents] = useState<WorkflowEvent[]>([]);
  const [isProcessing, setIsProcessing] = useState(false);
  const [enableMonitoring, setEnableMonitoring] = useState(true);
  const [result, setResult] = useState<string | null>(null);

  const processWorkflow = useMutation({
    mutationFn: async (params: { text: string; enableMonitoring: boolean }) => {
      setEvents([]);
      setResult(null);
      setIsProcessing(true);

      const response = await fetch('/api/self-correcting', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify(params),
      });

      if (!response.ok) {
        throw new Error('Workflow failed');
      }

      const reader = response.body?.getReader();
      const decoder = new TextDecoder();

      while (reader) {
        const { done, value } = await reader.read();
        if (done) break;

        const chunk = decoder.decode(value);
        const lines = chunk.split('\n');
        
        for (const line of lines) {
          if (line.startsWith('data: ')) {
            try {
              const event: WorkflowEvent = JSON.parse(line.slice(6));
              setEvents(prev => [...prev, event]);
              
              if (event.type === 'complete' && event.output) {
                setResult(event.output);
              }
            } catch (e) {
              console.error('Failed to parse event:', e);
            }
          }
        }
      }
    },
    onSettled: () => {
      setIsProcessing(false);
    },
  });

  const handleSubmit = (e: React.FormEvent) => {
    e.preventDefault();
    if (inputText.trim().length >= 10) {
      processWorkflow.mutate({ text: inputText, enableMonitoring });
    }
  };

  const getCorrectionStats = () => {
    const correctionEvents = events.filter(e => e.type === 'correction');
    if (correctionEvents.length === 0) return null;
    
    const lastCorrection = correctionEvents[correctionEvents.length - 1];
    return {
      attempts: lastCorrection.attempts || 0,
      errors: lastCorrection.errors || []
    };
  };

  const getExecutionMetrics = () => {
    const completeEvent = events.find(e => e.type === 'complete');
    if (!completeEvent) return null;
    
    return {
      time: completeEvent.executionTime,
      success: completeEvent.success,
      corrections: completeEvent.metrics?.correctionAttempts || 0,
      errors: completeEvent.metrics?.errorCount || 0
    };
  };

  return (
    <div className="container mx-auto p-4 space-y-4">
      {/* Header */}
      <div className="card bg-base-100 shadow-xl">
        <div className="card-body">
          <h1 className="card-title text-2xl">Self-Correcting Workflow Dashboard</h1>
          <p className="text-base-content/70">
            Watch AI automatically detect and correct errors in real-time
          </p>
        </div>
      </div>

      {/* Input Form */}
      <div className="card bg-base-100 shadow-xl">
        <div className="card-body">
          <form onSubmit={handleSubmit} className="space-y-4">
            <div className="form-control">
              <label className="label">
                <span className="label-text">Input Text (min 10 characters)</span>
              </label>
              <textarea
                className="textarea textarea-bordered h-32"
                placeholder="Enter text for extraction and analysis..."
                value={inputText}
                onChange={(e) => setInputText(e.target.value)}
                disabled={isProcessing}
              />
            </div>

            <div className="form-control">
              <label className="label cursor-pointer">
                <span className="label-text">Enable Error Monitoring</span>
                <input
                  type="checkbox"
                  className="toggle toggle-primary"
                  checked={enableMonitoring}
                  onChange={(e) => setEnableMonitoring(e.target.checked)}
                />
              </label>
            </div>

            <button
              type="submit"
              className="btn btn-primary w-full"
              disabled={isProcessing || inputText.trim().length < 10}
            >
              {isProcessing ? (
                <>
                  <span className="loading loading-spinner"></span>
                  Processing Workflow...
                </>
              ) : 'Execute Self-Correcting Workflow'}
            </button>
          </form>
        </div>
      </div>

      {/* Execution Timeline */}
      {events.length > 0 && (
        <div className="card bg-base-100 shadow-xl">
          <div className="card-body">
            <h2 className="card-title">Execution Timeline</h2>
            
            <ul className="timeline timeline-vertical">
              {events.map((event, idx) => (
                <li key={idx}>
                  {idx > 0 && <hr />}
                  <div className="timeline-start">{idx + 1}</div>
                  <div className="timeline-middle">
                    <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 20 20" fill="currentColor" className={`w-5 h-5 ${
                      event.type === 'error' || event.type === 'critical_error' ? 'text-error' :
                      event.type === 'correction' ? 'text-warning' :
                      event.type === 'complete' ? 'text-success' :
                      'text-primary'
                    }`}>
                      <path fillRule="evenodd" d="M10 18a8 8 0 100-16 8 8 0 000 16zm3.857-9.809a.75.75 0 00-1.214-.882l-3.483 4.79-1.88-1.88a.75.75 0 10-1.06 1.061l2.5 2.5a.75.75 0 001.137-.089l4-5.5z" clipRule="evenodd" />
                    </svg>
                  </div>
                  <div className="timeline-end timeline-box">
                    <div className="font-semibold capitalize">{event.type}</div>
                    {event.message && (
                      <p className="text-sm opacity-80">{event.message}</p>
                    )}
                    {event.attempts && (
                      <div className="badge badge-warning badge-sm mt-1">
                        {event.attempts} correction attempts
                      </div>
                    )}
                  </div>
                  {idx < events.length - 1 && <hr />}
                </li>
              ))}
            </ul>
          </div>
        </div>
      )}

      {/* Metrics Dashboard */}
      {getExecutionMetrics() && (
        <div className="card bg-base-100 shadow-xl">
          <div className="card-body">
            <h2 className="card-title">Execution Metrics</h2>
            
            <div className="stats stats-vertical lg:stats-horizontal shadow">
              <div className="stat">
                <div className="stat-title">Status</div>
                <div className="stat-value text-lg">
                  {getExecutionMetrics()?.success ? (
                    <span className="text-success">Success</span>
                  ) : (
                    <span className="text-warning">Partial</span>
                  )}
                </div>
              </div>
              
              <div className="stat">
                <div className="stat-title">Execution Time</div>
                <div className="stat-value text-lg">
                  {(getExecutionMetrics()?.time || 0) / 1000}s
                </div>
              </div>
              
              <div className="stat">
                <div className="stat-title">Corrections</div>
                <div className="stat-value text-lg">
                  {getExecutionMetrics()?.corrections || 0}
                </div>
              </div>
              
              <div className="stat">
                <div className="stat-title">Errors Handled</div>
                <div className="stat-value text-lg">
                  {getExecutionMetrics()?.errors || 0}
                </div>
              </div>
            </div>
          </div>
        </div>
      )}

      {/* Result Display */}
      {result && (
        <div className="card bg-base-100 shadow-xl">
          <div className="card-body">
            <h2 className="card-title">Workflow Result</h2>
            <div className="mockup-code">
              <pre className="text-sm"><code>{result}</code></pre>
            </div>
          </div>
        </div>
      )}

      {/* Error Display */}
      {processWorkflow.isError && (
        <div className="alert alert-error shadow-lg">
          <span>Workflow execution failed. Please try again.</span>
        </div>
      )}
    </div>
  );
}

Interactive dashboard providing real-time visualization of workflow execution, error corrections, and performance metrics.

Conclusion

Exception handling in agentic design patterns goes beyond simple try-catch blocks—it's about building intelligent, self-healing systems that anticipate failures, recover gracefully, and learn from errors. The patterns demonstrated here (error boundaries, recovery managers, self-correction, and error propagation) form the foundation for production-ready agents. Remember that in serverless environments, proper timeout management, state persistence, and graceful degradation are essential. The key is building systems that not only handle errors but use them as opportunities to improve reliability and user experience.