초안 "에이전틱 디자인 패턴 - 프롬프트 체이닝"

복잡한 작업을 관리 가능한 단계로 분해하고 전문화된 프롬프트 간에 출력을 전달하여 정교한 추론 및 처리 기능을 구현하는 프로덕션 수준의 순차적 AI 워크플로우를 Next.js 애플리케이션에서 구축하는 방법을 알아보세요.

멘탈 모델: AI를 위한 조립 라인

프롬프트 체이닝을 테슬라 공장의 현대적인 조립 라인처럼 생각해보세요. 각 스테이션(프롬프트)은 특정 작업을 수행합니다 - 하나는 휠을 설치하고, 다른 하나는 페인트를 칠하고, 세 번째는 품질을 검사합니다. 각 스테이션의 출력이 다음 스테이션의 입력이 되어 정제된 최종 제품을 만들어냅니다. AI 측면에서는 자동차를 만드는 대신 정보를 정제합니다: 먼저 데이터를 추출하고, 분석하고, 포맷을 정하고, 마지막으로 검증합니다. 조립 라인이 제조 효율성을 혁신한 것처럼, 프롬프트 체이닝은 복잡한 AI 작업을 독립적으로 최적화할 수 있는 전문화되고 관리 가능한 작업으로 나누어 처리하는 방식을 혁신합니다.

기본 순차 체인 구현

1. 간단한 텍스트 처리 체인

// lib/chains/basic-chain.ts
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { PromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';
import { RunnableSequence } from '@langchain/core/runnables';

export function createBasicChain() {
  const model = new ChatGoogleGenerativeAI({
    modelName: 'gemini-2.5-flash',
    temperature: 0,
    maxRetries: 3,
  });

  // 1단계: 입력 요약
  const summarizePrompt = PromptTemplate.fromTemplate(
    `다음 텍스트를 2-3문장으로 요약하세요:

    {text}`
  );

  // 2단계: 핵심 포인트 추출
  const extractPrompt = PromptTemplate.fromTemplate(
    `이 요약에서 3-5개의 핵심 포인트를 추출하세요:

    {summary}`
  );

  // 3단계: 액션 아이템 생성
  const actionPrompt = PromptTemplate.fromTemplate(
    `이러한 핵심 포인트를 바탕으로 실행 가능한 권고사항을 생성하세요:

    {keyPoints}`
  );

  // LCEL을 사용하여 체인 구축
  const chain = RunnableSequence.from([
    // 첫 번째 단계: 요약
    summarizePrompt,
    model,
    new StringOutputParser(),
    // 다음 단계로 전달
    (summary) => ({ summary }),
    extractPrompt,
    model,
    new StringOutputParser(),
    // 최종 단계로 전달
    (keyPoints) => ({ keyPoints }),
    actionPrompt,
    model,
    new StringOutputParser(),
  ]);

  return chain;
}

원시 입력에서 실행 가능한 권고사항까지 점진적으로 텍스트를 정제하는 3단계 체인을 생성합니다.

2. 스트리밍을 포함한 API 라우트

// app/api/basic-chain/route.ts
import { createBasicChain } from '@/lib/chains/basic-chain';
import { NextResponse } from 'next/server';

export const runtime = 'nodejs';
export const maxDuration = 60;

export async function POST(req: Request) {
  try {
    const { text } = await req.json();

    if (!text || text.length < 50) {
      return NextResponse.json(
        { error: '최소 50자 이상의 텍스트를 입력해주세요' },
        { status: 400 }
      );
    }

    const chain = createBasicChain();
    const result = await chain.invoke({ text });

    return NextResponse.json({
      success: true,
      result,
      timestamp: new Date().toISOString(),
    });
  } catch (error) {
    console.error('체인 실행 오류:', error);
    return NextResponse.json(
      { error: '체인 처리에 실패했습니다' },
      { status: 500 }
    );
  }
}

서버리스 배포를 위한 적절한 오류 처리 및 검증과 함께 체인 실행을 처리합니다.

3. TanStack Query를 활용한 프론트엔드 통합

// components/BasicChainInterface.tsx
'use client';

import { useState } from 'react';
import { useMutation } from '@tanstack/react-query';
import { debounce } from 'es-toolkit';

interface ChainResponse {
  success: boolean;
  result: string;
  timestamp: string;
}

export default function BasicChainInterface() {
  const [text, setText] = useState('');
  const [charCount, setCharCount] = useState(0);

  const processChain = useMutation<ChainResponse, Error, string>({
    mutationFn: async (inputText: string) => {
      const response = await fetch('/api/basic-chain', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ text: inputText }),
      });

      if (!response.ok) {
        const error = await response.json();
        throw new Error(error.error || '체인 처리가 실패했습니다');
      }

      return response.json();
    },
    retry: 2,
    retryDelay: (attemptIndex) => Math.min(1000 * 2 ** attemptIndex, 30000),
  });

  const handleTextChange = debounce((value: string) => {
    setCharCount(value.length);
  }, 300);

  const handleSubmit = (e: React.FormEvent) => {
    e.preventDefault();
    if (text.length >= 50) {
      processChain.mutate(text);
    }
  };

  return (
    <div className="card w-full bg-base-100 shadow-xl">
      <div className="card-body">
        <h2 className="card-title">텍스트 처리 체인</h2>

        <form onSubmit={handleSubmit} className="space-y-4">
          <div className="form-control">
            <label className="label">
              <span className="label-text">입력 텍스트</span>
              <span className="label-text-alt">{charCount} 글자</span>
            </label>
            <textarea
              className="textarea textarea-bordered h-32"
              placeholder="처리할 텍스트를 입력하세요 (최소 50글자)..."
              value={text}
              onChange={(e) => {
                setText(e.target.value);
                handleTextChange(e.target.value);
              }}
              disabled={processChain.isPending}
            />
          </div>

          <button
            type="submit"
            className="btn btn-primary"
            disabled={processChain.isPending || text.length < 50}
          >
            {processChain.isPending ? (
              <>
                <span className="loading loading-spinner"></span>
                체인 처리 중...
              </>
            ) : (
              '텍스트 처리'
            )}
          </button>
        </form>

        {processChain.isError && (
          <div className="alert alert-error mt-4">
            <span>{processChain.error.message}</span>
          </div>
        )}

        {processChain.isSuccess && (
          <div className="card bg-base-200 mt-4">
            <div className="card-body">
              <h3 className="font-semibold">결과:</h3>
              <p className="whitespace-pre-wrap">{processChain.data.result}</p>
              <p className="text-xs text-base-content/60 mt-2">
                처리 시간: {new Date(processChain.data.timestamp).toLocaleString()}
              </p>
            </div>
          </div>
        )}
      </div>
    </div>
  );
}

TanStack Query를 사용한 디바운스된 문자 카운팅과 포괄적인 오류 처리를 갖춘 React 컴포넌트입니다.

상태 관리를 포함한 고급 다단계 체인

1. LangGraph를 활용한 문서 분석 체인

// lib/chains/document-analyzer.ts
import { StateGraph, START, END, Annotation } from '@langchain/langgraph';
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { BaseMessage, HumanMessage, AIMessage } from '@langchain/core/messages';
import { z } from 'zod';
import { StructuredOutputParser } from '@langchain/core/output_parsers';
import { isNil, chunk } from 'es-toolkit';

// Zod를 사용한 상태 스키마 정의
const DocumentMetadata = z.object({
  title: z.string(),
  category: z.string(),
  confidence: z.number().min(0).max(1),
  keywords: z.array(z.string()),
});

// 그래프 상태 정의
const GraphState = Annotation.Root({
  document: Annotation<string>(),
  chunks: Annotation<string[]>(),
  metadata: Annotation<z.infer<typeof DocumentMetadata>>(),
  summary: Annotation<string>(),
  insights: Annotation<string[]>(),
  messages: Annotation<BaseMessage[]>({
    reducer: (current, update) => current.concat(update),
    default: () => [],
  }),
});

export function createDocumentAnalyzer() {
  const model = new ChatGoogleGenerativeAI({
    modelName: 'gemini-2.5-pro',
    temperature: 0.1,
    maxRetries: 3,
  });

  const metadataParser = StructuredOutputParser.fromZodSchema(DocumentMetadata);

  const workflow = new StateGraph(GraphState)
    // 노드 1: 문서 청킹
    .addNode('chunk_document', async (state) => {
      const doc = state.document;
      // 50자 중복으로 500자 세그먼트로 청킹
      const chunks = chunk(doc.split(' '), 100).map(words => words.join(' '));

      return {
        chunks,
        messages: [new AIMessage(`문서가 ${chunks.length}개 세그먼트로 청킹되었습니다`)],
      };
    })
    // 노드 2: 메타데이터 추출
    .addNode('extract_metadata', async (state) => {
      const prompt = `이 문서를 분석하고 메타데이터를 추출하세요.

문서: ${state.chunks[0]}

${metadataParser.getFormatInstructions()}`;

      const response = await model.invoke([new HumanMessage(prompt)]);
      const metadata = await metadataParser.parse(response.content as string);

      return {
        metadata,
        messages: [new AIMessage(`메타데이터 추출됨: ${metadata.category}`)],
      };
    })
    // 노드 3: 요약 생성
    .addNode('generate_summary', async (state) => {
      const combinedText = state.chunks.slice(0, 3).join('\n\n');
      const prompt = `이 문서의 포괄적인 요약을 작성하세요.
      카테고리: ${state.metadata.category}

      문서 발췌:
      ${combinedText}`;

      const response = await model.invoke([new HumanMessage(prompt)]);

      return {
        summary: response.content as string,
        messages: [new AIMessage('요약 생성됨')],
      };
    })
    // 노드 4: 인사이트 추출
    .addNode('extract_insights', async (state) => {
      const prompt = `이 요약과 메타데이터를 바탕으로 3-5개의 핵심 인사이트를 제공하세요:

      카테고리: ${state.metadata.category}
      키워드: ${state.metadata.keywords.join(', ')}
      요약: ${state.summary}`;

      const response = await model.invoke([new HumanMessage(prompt)]);
      const insights = (response.content as string)
        .split('\n')
        .filter(line => line.trim().length > 0);

      return {
        insights,
        messages: [new AIMessage(`${insights.length}개 인사이트 추출됨`)],
      };
    })
    // 엣지 정의
    .addEdge(START, 'chunk_document')
    .addEdge('chunk_document', 'extract_metadata')
    .addEdge('extract_metadata', 'generate_summary')
    .addEdge('generate_summary', 'extract_insights')
    .addEdge('extract_insights', END);

  return workflow.compile();
}

메타데이터 추출 및 인사이트 생성을 포함한 상태 기반 문서 처리 파이프라인을 구현합니다.

2. 문서 분석을 위한 스트리밍 API 라우트

// app/api/analyze-document/route.ts
import { createDocumentAnalyzer } from '@/lib/chains/document-analyzer';
import { HumanMessage } from '@langchain/core/messages';

export const runtime = 'nodejs';
export const maxDuration = 120;

export async function POST(req: Request) {
  const { document } = await req.json();

  if (!document || document.length < 100) {
    return new Response(
      JSON.stringify({ error: '문서는 최소 100자 이상이어야 합니다' }),
      { status: 400 }
    );
  }

  const encoder = new TextEncoder();
  const stream = new TransformStream();
  const writer = stream.writable.getWriter();

  const analyzer = createDocumentAnalyzer();

  (async () => {
    try {
      const eventStream = await analyzer.stream({
        document,
        chunks: [],
        metadata: null,
        summary: '',
        insights: [],
        messages: [],
      });

      for await (const event of eventStream) {
        const update = {
          type: 'update',
          node: Object.keys(event)[0],
          data: event,
          timestamp: new Date().toISOString(),
        };

        await writer.write(
          encoder.encode(`data: ${JSON.stringify(update)}\n\n`)
        );
      }

      await writer.write(
        encoder.encode(`data: ${JSON.stringify({ type: 'complete' })}\n\n`)
      );
    } catch (error) {
      await writer.write(
        encoder.encode(`data: ${JSON.stringify({
          type: 'error',
          error: error instanceof Error ? error.message : '알 수 없는 오류'
        })}\n\n`)
      );
    } finally {
      await writer.close();
    }
  })();

  return new Response(stream.readable, {
    headers: {
      'Content-Type': 'text/event-stream',
      'Cache-Control': 'no-cache',
      'Connection': 'keep-alive',
    },
  });
}

Server-Sent Events를 사용하여 각 노드의 실행 결과를 실시간으로 스트리밍합니다.

3. 오류 복구를 포함한 고급 체인

// lib/chains/resilient-chain.ts
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { RunnableSequence, RunnableWithFallbacks } from '@langchain/core/runnables';
import { PromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';
import { retry, delay } from 'es-toolkit';
import { kv } from '@vercel/kv';

interface ChainCache {
  get: (key: string) => Promise<string | null>;
  set: (key: string, value: string, ttl?: number) => Promise<void>;
}

class VercelKVCache implements ChainCache {
  async get(key: string): Promise<string | null> {
    try {
      return await kv.get(key);
    } catch {
      return null;
    }
  }

  async set(key: string, value: string, ttl = 3600): Promise<void> {
    try {
      await kv.set(key, value, { ex: ttl });
    } catch {
      // 캐시 쓰기는 조용히 실패
    }
  }
}

export function createResilientChain() {
  const cache = new VercelKVCache();

  // 폴백을 포함한 기본 모델
  const primaryModel = new ChatGoogleGenerativeAI({
    modelName: 'gemini-2.5-pro',
    temperature: 0,
    maxRetries: 2,
  });

  const fallbackModel = new ChatGoogleGenerativeAI({
    modelName: 'gemini-2.5-flash',
    temperature: 0,
    maxRetries: 1,
  });

  const modelWithFallback = primaryModel.withFallbacks({
    fallbacks: [fallbackModel],
  });

  // 캐시된 프롬프트 실행기 생성
  const cachedExecutor = async (prompt: string, input: any) => {
    const cacheKey = `chain:${prompt.substring(0, 20)}:${JSON.stringify(input)}`;

    // 캐시 확인
    const cached = await cache.get(cacheKey);
    if (cached) return cached;

    // 재시도 로직으로 실행
    const result = await retry(
      async () => {
        const template = PromptTemplate.fromTemplate(prompt);
        const chain = template.pipe(modelWithFallback).pipe(new StringOutputParser());
        return await chain.invoke(input);
      },
      {
        times: 3,
        delay: 1000,
        onError: async (error, attemptNumber) => {
          console.error(`시도 ${attemptNumber} 실패:`, error);
          await delay(attemptNumber * 1000); // 지수 백오프
        },
      }
    );

    // 결과 캐시
    await cache.set(cacheKey, result);
    return result;
  };

  // 복원력 있는 체인 구축
  const chain = RunnableSequence.from([
    async (input: { query: string }) => {
      // 1단계: 캐싱을 포함한 쿼리 이해
      const understanding = await cachedExecutor(
        '명확성을 위해 이 쿼리를 다시 표현하세요: {query}',
        input
      );
      return { understanding };
    },
    async (state: { understanding: string }) => {
      // 2단계: 폴백을 포함한 리서치
      const research = await cachedExecutor(
        '다음에 대한 상세한 리서치를 제공하세요: {understanding}',
        state
      );
      return { ...state, research };
    },
    async (state: { understanding: string; research: string }) => {
      // 3단계: 검증을 포함한 합성
      const synthesis = await cachedExecutor(
        `이 리서치를 실행 가능한 인사이트로 합성하세요:
        주제: {understanding}
        리서치: {research}`,
        state
      );

      // 출력 검증
      if (synthesis.length < 100) {
        throw new Error('합성이 너무 짧습니다. 재시도 중...');
      }

      return { ...state, synthesis };
    },
  ]);

  return chain;
}

es-toolkit 유틸리티를 사용하여 캐싱, 폴백, 재시도 로직을 포함한 프로덕션 수준의 체인을 구현합니다.

4. 병렬 체인 처리

// lib/chains/parallel-processor.ts
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { RunnableParallel, RunnableSequence } from '@langchain/core/runnables';
import { PromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';
import { chunk, uniqBy } from 'es-toolkit';

export function createParallelProcessor() {
  const model = new ChatGoogleGenerativeAI({
    modelName: 'gemini-2.5-flash',
    temperature: 0.3,
  });

  // 병렬 분석 브랜치 정의
  const sentimentAnalysis = PromptTemplate.fromTemplate(
    '이 텍스트의 감정을 분석하세요 (긍정적/부정적/중립적): {text}'
  ).pipe(model).pipe(new StringOutputParser());

  const entityExtraction = PromptTemplate.fromTemplate(
    '다음에서 모든 명명된 개체(사람, 장소, 조직)를 추출하세요: {text}'
  ).pipe(model).pipe(new StringOutputParser());

  const topicClassification = PromptTemplate.fromTemplate(
    '이 텍스트의 주요 주제를 분류하세요: {text}'
  ).pipe(model).pipe(new StringOutputParser());

  const keywordExtraction = PromptTemplate.fromTemplate(
    '이 텍스트에서 5-10개의 키워드를 추출하세요: {text}'
  ).pipe(model).pipe(new StringOutputParser());

  // 병렬 실행 생성
  const parallelAnalysis = RunnableParallel({
    sentiment: sentimentAnalysis,
    entities: entityExtraction,
    topic: topicClassification,
    keywords: keywordExtraction,
  });

  // 합성 단계
  const synthesisPrompt = PromptTemplate.fromTemplate(`
    다음을 바탕으로 포괄적인 분석 보고서를 작성하세요:

    감정: {sentiment}
    개체: {entities}
    주제: {topic}
    키워드: {keywords}

    섹션이 포함된 구조화된 보고서로 포맷하세요.
  `);

  // 완전한 체인: 병렬 분석 후 합성
  const chain = RunnableSequence.from([
    parallelAnalysis,
    synthesisPrompt,
    model,
    new StringOutputParser(),
  ]);

  return chain;
}

// 여러 문서를 위한 배치 프로세서
export async function processBatch(documents: string[]) {
  const processor = createParallelProcessor();
  const BATCH_SIZE = 5;

  // 요율 제한을 피하기 위해 배치로 처리
  const batches = chunk(documents, BATCH_SIZE);
  const results = [];

  for (const batch of batches) {
    const batchResults = await Promise.all(
      batch.map(async (doc) => {
        try {
          const result = await processor.invoke({ text: doc });
          return { success: true, result, document: doc };
        } catch (error) {
          return {
            success: false,
            error: error instanceof Error ? error.message : '알 수 없는 오류',
            document: doc
          };
        }
      })
    );
    results.push(...batchResults);

    // 배치 간 요율 제한 지연
    if (batches.indexOf(batch) < batches.length - 1) {
      await new Promise(resolve => setTimeout(resolve, 1000));
    }
  }

  return results;
}

여러 분석 작업을 병렬로 실행한 후 결과를 합성하며, 배치 처리 지원을 포함합니다.

5. 메모리를 포함한 컨텍스트 인식 체인

// lib/chains/contextual-chain.ts
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { BufferMemory } from 'langchain/memory';
import { ConversationChain } from 'langchain/chains';
import { PromptTemplate } from '@langchain/core/prompts';
import { MessagesPlaceholder } from '@langchain/core/prompts';
import { RunnableSequence } from '@langchain/core/runnables';

export class ContextualChain {
  private memory: BufferMemory;
  private chain: RunnableSequence;
  private model: ChatGoogleGenerativeAI;

  constructor() {
    this.model = new ChatGoogleGenerativeAI({
      modelName: 'gemini-2.5-pro',
      temperature: 0.7,
    });

    this.memory = new BufferMemory({
      returnMessages: true,
      memoryKey: 'history',
      inputKey: 'input',
      outputKey: 'output',
    });

    this.initializeChain();
  }

  private initializeChain() {
    const contextPrompt = PromptTemplate.fromTemplate(`
      당신은 문서를 단계별로 분석하고 있습니다.
      이전 컨텍스트: {history}

      현재 작업: {task}
      현재 입력: {input}

      이전 분석을 바탕으로 구축하는 상세한 응답을 제공하세요.
    `);

    this.chain = RunnableSequence.from([
      async (input: { task: string; input: string }) => {
        const history = await this.memory.loadMemoryVariables({});
        return { ...input, history: history.history || '' };
      },
      contextPrompt,
      this.model,
      async (response) => {
        const content = response.content as string;
        // 메모리에 저장
        await this.memory.saveContext(
          { input: this.lastInput },
          { output: content }
        );
        return content;
      },
    ]);
  }

  private lastInput: string = '';

  async process(task: string, input: string): Promise<string> {
    this.lastInput = `${task}: ${input}`;
    return await this.chain.invoke({ task, input });
  }

  async reset() {
    await this.memory.clear();
  }

  async getHistory(): Promise<string> {
    const history = await this.memory.loadMemoryVariables({});
    return history.history || '사용 가능한 기록이 없습니다';
  }
}

// API 라우트에서 사용
export async function processWithContext(
  sessionId: string,
  task: string,
  input: string
) {
  // 세션별 체인 저장
  const chainStore = global as any;
  chainStore.contextChains = chainStore.contextChains || new Map();

  if (!chainStore.contextChains.has(sessionId)) {
    chainStore.contextChains.set(sessionId, new ContextualChain());
  }

  const chain = chainStore.contextChains.get(sessionId);
  return await chain.process(task, input);
}

상태 기반 문서 분석을 위해 여러 체인 실행에 걸쳐 대화 컨텍스트를 유지합니다.

6. 모니터링을 포함한 프로덕션 수준 체인

// lib/chains/monitored-chain.ts
import { ChatGoogleGenerativeAI } from '@langchain/google-genai';
import { RunnableSequence } from '@langchain/core/runnables';
import { PromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';

interface ChainMetrics {
  executionTime: number;
  tokenCount: number;
  cost: number;
  steps: Array<{
    name: string;
    duration: number;
    tokens: number;
  }>;
}

export class MonitoredChain {
  private metrics: ChainMetrics[] = [];

  async executeWithMonitoring(input: string): Promise<{
    result: string;
    metrics: ChainMetrics;
  }> {
    const startTime = Date.now();
    const stepMetrics: ChainMetrics['steps'] = [];

    const model = new ChatGoogleGenerativeAI({
      modelName: 'gemini-2.5-flash',
      temperature: 0,
      callbacks: [
        {
          handleLLMStart: async (llm, prompts) => {
            console.log('LLM 시작:', prompts[0].substring(0, 100));
          },
          handleLLMEnd: async (output) => {
            const tokens = output.llmOutput?.tokenUsage?.totalTokens || 0;
            stepMetrics[stepMetrics.length - 1].tokens = tokens;
          },
        },
      ],
    });

    // 1단계: 분류
    const stepStart = Date.now();
    stepMetrics.push({ name: 'classification', duration: 0, tokens: 0 });

    const classificationPrompt = PromptTemplate.fromTemplate(
      '이 텍스트를 카테고리로 분류하세요: {text}'
    );

    const classification = await classificationPrompt
      .pipe(model)
      .pipe(new StringOutputParser())
      .invoke({ text: input });

    stepMetrics[0].duration = Date.now() - stepStart;

    // 2단계: 향상
    const step2Start = Date.now();
    stepMetrics.push({ name: 'enhancement', duration: 0, tokens: 0 });

    const enhancementPrompt = PromptTemplate.fromTemplate(
      '이 분류를 세부 사항으로 향상시키세요: {classification}'
    );

    const enhancement = await enhancementPrompt
      .pipe(model)
      .pipe(new StringOutputParser())
      .invoke({ classification });

    stepMetrics[1].duration = Date.now() - step2Start;

    // 총 메트릭 계산
    const totalTokens = stepMetrics.reduce((sum, step) => sum + step.tokens, 0);
    const metrics: ChainMetrics = {
      executionTime: Date.now() - startTime,
      tokenCount: totalTokens,
      cost: totalTokens * 0.000001, // 예시 가격
      steps: stepMetrics,
    };

    this.metrics.push(metrics);

    return {
      result: enhancement,
      metrics,
    };
  }

  getAverageMetrics(): ChainMetrics | null {
    if (this.metrics.length === 0) return null;

    const avgTime = this.metrics.reduce((sum, m) => sum + m.executionTime, 0) / this.metrics.length;
    const avgTokens = this.metrics.reduce((sum, m) => sum + m.tokenCount, 0) / this.metrics.length;
    const avgCost = this.metrics.reduce((sum, m) => sum + m.cost, 0) / this.metrics.length;

    return {
      executionTime: Math.round(avgTime),
      tokenCount: Math.round(avgTokens),
      cost: avgCost,
      steps: [],
    };
  }
}

타이밍, 토큰 사용량, 비용을 포함한 각 체인 실행의 상세한 메트릭을 추적합니다.

7. 체인 모니터링을 위한 프론트엔드 대시보드

// components/ChainDashboard.tsx
'use client';

import { useState, useEffect } from 'react';
import { useQuery, useMutation } from '@tanstack/react-query';
import { groupBy, mean, round } from 'es-toolkit';

interface ChainExecution {
  id: string;
  chainName: string;
  status: 'pending' | 'processing' | 'complete' | 'failed';
  startTime: string;
  endTime?: string;
  metrics?: {
    executionTime: number;
    tokenCount: number;
    cost: number;
  };
}

export default function ChainDashboard() {
  const [executions, setExecutions] = useState<ChainExecution[]>([]);

  // 실행 기록 가져오기
  const { data: history } = useQuery({
    queryKey: ['chain-history'],
    queryFn: async () => {
      const response = await fetch('/api/chain-history');
      return response.json();
    },
    refetchInterval: 5000,
  });

  // 새 체인 실행
  const executeChain = useMutation({
    mutationFn: async (chainConfig: { name: string; input: string }) => {
      const response = await fetch('/api/execute-chain', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify(chainConfig),
      });
      return response.json();
    },
    onSuccess: (data) => {
      setExecutions(prev => [...prev, data]);
    },
  });

  // 통계 계산
  const stats = executions.reduce(
    (acc, exec) => {
      if (exec.status === 'complete' && exec.metrics) {
        acc.totalExecutions++;
        acc.totalTokens += exec.metrics.tokenCount;
        acc.totalCost += exec.metrics.cost;
        acc.avgTime = mean([acc.avgTime, exec.metrics.executionTime]);
      } else if (exec.status === 'failed') {
        acc.failures++;
      }
      return acc;
    },
    {
      totalExecutions: 0,
      totalTokens: 0,
      totalCost: 0,
      avgTime: 0,
      failures: 0,
    }
  );

  return (
    <div className="p-6 space-y-6">
      {/* 통계 그리드 */}
      <div className="stats shadow w-full">
        <div className="stat">
          <div className="stat-title">총 실행 횟수</div>
          <div className="stat-value">{stats.totalExecutions}</div>
          <div className="stat-desc">
            {stats.failures > 0 && `${stats.failures}개 실패`}
          </div>
        </div>

        <div className="stat">
          <div className="stat-title">평균 실행 시간</div>
          <div className="stat-value">{round(stats.avgTime / 1000, 2)}초</div>
          <div className="stat-desc">체인 실행당</div>
        </div>

        <div className="stat">
          <div className="stat-title">총 토큰</div>
          <div className="stat-value">{stats.totalTokens.toLocaleString()}</div>
          <div className="stat-desc">
            ${round(stats.totalCost, 4)} 총 비용
          </div>
        </div>
      </div>

      {/* 실행 타임라인 */}
      <div className="card bg-base-100 shadow-xl">
        <div className="card-body">
          <h2 className="card-title">최근 실행</h2>

          <div className="overflow-x-auto">
            <table className="table table-zebra">
              <thead>
                <tr>
                  <th>체인</th>
                  <th>상태</th>
                  <th>지속 시간</th>
                  <th>토큰</th>
                  <th>비용</th>
                </tr>
              </thead>
              <tbody>
                {executions.slice(-10).reverse().map((exec) => (
                  <tr key={exec.id}>
                    <td>{exec.chainName}</td>
                    <td>
                      <div className={`badge ${
                        exec.status === 'complete' ? 'badge-success' :
                        exec.status === 'failed' ? 'badge-error' :
                        exec.status === 'processing' ? 'badge-warning' :
                        'badge-ghost'
                      }`}>
                        {exec.status === 'complete' ? '완료' :
                         exec.status === 'failed' ? '실패' :
                         exec.status === 'processing' ? '처리 중' : '대기 중'}
                      </div>
                    </td>
                    <td>
                      {exec.metrics
                        ? `${round(exec.metrics.executionTime / 1000, 2)}초`
                        : '-'}
                    </td>
                    <td>{exec.metrics?.tokenCount || '-'}</td>
                    <td>
                      {exec.metrics
                        ? `$${round(exec.metrics.cost, 4)}`
                        : '-'}
                    </td>
                  </tr>
                ))}
              </tbody>
            </table>
          </div>
        </div>
      </div>

      {/* 새 체인 실행 */}
      <div className="card bg-base-100 shadow-xl">
        <div className="card-body">
          <h2 className="card-title">체인 실행</h2>

          <div className="form-control">
            <label className="label">
              <span className="label-text">체인 타입 선택</span>
            </label>
            <select className="select select-bordered">
              <option>기본 순차</option>
              <option>문서 분석기</option>
              <option>병렬 프로세서</option>
              <option>컨텍스트 체인</option>
            </select>
          </div>

          <button
            className="btn btn-primary"
            onClick={() => executeChain.mutate({
              name: '기본 순차',
              input: '샘플 입력 텍스트',
            })}
            disabled={executeChain.isPending}
          >
            {executeChain.isPending ? (
              <>
                <span className="loading loading-spinner"></span>
                실행 중...
              </>
            ) : (
              '체인 실행'
            )}
          </button>
        </div>
      </div>
    </div>
  );
}

통계 및 비용 추적과 함께 체인 실행을 모니터링하는 실시간 대시보드입니다.

결론

프롬프트 체이닝은 복잡한 AI 작업을 독립적으로 최적화, 모니터링, 확장할 수 있는 관리 가능한 순차적 작업으로 변환합니다. Vercel의 서버리스 플랫폼에서 LangChain의 LCEL과 LangGraph의 상태 기반 워크플로우로 이러한 패턴을 구현함으로써 정교한 추론 작업을 안정적으로 처리하는 프로덕션 수준의 AI 애플리케이션을 구축할 수 있습니다. 핵심은 간단한 체인부터 시작하여 애플리케이션이 확장됨에 따라 캐싱, 폴백, 모니터링과 같은 복원력 기능을 점진적으로 추가하는 것입니다.