Translations:FACTS About Building Retrieval Augmented Generation-based Chatbots/4/ja: Difference between revisions

Latest revision as of 07:13, 20 February 2025

Information about message (contribute)

This message has no documentation. If you know where or how this message is used, you can help other translators by adding documentation to this message.

Message definition (FACTS About Building Retrieval Augmented Generation-based Chatbots)

Enterprise chatbots, powered by generative AI, are rapidly emerging as the most explored initial applications of this technology in the industry, aimed at enhancing employee productivity. Retrieval Augmented Generation (RAG), Large Language Models (LLMs), Langchain/Llamaindex types of LLM orchestration frameworks serve as key technological components in building generative-AI based chatbots. However, building successful enterprise chatbots is not easy. They require meticulous engineering of RAG pipelines. This includes fine-tuning semantic embeddings and LLMs, extracting relevant documents from vector databases, rephrasing queries, reranking results, designing effective prompts, honoring document access controls, providing concise responses, including pertinent references, safeguarding personal information, and building agents to orchestrate all these activities. In this paper, we present a framework for building effective RAG-based chatbots based on our first-hand experience of building three chatbots at NVIDIA: chatbots for IT and HR benefits, company financial earnings, and general enterprise content. Our contributions in this paper are three-fold. First, we introduce our FACTS framework for building enterprise-grade RAG-based chatbots that address the challenges mentioned. FACTS mnemonic refers to the five dimensions that RAG-based chatbots must get right - namely content freshness (F), architectures (A), cost economics of LLMs (C), testing (T), and security (S). Second, we present fifteen control points of RAG pipelines and techniques for optimizing chatbots’ performance at each stage. Finally, we present empirical results from our enterprise data on the accuracy-latency tradeoffs between large LLMs vs small LLMs. To the best of our knowledge, this is the first paper of its kind that provides a holistic view of the factors as well as solutions for building secure enterprise-grade chatbots.

生成AIを活用したエンタープライズチャットボットは、業界におけるこの技術の最初の応用として急速に注目を集めており、従業員の生産性向上を目指しています。生成強化型生成（RAG）、大規模言語モデル（LLM）、Langchain/LlamaindexタイプのLLMオーケストレーションフレームワークは、生成AIベースのチャットボットを構築するための主要な技術要素として機能します。しかし、成功するエンタープライズチャットボットを構築することは容易ではありません。これには、RAGパイプラインの綿密な設計が必要です。具体的には、セマンティック埋め込みとLLMの微調整、ベクターデータベースからの関連文書の抽出、クエリの言い換え、結果の再ランキング、効果的なプロンプトの設計、文書アクセス制御の遵守、簡潔な応答の提供、関連する参照の含有、個人情報の保護、これらすべての活動を調整するエージェントの構築が含まれます。本論文では、NVIDIAでの3つのチャットボット（ITおよびHRの福利厚生、企業の財務収益、一般的な企業コンテンツ）の構築経験に基づいた効果的なRAGベースのチャットボット構築フレームワークを紹介します。本論文の貢献は三つあります。第一に、RAGベースのチャットボットが直面する課題に対処するためのエンタープライズグレードのRAGベースチャットボット構築のためのFACTSフレームワークを紹介します。FACTSの頭文字は、RAGベースのチャットボットが正しく行うべき5つの次元、すなわちコンテンツの新鮮さ（F）、アーキテクチャ（A）、LLMのコスト経済性（C）、テスト（T）、セキュリティ（S）を指します。第二に、RAGパイプラインの15の制御ポイントと各段階でのチャットボットの性能を最適化するための技術を紹介します。最後に、大規模LLMと小規模LLMの間の精度と遅延のトレードオフに関する企業データからの実証結果を提示します。私たちの知る限り、これは安全なエンタープライズグレードのチャットボットを構築するための要因と解決策を包括的に示した初めての論文です。