Translations:FACTS About Building Retrieval Augmented Generation-based Chatbots/4/ko: Difference between revisions

Latest revision as of 07:19, 20 February 2025

Information about message (contribute)

This message has no documentation. If you know where or how this message is used, you can help other translators by adding documentation to this message.

Message definition (FACTS About Building Retrieval Augmented Generation-based Chatbots)

Enterprise chatbots, powered by generative AI, are rapidly emerging as the most explored initial applications of this technology in the industry, aimed at enhancing employee productivity. Retrieval Augmented Generation (RAG), Large Language Models (LLMs), Langchain/Llamaindex types of LLM orchestration frameworks serve as key technological components in building generative-AI based chatbots. However, building successful enterprise chatbots is not easy. They require meticulous engineering of RAG pipelines. This includes fine-tuning semantic embeddings and LLMs, extracting relevant documents from vector databases, rephrasing queries, reranking results, designing effective prompts, honoring document access controls, providing concise responses, including pertinent references, safeguarding personal information, and building agents to orchestrate all these activities. In this paper, we present a framework for building effective RAG-based chatbots based on our first-hand experience of building three chatbots at NVIDIA: chatbots for IT and HR benefits, company financial earnings, and general enterprise content. Our contributions in this paper are three-fold. First, we introduce our FACTS framework for building enterprise-grade RAG-based chatbots that address the challenges mentioned. FACTS mnemonic refers to the five dimensions that RAG-based chatbots must get right - namely content freshness (F), architectures (A), cost economics of LLMs (C), testing (T), and security (S). Second, we present fifteen control points of RAG pipelines and techniques for optimizing chatbots’ performance at each stage. Finally, we present empirical results from our enterprise data on the accuracy-latency tradeoffs between large LLMs vs small LLMs. To the best of our knowledge, this is the first paper of its kind that provides a holistic view of the factors as well as solutions for building secure enterprise-grade chatbots.

생성 AI를 기반으로 한 엔터프라이즈 챗봇은 직원 생산성을 향상시키기 위한 산업 내 이 기술의 가장 탐구된 초기 응용 프로그램으로 빠르게 부상하고 있습니다. 검색 증강 생성(RAG), 대형 언어 모델(LLM), Langchain/Llamaindex 유형의 LLM 오케스트레이션 프레임워크는 생성 AI 기반 챗봇을 구축하는 데 중요한 기술 구성 요소로 작용합니다. 그러나 성공적인 엔터프라이즈 챗봇을 구축하는 것은 쉽지 않습니다. 이는 RAG 파이프라인의 세심한 엔지니어링을 요구합니다. 여기에는 의미론적 임베딩 및 LLM의 미세 조정, 벡터 데이터베이스에서 관련 문서 추출, 쿼리 재구성, 결과 재정렬, 효과적인 프롬프트 설계, 문서 접근 제어 준수, 간결한 응답 제공, 관련 참조 포함, 개인 정보 보호, 그리고 이러한 모든 활동을 조율하는 에이전트 구축이 포함됩니다. 이 논문에서는 NVIDIA에서 IT 및 HR 혜택, 회사 재무 수익, 일반 엔터프라이즈 콘텐츠에 대한 세 가지 챗봇을 구축한 경험을 바탕으로 효과적인 RAG 기반 챗봇을 구축하기 위한 프레임워크를 제시합니다. 이 논문의 기여는 세 가지로 나뉩니다. 첫째, 언급된 과제를 해결하는 엔터프라이즈급 RAG 기반 챗봇을 구축하기 위한 FACTS 프레임워크를 소개합니다. FACTS 기억법은 RAG 기반 챗봇이 올바르게 수행해야 하는 다섯 가지 차원을 나타냅니다 - 즉, 콘텐츠 신선도(F), 아키텍처(A), LLM의 비용 경제성(C), 테스트(T), 보안(S)입니다. 둘째, RAG 파이프라인의 15개 제어 지점과 각 단계에서 챗봇 성능을 최적화하기 위한 기술을 제시합니다. 마지막으로, 대형 LLM과 소형 LLM 간의 정확도-지연 시간 절충에 대한 엔터프라이즈 데이터의 실증적 결과를 제시합니다. 우리의 지식에 따르면, 이는 안전한 엔터프라이즈급 챗봇을 구축하기 위한 요소와 솔루션에 대한 전체적인 관점을 제공하는 최초의 논문입니다.