Translations:FACTS About Building Retrieval Augmented Generation-based Chatbots/46/ko: Difference between revisions

Latest revision as of 07:19, 20 February 2025

Information about message (contribute)

This message has no documentation. If you know where or how this message is used, you can help other translators by adding documentation to this message.

Message definition (FACTS About Building Retrieval Augmented Generation-based Chatbots)

'''Bigger Vs. Smaller Models''': Larger, commercial LLMs, smaller open source LLMs are increasingly becoming viable for many use cases, thereby offering cost effective alternatives to companies. As open-source models are catching up with larger, commercial models, they are increasingly offering close-comparable accuracy, as demonstrated in our NVHelp bot empirical evaluation in Figure [[#S3.F3|3]], and generally have better latency performance compared to larger models. Additionally, GPU optimization of inference models can further speed up processing times. Open-source models optimized with NVIDIA’s Tensor RT-LLM inference libraries, for instance, have shown faster performance than non-optimized models. These strategies help balance the need for cost-efficiency with maintaining high performance and security standards.

더 큰 모델 대 더 작은 모델: 더 큰 상업용 LLM과 더 작은 오픈 소스 LLM은 점점 더 많은 사용 사례에 대해 실행 가능한 대안이 되고 있으며, 이를 통해 기업에 비용 효율적인 대안을 제공합니다. 오픈 소스 모델이 더 큰 상업용 모델을 따라잡으면서, 3의 NVHelp 봇 실증 평가에서 입증된 바와 같이 점점 더 유사한 정확도를 제공하고 있으며, 일반적으로 더 큰 모델에 비해 더 나은 지연 성능을 가지고 있습니다. 또한, GPU 최적화를 통해 추론 모델의 처리 시간을 더욱 단축할 수 있습니다. 예를 들어, NVIDIA의 Tensor RT-LLM 추론 라이브러리로 최적화된 오픈 소스 모델은 비최적화 모델보다 더 빠른 성능을 보여주었습니다. 이러한 전략은 비용 효율성을 유지하면서 높은 성능과 보안 표준을 유지하는 데 도움을 줍니다.